Index Design
Index Overview
Take this table as an example:
Applications often query patterns like:
Without preparation, the system performs a sequential scan of every row in test1
to find matching entries. With many rows in the table and only a few—often zero or one—matching such queries, this approach is highly inefficient. However, with an index on the id
column, the system can use a much faster lookup mechanism—potentially requiring just a few IO operations to find the matching row.
You can create such an index using:
Index Type
Multiple index types supported: B-tree, GIN, and SPLIT_BITMAP. Each index within a given Schema
must have a unique name. Different algorithms are used, matching different query requirements. The default for CREATE INDEX
is a B-tree index. Use the USING
clause to select another index type. For example, to create a SPLIT_BITMAP
index:
B-tree
B-tree handles equality and range queries efficiently for sortable data types. These two create equivalent indexes:
Suppose you have a transactions
table:
For queries like WHERE transaction_date = '2023-04-01'
, a B-tree index boosts performance:
- Best for: Range queries, equality, ordering; operators such as
=
,<
,>
,BETWEEN
. Recommended for numeric, string, and date fields.
Bitmap
Column store tables support bitmap indexes—data structures well-suited to columns with low cardinality (few distinct values).
Tacnode automatically creates adaptive Bitmap indexes for TEXT/CHAR/VARCHAR/BPCHAR, DATE, ENUM fields. Manual bitmap index creation is rarely needed; adaptations are managed via the adaptive
parameter.
Consider the sales
table with dimensions like region
and product_category
:
For queries like WHERE region = 'North America' AND product_category = 'Electronics'
, Bitmap indexes accelerate compound queries:
- Best for: Low-cardinality columns; analytical workloads; queries with multi-condition (
AND
,OR
) logic.
GIN
GIN (Generalized Inverted Index)—for data with multiple composite values.
Suppose you store product tags:
- Best for: Array (e.g.,
TEXT[]
),JSONB/JSON
, and full-text search columns. Useful operators:&&
, or functions liketo_tsvector()
. For text data, an operator class like 'gin_trgm_ops' is needed; not necessary for arrays or JSON.
Tacnode supports the following index operators:
Sample query for products tagged 'furniture':
Multi-Column Index
Indexes may span multiple columns (B-tree or bitmap):
Example:
Consider a users
table:
Accelrates queries filtering on both firstname
and lastname
. For example:
- When a query filter involves two or more fields.
- Effective for queries with multiple AND’ed conditions (e.g. composite searches in web apps).
- Fields with high combined selectivity—i.e., the combination narrows results substantially.
Example use cases:
- E-commerce product search:
category
+price range
- Detecting duplicates by both first and last name in social networks
Notes:
- Column order matters for B-tree composite indexes; constraints must apply to leading columns for index usage. Column order in Bitmap indexes makes no practical difference; a composite bitmap index and single-column indexes functionally equivalent for performance.
- Avoid excessively wide composite indexes; very large indexes increase disk usage and maintenance cost, slow inserts and updates.
- Beware index overlap/conflict; evaluate if extra single-column indexes are still beneficial when using composites (no single-column coverage could under-utilize the composite).
Partial Index
Partial Index (Conditional or Filtered Index) targets only table rows that match specified criteria, not the whole table.
Key properties:
- Selective: Only includes rows matching the condition; avoids indexing rarely queried data
- Conditional:
WHERE
clause specifies which rows to index; modifications to unindexed rows don’t update this index - Space efficient: Saves 30–90% storage compared to a full index
- Faster queries: Smaller indexes yield quicker search times
- Solves edge scenarios: e.g., sparse data, hot sub-sets
Syntax:
Example:
Suppose orders
table often filters by status = 'completed'
:
For such queries:
Partial index directly supports this case; non-matching queries (status != 'completed'
) use other indexes or a table scan.
Recommended scenarios:
-
Sparse columns: e.g.
-
Business filtering: Only query specific statuses
-
Time-range locality: Only index recent data
-
Unique value indexing:
Considerations:
- Query must match index condition for automatic use; unmatched queries fall back to other strategies.
- Slightly higher maintenance complexity; index logic requires precise matching with business querying patterns.
- Condition selectivity: Should filter out a large portion of data (typically >90%).
- Composite partial indexes: Use conditions and multiple columns for greater effect.
Expression Index
Expression indexes index the computed result of an expression, rather than storing a column’s raw value. They apply a function or computation to one/more columns, then index based on that result. When queries use similar computations in their WHERE
clause, the indexed search greatly improves performance.
Example:
Employee table:
Frequently search absolute salaries:
Create an index on the expression:
Subsequent queries matching the expression can benefit from index scans.
When to use:
- Frequently target computed forms of a column (functions, arithmetic) in filters
WHERE
conditions routinely use functions/operations on columns
Concurrent Index Building
Index creation can disrupt normal operations. By default, Tacnode locks the target table during index build—blocking reads and writes during a table scan—potentially a major issue in production environments.
Tacnode supports non-blocking, concurrent index construction using the CONCURRENTLY
keyword.
Best Practices
- Selecting the Right Columns:
- Indexing columns frequently used in WHERE clauses, JOIN conditions, as well as ORDER BY or GROUP BY clauses
- Considering B-tree indexes on high-cardinality columns (columns with many distinct values) for efficient filtering and sorting
- Considering bitmap indexes on low-cardinality columns (columns with many repeated values) for filtering
- Avoiding Excessive Indexing:
- Not creating indexes on columns rarely or never used in queries; excessive B-tree indexes may introduce write amplification, reducing write and update throughput
- Creating indexes only when query performance enhancement is required
- Avoiding redundant indexes, for example:
- Being Aware of Index Size:
- Recognizing that indexes consume storage space; large indexes may impact disk I/O and write performance
- Balancing indexing benefits against storage limitations of the database