Index Design

Index Overview

Take this table as an example:

CREATE TABLE test1 (
    id integer,
    content varchar
);

Applications often query patterns like:

SELECT content FROM test1 WHERE id = constant;

Without preparation, the system performs a sequential scan of every row in test1 to find matching entries. With many rows in the table and only a few—often zero or one—matching such queries, this approach is highly inefficient. However, with an index on the id column, the system can use a much faster lookup mechanism—potentially requiring just a few IO operations to find the matching row.

You can create such an index using:

CREATE INDEX test1_id_index ON test1 (id);

Index Type

Multiple index types supported: B-tree, GIN, and SPLIT_BITMAP. Each index within a given Schema must have a unique name. Different algorithms are used, matching different query requirements. The default for CREATE INDEX is a B-tree index. Use the USING clause to select another index type. For example, to create a SPLIT_BITMAP index:

CREATE INDEX idx_content_bitmap ON test1 USING split_bitmap (content);

B-tree

B-tree handles equality and range queries efficiently for sortable data types. These two create equivalent indexes:

CREATE INDEX idx_b ON tbl (b);
CREATE INDEX idx_b ON tbl USING btree (b);

Suppose you have a transactions table:

CREATE TABLE transactions (
    id SERIAL PRIMARY KEY,
    user_id INT,
    amount DECIMAL,
    transaction_date DATE
);

For queries like WHERE transaction_date = '2023-04-01', a B-tree index boosts performance:

CREATE INDEX idx_transaction_date ON transactions (transaction_date);
  • Best for: Range queries, equality, ordering; operators such as =, <, >, BETWEEN. Recommended for numeric, string, and date fields.

Bitmap

Column store tables support bitmap indexes—data structures well-suited to columns with low cardinality (few distinct values).

CREATE INDEX idx_b_bitmap ON tbl USING split_bitmap (b);

Tacnode automatically creates adaptive Bitmap indexes for TEXT/CHAR/VARCHAR/BPCHAR, DATE, ENUM fields. Manual bitmap index creation is rarely needed; adaptations are managed via the adaptive parameter.

CREATE INDEX idx_b_bitmap ON tbl USING split_bitmap (b) with (adaptive=true);

Consider the sales table with dimensions like region and product_category:

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    region TEXT,
    product_category TEXT,
    sales_amount DECIMAL
);

For queries like WHERE region = 'North America' AND product_category = 'Electronics', Bitmap indexes accelerate compound queries:

CREATE INDEX idx_region ON sales USING split_bitmap (region);
CREATE INDEX idx_product_category ON sales USING split_bitmap (product_category);
  • Best for: Low-cardinality columns; analytical workloads; queries with multi-condition (AND, OR) logic.

GIN

GIN (Generalized Inverted Index)—for data with multiple composite values.

Suppose you store product tags:

CREATE TABLE products (
    id serial PRIMARY KEY,
    name text,
    tags text[]
);
 
INSERT INTO products (name, tags) VALUES
    ('Product A', ARRAY['electronics', 'gadgets']),
    ('Product B', ARRAY['clothing', 'accessories']),
    ('Product C', ARRAY['electronics', 'accessories']),
    ('Product D', ARRAY['furniture']);
 
CREATE INDEX products_tags_gin ON products USING gin (tags);
  • Best for: Array (e.g., TEXT[]), JSONB/JSON, and full-text search columns. Useful operators: &&, or functions like to_tsvector(). For text data, an operator class like 'gin_trgm_ops' is needed; not necessary for arrays or JSON.

Tacnode supports the following index operators:

<    =    &&

Sample query for products tagged 'furniture':

SELECT count(*) FROM products WHERE tags  '{"furniture"}';

Multi-Column Index

Indexes may span multiple columns (B-tree or bitmap):

Example:

Consider a users table:

CREATE INDEX idx_fullname ON users (firstname, lastname);

Accelrates queries filtering on both firstname and lastname. For example:

SELECT * FROM users WHERE firstname = 'John' AND lastname = 'Doe';
  • When a query filter involves two or more fields.
  • Effective for queries with multiple AND’ed conditions (e.g. composite searches in web apps).
  • Fields with high combined selectivity—i.e., the combination narrows results substantially.

Example use cases:

  • E-commerce product search: category + price range
  • Detecting duplicates by both first and last name in social networks

Notes:

  1. Column order matters for B-tree composite indexes; constraints must apply to leading columns for index usage. Column order in Bitmap indexes makes no practical difference; a composite bitmap index and single-column indexes functionally equivalent for performance.
  2. Avoid excessively wide composite indexes; very large indexes increase disk usage and maintenance cost, slow inserts and updates.
  3. Beware index overlap/conflict; evaluate if extra single-column indexes are still beneficial when using composites (no single-column coverage could under-utilize the composite).

Partial Index

Partial Index (Conditional or Filtered Index) targets only table rows that match specified criteria, not the whole table.

Key properties:

  • Selective: Only includes rows matching the condition; avoids indexing rarely queried data
  • Conditional: WHERE clause specifies which rows to index; modifications to unindexed rows don’t update this index
  • Space efficient: Saves 30–90% storage compared to a full index
  • Faster queries: Smaller indexes yield quicker search times
  • Solves edge scenarios: e.g., sparse data, hot sub-sets

Syntax:

CREATE INDEX index_name
ON table_name (column_name [, ...])
WHERE condition;

Example:

Suppose orders table often filters by status = 'completed':

CREATE INDEX idx_completed_orders
ON orders (order_date, customer_id)
WHERE status = 'completed';

For such queries:

SELECT customer_id, order_date
FROM orders
WHERE status = 'completed'
  AND order_date > '2023-01-01';

Partial index directly supports this case; non-matching queries (status != 'completed') use other indexes or a table scan.

Recommended scenarios:

  • Sparse columns: e.g.

    CREATE INDEX idx_non_null_phone ON customers(phone)
    WHERE phone IS NOT NULL;
  • Business filtering: Only query specific statuses

    CREATE INDEX idx_unpaid_orders ON orders(total)
    WHERE payment_status = 'unpaid';
  • Time-range locality: Only index recent data

    CREATE INDEX idx_recent_logs ON logs(timestamp)
    WHERE timestamp > '2023-01-01';
  • Unique value indexing:

    CREATE INDEX idx_unique_emails ON users(email)
    WHERE email IN (SELECT email FROM users GROUP BY email HAVING COUNT(*) = 1);

Considerations:

  1. Query must match index condition for automatic use; unmatched queries fall back to other strategies.
  2. Slightly higher maintenance complexity; index logic requires precise matching with business querying patterns.
  3. Condition selectivity: Should filter out a large portion of data (typically >90%).
  4. Composite partial indexes: Use conditions and multiple columns for greater effect.

Expression Index

Expression indexes index the computed result of an expression, rather than storing a column’s raw value. They apply a function or computation to one/more columns, then index based on that result. When queries use similar computations in their WHERE clause, the indexed search greatly improves performance.

Example:

Employee table:

CREATE TABLE employee (name VARCHAR(20), salary INT);

Frequently search absolute salaries:

SELECT * FROM employee WHERE ABS(salary) = 5000;

Create an index on the expression:

CREATE INDEX idx_salary_abs ON employee((ABS(salary)));

Subsequent queries matching the expression can benefit from index scans.

When to use:

  • Frequently target computed forms of a column (functions, arithmetic) in filters
  • WHERE conditions routinely use functions/operations on columns

Concurrent Index Building

Index creation can disrupt normal operations. By default, Tacnode locks the target table during index build—blocking reads and writes during a table scan—potentially a major issue in production environments.

Tacnode supports non-blocking, concurrent index construction using the CONCURRENTLY keyword.

CREATE INDEX CONCURRENTLY test2_mm_idx ON test2 (major, minor);

Best Practices

  1. Selecting the Right Columns:
  • Indexing columns frequently used in WHERE clauses, JOIN conditions, as well as ORDER BY or GROUP BY clauses
  • Considering B-tree indexes on high-cardinality columns (columns with many distinct values) for efficient filtering and sorting
  • Considering bitmap indexes on low-cardinality columns (columns with many repeated values) for filtering
  1. Avoiding Excessive Indexing:
  • Not creating indexes on columns rarely or never used in queries; excessive B-tree indexes may introduce write amplification, reducing write and update throughput
  • Creating indexes only when query performance enhancement is required
  • Avoiding redundant indexes, for example:
CREATE INDEX ON t(a);
 
-- The following composite index already includes the above
CREATE INDEX ON t(a, b);
  1. Being Aware of Index Size:
  • Recognizing that indexes consume storage space; large indexes may impact disk I/O and write performance
  • Balancing indexing benefits against storage limitations of the database

On this page