tacnode

Index Design

Index Overview

Consider a table structured as follows:

CREATE TABLE test1 (
id integer,
content varchar
);

The application frequently executes queries in this format:

SELECT content FROM test1 WHERE id = constant;

Without an index, the system must scan the entire test1 table row by row to find matching entries. This method is inefficient if test1 contains many rows but only returns a few (or none). By creating an index on the id column, the system can use a more efficient strategy to locate matching rows, potentially requiring only a few IO operations.

To create an index on the id column, use the following command:

CREATE INDEX test1_id_index ON test1 (id);

Index Type

Three index types are available: B-Tree, GIN, and SPLIT_BITMAP. Index names must be unique within the schema. Each index type uses a distinct algorithm and is optimized for different queries. By default, the CREATE INDEX command generates a B-Tree index. To create a different index type, specify the desired type after the keyword USING. For example, to create a SPLIT_BITMAP index:

CREATE TABLE hybrid_t
(
    a int PRIMARY KEY,
    b int
) USING hybrid;
 
CREATE INDEX hybrid_t_b_bitmap ON hybrid_t USING split_bitmap (b);

B-Tree

B-trees handle equality and range queries on sortable data. The following two statements for index creation are equivalent.

CREATE INDEX hybrid_t_b_btree ON hybrid_t (b);
CREATE INDEX hybrid_t_b_btree ON hybrid_t USING btree (b);

The Tacnode query optimizer considers B-tree indexes when the indexed columns are compared using any of the following operators:

<   <=   =   >=   >

Bitmap

Columnar storage tables use bitmap indexes, a database structure designed to enhance query performance, especially for columns with low cardinality and few distinct values.

CREATE INDEX hybrid_t_b_bitmap ON hybrid_t USING split_bitmap (b);

Starting with version V1.0, the engine automatically generates adaptive Bitmap indexes for text types (TEXT/CHAR/VARCHAR/BPCHAR), date-time types (DATE), and enumeration types (ENUM). Manually creating bitmap indexes is no longer necessary. The engine determines how to produce the index file based on data characteristics, and users can adjust the adaptive behavior by altering parameters.

CREATE INDEX hybrid_t_b_adp_bitmap ON hybrid_t USING split_bitmap (b) with (adaptive=true);

GIN

The Generalized Inverted Index (GIN) is an "inverted index" suitable for data with multiple combined values.

CREATE TABLE products (
    id serial PRIMARY KEY,
    name text,
    tags text[]
);
 
INSERT INTO products (name, tags) VALUES
    ('Product A', ARRAY['electronics', 'gadgets']),
    ('Product B', ARRAY['clothing', 'accessories']),
    ('Product C', ARRAY['electronics', 'accessories']),
    ('Product D', ARRAY['furniture']);
 
CREATE INDEX products_tags_gin ON products USING gin (tags);

Tacnode enables index queries through these operators:

<@   @>   =   &&

Locate products labeled as 'furniture':

SELECT count(*) FROM products WHERE tags @> '{"furniture"}';

Multi-column Index

An index, which can be either a B-Tree or a bitmap index, may be created on several table columns. For instance, consider a table structured as follows:

CREATE TABLE test2 (
  major int,
  minor int,
  name varchar
);

Examples include queries like:

SELECT name FROM test2 WHERE major = constant AND minor = constant;

In this case, it would be advisable to create an index on both the major and minor columns together, as shown:

CREATE INDEX test2_mm_idx ON test2 (major, minor);

It's essential to recognize that the column order in a B-Tree index greatly influences index lookup efficiency. However, this is only valid if a constraint exists on the leading (leftmost) column. Conversely, the column order in a bitmap index does not impact performance. Note that with bitmap indexes, the effect of defining a combined index is the same as defining each column separately; the difference lies in the method of writing.

Concurrent Index Building

Creating indexes can disrupt the database's regular functioning. Typically, Tacnode locks the table intended for indexing, preventing both reads and writes and completes the entire index build in one table scan. This can have significant consequences if the system is a live production database.

Tacnode allows index creation without blocking writes. To activate this feature, use the CONCURRENTLY option in the CREATE INDEX command.

CREATE INDEX CONCURRENTLY test2_mm_idx ON test2 (major, minor);

Best Practices

  1. Choose the appropriate column:
  • Generate indexes for columns commonly found in WHERE clauses, JOIN conditions, and ORDER BY or GROUP BY clauses.
  • Consider implementing B-Tree indexes on high cardinality columns (those with a wide variety of values) to assist in filtering and sorting.
  • For filtering purposes, consider bitmap indexes for low cardinality columns (those with many repeated values).
  1. Avoid over-indexing:
  • Do not create indexes on columns that are seldom or never queried.
  • Create indexes only when query performance requires enhancement.
  • Avoid duplicate indexes like this:
CREATE INDEX ON t(a);
 
-- This composite index includes the index mentioned above.
CREATE INDEX ON t(a, b);
  1. Consider the size of indexes:
  • Indexes consume storage space, and large ones can affect disk I/O and writing performance.
  • Weigh the advantages of indexes against your database's storage constraints.

Example

Creating a B-tree index

CREATE INDEX t_b_idx ON t(b);

Creating a split_bitmap index

Note: split_bitmap indexes do not apply to row tables.

CREATE INDEX t_b_bitmap ON t USING split_bitmap(b);

Avoid blocking reads and writes when adding an index

CREATE INDEX CONCURRENTLY t_b_idx ON t(b);

Verify if the index is utilized in the query

EXPLAIN SELECT a FROM t WHERE b = 5;

On this page