tacnode

Partitioned Table

Partitioned Table Overview

As data in a single table grows over time, managing and querying it can become difficult. To address this, tables can be partitioned, using their characteristics to divide the data into smaller storage blocks. This method can significantly improve query performance, simplify data management, and enhance availability.

A partitioned table is divided into smaller segments called partitions. Each partition contains a portion of the table's data. These partitions are invisible to higher-level applications, which still see the table as a single entity.

Each partition is named and can have unique storage characteristics; for instance, different partitions might be stored in separate tablespaces.

When adding data to a partitioned table, a partition key is typically defined to determine which partition the data should go to. The specific partition for each data piece is identified based on the partition key's value or its calculated result. This partition key can consist of one or more columns.

Appropriate Scenarios

  1. Partition tables are a data management method rather than a traditional approach to optimizing query performance. If not designed effectively and without proper partition pruning, they can hinder query performance for many statements.

  2. The core idea involves logically dividing a large table into smaller, manageable segments. When a query or update targets a significant portion of a single partition, performance can be enhanced by leveraging sequential scans instead of relying on indexes and the scattered random access reads across the table.

  3. The table holds vast historical data in a clear chronological order. It is often essential to clean up or archive this data.

  4. Data must be organized hierarchically, storing various segments on different media. Implementing partition tables is a viable option in this context.

  5. Planning the number of partitions, estimating the maximum required partitions, and establishing a sensible partitioning strategy are crucial.

Advantages of Partitioned Tables

  1. Enhance SQL query performance

Incorporating partition keys in the WHERE clause of SQL queries allows for quick identification of the relevant partitions. This process, known as Partition Pruning, significantly reduces the data scanning range, dramatically improving query performance.

Additionally, when JOIN operations involve multiple partitioned tables based on the join key, the engine can convert the join of two large tables into a more efficient partition-level join. This optimization dramatically accelerates the joining process, known as a partition-wise join.

  1. Enhance table manageability

With partitioned tables, traditional table-level management operations are now decentralized to each partition, allowing independent maintenance tasks. Maintenance on large tables can be broken down into smaller tasks executed during different time intervals. For instance, a typical table backup can now focus on a single partition.

  1. Enhance data availability

After partitioning, each partition operates independently. Actions performed on one partition do not impact the data accessibility of others. If a partition fails, it won’t disrupt transactions on other partitions. Furthermore, partitions can be stored across various table spaces to alleviate I/O pressure.

Partitioning Strategy

Depending on different application scenarios, you can choose different partitioning strategies for the table. The basic partitioning strategy provided by the engine is as follows:

  • Range Partition

  • Hash Partition

  • List Partition

Based on the basic partitioning strategy, there can be further Composite Partition strategies.

Range Partition

Range partitioning organizes partitions based on a set range. It works best for similar, clearly ordered data, allowing easy definition of the partition range in line with the data's sequence. A common use case for range partitioning is dividing data by time, often utilizing a time-based partition key. The partition range should be continuous, without gaps or overlaps, and the boundary values must be compatible with the partition key type of the main table.

-- Create a RANGE Partitioned Table
CREATE TABLE table_name (
    column_definitions... )
    PARTITION BY RANGE (partition_key);
 
-- Create a RANGE Partition
CREATE TABLE partition_name PARTITION OF parent_table_name
    FOR VALUES FROM (start_value) TO (end_value);

Parameter Description

  • start_value: The partition's initial value, defined by the FOR VALUES FROM clause; this value is included.
  • end_value: The final value of the partition specified in FOR VALUES TO. This value is excluded, meaning all data greater than or equal to this value will move to the next partition. In the case of the first partition, the lower limit of each partition is the upper limit of the one before it.
  • MIN_VALUE indicates RANGE, representing the minimum possible value in the partition. This particular constant sets the lower bound; when used as the starting value, it signifies that the partition includes the minimum value of the partition key. FOR VALUES FROM (MIN_VALUE) TO (value) shows that the partition range starts at the minimum value and extends to the specified value, excluding that value.
  • MAX_VALUE indicates RANGE, representing the maximum possible value in a partition. This particular constant sets the upper bound; when used as the end value, it signifies that the partition includes the maximum value of the partition key.
  • MIN_VALUE and MAX_VALUE apply only for RANGE partitions and cannot be used for LIST or HASH sub-partitions.
-- The following uses the time field info_time as the partition key and uses the range partitioning method for partitioning
CREATE TABLE range_pt (
    uid integer NOT NULL,
    info_time timestamp NOT NULL,
    money decimal(5,2) NOT NULL,
    PRIMARY KEY (uid,info_time)
) PARTITION BY RANGE (info_time);
 
--- Add three partitions
CREATE TABLE range_pt_2020_1 PARTITION OF range_pt FOR VALUES FROM ('2020-1-01') TO ('2020-1-01'::timestamp + interval '1 month');
CREATE TABLE range_pt_2020_2 PARTITION OF range_pt FOR VALUES FROM ('2020-2-01') TO ('2020-2-01'::timestamp + interval '1 month');
CREATE TABLE range_pt_2020_3 PARTITION OF range_pt FOR VALUES FROM ('2020-3-01') TO ('2020-3-01'::timestamp + interval '1 month');
 
--- You can also add two partitions to separate (minvalue to 2020-1-01) and (2020-4-01 to maxvalue)
CREATE TABLE range_pt_before_2020_1 PARTITION OF range_pt FOR VALUES FROM (MINVALUE) TO ('2020-1-01');
CREATE TABLE range_pt_after_2020_4 PARTITION OF range_pt FOR VALUES FROM ('2020-4-01') TO (MAXVALUE);

You can now view the table structure:

postgres=> \d+ range_pt
                                              Partitioned table "public.range_pt"
  Column   |            Type             | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+-----------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 uid       | integer                     |           | not null |         | ???     |             |              |
 info_time | timestamp without time zone |           | not null |         | ???     |             |              |
 money     | numeric(5,2)                |           | not null |         | ???     |             |              |
Partition key: RANGE (info_time)
Indexes:
    "range_pt_pkey" PRIMARY KEY, btree (uid ASC, info_time ASC)
Partitions: range_pt_2020_1 FOR VALUES FROM ('2020-1-01') TO ('2020-1-01'::timestamp without time zone + '1 mon'),
            range_pt_2020_2 FOR VALUES FROM ('2020-2-01') TO ('2020-2-01'::timestamp without time zone + '1 mon'),
            range_pt_2020_3 FOR VALUES FROM ('2020-3-01') TO ('2020-3-01'::timestamp without time zone + '1 mon'),
            range_pt_after_2020_4 FOR VALUES FROM ('2020-4-01') TO (MAXVALUE),
            range_pt_before_2020_1 FOR VALUES FROM (MINVALUE) TO ('2020-1-01')
Access method: row_v1

Default partition: You can create a DEFAULT partition for RANGE PARTITION. If data is not assigned to an existing partition, it will be written to the default partition.

CREATE TABLE range_default PARTITION OF range_pt DEFAULT;

Partitioned tables can be defined using expressions.

-- Create the primary table, add the PARTITION BY RANGE (DATE_TRUNC('day', created_at)) clause, and define the partition key. DATE_TRUNC('day', created_at) indicates partitioning by the date part of the created_at field.
CREATE TABLE sales
(
    id         SERIAL,
    product_id INT            NOT NULL,
    quantity   INT            NOT NULL,
    price      DECIMAL(10, 2) NOT NULL,
    created_at TIMESTAMP      NOT NULL
) PARTITION BY RANGE (DATE_TRUNC('day', created_at));
 
-- Create partitions
CREATE TABLE sales_y2023m01 PARTITION OF sales
    FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
 
CREATE TABLE sales_y2023m02 PARTITION OF sales
    FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
 
-- Insert data
INSERT INTO sales (product_id, quantity, price, created_at)
VALUES (1, 10, 100.00, '2023-01-15 10:30:00'),
       (2, 5, 50.00, '2023-01-20 15:45:00'),
       (1, 8, 80.00, '2023-02-03 09:15:00'),
       (3, 3, 30.00, '2023-02-18 14:00:00');

Hash Partition

Hash Partition uses a hash algorithm on the designated PARTITION KEY to determine the data's storage partition. This technique distributes data evenly and randomly across partitions to achieve consistent partition sizes.

Given its random distribution, hash partitioning may not be suitable for handling historical data that follows a specific chronological order. However, it excels when there is a need for an even data allocation across different storage devices. To maximize the effectiveness of hash partitioning, keep these guidelines in mind:

  1. Select a partition key with a unique column (Unique) or numerous unique values (Almost Unique).

  2. When creating hash partitions, choose a quantity that is a power of 2, such as 2, 4, 8, 16, etc.

Tables that use hash partitioning are created with the partition clause in the CREATE TABLE statement, specifically by employing PARTITION BY HASH. During partition creation, you can assign a name to each partition and its corresponding tablespace.

--Create a HASH partitioned table
CREATE TABLE table_name (
    column_definitions... )
    PARTITION BY HASH (partition_key);
 
--Create HASH partition
CREATE TABLE child_table_name PARTITION OF parent_table_name
    FOR VALUES WITH (MODULUS M, REMAINDER R);

Parameter Description:

  • MODULUS: specifies the total number of hash partitions, which must be a positive integer.
  • REMAINDER: Specifies the partition number, ranging from 0 to modulus -1.
CREATE TABLE hash_pt (
    userid int4,
    username character varying(64),
    ctime timestamp(6) without time zone,
    PRIMARY KEY(userid))
PARTITION BY HASH(userid);
 
-- Add 16 partitions:
CREATE TABLE hash_pt_0 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 0);
CREATE TABLE hash_pt_1 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 1);
CREATE TABLE hash_pt_2 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 2);
CREATE TABLE hash_pt_3 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 3);
CREATE TABLE hash_pt_4 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 4);
CREATE TABLE hash_pt_5 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 5);
CREATE TABLE hash_pt_6 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 6);
CREATE TABLE hash_pt_7 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 7);
CREATE TABLE hash_pt_8 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 8);
CREATE TABLE hash_pt_9 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 9);
CREATE TABLE hash_pt_10 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 10);
CREATE TABLE hash_pt_11 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 11);
CREATE TABLE hash_pt_12 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 12);
CREATE TABLE hash_pt_13 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 13);
CREATE TABLE hash_pt_14 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 14);
CREATE TABLE hash_pt_15 PARTITION OF hash_pt FOR VALUES WITH (MODULUS 16, REMAINDER 15);

View table structure

postgres=> \d+ hash_pt
                                               Partitioned table "public.hash_pt"
  Column  |              Type              | Collation | Nullable | Default | Storage | Compression | Stats target | Description
----------+--------------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 userid   | integer                        |           | not null |         | ???     |             |              |
 username | character varying(64)          |           |          |         | ???     |             |              |
 ctime    | timestamp(6) without time zone |           |          |         | ???     |             |              |
Partition key: HASH (userid)
Indexes:
    "hash_pt_pkey" PRIMARY KEY, btree (userid ASC)
Partitions: hash_pt_0 FOR VALUES WITH (MODULUS 16, REMAINDER 0),
            hash_pt_1 FOR VALUES WITH (MODULUS 16, REMAINDER 1),
            hash_pt_10 FOR VALUES WITH (MODULUS 16, REMAINDER 10),
            hash_pt_11 FOR VALUES WITH (MODULUS 16, REMAINDER 11),
            hash_pt_12 FOR VALUES WITH (MODULUS 16, REMAINDER 12),
            hash_pt_13 FOR VALUES WITH (MODULUS 16, REMAINDER 13),
            hash_pt_14 FOR VALUES WITH (MODULUS 16, REMAINDER 14),
            hash_pt_15 FOR VALUES WITH (MODULUS 16, REMAINDER 15),
            hash_pt_2 FOR VALUES WITH (MODULUS 16, REMAINDER 2),
            hash_pt_3 FOR VALUES WITH (MODULUS 16, REMAINDER 3),
            hash_pt_4 FOR VALUES WITH (MODULUS 16, REMAINDER 4),
            hash_pt_5 FOR VALUES WITH (MODULUS 16, REMAINDER 5),
            hash_pt_6 FOR VALUES WITH (MODULUS 16, REMAINDER 6),
            hash_pt_7 FOR VALUES WITH (MODULUS 16, REMAINDER 7),
            hash_pt_8 FOR VALUES WITH (MODULUS 16, REMAINDER 8),
            hash_pt_9 FOR VALUES WITH (MODULUS 16, REMAINDER 9)
Access method: row_v1

List Partition

List partitioning involves defining unique values (lists) for each partition. When the partition key aligns with a distinct value, the related data is saved in that specific partition. This approach facilitates personalized data organization, such as sorting data by region.

To set up list-partitioned tables, use the CREATE TABLE clause alongside the SELECT statement, employ PARTITION BY LIST, and specify a list of unique values for each partition. The partition key values must remain distinct.

--Create a LIST partitioned table
CREATE TABLE table_name (
    column_definitions... )
    PARTITION BY LIST (partition_key);
 
--Create LIST partitions
CREATE TABLE partition_name PARTITION OF parent_table_name
    FOR VALUES IN (value1, value2, ...);

Parameter Description

  • value1, value2, ...: One or more values ​​of the partition key, separated by commas. Use the FOR VALUES IN clause to specify. Each partition can contain one or more discrete values. Partition values ​​cannot appear repeatedly in multiple partitions.
--Create a List partitioned table by country field
CREATE TABLE list_pt (
    userid int4,
    username character varying(64),
    country character varying(2),
    ctime timestamp(6) without time zone,
    PRIMARY KEY(userid, country))
PARTITION BY LIST(country);
 
CREATE TABLE list_pt_1 PARTITION OF list_pt FOR VALUES IN ('AT', 'DE', 'IT', 'FR', 'ES');
CREATE TABLE list_pt_2 PARTITION OF list_pt FOR VALUES IN ('US');
CREATE TABLE list_pt_3 PARTITION OF list_pt FOR VALUES IN ('CN');

View table structure

postgres=> \d+ list_pt
                                               Partitioned table "public.list_pt"
  Column  |              Type              | Collation | Nullable | Default | Storage | Compression | Stats target | Description
----------+--------------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 userid   | integer                        |           | not null |         | ???     |             |              |
 username | character varying(64)          |           |          |         | ???     |             |              |
 country  | character varying(2)           |           | not null |         | ???     |             |              |
 ctime    | timestamp(6) without time zone |           |          |         | ???     |             |              |
Partition key: LIST (country)
Indexes:
    "list_pt_pkey" PRIMARY KEY, btree (userid ASC, country ASC)
Partitions: list_pt_1 FOR VALUES IN ('AT', 'DE', 'IT', 'FR', 'ES'),
            list_pt_2 FOR VALUES IN ('US'),
            list_pt_3 FOR VALUES IN ('CN')
Access method: row_v1

Composite Partitions

Composite partitions, as the term suggests, integrate various partitioning strategies. Each partition undergoes another application of the original partitioning method. For instance, range partitioning can be applied to each resulting partition, leading to several sub-partitions. This concept can be likened to China, which is divided into provinces (partitions) and cities (sub-partitions).

The three fundamental partitioning strategies can be freely combined when employing composite partitioning. Range partitioning is the primary strategy, and its sub-partitions utilize range, hash, and list partitioning methods.

A practical example is a secondary branch office where all sub-bureaus are organized as LIST partitions.

--This is an example of two-level partitioning: the first level is organized by province, while the second level is sorted by city.
--A parent table named `places` has been created and partitioned by province and city. When data is inserted, it automatically navigates to the appropriate province and city partition table. This partitioning method is ideal for situations where the list value range is clear and well-defined.
CREATE TABLE places
(
    province   TEXT,
    city       TEXT,
    address    TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (province, city)
) PARTITION BY LIST (province);
 
CREATE TABLE places_california PARTITION OF places FOR VALUES IN ('California') PARTITION BY LIST (city);
CREATE TABLE places_texas PARTITION OF places FOR VALUES IN ('Texas') PARTITION BY LIST (city);
 
--Create partition tables for other states
CREATE TABLE places_california_la PARTITION OF places_california FOR VALUES IN ('Los Angeles');
CREATE TABLE places_california_sf PARTITION OF places_california FOR VALUES IN ('San Francisco');
 
--Create a partition table for other cities in Texas
CREATE TABLE places_texas_houston PARTITION OF places_texas FOR VALUES IN ('Houston');
CREATE TABLE places_texas_dallas PARTITION OF places_texas FOR VALUES IN ('Dallas');

An example of a first-level partition of LIST and a second-level partition of RANGE.

--Example of creating a secondary partition table. The first level uses LIST partitions by tenant ID, and the second uses RANGE partitions by order time. To improve query efficiency, you can quickly locate the corresponding partition based on tenant_id and order_date.
--Create the main table orders, specify the first level LIST partition key as tenant_id
CREATE TABLE orders
(
    order_id   SERIAL,
    tenant_id  INT,
    order_date DATE,
    amount     NUMERIC(10, 2)
) PARTITION BY LIST (tenant_id);
 
--Create a partition table for each tenant and specify the second-level RANGE partition key order_date
CREATE TABLE orders_tenant1 PARTITION OF orders
    FOR VALUES IN (1)
    PARTITION BY RANGE (order_date);
 
CREATE TABLE orders_tenant2 PARTITION OF orders
    FOR VALUES IN (2)
    PARTITION BY RANGE (order_date);
 
--Create monthly RANGE partitions for each tenant's partition table
CREATE TABLE orders_tenant1_202301 PARTITION OF orders_tenant1
    FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
 
CREATE TABLE orders_tenant1_202302 PARTITION OF orders_tenant1
    FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
 
CREATE TABLE orders_tenant2_202301 PARTITION OF orders_tenant2
    FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
 
CREATE TABLE orders_tenant2_202302 PARTITION OF orders_tenant2
    FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');

An example is where the first-level partition is LIST, and the second-level partition is partly RANGE and partly HASH.

--First create the orders table, use LIST partition, including a DEFAULT partition
CREATE TABLE orders
(
    order_id    INT,
    order_date  DATE,
    customer_id INT,
    amount      DECIMAL(10, 2),
    region      VARCHAR(20)
)
    PARTITION BY LIST (region);
 
--For the 'North', 'East' partitions orders_pRegionA, use RANGE secondary partitioning
CREATE TABLE orders_pRegionA
    PARTITION OF orders
        FOR VALUES IN ('North', 'East')
    PARTITION BY RANGE (order_date);
 
CREATE TABLE orders_pRegionA_2022
    PARTITION OF orders_pRegionA
        FOR VALUES FROM (MINVALUE) TO ('2023-01-01');
 
CREATE TABLE orders_pRegionA_2023
    PARTITION OF orders_pRegionA
        FOR VALUES FROM ('2023-01-01') TO (MAXVALUE);
 
--For the 'South' and 'West' partitions orders_pRegionB, RANGE secondary partitioning is also used
CREATE TABLE orders_pRegionB
    PARTITION OF orders
        FOR VALUES IN ('South', 'West')
    PARTITION BY RANGE (order_date);
 
CREATE TABLE orders_pRegionB_2022
    PARTITION OF orders_pRegionB
        FOR VALUES FROM (MINVALUE) TO ('2023-01-01');
 
CREATE TABLE orders_pRegionB_2023
    PARTITION OF orders_pRegionB
        FOR VALUES FROM ('2023-01-01') TO (MAXVALUE);
 
--For the DEFAULT partition orders_pDefault, use HASH partitioning, MODULUS is 5, and create 5 HASH partitions
CREATE TABLE orders_pDefault
    PARTITION OF orders
        DEFAULT
    PARTITION BY HASH (customer_id);
 
CREATE TABLE orders_pDefault_1 PARTITION OF orders_pDefault FOR VALUES WITH (MODULUS 5, REMAINDER 0);
CREATE TABLE orders_pDefault_2 PARTITION OF orders_pDefault FOR VALUES WITH (MODULUS 5, REMAINDER 1);
CREATE TABLE orders_pDefault_3 PARTITION OF orders_pDefault FOR VALUES WITH (MODULUS 5, REMAINDER 2);
CREATE TABLE orders_pDefault_4 PARTITION OF orders_pDefault FOR VALUES WITH (MODULUS 5, REMAINDER 3);
CREATE TABLE orders_pDefault_5 PARTITION OF orders_pDefault FOR VALUES WITH (MODULUS 5, REMAINDER 4);

Partitioned Table Maintenance

Adding a Partition

  1. Create a new table using CREATE TABLE PARTITION OF statements
CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name
    PARTITION OF parent_table [ (
  { column_name [ WITH OPTIONS ] [ column_constraint [ ... ] ]
    | table_constraint }
    [, ... ]
) ] { FOR VALUES partition_bound_spec | DEFAULT }
[ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [, ... ] ) ]
[ USING method ]
[ WITH ( storage_parameter [= value] [, ... ] )]
[ TABLESPACE tablespace_name ]

Example

--Partition by date or timestamp:
PARTITION BY RANGE (created_at)
 
--Partition by integer ID:
PARTITION BY RANGE (user_id)
 
--Partition by hash value:
PARTITION BY HASH (id)
 
--Partition by list:
PARTITION BY LIST (country)
 
--Partition by combination keys:
PARTITION BY RANGE (created_at, user_id)
 
--Using expressions or functions:
PARTITION BY RANGE (DATE_TRUNC('day', created_at))
PARTITION BY RANGE (EXTRACT(YEAR FROM created_at))
  1. ALTER TABLE ... ATTACH PARTITION ... First, build the table, and then add partitions to the parent table by
ALTER TABLE table_name ATTACH PARTITION partition_name FOR VALUES FROM () TO ();

Notices:

  • It's advisable to set the partitioning method when establishing the base table. If you choose the ATTACH method, the new partition table must match the main table's structure exactly, and its range must not overlap with any existing partitions.

  • You can query partitioned tables just like regular ones.

  • When adding new data, you can use standard INSERT statements. The partition table will automatically direct the data to the correct partition based on the partition key.

Querying a partitioned table resembles a standard table using a SELECT statement. However, Tacnode efficiently identifies and scans only the necessary partitions for the query. To optimize performance, it's advisable to include the partition's key condition in the WHERE clause to narrow the query's focus.

  • If the table has a primary key, the partition key must include part or all of that primary key. The primary key constraint can only be applied at the partition table level, not individually for each partition.

Deleting a Partition

  1. To delete the partition table directly, use DROP TABLE partition_table to finalize the action.

  2. To complete the process, execute ALTER TABLE parent_table DETACH PARTITION partition_name to retain the partition table data.

Replace a Partition

  1. DETACH works with the current partition table.

  2. For the new partition table, ATTACH it to the parent table.

Merge Partitions

  1. DETACH Execute actions on the partitions intended for merging.

  2. Data from the two new partitions is recorded in the same table.

  3. Integrate the new table ATTACH into the parent table.

  4. Remove unnecessary data.

Split Partition

  1. DETACH Execute operations on the partition intended for splitting.

  2. Utilize SQL to divide the partitions designated for splitting into multiple tables.

  3. Integrate new partitions ATTACH into the parent table.

  4. Eliminate unnecessary data.

Precautions

  • Select an appropriate partition key:

    • The partition key should be a column commonly used in query conditions, as partition pruning can enhance query efficiency.
    • Fundamental partition values should be well-distributed to prevent performance issues from data skew.
  • Pay attention to partition boundaries:

    • Partition boundaries must be contiguous and non-overlapping.
    • When setting these boundaries, consider the anticipated data growth and ensure enough space is reserved for future increases.
  • Manage the number of partitions:

    • Avoid excessive partitions as this may complicate management and heighten system overhead.
    • Regularly assess the number of partitions and adjust based on data volume and access patterns.
    • Add new partitions promptly to accommodate incoming data and avoid insertion failures for range partitions.
    • Timely archive or delete any unnecessary historical partitions to free up storage space.
  • Exercise caution with DDL operations:

    • DDL operations on a partitioned table (like adding columns) will affect all partitions and may be time-consuming.
    • Conduct DDL operations during off-peak times or use the CREATE TABLE ... AS method to create a new table before swapping.
  • Be aware of the limitations of partition tables:

    • Partitioned tables do not support specific features, including foreign keys or ON CONFLICT clauses.
    • Some functions (such as DISTINCT and GROUP BY) may be restricted when querying a partitioned table.