Partitioned Table

Partitioning divides large tables into smaller, manageable segments called partitions. Each partition contains a subset of the table's data while appearing as a single entity to applications. This guide covers implementing and managing partitioned tables for optimal performance and maintainability.

Overview and Benefits

Partitioning is most effective for:

  • Managing Large Datasets: Tables with millions or billions of rows
  • Time-Series Data: Clear chronological data access patterns
  • Data Archival: Regular cleanup or archival of historical data
  • Hierarchical Storage: Different data segments on different storage media
  • Performance Isolation: Independent maintenance operations per partition

Design Considerations

Partitioning is primarily a data management strategy. Without proper partition pruning, it can negatively impact query performance. Plan your partitioning strategy carefully based on access patterns.

Key Benefits:

  • Enhanced Query Performance: Partition pruning, partition-wise joins, parallel execution
  • Improved Manageability: Independent maintenance, parallel operations, targeted operations
  • Enhanced Availability: Isolated failures, distributed storage, reduced I/O pressure

Partition Architecture:

-- Parent table (logical view)
CREATE TABLE sales_data (
    sale_id BIGINT,
    sale_date DATE,
    customer_id INT,
    amount NUMERIC(10,2)
) PARTITION BY RANGE (sale_date);
 
-- Child partitions (physical storage)
CREATE TABLE sales_2024_q1 PARTITION OF sales_data 
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
 
CREATE TABLE sales_2024_q2 PARTITION OF sales_data 
    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

Partitioning Strategies

Choose the appropriate strategy based on your data distribution and access patterns:

StrategyBest ForData DistributionUse Cases
RangeOrdered sequential dataContinuous rangesTime-series, sequential IDs
HashEven distributionRandom across partitionsLoad balancing, write distribution
ListDiscrete categorical valuesSpecific value listsGeographic regions, categories
CompositeComplex multi-level organizationHierarchical distributionMulti-tenant, time + category

Range Partitioning

Range partitioning organizes data into continuous, non-overlapping ranges. Most effective for ordered data like timestamps or sequential IDs.

Time-Based Partitioning:

-- Create range partitioned table by date
CREATE TABLE order_history (
    order_id BIGSERIAL,
    customer_id INT NOT NULL,
    order_date DATE NOT NULL,
    total_amount NUMERIC(12,2),
    status VARCHAR(20),
    PRIMARY KEY (order_id, order_date)
) PARTITION BY RANGE (order_date);
 
-- Create monthly partitions
CREATE TABLE order_history_2024_01 PARTITION OF order_history
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
 
CREATE TABLE order_history_2024_02 PARTITION OF order_history
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
 
CREATE TABLE order_history_2024_03 PARTITION OF order_history
    FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
 
-- Handle historical and future data
CREATE TABLE order_history_before_2024 PARTITION OF order_history
    FOR VALUES FROM (MINVALUE) TO ('2024-01-01');
 
CREATE TABLE order_history_future PARTITION OF order_history
    FOR VALUES FROM ('2024-04-01') TO (MAXVALUE);

Numeric Range Partitioning:

-- Partition by customer ID ranges
CREATE TABLE customer_orders (
    order_id BIGSERIAL,
    customer_id INT NOT NULL,
    order_date DATE DEFAULT CURRENT_DATE,
    amount NUMERIC(10,2),
    PRIMARY KEY (order_id, customer_id)
) PARTITION BY RANGE (customer_id);
 
-- Create customer ID range partitions
CREATE TABLE customer_orders_1_1000 PARTITION OF customer_orders
    FOR VALUES FROM (1) TO (1001);
 
CREATE TABLE customer_orders_1001_2000 PARTITION OF customer_orders
    FOR VALUES FROM (1001) TO (2001);
 
CREATE TABLE customer_orders_2001_plus PARTITION OF customer_orders
    FOR VALUES FROM (2001) TO (MAXVALUE);

Expression-Based Partitioning:

Use expressions for more flexible partitioning strategies:

-- Partition by year extraction
CREATE TABLE event_log (
    event_id BIGSERIAL,
    event_timestamp TIMESTAMP NOT NULL,
    event_type VARCHAR(50),
    user_id INT,
    event_data JSONB,
    PRIMARY KEY (event_id, event_timestamp)
) PARTITION BY RANGE (EXTRACT(YEAR FROM event_timestamp));
 
-- Create yearly partitions
CREATE TABLE event_log_2023 PARTITION OF event_log
    FOR VALUES FROM (2023) TO (2024);
 
CREATE TABLE event_log_2024 PARTITION OF event_log
    FOR VALUES FROM (2024) TO (2025);
 
-- Partition by day truncation for high-frequency data
CREATE TABLE api_requests (
    request_id BIGSERIAL,
    request_time TIMESTAMP NOT NULL,
    endpoint VARCHAR(200),
    response_status INT,
    processing_time_ms INT,
    PRIMARY KEY (request_id, request_time)
) PARTITION BY RANGE (DATE_TRUNC('day', request_time));
 
-- Create daily partitions
CREATE TABLE api_requests_20240115 PARTITION OF api_requests
    FOR VALUES FROM ('2024-01-15') TO ('2024-01-16');
 
CREATE TABLE api_requests_20240116 PARTITION OF api_requests
    FOR VALUES FROM ('2024-01-16') TO ('2024-01-17');

Default Partitions:

Handle data that doesn't match existing partitions:

-- Add default partition for unexpected data
CREATE TABLE order_history_default PARTITION OF order_history DEFAULT;

Default Partition Best Practice

Monitor default partitions regularly. High default partition usage indicates need for additional specific partitions or review of partitioning strategy.

Hash Partitioning

Hash partitioning distributes data evenly across partitions using a hash function. Ideal for load balancing and preventing hotspots.

-- Create hash partitioned table
CREATE TABLE user_sessions (
    session_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id INT NOT NULL,
    login_time TIMESTAMP DEFAULT NOW(),
    logout_time TIMESTAMP,
    ip_address INET,
    user_agent TEXT
) PARTITION BY HASH (user_id);
 
-- Create hash partitions (use powers of 2)
CREATE TABLE user_sessions_0 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
 
CREATE TABLE user_sessions_1 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);
 
CREATE TABLE user_sessions_2 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 2);
 
CREATE TABLE user_sessions_3 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 3);
 
CREATE TABLE user_sessions_4 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 4);
 
CREATE TABLE user_sessions_5 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 5);
 
CREATE TABLE user_sessions_6 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 6);
 
CREATE TABLE user_sessions_7 PARTITION OF user_sessions 
    FOR VALUES WITH (MODULUS 8, REMAINDER 7);

Choose Good Hash Keys:

-- Good: High cardinality unique column
CREATE TABLE transaction_log (
    transaction_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_number VARCHAR(20) NOT NULL,
    amount NUMERIC(15,2),
    transaction_time TIMESTAMP DEFAULT NOW()
) PARTITION BY HASH (transaction_id);
 
-- Good: Nearly unique business identifier
CREATE TABLE product_reviews (
    review_id BIGSERIAL,
    product_sku VARCHAR(50) NOT NULL,
    customer_id INT NOT NULL,
    rating INT CHECK (rating BETWEEN 1 AND 5),
    review_text TEXT,
    created_at TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (review_id, product_sku)
) PARTITION BY HASH (product_sku);

Optimal Partition Count:

-- Use powers of 2 for optimal distribution
-- 4, 8, 16, 32 partitions are typically good choices
 
-- Example: 16 partitions for high-volume table
CREATE TABLE clickstream_events (
    event_id BIGSERIAL,
    user_id INT NOT NULL,
    page_url VARCHAR(500),
    event_type VARCHAR(50),
    timestamp TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (event_id, user_id)
) PARTITION BY HASH (user_id);
 
-- Create 16 hash partitions
-- (Use script to generate multiple partitions)

List Partitioning

List partitioning assigns specific values to each partition. Perfect for categorical data with known, distinct values.

Geographic Example:

-- Partition by geographic regions
CREATE TABLE customer_data (
    customer_id SERIAL,
    customer_name VARCHAR(100) NOT NULL,
    country_code VARCHAR(2) NOT NULL,
    region VARCHAR(50),
    registration_date DATE DEFAULT CURRENT_DATE,
    PRIMARY KEY (customer_id, country_code)
) PARTITION BY LIST (country_code);
 
-- Create geographic partitions
CREATE TABLE customer_data_north_america PARTITION OF customer_data
    FOR VALUES IN ('US', 'CA', 'MX');
 
CREATE TABLE customer_data_europe PARTITION OF customer_data
    FOR VALUES IN ('DE', 'FR', 'GB', 'IT', 'ES', 'NL');
 
CREATE TABLE customer_data_asia_pacific PARTITION OF customer_data
    FOR VALUES IN ('JP', 'CN', 'IN', 'AU', 'SG', 'KR');
 
-- Default partition for unspecified countries
CREATE TABLE customer_data_other PARTITION OF customer_data DEFAULT;

Category-Based Example:

-- Partition by product categories
CREATE TABLE product_catalog (
    product_id SERIAL,
    product_name VARCHAR(200) NOT NULL,
    category VARCHAR(50) NOT NULL,
    price NUMERIC(10,2),
    inventory_count INT,
    PRIMARY KEY (product_id, category)
) PARTITION BY LIST (category);
 
-- Create category partitions
CREATE TABLE products_electronics PARTITION OF product_catalog
    FOR VALUES IN ('electronics', 'computers', 'mobile', 'gaming');
 
CREATE TABLE products_clothing PARTITION OF product_catalog
    FOR VALUES IN ('clothing', 'shoes', 'accessories', 'jewelry');
 
CREATE TABLE products_home PARTITION OF product_catalog
    FOR VALUES IN ('furniture', 'kitchen', 'garden', 'tools');
 
CREATE TABLE products_books PARTITION OF product_catalog
    FOR VALUES IN ('books', 'ebooks', 'audiobooks');

Status-Based Example:

-- Partition by order status
CREATE TABLE order_management (
    order_id BIGSERIAL,
    customer_id INT NOT NULL,
    order_status VARCHAR(20) NOT NULL,
    order_date DATE DEFAULT CURRENT_DATE,
    total_amount NUMERIC(12,2),
    PRIMARY KEY (order_id, order_status)
) PARTITION BY LIST (order_status);
 
-- Active orders partition
CREATE TABLE orders_active PARTITION OF order_management
    FOR VALUES IN ('pending', 'processing', 'confirmed');
 
-- Completed orders partition
CREATE TABLE orders_completed PARTITION OF order_management
    FOR VALUES IN ('shipped', 'delivered', 'completed');
 
-- Problem orders partition
CREATE TABLE orders_issues PARTITION OF order_management
    FOR VALUES IN ('cancelled', 'returned', 'refunded');

Composite Partitioning

Composite partitioning combines multiple partitioning strategies to create hierarchical data organization.

Range + List (Time + Tenant):

-- Multi-tenant application with time-based sub-partitioning
CREATE TABLE tenant_events (
    event_id BIGSERIAL,
    tenant_id INT NOT NULL,
    event_date DATE NOT NULL,
    event_type VARCHAR(50),
    event_data JSONB,
    PRIMARY KEY (event_id, tenant_id, event_date)
) PARTITION BY LIST (tenant_id);
 
-- Create tenant partitions with date sub-partitioning
CREATE TABLE tenant_1_events PARTITION OF tenant_events
    FOR VALUES IN (1)
    PARTITION BY RANGE (event_date);
 
CREATE TABLE tenant_2_events PARTITION OF tenant_events
    FOR VALUES IN (2)
    PARTITION BY RANGE (event_date);
 
CREATE TABLE tenant_3_events PARTITION OF tenant_events
    FOR VALUES IN (3)
    PARTITION BY RANGE (event_date);
 
-- Create monthly sub-partitions for each tenant
CREATE TABLE tenant_1_events_2024_01 PARTITION OF tenant_1_events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
 
CREATE TABLE tenant_1_events_2024_02 PARTITION OF tenant_1_events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
 
CREATE TABLE tenant_2_events_2024_01 PARTITION OF tenant_2_events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
 
CREATE TABLE tenant_2_events_2024_02 PARTITION OF tenant_2_events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

Range + Hash (Time + Scale):

-- Time-based partitioning with hash sub-partitioning for scale
CREATE TABLE metrics_data (
    metric_id BIGSERIAL,
    measurement_time TIMESTAMP NOT NULL,
    device_id INT NOT NULL,
    metric_type VARCHAR(50),
    metric_value NUMERIC,
    PRIMARY KEY (metric_id, measurement_time, device_id)
) PARTITION BY RANGE (measurement_time);
 
-- Create monthly partitions
CREATE TABLE metrics_2024_01 PARTITION OF metrics_data
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
    PARTITION BY HASH (device_id);
 
CREATE TABLE metrics_2024_02 PARTITION OF metrics_data
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01')
    PARTITION BY HASH (device_id);
 
-- Create hash sub-partitions for each month
CREATE TABLE metrics_2024_01_hash_0 PARTITION OF metrics_2024_01
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
 
CREATE TABLE metrics_2024_01_hash_1 PARTITION OF metrics_2024_01
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
 
CREATE TABLE metrics_2024_01_hash_2 PARTITION OF metrics_2024_01
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
 
CREATE TABLE metrics_2024_01_hash_3 PARTITION OF metrics_2024_01
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

List + Hash (Region + Distribution):

-- Regional partitioning with hash distribution
CREATE TABLE global_transactions (
    transaction_id UUID DEFAULT gen_random_uuid(),
    region VARCHAR(20) NOT NULL,
    customer_id INT NOT NULL,
    transaction_time TIMESTAMP DEFAULT NOW(),
    amount NUMERIC(15,2),
    PRIMARY KEY (transaction_id, region, customer_id)
) PARTITION BY LIST (region);
 
-- Create regional partitions with hash sub-partitioning
CREATE TABLE transactions_americas PARTITION OF global_transactions
    FOR VALUES IN ('north_america', 'south_america')
    PARTITION BY HASH (customer_id);
 
CREATE TABLE transactions_emea PARTITION OF global_transactions
    FOR VALUES IN ('europe', 'middle_east', 'africa')
    PARTITION BY HASH (customer_id);
 
CREATE TABLE transactions_apac PARTITION OF global_transactions
    FOR VALUES IN ('asia_pacific')
    PARTITION BY HASH (customer_id);
 
-- Create hash sub-partitions for each region
CREATE TABLE transactions_americas_hash_0 PARTITION OF transactions_americas
    FOR VALUES WITH (MODULUS 8, REMAINDER 0);
 
CREATE TABLE transactions_americas_hash_1 PARTITION OF transactions_americas
    FOR VALUES WITH (MODULUS 8, REMAINDER 1);
-- ... continue for all hash partitions

Partition Management

Adding Partitions:

-- Add new monthly partition
CREATE TABLE sales_2024_04 PARTITION OF sales_data
    FOR VALUES FROM ('2024-04-01') TO ('2024-05-01');
 
-- Add new hash partition (requires increasing modulus)
-- Note: This requires careful planning and data redistribution

Attaching Existing Tables:

-- Create table with same structure
CREATE TABLE new_partition (
    LIKE sales_data INCLUDING ALL
);
 
-- Populate with data
INSERT INTO new_partition 
SELECT * FROM external_data_source 
WHERE sale_date BETWEEN '2024-04-01' AND '2024-04-30';
 
-- Attach as partition
ALTER TABLE sales_data 
ATTACH PARTITION new_partition 
FOR VALUES FROM ('2024-04-01') TO ('2024-05-01');

Removing Partitions:

-- Completely remove partition and its data
DROP TABLE sales_2023_01;
-- Detach partition while preserving data
ALTER TABLE sales_data 
DETACH PARTITION sales_2023_01;
 
-- Archive the detached table if needed
ALTER TABLE sales_2023_01 
RENAME TO sales_2023_01_archive;

Maintenance Operations:

-- 1. Detach the partition to split
ALTER TABLE sales_data 
DETACH PARTITION sales_2024_q1;
 
-- 2. Create new monthly partitions
CREATE TABLE sales_2024_01 PARTITION OF sales_data
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
 
CREATE TABLE sales_2024_02 PARTITION OF sales_data
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
 
CREATE TABLE sales_2024_03 PARTITION OF sales_data
    FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
 
-- 3. Redistribute data from old partition
INSERT INTO sales_data 
SELECT * FROM sales_2024_q1;
 
-- 4. Drop old partition
DROP TABLE sales_2024_q1;
-- 1. Detach partitions to merge
ALTER TABLE sales_data DETACH PARTITION sales_2024_01;
ALTER TABLE sales_data DETACH PARTITION sales_2024_02;
ALTER TABLE sales_data DETACH PARTITION sales_2024_03;
 
-- 2. Create new merged partition
CREATE TABLE sales_2024_q1_merged (
    LIKE sales_data INCLUDING ALL
);
 
-- 3. Combine data
INSERT INTO sales_2024_q1_merged 
SELECT * FROM sales_2024_01
UNION ALL
SELECT * FROM sales_2024_02
UNION ALL
SELECT * FROM sales_2024_03;
 
-- 4. Attach merged partition
ALTER TABLE sales_data 
ATTACH PARTITION sales_2024_q1_merged 
FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
 
-- 5. Clean up old partitions
DROP TABLE sales_2024_01, sales_2024_02, sales_2024_03;

Performance Optimization

Best Practices:

Choose Partition Keys Wisely

  • Use columns frequently in WHERE clauses
  • Ensure even data distribution
  • Consider composite keys for complex access patterns
  • Include partition key in primary key constraints
-- Good: Frequently queried time-based partition key
CREATE TABLE user_activity (
    user_id INT,
    activity_date DATE NOT NULL,
    activity_type VARCHAR(50),
    details JSONB,
    PRIMARY KEY (user_id, activity_date)  -- Include partition key
) PARTITION BY RANGE (activity_date);
 
-- Good: Composite partition key for multi-dimensional access
CREATE TABLE multi_tenant_logs (
    tenant_id INT NOT NULL,
    log_date DATE NOT NULL,
    log_level VARCHAR(10),
    message TEXT,
    PRIMARY KEY (tenant_id, log_date, log_level)
) PARTITION BY RANGE (tenant_id, log_date);
-- Regular partition maintenance schedule
-- 1. Create future partitions
SELECT create_monthly_partition('sales_data', CURRENT_DATE + INTERVAL '1 month');
 
-- 2. Archive old partitions  
SELECT archive_old_partitions('sales_data', 24);  -- Keep 24 months
 
-- 3. Update table statistics
ANALYZE sales_data;
 
-- 4. Monitor partition usage
SELECT * FROM partition_usage_report();
  • Always include partition keys in WHERE clauses when possible
  • Use EXPLAIN to verify partition pruning is working
  • Design indexes consistently across partitions
  • Monitor query performance regularly
  • Consider partition-wise joins for related partitioned tables

Limitations to Consider

  • Foreign keys not supported across partitions
  • Some features like DISTINCT may have limitations
  • UNIQUE constraints must include partition key
  • DDL operations affect all partitions
  • Plan partition count carefully (too many can hurt performance)

This comprehensive approach to partitioning enables you to effectively manage large datasets while maintaining optimal query performance and operational flexibility.

On this page