Data Query, Update and Delete

This comprehensive guide covers all fundamental data operations in Tacnode. Learn how to query, modify, and manage your data efficiently using SQL commands optimized for Tacnode's distributed architecture.

Quick Reference

OperationPurposeKey Features
SELECTQuery and retrieve dataFiltering, joins, aggregations, CTEs
DELETERemove recordsConditional deletion, multi-table operations
UPDATEModify existing dataSingle/batch updates, computed values
MERGEUpsert operationsINSERT + UPDATE in one command
TRUNCATERemove all recordsFast table clearing
VIEWVirtual tablesReusable query logic

SELECT

The SELECT statement is your primary tool for data retrieval and analysis. It supports everything from simple filtering to complex analytical queries.

Basic Syntax:

SELECT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column
LIMIT count;

For complete syntax reference, see SELECT documentation.

Working with Sample Data

Throughout this guide, we'll use this sample table:

tacnode=> \d users
                                 Table "public.users"
  Column  |     Type     | Collation |  Nullable  |                 Default
----------+--------------+-----------+------------+--------------------------------------
 id       | integer      |           | not null   | nextval('users_id_seq'::regclass)
 name     | text         |           |            |
 age      | smallint     |           |            |
 gender   | character(1) |           |            |
 salary   | numeric      |           |            |
Indexes:
    "users_pkey" PRIMARY KEY, btree (id ASC)
 
tacnode=> SELECT * FROM users;
 id |    name    | age | gender | salary
----+------------+-----+--------+--------
  1 | Alice      |  25 | f      |  50000
  2 | Bob        |  30 | m      |  60000
  3 | Carol      |  28 | f      |  55000
  4 | David      |  35 | m      |  70000
  5 | Eve        |  22 | f      |  45000
(5 rows)

Basic Queries

Filter by condition:

-- Find all female employees
SELECT * FROM users WHERE gender = 'f';
 
-- Find employees over 25 with salary > 50000
SELECT name, age, salary 
FROM users 
WHERE age > 25 AND salary > 50000;

Sort and limit results:

-- Top 3 highest paid employees
SELECT name, salary 
FROM users 
ORDER BY salary DESC 
LIMIT 3;
 
-- Employees sorted by age, then name
SELECT name, age 
FROM users 
ORDER BY age, name;

Advanced Querying

Aggregations and grouping:

-- Average salary by gender
SELECT gender, AVG(salary) as avg_salary, COUNT(*) as employee_count
FROM users 
GROUP BY gender;
 
-- Salary statistics
SELECT 
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    AVG(salary) as avg_salary,
    COUNT(*) as total_employees
FROM users;

Subqueries

Use subqueries to create complex filtering conditions:

-- Find employees earning above average
SELECT name, salary
FROM users
WHERE salary > (SELECT AVG(salary) FROM users);
 
-- Find employees in top salary quartile
SELECT name, salary
FROM users
WHERE salary >= (
    SELECT PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY salary)
    FROM users
);

Common Table Expressions (CTE)

CTEs make complex queries more readable by breaking them into logical components:

-- Calculate salary rankings
WITH salary_ranks AS (
    SELECT 
        name, 
        salary,
        RANK() OVER (ORDER BY salary DESC) as salary_rank
    FROM users
)
SELECT name, salary, salary_rank
FROM salary_ranks
WHERE salary_rank <= 3;
 
-- Multi-step analysis
WITH dept_stats AS (
    SELECT 
        gender,
        AVG(salary) as avg_salary,
        COUNT(*) as employee_count
    FROM users
    GROUP BY gender
),
overall_stats AS (
    SELECT AVG(salary) as company_avg_salary
    FROM users
)
SELECT 
    ds.gender,
    ds.avg_salary,
    ds.employee_count,
    ds.avg_salary - os.company_avg_salary as salary_difference
FROM dept_stats ds, overall_stats os;

DELETE

Remove data from tables with precise control over which records to delete.

Basic Syntax:

DELETE FROM table_name
WHERE condition;

For complete syntax reference, see DELETE documentation.

Simple Deletions

-- Delete a specific user
DELETE FROM users WHERE id = 5;
 
-- Delete users meeting criteria
DELETE FROM users WHERE age < 25;
 
-- Delete with confirmation (using RETURNING)
DELETE FROM users 
WHERE salary < 50000 
RETURNING id, name, salary;

Multi-table Deletions

Use DELETE USING for complex deletions involving multiple tables:

-- Remove inactive users (using a related table)
DELETE FROM users
USING user_activity
WHERE users.id = user_activity.user_id 
  AND user_activity.last_login < '2024-01-01';
 
-- Remove products with zero inventory
DELETE FROM products
USING inventory
WHERE products.id = inventory.product_id 
  AND inventory.quantity = 0;

Batch Deletions

For large datasets, consider batch deletions to avoid long-running transactions:

-- Delete in batches to avoid locks
DELETE FROM large_table 
WHERE created_at < '2023-01-01'
  AND id IN (
      SELECT id FROM large_table 
      WHERE created_at < '2023-01-01' 
      LIMIT 1000
  );

Important: Tacnode uses MVCC (Multi-Version Concurrency Control). Deleted data isn't immediately removed from disk but cleaned up asynchronously. This temporarily increases storage usage until compaction completes.


UPDATE

Modify existing records in your tables efficiently.

Basic Syntax:

UPDATE table_name
SET column1 = value1, column2 = value2
WHERE condition;

For complete syntax reference, see UPDATE documentation.

Simple Updates

-- Update a single record
UPDATE users 
SET salary = 75000 
WHERE id = 4;
 
-- Update multiple columns
UPDATE users 
SET salary = salary * 1.1, age = age + 1
WHERE gender = 'f';
 
-- Conditional updates
UPDATE users 
SET salary = CASE 
    WHEN age > 30 THEN salary * 1.15
    WHEN age > 25 THEN salary * 1.10
    ELSE salary * 1.05
END;

Computed Updates

-- Update based on aggregated data
UPDATE users 
SET salary = (
    SELECT AVG(salary) * 1.2 
    FROM users u2 
    WHERE u2.gender = users.gender
)
WHERE age > 35;
 
-- Update with joins
UPDATE users 
SET salary = users.salary + bonus.amount
FROM employee_bonus bonus
WHERE users.id = bonus.employee_id 
  AND bonus.year = 2024;

Bulk Updates

-- Update from another table
UPDATE users 
SET 
    salary = new_salaries.salary,
    department = new_salaries.dept
FROM (VALUES 
    (1, 80000, 'Engineering'),
    (2, 70000, 'Sales'),
    (3, 75000, 'Marketing')
) AS new_salaries(id, salary, dept)
WHERE users.id = new_salaries.id;

Important: Like DELETE, UPDATE operations use MVCC. Old versions remain on disk temporarily until asynchronous compaction removes them.


MERGE

Combine INSERT and UPDATE operations in a single, atomic command. Perfect for data synchronization and upsert scenarios.

Basic Syntax:

MERGE INTO target_table 
USING source_table ON (condition)
WHEN MATCHED THEN UPDATE SET column = value
WHEN NOT MATCHED THEN INSERT (columns) VALUES (values);

Data Synchronization

-- Sync customer data from external source
MERGE INTO customers c
USING customer_updates cu ON c.id = cu.id
WHEN MATCHED THEN
    UPDATE SET 
        name = cu.name,
        email = cu.email,
        phone = cu.phone,
        updated_at = NOW()
WHEN NOT MATCHED THEN
    INSERT (id, name, email, phone, created_at)
    VALUES (cu.id, cu.name, cu.email, cu.phone, NOW());

ETL Operations

-- Load daily sales data
MERGE INTO daily_sales ds
USING (
    SELECT 
        DATE(sale_timestamp) as sale_date,
        product_id,
        SUM(amount) as total_amount,
        COUNT(*) as transaction_count
    FROM raw_sales 
    WHERE DATE(sale_timestamp) = CURRENT_DATE
    GROUP BY DATE(sale_timestamp), product_id
) rs ON ds.sale_date = rs.sale_date AND ds.product_id = rs.product_id
WHEN MATCHED THEN
    UPDATE SET 
        total_amount = rs.total_amount,
        transaction_count = rs.transaction_count,
        updated_at = NOW()
WHEN NOT MATCHED THEN
    INSERT (sale_date, product_id, total_amount, transaction_count, created_at)
    VALUES (rs.sale_date, rs.product_id, rs.total_amount, rs.transaction_count, NOW());

For complete syntax reference, see MERGE documentation.


TRUNCATE

Quickly remove all data from a table. Much faster than DELETE for removing all records.

Basic Syntax:

TRUNCATE TABLE table_name [RESTART IDENTITY] [CASCADE];

Fast Table Clearing

-- Simple truncate
TRUNCATE TABLE temp_data;
 
-- Reset auto-increment sequences
TRUNCATE TABLE users RESTART IDENTITY;
 
-- Truncate with foreign key dependencies
TRUNCATE TABLE orders CASCADE;
 
-- Truncate multiple tables
TRUNCATE TABLE table1, table2, table3;

When to Use TRUNCATE vs DELETE

OperationUse WhenPerformanceTransaction Log
TRUNCATERemoving all dataVery fastMinimal logging
DELETEConditional removalSlowerFull logging

Key Differences:

  • TRUNCATE resets identity sequences
  • TRUNCATE cannot have a WHERE clause
  • TRUNCATE is faster but less flexible
  • DELETE allows row-by-row processing

For complete syntax reference, see TRUNCATE documentation.


VIEW

Create virtual tables that store queries, not data. Views provide data abstraction, security, and code reusability.

Basic Syntax:

CREATE VIEW view_name AS
SELECT columns
FROM tables
WHERE conditions;

Simple Views

-- Create a view for active employees
CREATE VIEW active_employees AS
SELECT id, name, age, salary
FROM users
WHERE active = true;
 
-- Use the view like a table
SELECT * FROM active_employees WHERE salary > 60000;

Complex Views

-- Employee summary view
CREATE VIEW employee_summary AS
SELECT 
    gender,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    AVG(age) as avg_age
FROM users
GROUP BY gender;
 
-- Department performance view
CREATE VIEW department_performance AS
SELECT 
    d.name as department_name,
    COUNT(u.id) as employee_count,
    AVG(u.salary) as avg_salary,
    SUM(p.revenue) as total_revenue
FROM departments d
LEFT JOIN users u ON d.id = u.department_id
LEFT JOIN performance p ON u.id = p.employee_id
GROUP BY d.id, d.name;

View Management

-- Update view definition
CREATE OR REPLACE VIEW active_employees AS
SELECT id, name, age, salary, department_id
FROM users
WHERE active = true AND salary > 40000;
 
-- Rename view
ALTER VIEW employee_summary RENAME TO emp_summary;
 
-- Drop view
DROP VIEW IF EXISTS temp_view;
 
-- List all views
SELECT schemaname, viewname, definition
FROM pg_views
WHERE schemaname = 'public';

Security with Views

Views can provide row-level security and column filtering:

-- HR view - sensitive salary information
CREATE VIEW hr_employee_view AS
SELECT id, name, age, salary, department_id
FROM users;
 
-- Manager view - no salary information
CREATE VIEW manager_employee_view AS
SELECT id, name, age, department_id, performance_rating
FROM users u
JOIN performance p ON u.id = p.employee_id;
 
-- Public view - minimal information
CREATE VIEW public_employee_view AS
SELECT id, name, department_id
FROM users
WHERE public_profile = true;

For complete syntax reference, see CREATE VIEW, ALTER VIEW, and DROP VIEW documentation.


Performance Tips

Query Optimization

  1. Use indexes effectively:

    -- Create indexes for frequently queried columns
    CREATE INDEX idx_users_age ON users(age);
    CREATE INDEX idx_users_gender_salary ON users(gender, salary);
  2. Limit result sets:

    -- Use LIMIT for large datasets
    SELECT * FROM large_table ORDER BY created_at DESC LIMIT 100;
  3. Use EXPLAIN to analyze queries:

    EXPLAIN ANALYZE 
    SELECT * FROM users WHERE age > 25 AND salary > 50000;

Transaction Management

-- Use transactions for data consistency
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
 
-- Handle errors gracefully
BEGIN;
    INSERT INTO orders (customer_id, total) VALUES (123, 500);
    UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 456;
EXCEPTION WHEN OTHERS THEN
    ROLLBACK;
    RAISE;
COMMIT;

Best Practices

  1. Always use WHERE clauses with UPDATE and DELETE to avoid accidental data modification
  2. Test queries on small datasets before running on production data
  3. Use transactions for related operations that must succeed or fail together
  4. Monitor query performance with EXPLAIN and adjust indexes as needed
  5. Use CTEs to break complex queries into readable parts
  6. Consider MERGE for upsert operations instead of separate INSERT/UPDATE logic
  7. Use views to encapsulate complex business logic and provide security layers
  8. Batch large operations to avoid long-running transactions and locks

This comprehensive guide provides the foundation for all data operations in Tacnode. Each operation builds upon these fundamentals to support complex data management scenarios in production environments.