tacnode

Query Tuning

Database diagnosis and tuning involve identifying, analyzing, and improving the database system to enhance its performance, stability, and maintainability. This process typically includes monitoring database operations, identifying the root causes of performance issues, and implementing measures to improve performance, such as developing an appropriate index structure and gathering essential statistical data.

When a user's query encounters unexpected performance issues—especially if it doesn’t complete within the expected time—it is referred to as a slow SQL. To address slow SQL, follow these steps:

  • Check the system load: A primary factor affecting query performance can be system overload. In such cases, new SQL queries can further strain the system, significantly reducing overall performance. You can check the current system load on the management platform or use the pg_stat_activity view to assess the status of queries currently executing in the system.
  • Examine the execution plan of slow SQL: Once you’ve identified the slow SQL, use the EXPLAIN command to observe the actual execution plan. This will help determine if the query is selecting the expected index, if it has adequate statistical information, and if the right execution method is being chosen.
  • Review the execution statistics of slow SQL: While analyzing the SQL execution plan, correlate it with real SQL operational data to identify the exact node where the slow SQL bottleneck occurs, enabling you to take precise corrective actions on the execution plan.

pg_stat_activity

pg_stat_activity is a system view in the Tacnode database that provides insights into active sessions and queries. It displays details such as the executing query, connection status, duration of the connection, and more. The specific field information is as follows:

Field NameField TypeDescription
datidoidOID of the database this backend is connected to
datnamenameName of the database this backend is connected to
pidintegerProcess ID of this backend
leader_pidintegerProcess ID of the parallel group leader, if this process is a parallel query worker. NULL if this process is a parallel group leader or does not participate in parallel query.
usesysidoidOID of the user logged into this backend
usenamenameName of the user logged into this backend
application_nametextName of the application that is connected to this backend
client_addrinetIP address of the client connected to this backend. If this field is null, it indicates either that the client is connected via a Unix socket on the server machine or that this is an internal process such as autovacuum.
client_hostnametextHost name of the connected client, as reported by a reverse DNS lookup of client_addr.
client_portintegerTCP port number that the client is using for communication with this backend, or -1 if a Unix socket is used. If this field is null, it indicates that this is an internal server process.
backend_starttimestamp with time zoneTime when this process was started. For client backends, this is the time the client connected to the server.
xact_starttimestamp with time zoneTime when this process' current transaction was started, or null if no transaction is active. If the current query is the first of its transaction, this column is equal to the query_start column.
query_starttimestamp with time zoneTime when the currently active query was started, or if state is not active, when the last query was started
state_changetimestamp with time zoneTime when the state was last changed
wait_event_typetextThe type of event for which the backend is waiting, if any; otherwise NULL.
wait_eventtextWait event name if backend is currently waiting, otherwise NULL.
statetextCurrent overall state of this backend. Possible values are:
active: The backend is executing a query.
idle: The backend is waiting for a new client command.
idle in transaction: The backend is in a transaction, but is not currently executing a query.
idle in transaction (aborted): This state is similar to idle in transaction, except one of the statements in the transaction caused an error.
fastpath function call: The backend is executing a fast-path function.
backend_xidbigintTop-level transaction identifier of this backend, if any.
backend_xminbigintThe current backend's xmin horizon.
query_idbigintIdentifier of this backend's most recent query. If state is active this field shows the identifier of the currently executing query. In all other states, it shows the identifier of last query that was executed.
querytextText of this backend's most recent query. If state is active this field shows the currently executing query. In all other states, it shows the last query that was executed. By default the query text is truncated at 1024 bytes.
backend_typetextType of current backend. Possible types are autovacuum launcher, autovacuum worker, logical replication launcher, logical replication worker, parallel worker, background writer, client backend, checkpointer, archiver, startup, walreceiver, walsender and walwriter.

Users can access information from the pg_stat_activity view using SQL queries. For instance, the view enables the implementation of the following functions:

  1. Check active queries
SELECT * FROM pg_stat_activity WHERE state = 'active';
  1. Check active sessions for a particular user
SELECT * FROM pg_stat_activity WHERE usename = 'your_username';
  1. Identify queries exceeding ten seconds in execution time
SELECT * FROM pg_stat_activity WHERE now() - query_start > interval '10s';

When utilizing pg_stat_activity, it is important to keep the following precautions in mind:

  • Permissions: Accessing pg_stat_activity views typically requires users to have the necessary permissions. Ensure you run queries with a user who has adequate privileges.
  • Information Interpretation: Understanding the output of pg_stat_activity necessitates some familiarity with Tacnode's internal mechanics, particularly regarding connection status, query status, and so on.
  • Performance Impact: While querying pg_stat_activity minimizes database performance, executing numerous queries on a heavily used database may slightly elevate system overhead. Therefore, it should be used cautiously in production environments.

EXPLAIN

In the Tacnode system, EXPLAIN statements provide insights into query plans. These plans, created by the optimizer, detail the execution process of a query. By examining the execution plan, you can gain insight into how Tacnode determines the execution path, including the indexes utilized and the methods applied for JOIN operations, among other details.

Here are the fundamental usages of EXPLAIN statements:

EXPLAIN
SELECT l_returnflag,
       l_linestatus,
       SUM(l_quantity)                                       AS sum_qty,
       SUM(l_extendedprice)                                  AS sum_base_price,
       SUM(l_extendedprice * (1 - l_discount))               AS sum_disc_price,
       SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
       AVG(l_quantity)                                       AS avg_qty,
       AVG(l_extendedprice)                                  AS avg_price,
       AVG(l_discount)                                       AS avg_disc,
       COUNT(*)                                              AS count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL ':1' DAY
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
    
QUERY PLAN
Response/dop=1
  ->  Noop/dop=1  (cost=0.00..1088857.22 rows=6 width=0)
        ->  Sort/dop=1  (cost=0.00..1088857.12 rows=6 width=0)
"              Sort Key: l_returnflag asc,l_linestatus asc"
              ->  Finalize Aggregate/dop=1  (cost=0.00..1088856.66 rows=6 width=0)
"                    Group Key: l_returnflag, l_linestatus"
                    ->  Gather Channels:6 Method:Unordered
                          ->  Partial Aggregate/dop=6  (cost=0.00..1088854.00 rows=15 width=0)
"                                Group Key: l_returnflag, l_linestatus"
                                ->  Noop/dop=6  (cost=0.00..368708.03 rows=6001215 width=0)
                                      ->  Seq Scan on lineitem/dop=6  (cost=0.00..188671.57 rows=6001215 width=0)
                                            Filter: l_shipdate <= '1998-11-30'

To tune slow queries, use the EXPLAIN statement to examine the current execution plan. Next, assess whether this plan is optimal given the specific data distribution. Consider if the correct index is chosen, the best join algorithm (hash join, merge join, or lookup join) is applied, and the proper aggregation algorithm (hash or merge aggregation) is used.

On this page