Metrics

Nodegroup Metrics

Displays various metrics for a specific Nodegroup over a designated period, including:

Target and Current Size: Shows the Nodegroup's target and current specifications. Current specifications indicate the number of units in normal service. When adding a new unit or increasing capacity, the condition of Target Size > Current Size might temporarily occur, which will later adjust to Target Size = Current Size. If Target Size != Current Size persists, it could suggest that the system is in an abnormal state, prompting you to contact the administrator.
Utilization: Reflects the utilization rate of Nodegroup resources, factoring in CPU and memory consumption. If this rate consistently exceeds 80%, capacity expansion may be necessary.
QPS: Indicates the number of SQL statements processed by Nodegroup each second, encompassing Select, Update, Insert, Delete, and Copy types.
SQL Latency: P99 represents the execution time of 99% of SQL statements on Nodegroup; P90 reflects the execution time of 90%. If the P99 or P90 latency remains abnormal for an extended period (such as several minutes), an analysis in conjunction with business and system conditions is required.
SQL Type Delays: P99 and P90 delays for various SQL types (select, insert, update, delete, copy) are recorded separately. If these indicators show long-term abnormalities (for example, several minutes), they should be analyzed alongside business and system conditions.
Network Throughput: Shows the Nodegroup's network throughput, including the total bytes received and sent.
Connections: Displays the total number of SQL connections on Nodegroup, categorizing them into active and idle connections.
Failed Query Count: Represents the number of SQL statements that failed to execute per second in Nodegroup. A sudden increase in this value necessitates an analysis of business and system conditions.
Affected Rows: Indicates the number of rows impacted by Nodegroup insert (INSERT), update (UPDATE), or delete (DELETE) operations per second. If any exceptions occur or the results diverge from expectations, an analysis in conjunction with business and system conditions is advised.

Database Metrics

Shows the storage sizes for the Nodegroup and Database dimensions:

Storage size for each Nodegroup: Displays the total storage size for all databases within each Nodegroup.
Storage size for individual databases: Displays the storage size for each separate database.

The database storage size encompasses the total storage space used by all data, including table indexes and transaction logs. Data ingestion, modification, indexing, transaction processing, schema changes, replication, and snapshots can all influence the overall storage size.

Monitoring Metrics

Metric Key	Metric Name	Type	Sample Value	Label	Description
Nodegroup Computation Metrics
nodegroup_expect_units	Expected Units	gauge	5	_cloudProvider _region _datacloudId _id _name	Target/Current Units: Displays the target and current unit count of Nodegroup. The current unit count reflects the number of units in normal service. During cluster creation or scaling, target units > current units may occur temporarily before equilibrium. If target units != current units persists, it may indicate an abnormal state—contact support if this occurs.
nodegroup_running_units	Running Units	gauge	5
nodegroup_resource_percent_normalized	Utilization	gauge	0.8		Resource Utilization: Indicates overall resource utilization of Nodegroup, incorporating both CPU and memory usage. If persistently above 80%, consider scaling clusters.
nodegroup_select_qps	Select QPS	gauge	1000		QPS: Number of SQL statements handled per second by Nodegroup, including Select, Update, Insert, Delete, and Copy queries.
nodegroup_update_qps	Update QPS	gauge	1000
nodegroup_insert_qps	Insert QPS	gauge	1000
nodegroup_delete_qps	Delete QPS	gauge	1000
nodegroup_copy_qps	Copy QPS	gauge	1
nodegroup_failure_qps	Failed Query QPS	gauge	1		Failed Queries: Number of failed SQL statements executed per second by Nodegroup. Investigate if this value surges in conjunction with business/system status.
nodegroup_insert_affected_rows	Rows Affected by Insert	gauge	10000		Rows Affected: Shows the number of rows impacted per second by INSERT, UPDATE, or DELETE operations executed by Nodegroup. If anomalies or unexpected results occur, further analysis is required in conjunction with application and system status.
nodegroup_update_affected_rows	Rows Affected by Update	gauge	10000
nodegroup_delete_affected_rows	Rows Affected by Delete	gauge	10000
nodegroup_copy_affected_rows	Rows Affected by Copy	gauge	10000
nodegroup_sql_select_p90_latency	Select Latency (P90)	gauge	38818282.52 (ns)		SQL Latency by Type: Collects P99 and P90 latency metrics for each type of SQL operation (SELECT, INSERT, UPDATE, DELETE, COPY) in Nodegroup. Extended abnormal values (lasting several minutes or more) should be troubleshot relative to business processes and system conditions.
nodegroup_sql_select_p99_latency	Select Latency (P99)	gauge	38818282.52 (ns)
nodegroup_sql_insert_p90_latency	Insert Latency (P90)	gauge	38818282.52 (ns)
nodegroup_sql_insert_p99_latency	Insert Latency (P99)	gauge	38818282.52 (ns)
nodegroup_sql_update_p90_latency	Update Latency (P90)	gauge	38818282.52 (ns)
nodegroup_sql_update_p99_latency	Update Latency (P99)	gauge	38818282.52 (ns)
nodegroup_sql_delete_p90_latency	Delete Latency (P90)	gauge	38818282.52 (ns)
nodegroup_sql_delete_p99_latency	Delete Latency (P99)	gauge	38818282.52 (ns)
nodegroup_sql_copy_p90_latency	Copy Latency (P90)	gauge	38818282.52 (ns)
nodegroup_sql_copy_p99_latency	Copy Latency (P99)	gauge	38818282.52 (ns)
nodegroup_sql_service_p90_latency	SQL Latency (P90)	gauge	38818282.52 (ns)		SQL Latency: P99: 99th percentile of query execution duration measured in Nodegroup. P90: 90th percentile of query execution duration measured in Nodegroup. If P99 or P90 latency metrics remain abnormal for several minutes, business processes and system status must be referenced in diagnosis.
nodegroup_sql_service_p99_latency	SQL Latency (P99)	gauge	38,818,282.52 (ns)
nodegroup_network_receive_bytes	Network Throughput (Receive)	gauge	92,468.533333 (bytes)		Network Throughput: Displays Nodegroup network throughput, including bytes received and sent.
nodegroup_network_send_bytes	Network Throughput (Send)	gauge	92,468.533333 (bytes)
nodegroup_active_sql_connections	Active SQL Connections	gauge	100		Connections: Shows SQL connections on Nodegroup, including active and idle connections.
nodegroup_idle_sql_connections	Idle SQL Connections	gauge	20
Database Storage Metrics
nodegroup_size_bytes	Database Storage Size	gauge	1,073,741,824 (1TB)	_cloudProvider _region _datacloudId _handle	Per-database storage size: Displays the storage size for each database. Includes all underlying physical storage used: table data, indexes, and WAL. Storage size is affected by insertions, updates, index rebuilds, transactions, schema changes, replication, and snapshots.
Backup Metrics
backup_size_bytes	Backup Storage Size	gauge	1,073,741,824 (1TB)	_cloudProvider _region _datacloudId _handle	Storage size per backup
Data Sync Metrics
datasync_source_idle_time		gauge	todo	_cloudProvider _region _datacloudId _jobId _jobName	Source idle time (seconds): Current system time - last record event time. Increases when there is no incoming data.
datasync_emit_event_time		gauge	todo		Delay for the most recently received data (seconds): Last system receipt timestamp - last event's business time. Will not increment when there's no data at the source.
datasync_source_heartbeat_time		gauge	todo		Source heartbeat time (seconds): Metric generation time - most recent attempt to read source. Growth indicates downstream backpressure.
datasync_rps		gauge	todo		Records per second
datasync_bps		gauge	todo		Bytes per second

Metrics

Nodegroup Metrics

Database Metrics

Monitoring Metrics

On this page