#pg-stat-views
Which system views do you start a live database's diagnostics from?
What to say
First, `pg_stat_activity`: who is connected now, what they run, for how long, in what state (active, idle, idle in transaction), and what they wait on (`wait_event`). It immediately shows stuck transactions and locks. `pg_stat_statements` (an extension) aggregates over normalized queries: total time, call count, average, buffer reads; the main tool for "which queries eat the server". `pg_locks` shows who blocks whom. `pg_stat_user_tables` and `pg_stat_user_indexes` show scans, dead versions, the last autovacuum, index usage. `pg_stat_io` (PG 16) gives a picture of reads and writes by type. This is the set any incident analysis starts from.
What they want to hear
A senior should: - name `pg_stat_activity` as the starting point and what to look at in it (state, wait_event, duration) - know `pg_stat_statements` as the way to find heavy queries by aggregates - list `pg_locks`, `pg_stat_user_tables/indexes`, `pg_stat_io` and what each answers - move from symptom to view: generally slow means statements, one thing hangs means activity plus locks
Pitfalls
- ✗ Starting diagnostics by guessing from the logs instead of `pg_stat_activity` and `pg_stat_statements`
- ✗ Not enabling `pg_stat_statements` ahead of time and being left without history at the moment of an incident
- ✗ Looking only at the average query time and missing a rare but catastrophically expensive one
Follow-up
- ? How do you find a transaction stuck in idle in transaction through `pg_stat_activity`?
- ? What does `pg_stat_statements` give that the slow query log does not?
- ? Where do you look to tell whether an index is used?
Depth in knowledge base