When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When investigating slow PostgreSQL queries, start by identifying the top offenders based on total execution time (not just per-query duration), then check for stale statistics which are the most common and easiest fix. Next, analyze execution plans to understand what's actually happening, look for missing indexes based on sequential scan patterns, and finally investigate table bloat and memory pressure if the previous steps don't resolve the issue.
1Identify queries with highest total execution time
Query pg_stat_statements ordered by `stat_statements_seconds` (total_exec_time) DESC, not just mean execution time. A query running 50ms but called 10,000 times per hour (139 hours total) is far worse than a 5-second query called once. Look at the `stat_statements_calls` count alongside duration — high call frequency with moderate latency usually offers the best optimization ROI. Focus on the top 5-10 queries by total time for the rest of your investigation.
2Check for stale statistics on affected tables
Check `last-analyze` timestamps and `n_mod_since_analyze` counts in pg_stat_user_tables for the tables used by your slow queries. If `n_mod_since_analyze` shows thousands or millions of rows modified since the last ANALYZE, or `last-analyze` is days/weeks old, your query planner is making decisions on outdated statistics, which can cause exponentially worse query plans. Run ANALYZE on those tables immediately and retest — this is the most common culprit and takes seconds to fix.
3Analyze execution plans with EXPLAIN ANALYZE
Run EXPLAIN (ANALYZE, BUFFERS, VERBOSE) on your top offending queries to see the actual execution plan and I/O patterns. Look for sequential scans on large tables, nested loops with high row counts, and hash joins that spill to disk. Compare `postgresql.blk_read_time` — if block read time is high, you're I/O bound. Pay attention to the planner's row count estimates versus actual rows — large discrepancies confirm stale statistics or missing WHERE clause selectivity.
4Look for missing indexes via sequential scan patterns
Compare `postgresql.sequential_scans` to `postgresql.index.scans` at the table level, and check `postgresql.seq_rows_read` versus `postgresql.index.rows_fetched`. If you see high sequential scan counts on tables with >100K rows, especially in your top slow queries, you likely need indexes. Cross-reference with EXPLAIN output — look for 'Seq Scan' nodes where you expected index usage. Create indexes on columns used in WHERE clauses, JOIN conditions, and ORDER BY for those queries.
5Check table bloat for unavoidable sequential scans
If sequential scans are legitimate (small tables or queries that need full scans), check `n_dead_tup` to identify table bloat. Bloat increases I/O cost linearly since PostgreSQL reads dead tuples during scans even though they're not returned. If `n_dead_tup` is >20% of live tuples and `postgresql.blk_read_time` is elevated, run VACUUM on those tables or tune autovacuum to run more aggressively. This is especially important after bulk operations or heavy UPDATE/DELETE workloads.
6Check for memory pressure via temp file usage
Look at `postgresql.temp_bytes` to identify queries spilling sorts, hashes, or aggregations to disk due to insufficient work_mem. If temp file usage is high (megabytes to gigabytes per query), your queries are memory-starved. Either increase work_mem (carefully — it's per-operation, not per-connection) or rewrite queries to reduce sort/hash operations. This is common with large JOINs, GROUP BY, or ORDER BY on unindexed columns.