#query-lifecycle
What stages does a query go through from text to result?
What to say
Four stages. Parse: the text becomes a tree, and the syntax and object names are checked against the system catalog. Rewrite: rules and views are applied. A reference to a view, for instance, is expanded into a subquery, and row security (RLS) is layered on. Plan: the cost-based optimizer tries access methods and join orders and picks the cheapest plan. Execute: the tree of plan nodes runs in a "pull a row from the top" model (volcano), each node asking its children for rows one at a time. Splitting into stages is why a prepared statement can be planned once and executed many times.
What they want to hear
A senior should: - name the four stages in order and what each one does - give an example of rewriting: expanding a view into a subquery, RLS - explain that planning is separate from execution, and tie that to the plan cache of a prepared statement - understand the optimizer is cost-based: it compares estimates rather than running the variants
Pitfalls
- ✗ Thinking a view is a "stored result". In reality it is expanded into the query text at the rewrite stage
- ✗ Confusing planning with execution. Planning time and execution time are different numbers in EXPLAIN
- ✗ Thinking the optimizer tries plans for real. It estimates their cost by formulas
Follow-up
- ? What happens to a `SELECT` from a view at the rewrite stage?
- ? Why are planning time and execution time separated in EXPLAIN output?
- ? How are the stages tied to a prepared statement's plan cache?
Depth in knowledge base