April 14, 2026
How SQL Really Executes a Query
SQL queries are not evaluated in the order we write them. This post explains the logical execution order, why aliases fail in WHERE, why HAVING exists, and how to debug filtering and grouping mistakes step by step.
SQL does not execute clauses in the order we type them.
Instead, the engine builds the result in stages. Once we see that sequence, several confusing rules stop feeling arbitrary:
- why aliases are not available in
WHERE - why aggregates cannot appear in
WHERE - why
HAVINGexists - why
WHEREandHAVINGchange results differently - why
SELECTis evaluated late
Understanding this is one of the fastest ways to debug filtering and grouping mistakes.
Let's use one mental model: SQL is an assembly line.
Each clause receives input from the previous stage. A clause cannot use data that has not been produced yet.
Here is the logical order:
- FROM - load base table(s)
- JOIN - match related rows inside the FROM phase
- WHERE - filter rows before grouping
- GROUP BY - form groups
- HAVING - filter groups after aggregation
- SELECT - shape the output
- DISTINCT - remove duplicates
- ORDER BY - sort rows
- LIMIT - keep the top N
Three consequences matter immediately:
WHEREcannot useSELECTaliases because aliases are created at stage 6WHEREcannot use aggregates because grouping happens at stage 4HAVINGcan use aggregates because it runs after grouping
Example:
Let's take one complete query and walk through it in logical order.
SELECT
customers.name AS customer_name,
COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
WHERE orders.status = 'shipped'
GROUP BY customers.name
HAVING COUNT(*) > 5
ORDER BY order_count DESC
LIMIT 10;
This query answers a simple question: which customers have more than 5 shipped orders?
We will read it in execution order, not in writing order.
Stage 1 and 2: FROM and JOIN
For teaching, we split this into two steps. In actual logical processing, the join is resolved inside the FROM phase.
The engine starts here:
FROM customersloads the base tableJOIN orders ON ...combines matching rows
By the end of the FROM phase, including join matching, SQL has a working table. Nothing has been filtered or grouped yet.
Stage 3: WHERE
WHERE filters individual rows before any grouping happens.
In our example, WHERE orders.status = 'shipped' keeps only shipped orders.
That is why it cannot reference:
- aliases from
SELECT - aggregates such as
COUNT,SUM, orAVG
-- Fails: alias does not exist yet
SELECT customer_id, COUNT(*) as order_count
FROM orders
WHERE order_count > 5
GROUP BY customer_id;
-- Fails: aggregate not available yet
SELECT customer_id, COUNT(*) as total
FROM orders
WHERE COUNT(*) > 5
GROUP BY customer_id;
Stage 4: GROUP BY
GROUP BY collapses rows into groups. After that, each group becomes one output row.
In our example, GROUP BY customers.name creates one group per customer.
From this point on, the SELECT list can contain only:
- grouped columns
- aggregate expressions
-- Valid
SELECT customers.name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name;
-- Invalid: order_date is neither grouped nor aggregated
SELECT customers.name, orders.order_date, COUNT(*)
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name;
Stage 5: HAVING
HAVING filters groups after aggregation. That is why aggregate conditions belong here.
In our example, HAVING COUNT(*) > 5 keeps only customers whose shipped-order count is greater than 5.
-- Correct
SELECT customers.name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name
HAVING COUNT(*) > 5;
-- Wrong
SELECT customers.name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
WHERE COUNT(*) > 5
GROUP BY customers.name;
Keep the split simple:
WHEREfilters rowsHAVINGfilters groups
Stage 6: SELECT
This is where SQL shapes the final output and creates aliases such as order_count.
So if WHERE runs first and SELECT runs later, WHERE cannot see order_count.
ORDER BY can see that alias because it runs later. In practice, all major SQL engines support this.
SELECT customers.name AS customer_name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name
ORDER BY order_count DESC;
Stages 7 to 9: DISTINCT, ORDER BY, LIMIT
These final stages are straightforward.
In our example:
DISTINCTremoves duplicates from the selected resultORDER BYsorts the resultLIMITkeeps only the requested number of rows
We do not use DISTINCT here because GROUP BY customers.name already produces one row per customer.
SELECT customers.name AS customer_name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name
ORDER BY order_count DESC
LIMIT 10;
Let's pressure-test this mental model with three common mistakes.
Mistake 1: Using a SELECT Alias in WHERE
-- Fails
SELECT
product_name,
price * quantity as revenue
FROM sales
WHERE revenue > 1000;
-- Correct
SELECT
product_name,
price * quantity as revenue
FROM sales
WHERE price * quantity > 1000;
WHERE runs before SELECT, so the alias does not exist yet.
Mistake 2: Using an Aggregate in WHERE
-- Fails
SELECT customer_id, COUNT(*) as order_count
FROM orders
WHERE COUNT(*) > 5
GROUP BY customer_id;
-- Correct
SELECT customer_id, COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
COUNT(*) becomes available only after grouping, so the filter belongs in HAVING.
Mistake 3: Breaking a LEFT JOIN with WHERE
-- Wrong: this behaves like an INNER JOIN
SELECT
customers.name,
orders.total
FROM customers
LEFT JOIN orders ON customers.id = orders.customer_id
WHERE orders.status = 'shipped';
-- Correct: keep the filter in the join condition
SELECT
customers.name,
orders.total
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id
AND orders.status = 'shipped';
LEFT JOIN preserves unmatched rows first. Then WHERE runs and removes rows where orders.status is NULL. That is how a LEFT JOIN quietly turns into an INNER JOIN.
If we want to preserve unmatched rows, the filter belongs in ON.
Now let's turn the idea into practice.
We have three tables:
customers(id,name,region)orders(id,customer_id,order_date,status)order_items(order_id,product_id,quantity,price)
Write a query that returns:
- customers from the
Northregion - more than 3 completed orders
- total revenue above
$5000 - customer name, order count, and total revenue
- the top 10 rows by total revenue
Before writing the query, answer these four questions:
- Which filters belong in
WHERE? - Which filter belongs in
HAVING? - Can
total_revenuebe reused as an alias inWHEREorHAVING? - Why must
customer_nameappear inGROUP BYif it appears inSELECT?
Bonus challenge: instead of the ranked filter query above, write a second query that returns all North-region customers, including those with zero completed orders, with order_count and total_revenue shown as 0 when no matching order exists. That change depends on join type, null handling, and filter placement.
Takeaway:
Keep this pipeline in mind every time a query behaves strangely:
- FROM -> load base table(s)
- JOIN -> match related rows inside the FROM phase
- WHERE -> filter rows
- GROUP BY -> form groups
- HAVING -> filter groups
- SELECT -> shape the output
- DISTINCT -> remove duplicates
- ORDER BY -> sort rows
- LIMIT -> keep the top N
That is the core idea. SQL is declarative, but its logic is not random. Once we understand the execution order, clause placement becomes much easier to reason about. Next time a query result, metric, or dashboard looks wrong, check the order of the query before blaming the data.