mteq.pro

April 14, 2026

How SQL Really Executes a Query

SQL queries are not evaluated in the order we write them. This post explains the logical execution order, why aliases fail in WHERE, why HAVING exists, and how to debug filtering and grouping mistakes step by step.

SQL does not execute clauses in the order we type them.

Instead, the engine builds the result in stages. Once we see that sequence, several confusing rules stop feeling arbitrary:

  • why aliases are not available in WHERE
  • why aggregates cannot appear in WHERE
  • why HAVING exists
  • why WHERE and HAVING change results differently
  • why SELECT is evaluated late

Understanding this is one of the fastest ways to debug filtering and grouping mistakes.

Let's use one mental model: SQL is an assembly line.

Each clause receives input from the previous stage. A clause cannot use data that has not been produced yet.

Here is the logical order:

  1. FROM - load base table(s)
  2. JOIN - match related rows inside the FROM phase
  3. WHERE - filter rows before grouping
  4. GROUP BY - form groups
  5. HAVING - filter groups after aggregation
  6. SELECT - shape the output
  7. DISTINCT - remove duplicates
  8. ORDER BY - sort rows
  9. LIMIT - keep the top N

Three consequences matter immediately:

  • WHERE cannot use SELECT aliases because aliases are created at stage 6
  • WHERE cannot use aggregates because grouping happens at stage 4
  • HAVING can use aggregates because it runs after grouping

Example:

Let's take one complete query and walk through it in logical order.

SELECT
    customers.name AS customer_name,
    COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
WHERE orders.status = 'shipped'
GROUP BY customers.name
HAVING COUNT(*) > 5
ORDER BY order_count DESC
LIMIT 10;

This query answers a simple question: which customers have more than 5 shipped orders?

We will read it in execution order, not in writing order.

Stage 1 and 2: FROM and JOIN

For teaching, we split this into two steps. In actual logical processing, the join is resolved inside the FROM phase.

The engine starts here:

  1. FROM customers loads the base table
  2. JOIN orders ON ... combines matching rows

By the end of the FROM phase, including join matching, SQL has a working table. Nothing has been filtered or grouped yet.

Stage 3: WHERE

WHERE filters individual rows before any grouping happens.

In our example, WHERE orders.status = 'shipped' keeps only shipped orders.

That is why it cannot reference:

  • aliases from SELECT
  • aggregates such as COUNT, SUM, or AVG
-- Fails: alias does not exist yet
SELECT customer_id, COUNT(*) as order_count
FROM orders
WHERE order_count > 5
GROUP BY customer_id;

-- Fails: aggregate not available yet
SELECT customer_id, COUNT(*) as total
FROM orders
WHERE COUNT(*) > 5
GROUP BY customer_id;

Stage 4: GROUP BY

GROUP BY collapses rows into groups. After that, each group becomes one output row.

In our example, GROUP BY customers.name creates one group per customer.

From this point on, the SELECT list can contain only:

  • grouped columns
  • aggregate expressions
-- Valid
SELECT customers.name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name;

-- Invalid: order_date is neither grouped nor aggregated
SELECT customers.name, orders.order_date, COUNT(*)
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name;

Stage 5: HAVING

HAVING filters groups after aggregation. That is why aggregate conditions belong here.

In our example, HAVING COUNT(*) > 5 keeps only customers whose shipped-order count is greater than 5.

-- Correct
SELECT customers.name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name
HAVING COUNT(*) > 5;

-- Wrong
SELECT customers.name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
WHERE COUNT(*) > 5
GROUP BY customers.name;

Keep the split simple:

  • WHERE filters rows
  • HAVING filters groups

Stage 6: SELECT

This is where SQL shapes the final output and creates aliases such as order_count.

So if WHERE runs first and SELECT runs later, WHERE cannot see order_count.

ORDER BY can see that alias because it runs later. In practice, all major SQL engines support this.

SELECT customers.name AS customer_name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name
ORDER BY order_count DESC;

Stages 7 to 9: DISTINCT, ORDER BY, LIMIT

These final stages are straightforward.

In our example:

  • DISTINCT removes duplicates from the selected result
  • ORDER BY sorts the result
  • LIMIT keeps only the requested number of rows

We do not use DISTINCT here because GROUP BY customers.name already produces one row per customer.

SELECT customers.name AS customer_name, COUNT(*) AS order_count
FROM customers
JOIN orders ON customers.id = orders.customer_id
GROUP BY customers.name
ORDER BY order_count DESC
LIMIT 10;

Let's pressure-test this mental model with three common mistakes.

Mistake 1: Using a SELECT Alias in WHERE

-- Fails
SELECT
    product_name,
    price * quantity as revenue
FROM sales
WHERE revenue > 1000;

-- Correct
SELECT
    product_name,
    price * quantity as revenue
FROM sales
WHERE price * quantity > 1000;

WHERE runs before SELECT, so the alias does not exist yet.

Mistake 2: Using an Aggregate in WHERE

-- Fails
SELECT customer_id, COUNT(*) as order_count
FROM orders
WHERE COUNT(*) > 5
GROUP BY customer_id;

-- Correct
SELECT customer_id, COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

COUNT(*) becomes available only after grouping, so the filter belongs in HAVING.

Mistake 3: Breaking a LEFT JOIN with WHERE

-- Wrong: this behaves like an INNER JOIN
SELECT
    customers.name,
    orders.total
FROM customers
LEFT JOIN orders ON customers.id = orders.customer_id
WHERE orders.status = 'shipped';

-- Correct: keep the filter in the join condition
SELECT
    customers.name,
    orders.total
FROM customers
LEFT JOIN orders
    ON customers.id = orders.customer_id
    AND orders.status = 'shipped';

LEFT JOIN preserves unmatched rows first. Then WHERE runs and removes rows where orders.status is NULL. That is how a LEFT JOIN quietly turns into an INNER JOIN.

If we want to preserve unmatched rows, the filter belongs in ON.

Now let's turn the idea into practice.

We have three tables:

  • customers (id, name, region)
  • orders (id, customer_id, order_date, status)
  • order_items (order_id, product_id, quantity, price)

Write a query that returns:

  1. customers from the North region
  2. more than 3 completed orders
  3. total revenue above $5000
  4. customer name, order count, and total revenue
  5. the top 10 rows by total revenue

Before writing the query, answer these four questions:

  • Which filters belong in WHERE?
  • Which filter belongs in HAVING?
  • Can total_revenue be reused as an alias in WHERE or HAVING?
  • Why must customer_name appear in GROUP BY if it appears in SELECT?

Bonus challenge: instead of the ranked filter query above, write a second query that returns all North-region customers, including those with zero completed orders, with order_count and total_revenue shown as 0 when no matching order exists. That change depends on join type, null handling, and filter placement.

Takeaway:

Keep this pipeline in mind every time a query behaves strangely:

  1. FROM -> load base table(s)
  2. JOIN -> match related rows inside the FROM phase
  3. WHERE -> filter rows
  4. GROUP BY -> form groups
  5. HAVING -> filter groups
  6. SELECT -> shape the output
  7. DISTINCT -> remove duplicates
  8. ORDER BY -> sort rows
  9. LIMIT -> keep the top N

That is the core idea. SQL is declarative, but its logic is not random. Once we understand the execution order, clause placement becomes much easier to reason about. Next time a query result, metric, or dashboard looks wrong, check the order of the query before blaming the data.