Lab 06: EXPLAIN & Query Optimization

Time: 40 minutes | Level: Practitioner | DB: MySQL 8.0 + PostgreSQL 15

EXPLAIN reveals how the query planner executes your SQL. Reading execution plans is a core skill — it shows whether indexes are used, where time is spent, and which joins are expensive.


Step 1 — MySQL: Basic EXPLAIN

USE labdb;

-- Create a meaningful dataset
CREATE INDEX idx_customer ON orders(customer);
CREATE INDEX idx_product  ON orders(product);

-- Basic EXPLAIN
EXPLAIN
SELECT customer, COUNT(*) AS cnt, SUM(total) AS total_spent
FROM orders
WHERE customer = 'Alice'
GROUP BY customer;

📸 Verified Output:

id  select_type  table   partitions  type  possible_keys  key           key_len  ref    rows  filtered  Extra
1   SIMPLE       orders  NULL        ref   idx_customer   idx_customer  403      const  2     100.00    NULL

Key columns to read:

Column
Meaning

type

Join type: const > eq_ref > ref > range > index > ALL

key

Index actually used (NULL = full scan)

rows

Estimated rows examined

filtered

% of rows after WHERE filter

Extra

Using index = covering index; Using filesort = expensive sort

💡 type=ALL means full table scan — usually bad. type=ref or type=range means index is used.


Step 2 — MySQL: EXPLAIN FORMAT=JSON

The JSON format reveals:

  • cost_info: estimated cost per node

  • used_columns: which columns are read

  • attached_condition: pushed-down filters

  • ordering_operation: whether filesort is needed

💡 Add EXPLAIN ANALYZE in MySQL 8.0.18+ to get actual execution statistics alongside estimates.


Step 3 — PostgreSQL: EXPLAIN ANALYZE

📸 Verified Output:


Step 4 — Reading PostgreSQL Execution Plans

Node types and what they mean:

Node
Meaning

Seq Scan

Full table scan — reads every row

Index Scan

Follows index to heap; good for selective queries

Index Only Scan

Never touches heap — all data in index

Bitmap Index Scan + Bitmap Heap Scan

Batches index lookups, then reads heap

Hash Join

Builds hash table from smaller relation

Nested Loop

For each outer row, scans inner — good for small tables

Merge Join

Requires both inputs sorted; good for large sorted data

Sort

Explicit sort (memory or disk)

HashAggregate

GROUP BY via hash map

📸 Verified Output (with index):


Step 5 — Cost Model and Statistics

💡 If your planner makes bad choices, statistics may be stale. Always ANALYZE after bulk inserts/updates. Increase default_statistics_target for high-cardinality columns.


Step 6 — Identifying Slow Queries

MySQL slow query log:

PostgreSQL:


Step 7 — N+1 Pattern Detection

The N+1 anti-pattern: execute one query to get N records, then execute N more queries for related data.

💡 In PostgreSQL, use pg_stat_activity to catch long-running queries in real-time: SELECT pid, now() - query_start AS duration, query FROM pg_stat_activity WHERE state = 'active';


Step 8 — Capstone: Query Tuning Workflow

Apply the full optimization workflow to a slow query:


Summary

Tool
Purpose

EXPLAIN

Show estimated plan (no execution)

EXPLAIN ANALYZE

Execute and show actual vs estimated

EXPLAIN (BUFFERS)

Add buffer cache hit/miss info

EXPLAIN FORMAT=JSON

Machine-readable plan

pg_stats

Planner statistics per column

ANALYZE

Refresh table statistics

pg_stat_statements

Aggregate slow query tracking

performance_schema (MySQL)

Query digest and timing

Last updated