Lab 05: Aggregate Functions

Time: 30 minutes | Level: Foundations | DB: MySQL 8 / PostgreSQL 15

Overview

Master COUNT, SUM, AVG, MIN, MAX, GROUP BY, HAVING, DISTINCT, and NULL behavior in aggregates. Includes the ROLLUP modifier for subtotals.


Step 1: Setup

docker run -d --name mysql-lab -e MYSQL_ROOT_PASSWORD=rootpass mysql:8.0
for i in $(seq 1 30); do docker exec mysql-lab mysql -uroot -prootpass -e "SELECT 1" 2>/dev/null && break || sleep 2; done

docker exec mysql-lab mysql -uroot -prootpass << 'EOF'
CREATE DATABASE analytics;
USE analytics;

CREATE TABLE sales (
    sale_id    INT NOT NULL AUTO_INCREMENT,
    rep_name   VARCHAR(50) NOT NULL,
    region     VARCHAR(30) NOT NULL,
    product    VARCHAR(50) NOT NULL,
    amount     DECIMAL(10,2),
    sale_date  DATE NOT NULL,
    PRIMARY KEY (sale_id)
);

INSERT INTO sales (rep_name, region, product, amount, sale_date) VALUES
('Alice',  'North', 'Widget A',   1200.00, '2024-01-05'),
('Bob',    'South', 'Widget B',   850.50,  '2024-01-08'),
('Alice',  'North', 'Widget C',   2300.00, '2024-01-12'),
('Carol',  'East',  'Widget A',   975.00,  '2024-01-15'),
('Bob',    'South', 'Widget A',   1100.00, '2024-01-20'),
('Alice',  'North', 'Widget B',   NULL,    '2024-01-22'),
('Carol',  'East',  'Widget B',   1450.00, '2024-02-01'),
('David',  'West',  'Widget C',   3200.00, '2024-02-05'),
('Bob',    'South', 'Widget C',   900.00,  '2024-02-10'),
('David',  'West',  'Widget A',   2100.00, '2024-02-14'),
('Alice',  'North', 'Widget A',   1800.00, '2024-02-18'),
('Carol',  'East',  'Widget C',   NULL,    '2024-02-20'),
('David',  'West',  'Widget B',   1650.00, '2024-02-25'),
('Bob',    'South', 'Widget B',   1250.00, '2024-03-01'),
('Alice',  'North', 'Widget C',   2750.00, '2024-03-05');
EOF

Step 2: COUNT

📸 Verified Output:

💡 COUNT(*) counts ALL rows. COUNT(column) counts only rows where that column is NOT NULL. This difference is important!


Step 3: SUM, AVG, MIN, MAX

📸 Verified Output:

💡 All aggregate functions (SUM, AVG, MIN, MAX) ignore NULL values. The average above is calculated over 13 rows (the 2 NULLs are excluded).


Step 4: GROUP BY

📸 Verified Output:

💡 Every column in SELECT that is NOT inside an aggregate function must appear in GROUP BY.


Step 5: HAVING (Filter After Grouping)

📸 Verified Output (HAVING total > 5000):

💡 WHERE vs HAVING:

  • WHERE filters individual rows before grouping

  • HAVING filters groups after aggregation

  • You CANNOT use aggregate functions in WHERE


Step 6: DISTINCT and COUNT DISTINCT

📸 Verified Output:


Step 7: NULL Handling in Aggregates

📸 Verified Output:


Step 8: Capstone — ROLLUP for Subtotals

📸 Verified Output:

PostgreSQL equivalent:

💡 ROLLUP is powerful for reporting dashboards — it generates subtotals at each level automatically.

Cleanup:


Summary

Function
Description
NULL behavior

COUNT(*)

Count all rows

Includes NULLs

COUNT(col)

Count non-NULL values

Excludes NULLs

COUNT(DISTINCT col)

Count unique non-NULL values

Excludes NULLs

SUM(col)

Sum of non-NULL values

Excludes NULLs

AVG(col)

Average of non-NULL values

Excludes NULLs

MIN(col)

Minimum non-NULL value

Excludes NULLs

MAX(col)

Maximum non-NULL value

Excludes NULLs

GROUP BY

Group rows for aggregation

Required for non-agg columns

HAVING

Filter groups (post-aggregation)

Like WHERE but for groups

WITH ROLLUP

Add subtotal rows

MySQL syntax

ROLLUP(...)

Add subtotal rows

PostgreSQL syntax

Next: Lab 06 — INNER JOIN

Last updated