Lab 12: MongoDB Aggregation Pipeline

Time: 40 minutes | Level: Practitioner | DB: MongoDB 7

The aggregation pipeline transforms documents through a sequence of stages. Each stage outputs documents to the next — like Unix pipes but for data. It replaces complex SQL GROUP BY, JOIN, and subqueries.


Step 1 — Setup: Orders and Customers

use shopdb

db.orders.drop()
db.customers.drop()

db.customers.insertMany([
  { _id: "Alice",  email: "[email protected]", tier: "Gold",   country: "US" },
  { _id: "Bob",    email: "[email protected]",   tier: "Silver", country: "UK" },
  { _id: "Carol",  email: "[email protected]", tier: "Bronze", country: "US" },
  { _id: "Dave",   email: "[email protected]",  tier: "Gold",   country: "JP" }
])

db.orders.insertMany([
  { customer: "Alice", product: "Laptop Pro",    category: "Electronics", amount: 1299.99, qty: 1, date: new Date("2024-01-05") },
  { customer: "Bob",   product: "Mouse",          category: "Electronics", amount: 29.99,  qty: 3, date: new Date("2024-01-10") },
  { customer: "Alice", product: "Python Book",    category: "Books",       amount: 39.99,  qty: 2, date: new Date("2024-01-15") },
  { customer: "Carol", product: "Running Shoes",  category: "Sports",      amount: 79.99,  qty: 1, date: new Date("2024-01-20") },
  { customer: "Bob",   product: "Keyboard",       category: "Electronics", amount: 89.99,  qty: 1, date: new Date("2024-01-22") },
  { customer: "Dave",  product: "Laptop Pro",     category: "Electronics", amount: 1299.99, qty: 2, date: new Date("2024-01-25") },
  { customer: "Alice", product: "Mouse",           category: "Electronics", amount: 29.99,  qty: 2, date: new Date("2024-01-28") }
])
print("Orders:", db.orders.countDocuments())

📸 Verified Output:


Step 2 — $match and $group

📸 Verified Output:


Step 3 — $project: Reshape Documents

📸 Verified Output:


Step 4 — $lookup: Join Collections

📸 Verified Output:

💡 $unwind deconstructs an array field — each array element becomes a separate document. After $lookup, the joined results are in an array; $unwind flattens it to one doc per match.


Step 5 — $addFields, $count, $skip, $limit


Step 6 — $facet and $bucket


Step 7 — Pipeline Optimization and explain()

💡 The aggregation pipeline optimizer automatically moves $match and $sort before $project when possible. But manual placement helps when the optimizer can't prove safety.


Step 8 — Capstone: Monthly Sales Report with $unwind and $bucket


Summary

Stage
SQL Equivalent
Purpose

$match

WHERE / HAVING

Filter documents

$group

GROUP BY

Aggregate and summarize

$project

SELECT columns

Include/exclude/compute fields

$sort

ORDER BY

Sort documents

$limit

LIMIT

Take N documents

$skip

OFFSET

Skip N documents

$lookup

LEFT JOIN

Join with another collection

$unwind

(flatten rows)

Explode array field to documents

$addFields

SELECT computed

Add computed fields

$count

COUNT(*)

Count documents

$facet

Multiple GROUP BYs

Parallel sub-pipelines

$bucket

CASE + GROUP BY

Range-based grouping

Last updated