Lab 09: Recommendation Systems

Objective

Build two recommendation systems from scratch: content-based filtering using cosine similarity on product features, user-based collaborative filtering using the user-item rating matrix, matrix factorisation via SVD for latent factors, and a hybrid system combining both — applied to recommending Microsoft products to users.

Background

Content-based filtering recommends items similar to what a user liked before, based on item features. Collaborative filtering recommends what similar users liked — "users who bought X also bought Y". Matrix Factorisation decomposes the sparse rating matrix into two low-rank matrices (user latent factors × item latent factors) to predict missing ratings. Netflix's prize-winning algorithm was matrix factorisation (SVD++).

Time

30 minutes

Prerequisites

  • Lab 07 (PCA) — matrix decomposition

  • Lab 08 (NLP) — cosine similarity

Tools

  • Docker: zchencow/innozverse-python:latest


Lab Instructions

💡 Collaborative filtering suffers from the cold-start problem. A new user has no ratings — there's no one to compare them to. A new product has no ratings — it can't be recommended. Solutions: (1) Ask new users to rate a few seed items. (2) Use content-based filtering until enough ratings accumulate. (3) Hybrid systems blend both signals. This is why Spotify asks for 3 favourite artists during signup and Netflix asks you to rate a few movies.

📸 Verified Output:


Summary

Method
Data needed
Pros
Cons

Content-based

Item features

No cold start

Feature engineering

User-based CF

Rating matrix

No features needed

Cold start, sparse

SVD

Rating matrix

Latent factors

Need ratings

Hybrid

Both

Best accuracy

Complex

Last updated