Recommender Systems, an Introduction

Recommender Systems, an Introduction#

Intro#

… there was a girl called Marinka

and Marinka was happy again

As the time passes by …

Idea

But

Recommender Engines#

What are Recommenders?#

Algorithms to find similar items and recommend them to user
Examples: Youtube, Amazon, Netflix and Spotify

Why Recommenders?#

User engagement: keep users on the platform
Personalization: tailor content to individual preferences
Revenue generation: increase sales through targeted recommendations
Data collection: gather user preferences for future improvements

Data Collection#

User data: user profiles, preferences, and behavior
- implicit feedback (views, clicks, purchases)
- explicit feedback (ratings, reviews)
Item data: item features, descriptions, and metadata

Types#

Content based
Collaborative filtering
Hybrid models

Content based model#

if the user liked/bought an item, recommend similar items.

How to find similar Items?#

Similarity measure:

euclidian distance
pearson correlation
cosine similarity –> most used

Cosine similarity#

can take value between 1 and 0

Cosine similarity#

Dot product of the vectors divided by the product of their lengths

Example#

Calculation example#

Similarity matrix#

Outcome#

Collaborative filtering#

if user A and user B have similar tastes, recommend items that user B liked to user A.

Types:

item based
- similarities between items based on user ratings
user based
- similarities between users based on their ratings

User-Item-Rating Matrix#

We usually deal with sparse matrices

A simple user based Example#

We want to use the most similar user’s rating to predict rating of other user

calculate similarity matrix (e.g. Cosine, Pearson, Euclidean)
use most similar user to predict the rating

User-User-Similarity Matrix#

First we calculate the siimilarity matrix

User-User-Similarity Matrix#

Then we look for the most simialar user

Predict Rating#

Then we use the rating of the most similar user to predict the rating

User-User-Similarity Matrix#

Again we look for the most similar user

Predict Rating#

Then we use the rating of the most similar user to predict the rating

Predict Rating#

Do this for all users

Model Based#

(A not so simple Example)

SVD - Singular Value Decomposition

Approximate Rating Matrix by product of three matrices
Latent features can often be interpreted (genre etc.)
Won the Netflix Prize

SVD#

Singular Value Decomposition

U - User–Feature Matrix:

Each row corresponds to a user, each column to a latent feature (hidden factor).
Example: A latent feature could represent a hidden preference dimension like “likes action movies” …
The values tell you how strongly each user relates to each latent feature.

Σ - Singular Values (Diagonal Matrix):

Contains the strengths (weights) of the latent features.
Larger singular values = more significant latent dimensions for explaining the data.

\(V^{T}\) - Feature–Item Matrix:

Each row corresponds to a latent feature, each column to an item.
The values indicate how strongly each item relates to each latent feature.
Example: An item might score high on the “action movie” factor and low on the “romantic comedy” factor.

SVD#

How to predict the missing ratings

SVD#

How to predict the missing ratings

SVD#

How to predict the missing ratings

Optimal decomposition can be found using gradient descent

Recommend Item#

Recommend highest rated item which was not rated before

How to evaluate recommenders?#

Offline evaluation (for algorithm tuning):

Rating prediction accuracy:
- RMSE (Root Mean Squared Error)
- MAE (Mean Absolute Error)
Ranking quality:
- Precision, Recall
- MAP (Mean Average Precision)
diversity, novelty (how broad and varied the recommendations are)

Online evaluation (for validating the real effect on users and the business):

A/B testing pproach:
- Split live users into groups:
  - Control: baseline system (e.g., popularity-based recommendations)
  - Treatment: new algorithm
Typical online metrics
- CTR (Click-Through Rate)
- Conversion rate (purchase, subscription, etc.)
- Engagement time (session duration, items viewed)
- Retention / churn rate
- Revenue per user

Drawbacks of Recommenders#

Cold start problem: new users/items have no data
Sparsity: many items have few ratings, making it hard to find similar items/users
Scalability: large datasets can be computationally expensive
Bias: algorithms can reinforce existing biases in data

Recommender Systems, an Introduction

Contents

Recommender Systems, an Introduction#

Intro#

Recommender Engines#

What are Recommenders?#

Why Recommenders?#

Data Collection#

Types#

Content based model#

How to find similar Items?#

Cosine similarity#

Cosine similarity#

Example#

Calculation example#

Similarity matrix#

Outcome#

Collaborative filtering#

User-Item-Rating Matrix#

User-Item-Rating Matrix#

A simple user based Example#

User-User-Similarity Matrix#

User-User-Similarity Matrix#

Predict Rating#

User-User-Similarity Matrix#

Predict Rating#

Predict Rating#

Recommend Item#

Model Based#

SVD#

SVD#

SVD#

SVD#

Recommend Item#

How to evaluate recommenders?#

Drawbacks of Recommenders#