Rec Systems

Ace your homework & exams now with Quizwiz!

NMF is a 'best in class' option for many recommendation problems:

- Includes overall, user, & item bias as well as latent factor interactions - Can fit via SGD or ALS - No need to impute missing ratings - Use regularization to avoid over-fitting - Can handle time dynamics, e.g., changes in user preferences - Used by winning entry in Netflix challenge

Collaborative challenges

- cold start - echo chambers - shilling attacks

content pros cons

- no other users' data - requires context about items

Types of data

Data can be: Explicit: - User provided ratings (1 to 5 stars) - User like/non-like Implicit: - Infer user-item relationships from behavior - More common - Example: buy/not-buy; view/not-view To convert implicit to explicit, create a matrix of 1s (yes) and 0s (no)

Ranking

Candidate generator -> retrieves top-k -> embedding space through matrix factorization -> knn use normalized discounted cumulative gain

The cold-start problem

Cold-start problem: - Need utility matrix to recommend - Can ask users to rate itemsInfer ratings from behavior, e.g., viewing an item Must also handle new users and new items Approaches: - Use ensemble of (bad) recommenders until you have enough ratings - Use content-based recommender - Exploit implicit, tag, and other side data - Use ItemSimilarityModel until you have enough rating data - recommend popular items - presenting random items to sub-groups that might have interest

Content-based recommendations

Compute (cosine) distance between user profile and item profiles May want to bucket items first using random-hyperplane and locality-sensitivity-hashing (LSH) ML approach: - Use random forest or equivalent to predict on a per-user basis - Computationally intensive -- usually only feasible for small problems

Collaborative filtering recipe

Compute predictions by similarity: - Normalize (demean) utility matrix - Compute similarity of users or items - Predict ratings for unrated items - Add prediction to average rating of user/item Note: - Precompute utility matrix for each user -- it is relatively stable - Only compute predictions at runtime

Item Profile

Consists of (feature, value) pairs Consider setting feature to 0 or 1 Consider how to scale non-Boolean features

Choosing a similarity measure

Cosine: - Use for ratings (non-Boolean) data - Treat missing ratings as 0 - Cosine + de-meaned data is the same as Pearson Jaccard: - Use only Boolean (e.g., buy/not buy) data - Loses information with ratings data Then compute similarity matrix of pair-wise similarities between items (users)

Problem of retraining for each new user

Deep Learning Extension

user profile

Describes user preferences (utility matrix) Consider how to aggregate item features per user: - Compute "weight" a user puts on each feature - E.g., "Julia Roberts" feature = average rating for films with "Julia Roberts" Normalize: subtract average utility per user - E.g., "Julia Roberts" feature = average rating for films with "Julia Roberts" - average rating

Evaluation issues

Historically, used RMSE or MAE But, only care about predicting top 𝑛 items - Should you compute metric over all missing ratings in test set? - No need to predict items undesirable items well Precision at n: percentage of top 𝑛n predicted ratings that are 'relevant' Recall at n: percentage of relevant items in top 𝑛n predictions Lift or hit rate are more relevant to business Performance of recommender should be viewed in context of user experience (UX) ⇒ run A/B test on entire system Cross validation is hard: - What do you use for labels because of missing data? - Users choose to rate only some items ⇒ selection bias - Not clear how to fix this bias, which is always present Beware of local optima ⇒ use multiple starts

Cross-validation

Randomly sample ratings to use in training set Split on users Be careful if you split temporally Do not split on items

To get best performance with NMF:

Model bias (overall, user, and item) Model time dynamics, such as changes in user preferences Add side or implicit information to handle cold-start

NMF

Non-negative matrix factorization

Building a production recommender is also challenging:

Part of entire UX Should consider: - Diversity of recommendations - Privacy of personal information - Security against attacks on recommender - Social effects - Provide explanations

Collaborative filtering using matrix factorization

Predict ratings from latent factors: Compute latent factors 𝑞𝑖qi and 𝑝𝑢pu via matrix factorization Latent factors are unobserved user or item attributes: - Describe some user or item conceptAffect behavior - Example: escapist vs. serious, male vs. female films - Predict rating: $\hat{r}_{ui} = q_i^T p_u$ - Assumes: - Utility matrix is product of two simpler matrices (long, thin): - ∃ small set of users & items which characterize behavior - Small set of features determines behavior of most users - Can use NMF, 𝑈𝑉, or SVD

Predict ratings from similarity

Predict using a similarity-weighted average of ratings:

Content-based Recommender

Recommend based on item properties/characteristics Construct item profile of characteristics using various features - Construct item features: - Text: use TF-IDF and use top 𝑁 features or features over a cutoff - Images: use tags -- only works if tags are frequent & accurate Construct user profile Compute similarity: Jaccard, Cosine max dot product

Recommend best items

Recommend items with highest predicted rating: - Sort predicted ratings 𝑟̂ 𝑢𝑖 - Optimize by only searching a neighborhood which contains the 𝑛 items most similar to 𝑖 - Beware: - Consumers like variety - Don't recommend every Star Trek film to someone who liked first - Best to offer several different types of item

SVD vs. NMF

SVD: - Must know all ratings -- i.e., no unrated items - Assumes can minimize squared Frobenius norm - Very slow if matrix is large & dense NMF: - Can estimate via alternating least squares (ALS) or stochastic gradient descent (SGD) - Must regularize - Can handle big data, biases, interactions, and time dynamics

SVD

Singular Value Decomposition

long tail

So the problem of the long tail is that learning to rank the items becomes critical. A secondary problem is diversity as you want to show items of different types or categories to the user, not all items in your recommendation should be similar. - look into diversity increasing algorithms

Two methods to estimate NMF factors:

Stochastic gradient descent (SGD): - Easier and faster than ALS - Must tune learning rate - Sometimes called 'Funk SGD' after originator Alternating least squares (ALS): - Use least squares, alternate between fixing 𝑞𝑖 and 𝑝𝑢 - Available in Spark/MLibFast if you can parallelizeBetter for implicit (non-sparse) data Beware of local optima!

Cosine distance

Use for ratings (non-Boolean) data Treat missing ratings as 00 Cosine + de-meaned data is the same as Pearson

Jaccard Distance

Use only Boolean (e.g., buy/not buy) data Loses information with ratings data

CF using similarity

Use similarity to recommend items: - Make recommendations based on similarity: - Between users - Between items - Similarity measures: - Pearson - Cosine - Jaccard matrix sparsity = # ratings/ total # elements - low sparsity -> don't do collab filtering

Utility Matrix

User rating of items User purchase decisions for items Most items are unrated ⇒⇒ matrix is sparse Unrated are coded as 0 or missing Use recommender: - Determine which attributes users think are important - Predict ratings for unrated items - Better than trusting 'expert' opinion

Two types of similarity-based CF

User-based: predict based on similarities between users - Performs well, but slow if many users - Use item-based CF if |𝑈𝑠𝑒𝑟𝑠|≫|𝐼𝑡𝑒𝑚𝑠| Item-based: predict based on similarities between items - Faster if you precompute item-item similarity - Usually |𝑈𝑠𝑒𝑟𝑠|≫|𝐼𝑡𝑒𝑚𝑠|⇒ item-based CF is most popular - Items tend to be more stable: - Items often only in one category (e.g., action films) - Stable over time - Users may like variety or change preferences over time - Items usually have more ratings than users ⇒⇒ items have more stable average ratings than users

collaborative filtering

a process that automatically groups people with similar buying intentions, preferences, and behaviors and predicts future purchases

presentation bias

analyze bias in data, stress diversity results appearing lower affect: - if post is even seen - even if seen, they may not interact because further down divide by bias (probability of item is clicked over some other relevant lower ranked item) * irrelevant item bias personalized feed - ex. scrolling speed

Normalization

user-item rating bias = global avg + item's avg rating + user's avg rating


Related study sets

Chapter 13 Inquizitive: Western Expansion and Southern Secession

View Set

Chapter 26: Growth and Development of a Toddler

View Set

Objective 8: Describe the following types of muscle responses to stimulation: twitch, graded muscle contraction (wave summation, incomplete tetanus, complete tetanus, multiple motor unit summation), treppe, isotonic contraction and isometric contraction

View Set

Chapter 5: Security Assessment and Testing

View Set