•
20 February 2026
•
4 mins
When I wanted to build an item-item recommendation system with just implicit ratings (e.g. viewed, clicked as opposed to explicit ratings which are things like liked, purchased), Matrix Factorization (MF) with Bayesian Personalized Ranking (BPR) is what Opus 4.5 recommended I get started with. As it helps to understand these systems to identify and debug issues, I’ll be giving a walkthrough of the crucial information I learned over the span of a month (on-and-off) until I had enough of an understanding to fix problems and evaluate model performance. I’m now switching to LightFM which overcomes a cold-start problem and has better recommendation performance, but this knowledge should still be valuable whether choosing to use MF + BPR or LightFM.
The goal of this algorithm is for software to: given an item i, recommend K items that are similar to item i. The K items could be used as recommendations to item i but the task is strictly to find related items to i. Items can be things like Amazon items, Netflix movies, and Spotify songs.
Matrix Factorization with BPR allows us to accomplish this in a series of steps:
-Math.log(sigmoid(x-hat_{uij}) + 1e-10). The 1e-10 is there to avoid computing log(0).userFactor[k] += lr * (sigmoidNegDiff * (pi - ni) - regularization * u)posFactor[k] += lr * (sigmoidNegDiff * u - regularization * pi)negFactor[k] += lr * (-sigmoidNegDiff * u - regularization * ni)We now have latent user vectors and latent item vectors with every entry being a factor representing something about the item (these aren’t manually labelled).
With cosine similarity (finds the dot product of two vectors normalized by the product of their magnitudes), it’s possible to find the closest-matching item vectors as well as the closest-matching user vectors to a user.
The dot product represents how much two vectors “agree” with each other – vector cells have a high dot product if their entries agree in positive/negative and magnitude.
x_{uij} is a score representing whether user u prefers item i over item j. It’s true if positive, otherwise it’s negative. Since the sigmoid function transforms a number x into a probability between 0 and 1, sigmoid(x_{uij}) = P(i > j | u)
Note that gradient descent is moving userFactor • posFactor (x_{ui}) away from userFactor • negFactor (x_{uj}). It makes x_{uij} positive; in other words, it makes the score x_{ui} greater than the score x_{uj}.
I ran into co-exposure bias (an item j gets recommended for item i just because they are visible from the same page) implementing MF + BPR. An item from a trending page would only have other trending items as its recommendations. There are several mitigations for this:
Two metrics commonly used for evaluating MF + BPR model performance are Recall@K and HR@K (Hit-rate@K). Hit Rate@K, according to ChatGPT, is defined as “The fraction of users for whom at least one relevant item appears in the top-K recommendations.” Recall@K is defined as “The fraction of a user’s relevant items that appear in the top-K recommendations.”
One way model testing is commonly done is Leave-One-Out. If a user interaction with items 1, 2, 3, and 4, it’s possible to train the model with items 1-3 and verify that item 4 appears in the recommendations within K across all users (the aforementioned HR@K).
I hope you enjoyed this blog article and/or found it helpful.
Like it? Share it!