We will be addressing two important issues faced by recommendation systems. First, how do you solve the cold start problem i.e., provide recommendations for new users with very limited behavior data available. Second, even if we have a recommendation list for new users, how do we prevent ourselves from presenting the same recommendation list repeatedly.
I will go over the details of how both of these problems have been solved in sifarish, my OSS recommendation engine. We calculate certain statistical parameters from user engagement historical signals and compute popularity for an item by combining those statistical parameters.
The second issue has to do with what is known as “above the fold issue”. When presented with a long list, typically users will scan only the top few items from the list. Rarely will they scroll down a long list or navigate to another page. If the same popular item list is presented, it’s effectiveness goes down as time goes. We will use a technique called dithering to alter the list little bit, every time it’s presented to the user.
It’s been reported that dithering is very significant in getting higher click through with recommended items. Recommendation algorithms, no matter how complex and smart are not always likely to accurately predict what an user may be interested in because of inherent idiosyncrasies in us. Dithering essentially allows experimentation with the recommendation results.
How do you define popularity for an item. Is it the number of users who bought an item, or the numbers of users who liked it. My answer is all of them and may be more. In sifarish we first calculate the implicit rating from the user engagement historical data and then we calculate the following quantities form implicit rating data.
|Number of users||Reflects the visibility of an item|
|Rating mean||Reflects the likability of an item|
|Rating median||Reflects the likability of an item. It’s outlier proof|
|Rating std deviation||Indicates how unanimous or stable the likability is|
Finally, we define popularity as an weighted average of the all 4 quantities. The relative weights are user configurable.
Popularity Computed with Hadoop
Popularity score is computed by running historical user engagement data through a sequence of Hadoop MR jobs as listed below. The output of one MR goes as input for the next MR in the sequence.
|User engagement signal||ImplicitRatingEstimator which converts signal stream to implicit rating||Estimated rating as (userID,itemID,rating,timestamp)|
|Implicit rating||ItemRatingStat which processes implicit rating to generate stats||Output as (itemID,rating mean,rating median,rating std dev, user count)|
|Rating statistics||Normalizer normalizes the rating stats to same scale||Output as (itemID,rating mean,rating median,rating std dev, user count)|
|Normalizer||WeightedAverage which combines rating stats to generate popularity score||Popularity score as (itemID,popularity score)|
More details on the implicit rating generator MR can be found in my earlier post. The Normalizer MR normalizes all attributes to the same scale.There are two choices for normalizing : minmax and zscore. If zscore based normalization is outlier proof. Since we are dealing with processed statistical parameters, there is no outlier and we have used the minmax based normalization.
The WeightedAverage MR is from my OSS project chombo. It’s general purpose MR that calculates weighted average of all the attribute values, in this case all four statistical parameters.
Here is some sample output for popularity score showing the top of the list. Each line contains itemID, followed by score.
GPV18YKCW4,73 3JN326M9T8,57 82C202EYG2,55 A52II7KG4I,55 1XELPDZF5M,55 82MYO3C6R0,53 52SZVJF1GZ,52
Even though the popular item scores may be recomputed periodically e.g. once a week, during the interim period we are forced to present the same list to the new users until it’s recomputed. When presented with a long list, users typically don’t scan beyond the first few items. It’s a well known problem in search. In case of search an user will rather change the search query than navigate to the next page.
Dithering solves the problem by introducing some random noise into the popularity score so that the list is reordered. It’s done in such a way, that it’s effect decreases as we go towards the top of the list. So the top items are more likely to retain their positions in the list. This process is like making something worse to get better results.
In sifarish, the dithering process computes new score for an item by taking log of existing score and adding a gaussian noise as below.
The parameter stdDev is configurable. As it’s value goes up, randomness goes up and the item list gets more shuffled. After the new scores are computed for each item, the item list is sorted by rating
Dithering in Real Time with Storm
Every time the popular items are to be presented to the user, the list is dithered in real time. In sifarish this done using Storm. The Storm topology involves two stage processing. A Redis spout layer reads userID from a Redis queue.
A bolt layer has the dithering logic. Bolts read the original recommendation list from a Redis cache for a given user. Why do we have user when we are dealing with global popular item list. The reason is that the same storm based solution is used for dithering personalized recommendation list. For personalized recommendation, recommendation list is generated for each user. For popular items, we use an unique ID for a fictitious user.
Every time an item list, whether popular or personal recommendation, is to be presented to the user, the application reads a already dithered item list from the Redis cache and writes the userID in a Redis queue, so that a new dithered list can be generated by the storm topology to be consumed by the next application thread.
The bolt dithering process reads original items list form a Redis Cache and computes new scores for each item in the list based on dithering logic and then sorts the list by rating. It finally writes the new list to Redis cache with userID as the cache key.
Here is some sample output. Comparing with the original list above, we can see that the top item has retained it’s place. Dithering has shuffled items from the second place onwards. I am showing only the top 5 items here.
GPV18YKCW4,3JN326M9T8,82C202EYG2,1XELPDZF5M,82MYO3C6R0 GPV18YKCW4,3JN326M9T8,A52II7KG4I,82C202EYG2,1XELPDZF5M GPV18YKCW4,3JN326M9T8,82C202EYG2,1XELPDZF5M,3OTPEPMHV3 GPV18YKCW4,1XELPDZF5M,52SZVJF1GZ,A52II7KG4I,82MYO3C6R0 GPV18YKCW4,82C202EYG2,A52II7KG4I,3JN326M9T8,1XELPDZF5M
I have used the value of 0.015 for the std deviation parameter dither.std.dev. You could increase it to inject more randomness in the dithering process.
So far we have discussed dithering for popular item recommendation. However, as mentioned earlier, personalized recommendation list can be dithered using the same process.
Personalized recommendation list as computed by collaborative filtering algorithm is used as the source list. Here are the details of how that is done. Every time personalized recommendations are to be presented to the user, a dithered item list is obtained for the user, based on the source list.
We have gone through the details of how to apply dithering to recommendation list, which has been reported to be very effective in getting higher click through for recommended items.
I have used Storm to generate the dithered list in real time. You could use Hadoop also to precompute a set of dithered items list for each user and use them as needed. The advantage processing real time with Storm is that the computation is done as needed.
Here is a step by step tutorial document to do the all processing as described in this post including Hadoop and Storm.