One of the nagging problems in personalized recommendation systems is crowding of items with same attribute values in the recommendation list. For example, if you happen to like certain artiste, the songs by the same artiste will tend to flood recommended song list in a music recommendation system. The flooding happens because of the accuracy of the collaborative filtering algorithm. But that accuracy may not always generate the best outcome. The only criterion for success for a recommender is whether or not the the user has clicked on one or more items in the recommendation impression list.
As I mentioned in my earlier post, getting a recommendation list through Collaborative Filtering (CF) machine learning algorithms goes only part of the way towards providing an effective solution. Various post processing mechanisms are required to modify and improve the recommendation list generated by CF algorithms. Adding diversity is one such mechanism. Recently I have added attribute diffusion based diversification to my open source Personalization and Recommendation engine sifarish. I will go over the details of this feature in this post.
Diversity is a group wise property. If we consider items with set of attributes and the distance between items, then diversity between two items is simply that distance. In other words, it has an inverse relationship with similarity.
Often diversity will be at odds with accuracy of the CF algorithm. A compromise between the two forces is necessary. We will consider the following two approaches for introducing diversity into recommendation, although there are other algorithmic approaches available.
- Content Diversification : All the attributes of an item are considered and a dynamic programming based approach is used to maximize the inter items diversity while maintaining the original rank order of the items to generate a new list
- Attribute Diffusion : Only a subset of attributes id considered. the algorithm modifies the CF generated list such that items with same values for the attribute subset has some minimum rank distance in the newly generated list.
So far, I have implemented only the Attribute Diffusion based diversification and our discussion will focus on that only. In future, when I implement the Content Diversification, I will be back with a separate post.
Because it is more common to add diversity based selected set of attributes, rather than all attributes, I have decided to implement the attribute diffusion based diversification first.
Given a set of n items recommended for an user, we want to reorder the list such that rank is as close as possible to the original list while average inter item diversity is as as high possible. This is a very complex optimization problem. A dynamic programming based approach is taken for the solution, which may be sub optimal. For all recommended items (R) for an user, the algorithm steps are as follows.
- Select the top ranked item from the original list R and put that in the list S
- Sort the remaining items in R as per a score which is a function of the original rank and the item’s average distance to all items in S. Call this list Q.
- Remove the top ranked item in Q from R and add it to S
- Repeat 2 and 3 until R is exhausted or S reaches it’s maximum size
As we process the original list R, the diversity of an item to existing items in S will decrease. It’s better to set a maximum size of S and make that less than the size of R. For example, if the number of recommended items (R) for an user is 300, you might set the maximum size of items in the list S to be 100.
This is a much simpler approach, but very intuitive nonetheless. The diversification is done based on a selected set of attributes. A minimum rank distance is maintained between items with same attribute values.
For example, you might require that in a movie recommendation, no two movies with the same director will be closer than 5 rank distance i.e., there will be at least 4 movies with different directors in between movies with the same director in the new list. Here some examples for attributes from different problem domains.
- Artiste for a song
- Director for a movie
- Category and brand for a product.
The algorithm is also based on dynamic programming, building the new list in an incremental fashion. For the recommendation list of an user, the steps are as follows.As before we start with the original recommendation list R and build a new list S.
- Build a map (M) of item list keyed with attribute values. Each list is sorted by the recommendation rank
- For each attribute value, pick the item with the highest rank. Sort this list by recommendation rank (Q).
- Iterate through Q until we find an item that meets the minimum rank distance requirement in S. Remove the item from M and add to S
- Repeat 2 and 3 until M is exhausted or S has reached it’s maximum size.
As in the first solution, diversity quality will deteriorate as we build the new list and approach the tail end of the list. Since the quality of the result will be better towards the beginning of the list, it’s better to set maximum limit on the size of S. So the new list S is based on reordering of the original list R and it’s truncated.
User and Item Attribute Aggregation Map Reduce
The goal of the map reduce job ItemRatingAttributeAggregator is to link an item along with it’s attributes with the user, for whom this item has been recommended. It essentially prepares the ground for the second map reduce job which does all the heavy lifting for the attribute diffusion algorithm. It operates on the following two kinds of input
- Predicted rating (userID, itemID, predictedRating) as generated by CF
- Item attributes (itemID, attribute1, attribute2,…) as extracted from a product or some other database
The number of attributes used in the second input data set is decided by the user. The mapper uses secondary sorting and emits output with itemID as the base key. The reducer generates output as (userID, itemID, attribute1, attribute2, ..).
Attribute Diffusion Based Diversification Map Reduce
The map reduce job AttributeBasedDiversifier implements the attribute diffusion algorithm described earlier. The mapper takes two data sets as below.
- Predicted rating as (userID, itemID, predictedRating) as generated by CF
- User item attributes (userID, itemID, attribute1, attribute2, ..) as generated by the first map reduce
The mapper uses secondary sorting and generated two kinds of values for each key, which is the userID. The first set contains the all the items along with the predicted rating. The second set contains all the items along with their attributes. The reducer implements the attribute diffusion algorithm discussed earlier.
The final output format same as the basic output from CF i.e., (userID, itemID, predictedRating), except that the list for a given user has been re-ordered.
The following output from Hadoop counters provides some insight into the attribute diffusion algorithm.
15/01/19 17:55:35 INFO mapred.JobClient: Reordering 15/01/19 17:55:35 INFO mapred.JobClient: min rank distance not satisfied=521 15/01/19 17:55:35 INFO mapred.JobClient: min rank distance satisfied=17258 15/01/19 17:55:35 INFO mapred.JobClient: first occurence of this attr=3895
The first counter value tells us that in 521 cases the minimum rank distance requirement could not be met with all the top rated items for different attribute values. In that case we simply select the with item with the maximum rank distance.
We also notice that in 17258 cases, the minimum rank distance requirement could be met. The third counter just gives the number of cases when an items with some attribute value appears first in the list.
Let’s take look at a partial list recommended items for some user. As we can see, the items are not strictly in descending order of the predicted rating. These deviations are caused by the diversity injecting logic. These deviations will get worse as we scroll down the list
105WD8TGQA47G,MKPM48V1Z1,224 105WD8TGQA47G,025E0B6DVQ,169 105WD8TGQA47G,YDL4O1XB85,164 105WD8TGQA47G,J60OH9PJO0,164 105WD8TGQA47G,3370ZWVOZL,156 105WD8TGQA47G,WGSXWO4OUT,203 105WD8TGQA47G,MQ1K7A175M,142 105WD8TGQA47G,2IQYSBYQ45,160
We have gone through an attribute diffusion based diversification process for collaborative filtering generated recommendation list. Section 13 of the tutorial document has all the execution steps for this use case.
Personalized recommendation is one of the rare cases where the accuracy of the machine learning algorithms may turn out to be a curse. Diversification is one of the ways to combat the problem so that the recommendation list does not appear too obvious.
For any one interested to dig deeper into novelty and diversity in recommendation system, here is a paper worth reading.