Diversity in Personalization with Attribute Diffusion

One of the nagging problems in  personalized recommendation systems is crowding of items with same attribute values in the recommendation list. For example, if you happen to like certain artiste, the songs by the same artiste will tend to flood recommended song list in a music recommendation system. The flooding happens because of  the accuracy of the collaborative filtering algorithm. But that accuracy may not always generate the best outcome. The only criterion for success for a recommender  is whether or not the the user has clicked on one or more items in the recommendation impression list.

As I mentioned in my earlier post, getting a recommendation list through Collaborative Filtering (CF) machine learning algorithms goes only part of the way towards providing an effective solution. Various post processing mechanisms are required to modify and improve the recommendation list generated by CF algorithms. Adding diversity is one such mechanism.  Recently I have added attribute diffusion based diversification to my open source Personalization and Recommendation engine sifarish. I will go over the details of this feature in this post.


Diversity is a group wise property. If we consider items with set of attributes and the distance between items, then diversity between two items is simply that distance. In other words, it has an inverse relationship with similarity.

Often diversity will be at odds with accuracy of the CF algorithm. A compromise between the two forces is necessary. We will consider the following two approaches for introducing diversity into recommendation, although there are other algorithmic approaches available.

  1. Content Diversification : All the attributes of an item are considered and a dynamic programming based approach is used to maximize the inter items diversity while maintaining the original rank order of the items to generate a new  list
  2. Attribute Diffusion : Only a subset of attributes id considered. the algorithm modifies the CF generated list such that items with same values for the attribute subset has some minimum rank distance in the newly generated list.

So far, I have implemented only the Attribute Diffusion based diversification and our discussion will focus on that only. In future, when I implement the  Content Diversification, I will be back with a separate post.

Because it is more common to add diversity based selected set of attributes, rather than all attributes, I have decided to implement the attribute diffusion based diversification first.

Content Diversification

Given a set of n items recommended for an user, we want to reorder the list such that  rank is as close as possible to the original list while average inter item diversity is as as high possible. This is a very complex optimization problem. A dynamic programming based approach is taken for the solution, which may be sub optimal.  For all recommended items (R) for an user, the algorithm steps are as follows.

  1. Select the top ranked item from the original list R and put that in the list S
  2. Sort the remaining items in R as per a score which is a function of the original rank and the item’s average distance to all items in S. Call this list Q.
  3. Remove the top ranked  item in Q from R and add it to S
  4. Repeat 2 and 3 until  R is exhausted or S reaches it’s maximum size

As we process the original list R, the diversity of an item to existing items in S will decrease. It’s better to set a maximum size of S and make that less than the size of R.  For example, if the number of recommended items (R) for an user is 300, you might set the maximum size of items in the list S to be 100.

Attribute Diffusion

This is a much simpler approach, but very intuitive nonetheless. The diversification is done based on a selected set of attributes. A minimum rank distance is maintained between items with same attribute values.

For example, you might require that in a movie recommendation, no two movies with the same director will be closer than 5 rank distance i.e., there will be at least 4 movies with different directors in between movies with the same director in the new list. Here some examples for attributes from different problem domains.

  • Artiste for a song
  • Director for a movie
  • Category and brand for a product.

The algorithm is also based on dynamic programming, building the new list in an incremental fashion. For the recommendation list of an user, the steps are as follows.As before we start with the original recommendation list R and build a new list S.

  1. Build a map (M) of item list keyed with attribute values. Each list is sorted by the recommendation rank
  2. For each attribute value, pick the item with the highest rank. Sort this list by recommendation rank (Q). 
  3. Iterate through Q until we find an item that meets the minimum rank distance requirement in S. Remove the item from M and add to S
  4. Repeat 2 and 3 until M is exhausted or S has reached it’s  maximum size.

As in the first solution, diversity quality will deteriorate as we build the new list and approach the tail end of the list. Since the quality of the result will be better towards the beginning of the list, it’s better to set maximum limit on the size of S. So the new list S is based on reordering of the original list R and it’s truncated.

 User and Item Attribute Aggregation Map Reduce

The goal of the map reduce job  ItemRatingAttributeAggregator is to link an item along with it’s attributes  with the user, for whom this item has been recommended. It essentially prepares the ground for the second map reduce job which does all the heavy lifting for the attribute diffusion algorithm. It operates on the following two kinds of input

  1. Predicted rating  (userID, itemID, predictedRating) as generated by CF
  2. Item attributes  (itemID, attribute1, attribute2,…) as extracted from a product or some other  database

The number of attributes used in the second input data set is decided by the user. The mapper uses secondary sorting and emits output with itemID as the base key. The reducer generates output as (userID, itemID, attribute1, attribute2, ..).

Attribute Diffusion Based Diversification Map Reduce

The map reduce job AttributeBasedDiversifier implements the attribute diffusion algorithm described earlier. The mapper takes two data sets as below.

  1. Predicted rating as (userID, itemID, predictedRating) as generated by CF
  2. User item attributes (userID, itemID, attribute1, attribute2, ..) as generated by the first map reduce

The mapper uses secondary sorting and generated two kinds of values for each key, which is the userID. The first set contains the all the items along with the predicted rating. The second set contains all the items along with their attributes. The reducer implements the attribute diffusion algorithm discussed earlier.

The final output format same as the basic output from CF i.e., (userID, itemID, predictedRating), except that the  list for a given user has been re-ordered.


The following output from Hadoop counters provides some insight into the attribute diffusion algorithm.

15/01/19 17:55:35 INFO mapred.JobClient:   Reordering
15/01/19 17:55:35 INFO mapred.JobClient:     min rank distance not satisfied=521
15/01/19 17:55:35 INFO mapred.JobClient:     min rank distance satisfied=17258
15/01/19 17:55:35 INFO mapred.JobClient:     first occurence of this attr=3895

The first counter value tells us that in 521 cases the minimum rank distance requirement could not be met with all the top rated items for different attribute values. In that case we simply select the with item with the maximum rank distance.

We also notice that in 17258 cases, the minimum rank distance requirement could be met. The third counter just gives the number of cases when an items with some attribute value appears first in the list.

Let’s take look at a partial list recommended items for some user. As we can see, the items are not strictly in descending order of the predicted rating. These deviations are caused by the diversity injecting logic. These deviations will get worse as we scroll down the list


Summing Up

We have gone through an attribute diffusion based diversification process for collaborative filtering generated recommendation list. Section 13 of the tutorial document has all the execution steps for this use case.

Personalized recommendation is one of the rare cases where the accuracy of the machine learning algorithms may turn out to be a curse. Diversification is one of the ways to combat the problem so that the recommendation list does not appear too obvious.

For any one interested to dig deeper into novelty and diversity in recommendation system, here is a paper worth reading.


About Pranab

I am Pranab Ghosh, a software professional in the San Francisco Bay area. I manipulate bits and bytes for the good of living beings and the planet. I have worked with myriad of technologies and platforms in various business domains for early stage startups, large corporations and anything in between. I am an active blogger and open source project owner. I am passionate about technology and green and sustainable living. My technical interest areas are Big Data, Distributed Processing, NOSQL databases, Machine Learning and Programming languages. I am fascinated by problems that don't have neat closed form solution.
This entry was posted in Big Data, Data Science, Hadoop and Map Reduce, Personalization, Recommendation Engine and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s