The basic recommendation output consisting of the tuple (user, item, predicted rating), is easy to obtain from any Collaborative Filtering (CF) based Recommendation and Personalization engine, including sifarish. It’s been reported that there is a bigger return for the quality of recommendation results by applying various post processing logic on the basic CF output, instead of tweaking the core machine learning algorithm. We can think of these post processing units as plugins that apply to the basic recommendation output.
Many of those have been implemented in sifarish. They are not based on rigorous machine learning algorithms, but on simple intuition and heuristics. In this post I will go through one of them that was implemented recently. The actual rating of an item by an user is is considered in conjunction with the predicted rating to derive a net rating. This may cause some of the items to bubble up the rank order. The meaning of actual rating needs some clarification. It’s actually an implicit rating derived from explicit engagement of an user with an item.
Post Processing Plugins
Here is a list of various post processing plugins. Some of of are still under development in sifarish. They are all optional. One or more of them can be used depending on the needs. They are all promising as far as improving the results. However, the effectiveness of them can only be judged only by carefully measuring recommendation performance using click through or some other metric.
Typically, these plugins impose requirements that are at odds with the basic CF based recommendation. We end up solving an optimization problem with conflicting requirements. The conflicting requirements are reconciled with appropriate trade off. Typically the trade off logic uses weighted average, with user configurable weights.
|Business goal injection||Business goal is considered in deriving a net rating|
|Adding novelty||Exposing less exposed items aka long tail items|
|Adding diversity||Increasing inter item diversity in the recommendation list (in progress)|
|Positive feedback driven rank reordering||Explicit positive user feedback is considered (this post)|
|Negative feedback driven rank reordering||Implicit negative user feedback is considered (in progress)|
|Dithering||Random shuffling of rank order|
These plugins can easily be applied to the basic CF output from any recommender system e.g. Apache Mahout. The only thing necessary is to have the output be available in HDFS as (userID, itemID, rating) in record oriented text files.
To cut down on unnecessary computation, it’s a good idea to truncate the recommendation list to retain only the top n items for any user, before applying any of the plugins. It’s safe to do the truncation, because an item that is really low in rank in the list, is not likely to bubble up to the top after applying the plugins.
Prediction Accuracy is not Always Necessarily the Best Thing
Personalized recommendation is a very unusual case where the prediction accuracy of the machine learning algorithm (CF), is not necessarily always the best outcome. Other considerations are necessary to increase the effectiveness of the results.
All the plugins listed above, modify the results of CF one way or other, for pragmatic reasons, to make the final results more effective. The algorithms in the plugins are not based on machine learning, but mostly heuristics and intuition.
Explicit User Feedback
In CF based recommendation, we predict a set of (user, item, rating) tuples, given another set of (user, item, rating) tuples, which are the training data. Multiple map reduce jobs are executed to achieve this. The training data is based on explicit actions users have taken on items.
Some times there may be overlap between the training set and the predicted set. There are three different scenarios as below for a given (user, item) pair.
- Only predicted rating
- Only actual rating
- Both predicted and actual rating
We will discuss in details how the different scenarios are handled in obtaining a net rating and reordering the recommendation ranks. One important use case that is not properly handled conversion event.
A conversion event happens when an item is fully consumed, after an impression for the item is presented to the user. The interpretation of conversion event depends on the type of an item. For a product, full consumption means a purchase. For a video, it is the complete viewing of the video. Once a conversion event has happened for a (user, item), that item should be excluded from the user’s future recommendation list. Yet, many recommender systems don’t do that.
Strictly speaking, it’s not just the particular item, but all items in that category that should blocked from any future impressions of user’s recommendation list. For example, after an user has purchased a tablet, it makes little sense to present another tablet in the recommendations, at least in the near future.
Some items are repeatedly consumed and have inherent consumption cycles. Generally an user will have renewed interest toward the end of the consumption cycle. However, that may not always be true. It’s difficult to gauge an user’s intent. An user may start searching and interacting with tablets sooner than expected, because the intent may be gift for some one. However those user activities will be captured and eventually tablet recommendations will show up in user’s personalized recommendation list.
Explicit Feedback Handler Map Reduce
The implementation is in the map reduce class PositiveFeedbackBasedRankReorderer.The mapper takes two sets of inputs, the predicted rating and the actual ratings. The pair (userID, itemdID) is used as the mapper key. On the reducer side, for a given (userId, itemdID) we get 1. only predicted rating 2. only actual rating or 3. both ratings. The different cases are handled as below in the reducer. The reducer output format is same as the input i.e. the tuple (userID, itemID, rating)
- Actual rating with max value: Since the max rating value implies conversion, no record is emitted
- Only predicted rating: Emit predicted rating. This is pure recommendation. We are exposing a previously unexplored item to the user.
- Only actual rating: Emit actual rating. This case has nothing to do with recommendation. The user may have interacted with an item directly already, possibly after doing a search. We are simply capturing that making it part of the recommendation impression list, hoping that the user will explore further or convert.
- Both predicted and actual ratings: Emit either max or weighted average of ratings. With max, if the actual rating is higher, we are simply leveraging the user’s direct interactions with an item. With weighted average, we can control the relative weights for predicted rating and actual rating.
The only difference between predicted rating and actual rating is that actual rating is based on an user’s direct interaction with an item. The predicted rating is based on user’s interaction with some other items and the correlation between those items and the item in question.
Here is some output based on Hadoop counters, showing the number of cases for the different scenarios listed above. In majority of the cases we have only predicted rating, which means we are exposing lot of previously unexplored items. The next significant case is when we have both predicted and actual rating
14/12/22 10:13:04 INFO mapred.JobClient: Rating 14/12/22 10:13:04 INFO mapred.JobClient: Both=3991 14/12/22 10:13:04 INFO mapred.JobClient: Only Actual=1411 14/12/22 10:13:04 INFO mapred.JobClient: Actual max and converted=13 14/12/22 10:13:04 INFO mapred.JobClient: Only Predicted=17672
Recommendation and Search
As we saw in the previous section, recommendation and search have a symbiotic relationship. Recommendation and search are similar in many ways. The only difference is that in search, the user provides an explicit query indicative of user’s interest, although it may be ambiguous and under specified in many cases.
With recommendation, there is no explicit query. Based on an user’s past behavior, we make an assessment of user’s interests and needs and ge nerate a recommendation list. Since it’s very difficult to asses an user’s intent at any given moment, recommendation results are likely to be far more ambiguous than an explicit search. An user’s past behavior is not necessarily a good indicator of the user’s current intent. That is the core issue with recommendation.
Using Reinforcement Learning parlance, recommendation facilitates more exploration. It exposes unknown unknowns. Search tends to be more exploitation driven. Generally the user has a specific goal in mind.
Implicit Negative Feedback
Once an item achieves high predicted rating, it will bubble up and appear in the impression list. However there is no guarantee that the user will click on it. In recommendation, we don’t know user’s true intent and interest. If there is lack of interest for an item in the impression list it should start dropping in the rank order.
User’s implicit negative feedback could be measured, for example, in terms number of consecutive impressions of an item, when the user has not clicked on the time. This negative feedback could be used to lower the rank order of the item. It will sink down and eventually disappear from the impression list. This will be a topic in a future post.
In this post I have shown one way of improving the recommendation result using explicit user feedback. In the process, we have seen how recommendation and search cross each other’s path. I have updated the CF based recommendation tutorial document for this use case.