You may be wondering about the relationship, I alluded to in the title. A personalization and recommendation system like sifarish bootstraps from user and item engagement data. This kind of data is gleaned from various signals e.g. an user’s engagement with various items, an user’s explicit rating of an item. A recommendation system could be benefit significantly from customer service data residing in Customer Relationship Management (CRM) systems. It’s yet another customer item engagement signal.
In this post the focus will be on extraction customer engagement data from CRM and how it can be combined with other customer item engagement signals to define a hybrid rating for any user item pair.The hybrid rating goes as input to the Collaborative Filtering based Map Reduce work flow in sifarish.
Implicit Signal from Customer Engagement
An user interacts with an item or product in various ways. In eCommerce setting, they could be
- Product browsing
- Reading product review
- Placing an item in a shopping cart
- Purchasing an item.
Taking another example, for a music download site, some of the user engagement events could be as follows.
- Reading review
- Listening to a song
- Purchasing and downloading a song.
Some events are tagged as conversion event e.g., purchasing an item in an eCommerce site or purchasing and downloading a song in a music download site. There is way to converts these events to an implicit rating in sifarish. The details of the solution can be found in this earlier post.
The event mapping solution is based on tracking different events that happen between an user and an item and also how many times it happens. All the event data is converted to an implicit rating for an user, item pair based on heuristics producing output of the form (user, item, rating, timestamp).
Explicit Signal from Rating
Explicit signals correspond to explicit rating of an item by user after the conversion event has happened e.g., an user may rate a product after purchasing it from a eCommerce site. Customers generally don’t explicitly rate and the data is scarce.
Generally, we don’t place lot of trust on explicit rating, It’s been reported that they trend to be extreme. In other words, people tend to explicitly rate an items when they have very high or low opinion. When available, it’s also of the form (user, item, rating, timestamp).
Signal from Customer Service
This is another source of user item engagement signal. An user may contact the customer service after the conversion event. It’s most likely to happen when the product didn’t meet the customer’s expectation.
The signal could be explicit in the form of (user, item, rating, timestamp) or implicit in the form of notes taken by customer service representative or other information filled. Explicit data will be a available only when a particular customer service application has the feature to allow the customer service representative to rate an interaction session with a customer with respect to a product.
If implicit, it needs to be converted to an explicit form. This process could be challenging, as it may involve text mining customer service notes. The rating is a measure of a customer’s satisfaction with a products, as assessed by the customer service representative.
Combining All Signals
A new Hadoop Map Reduce implementation has been added to sifarish to combine the various rating signals. All the signals are of the format (user, item, rating, timestamp). Out of the three kinds of signals, implicit rating signal is always there. The other two may or may not be there.
In the map reduce implementation in sfarish, secondary sorting is used to segregate the data into three different sets, corresponding to the three different kinds of signals. The tuple (userID, itemID) serves as the mapper key. On the reducer side the rating values are aggregated in different ways depending on the configuration. The different aggregation strategies are defined through the configuration parameter explicit.rating.override.
With weighted aggregation, we take an weighted average of the rating values. The weights are provided through the configuration parameter rating.weights. You may decide that irrespective of what happens post conversion, it’s behavior data prior to conversion that matters most and give more weight to implicit rating.
With time stamp based rating override,which ever among customer explicit rating and customer service rating happens last prevails and gets selected. For example, if a customer has purchase a product, explicitly rated the product and then after some time has elapsed called the customer service about some issue, then customer service rating will prevail.
With specific rating override,among customer explicit rating and customer service rating one is chosen to supersede others. You may decide that the customer service rating is the final arbitrator when deciding rating for an user and product.
Using appropriate user engagement signals from various sources is critical for an effective recommendation system. In this post, we have gone through the steps in incorporating the signals from a customer service application in the process of defining hybrid rating for an user and item.
There could be other appropriate signals, depending on the problem domain that may be relevant and should be taken into account.
Section 2 of the tutorial document has been updated with instructions for generating blended rating data as outlined in this post.