What does email marketing have to do with Markov model? Let’s explore and find out. Any consumer of product and services has a natural rhythm to his or her purchase history. Regular customers tend to visit a business according to some temporal patterns inherent in their buying history of products and services.
Wouldn’t you expect to get better results if your email marketing or any other marketing for that matter is aligned with the temporal pattern inherent in a customer’s purchase behavior. In practical terms, it means that your marketing email will go out at a time, predicted to be optimal by a predictive model. Effectively, your marketing campaign is being customized and timed appropriately based a customer’s past purchase history.
Well it sounds good, but what predictive modeling technique that can do all these. One answer is the Markov model. In this post, I will show how you to to build a Markov Model based on past customer transactions using Hadoop. This is part of my open source Hadoop driven data mining project avenir.
This prediction can be leveraged in many ways e.g., customizing content when an user lands on your site, customized email marketing. If we also track products being purchased along with the transaction, then we can use the model for recommending products for time delayed cross selling.
You can also think of it as a probabilistic state machine. With this model, we can predict the most likely next state, given the current state. One common use of Markov model in the security domain is intrusion detection.
Ideally, in any sequence based predictive model, the current state should be considered to be dependent on all past states. However, in Markov model we assume that the current state depends only on the previous state. It’s as if the system has only short term memory and no long term memory. In spite of this assumption, Markov model has been found to be effective.
For our example, the state consists of a tuple with the following elements. By defining the state this way, it embodies the last two transactions.
- Time since last transaction
- Whether the amount spent is less that, same as or greater than the last transaction
Once we have the probabilistic state transition matrix based on past transactions of all customers, we will be able to predict, in a probabilistic sense, the time for the next purchase and whether the amount that will spent will be less than , more or less same or greater that the last transaction for a given customer. Since our use case is about email marketing, the marketing email will be sent out around the predicted time.
Sequence Generating Map Reduce
Let’s start with our input. Our input of customer transaction consists of the following fields. The Transaction ID field, although present in the input is not used for any processing.
- Customer ID
- Transaction ID
Here is some sample input.
U3N10FWOIE,1365990122,2012-01-01,90 Y3AA3LPV93,1365990123,2012-01-02,127 83OJI5G1MQ,1365990124,2012-01-02,84 Z31AA4A50V,1365990125,2012-01-02,129 JWFHFR4149,1365990126,2012-01-02,162 B4YAHPKAN1,1365990127,2012-01-02,146 G9AYL4Y8YB,1365990128,2012-01-02,171 FOE2HD1SV2,1365990129,2012-01-02,129
The sequence data generated is consumed by the another map reduce that computes the Markov state transition matrix. The map reduce logic is in the class Projection, which implements simple projection, grouping and order by logic as in SQL.
For our data, we are grouping by customer ID and ordering by date. The ordering is achieved by map reduce secondary sort. The output of this map reduce consists of customer ID followed by date sorted list of transactions as shown below. The second field in the input which is transaction ID is ignored.
00VVD1E210,2012-06-18,87 00W6TWFW4S,2012-03-24,22,2012-05-22,80,2012-06-15,33 00W86Y0GFT,2012-02-15,141,2012-03-10,30,2012-03-25,49,2012-05-17,107 00W92K8A1W,2012-04-19,25 00W9W3Y3XH,2012-03-25,123 00XL1QERUO,2012-01-07,81,2012-05-10,154 00XPR1XW1P,2012-04-26,103 00Y1B0Y4CO,2012-03-10,81 00YR97DWWO,2012-07-15,118 00Z5SOHKED,2012-01-28,43,2012-02-25,27 00ZLLMHKND,2012-02-21,185,2012-04-02,63,2012-04-03,30
The next thing we do is little bit of processing of this data to transform a consecutive pair of date and amount into symbol which stands for a state. I have used a script for this purpose. A state is represented by a two letter symbol. Letters are shown below.
|Time elapsed since last transaction||Amount spent compared to previous transaction|
|S : small||L : significantly less than|
|M : medium||E : more or less same|
|L : large||G : significantly greater than|
We have 9 states in our system. For example, the state SL implies that the time elapsed between the last transaction and the one before was short and the customer spent less compared to the transaction before. Here is some sample output.
0B1T50K451,SG 0B1US1Z2OX,SL,SG,LL,SG 0B1WPSU3Q0,MG,LL 0B2L0EOA2I,LL,SG 0B3OQ1WW24,SG,LL,LG 0B4G8PNS7N,LL,SG
Markov State Transition Map Reduce
The implementation of this map reduce is in the class MarkovStateTransitionModel. Essentially we have to count the number of instances of state transitions. Since there are 9 states, there are 81 possible state transitions.
The mapper goes through the state sequence for each customer and emits a count of 1 for each from state, to state pair. This pair is essentially the mapper key. On the reducer side, we simply update the count for each from state, to state pair. In the cleanup method after the reducer has processed all the the data, we normalize and scale the state transition probability matrix and generate the output.
Only one reducer gets used, since we want all mapper output to be aggregated in one reducer. If multiple reducers are used, then each reducer will generate a partial state transition matrix, all of which will need to be consolidated into the final matrix. For simplicity I am using one reducer.
Here is the output for 9 x 9 state transition probability matrix.
4,1,363,246,3,7,366,3,3 250,0,373,0,0,0,377,0,0 377,1,3,251,0,0,366,0,0 0,0,407,0,0,299,293,0,0 142,0,282,0,0,288,285,0,0 9,0,396,285,0,0,307,0,0 0,0,541,0,0,295,0,0,162 38,0,500,0,0,192,38,38,192 19,0,519,0,0,321,139,0,0
Predicting Time of Future Transaction
To use the model, we take the state transition matrix and apply it to some recent customer transactions and predict next likely date of transaction for the customers. Here is the output. This processing is done by a script. The output consists of customer ID and a date.
Y47G4MC20U, 2013-04-28 KNJUW2EYQH, 2013-02-05 T0WALRZ3Q6, 2013-02-08 E4L70D2UAD, 2013-02-07 IS10DE9BQF, 2013-01-27 MJFYX65370, 2013-02-04 R21E4FOYGA, 2013-02-11
Hidden Markov Model
There is another variation of Markov Model called Hidden Markov Model (HMM). So far, in our example the states are functions of customer transaction i.e., something observable. Sometimes there are states that are not visible, but only some observations are visible as manifestation of internal states. In such cases, HMM is the right model to use.
We could take our current problem to the next level using HMM. We could define different levels of customer loyalty and they could represent the hidden states in an HMM. Customer transactions are manifestations of these hidden states. Using HMM based modelling we could predict current customer loyalty level as well as loyalty level trajectory based on customer transaction history.
One use of this model is to detect customers with decreasing loyalty. Once detected, some proactive actions could be taken to retain those customers. This will be a topic of a later post.
Markov model is a powerful tool for building predictive model based on sequence data, including time driven sequence data. There are many use cases in various domains.
If you want to try the example in this post, here is a step by step tutorial document to help. In addition to MR, you have to run several ruby scripts, as described in the tutorial.