Making recommendations based on an user’s current behavior in a small time window is a powerful feature that has been added to sifarish recently. In this post I will go over the details of this feature. The real time feature has been added for social collaborative filtering based recommendations.
In our solution, although Storm is used for processing real time user engagement event stream to find recommended items, Hadoop does lot of heavy lifting by computing the item correlation matrix from historical user engagement event data. Redis has been used Continue reading
Nobody likes hospital readmission soon after discharge, whether it’s the patient or the insurance company. Predictive analytic techniques have been used to predict the likelihood of hospital readmission, using the various medical, personal and demographic input or feature attributes. However, some problems including the one we are discussing has a very large input feature set.
Before we plow ahead with building a predictive model, it’s important to pause and ask ourselves what features are really important, especially with a problem like this with a very large set of input features.
Machine learning algorithms generally work better if the dimensionality i.e. the number of feature attributes is lowered. One of the techniques for lowering the dimensionality is to select a subset of the original feature set, known as feature subset selection. Continue reading
The WordPress.com stats helper monkeys prepared a 2013 annual report for this blog.
Here’s an excerpt:
The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 57,000 times in 2013. If it were a concert at Sydney Opera House, it would take about 21 sold-out performances for that many people to see it.
Click here to see the complete report.
When I go to a web site for for downloading white paper or product data sheet, I often hit the back button if presented with a form asking for lots of personal data. Any user that bounces out, is a potential loss of a lead. Perhaps a redesigned page, asking for fewer personal details, would have circumvented the problem.
In this post we will work on a solution using reinforcement learning algorithms. We will have multiple candidate page designs and have a reinforcement learning algorithm find the optimum page with highest click through rate. The algorithm will run real time deployed on Storm Continue reading
My earlier post was about storing nested objects modeled with composite key in Cassandra. Well, we need to be able to read the data back as objects and that’s the topic for this post. This post will focus on rest of the object story. This is part of my open source project agiato.
Mapping Object to Composite Columns
As described in the earlier post, here are some of the salient features of mapping between an object and Cassandra column family. The mapping logic does not use column family meta data. Instead it relies on introspection of the object passed. Continue reading
Research has shown that customers who have abandoned shopping carts, when subjected to retargeting email campaign, often come back and in many cases end up buying more than what was originally in the shopping cart.
There are many attributes of such email campaigns. In this post, we will find the attribute values that produce the maximum effectiveness for such retargeting campaigns, by including some of those attributes. A Hadoop based decision tree algorithm will be used to mine existing retargeting campaign data. Continue reading