Category Archives: Data Science

Gaining Insight by Mining Simple Rules from Customer Service Call Data

Although the goal for most predictive analytic problem is to make prediction, sometimes we are more interested in the model learnt by the learning algorithm. If the learnt model could be expressed as s set of rules, then those rules … Continue reading

Posted in Big Data, Data Science, Hadoop and Map Reduce, Machine Learning, Rule Mining | Tagged , , | Leave a comment

Supplier Fulfillment Forecasting with Continuous Time Markov Chain using Spark

In a supply chain, quantity ordered from a down stream supplier or manufacturer are not necessarily always completely fulfilled, because of various factors. If the extent of under fulfillment could be predicted over a time horizon, then the shortfall items … Continue reading

Posted in Big Data, Data Science, Machine Learning, Scala, Spark | Tagged , , | Leave a comment

Big Data System Design with Bayesian Optimization

Designing complex Big Data system with myriad of  parameters and design choices is a daunting task. It’s almost a black art. Typically we stay with the default parameter settings, unless it fails to meet your requirement which forces you venture out … Continue reading

Posted in Big Data, Cluster Computation, Data Science, Optimization | Tagged , | 1 Comment

Customer Segmentation Based on Online Behavior using ScikitLearn

Customer segmentation or clustering is useful in various ways. It could be used for targeted marketing. Sometimes when building predictive model, it’s more effective to cluster the data and build a separate predictive model for each cluster. In this post, … Continue reading

Posted in Data Mining, Data Science, Machine Learning | Tagged , , , , | 2 Comments

Inventory Forecasting with Markov Chain Monte Carlo

Sometimes you want to calculate statistics about some variable which has complex, possibly non linear relationship with another variable for which probability distribution is available, which may be non standard or non parametric. That’s the situation we face when trying predict … Continue reading

Posted in Data Science, Machine Learning, Optimization, Python, Simulation | Tagged , , , | 1 Comment

Customer Churn Prediction with SVM using Scikit-Learn

Support Vector Machine (SVM) is unique among the supervised machine learning algorithms in the sense that it focuses on training data points along the separating hyper planes. In this post, I will go over the details of how I have … Continue reading

Posted in Data Science, Machine Learning, Predictive Analytic, Python | Tagged , , , , | 2 Comments

Is Neural Network Better Off with Big Data

How does neural network or for that matter any machine learning model relates to Big Data. Do we get a better quality learning model with bigger data. That’s what we will explore in this post. We will explore sample complexity … Continue reading

Posted in Big Data, Data Science, Machine Learning, Optimization, Predictive Analytic, Uncategorized | Tagged , , , , , , , | 4 Comments