Tag Archives: Hadoop

Big Road Map for Big Data

The number of choices for big data solutions sometimes makes it overwhelming and confusing. Purpose of this post is to  layout a road map for the big data solutions. I will be categorizing the products under four different category of … Continue reading

Posted in Big Data | Tagged , , , , , , , , , , , , | 5 Comments

Explore Customer Churn with Cramer Index

Classification problems involve predicting a response variable based on  a set of feature variables for some entity. But there is another problem whose solution is a prerequisite for solving classification problem. We may want to know which among the set … Continue reading

Posted in Big Data, Correlation, Data Mining, Hadoop and Map Reduce, Predictive Analytic | Tagged , , , | 1 Comment

Get Social with Pearson Correlation

In one of my earlier posts, I discussed about using Pearson correlation for making social recommendation. In this post we will delve deeper into it including the Hadoop map reduce implementation. There are many correlation techniques, including cosine distance, slope … Continue reading

Posted in Collaborative Filtering, Hadoop and Map Reduce, Predictive Analytic, Recommendation Engine | Tagged , | 3 Comments

Relative Density and Outliers

Recently I did some work on my open source fraud analytic project beymani. I implemented one of the proximity based algorithms  using Relative Density of a data point as described in my earlier post. When the density of a data … Continue reading

Posted in Big Data, Data Mining, Fraud Detection, Hadoop and Map Reduce | Tagged , | Leave a comment

It’s a lonely life for outliers

In this post, I am back to outliers and fraud analytic. In this earlier post, I did an overview of outliers detection techniques that are being implemented with Hadoop in my open source project beymani. In this earlier post, I … Continue reading

Posted in Big Data, Data Science, Fraud Detection, Hadoop and Map Reduce, Predictive Analytic | Tagged , , , | 1 Comment

Big Web Analytic

I had started on a Hadoop based web analytic open source project some time ago. Recently I did some work on it and decided blog about the development I did on the the project. The project is  called visitante and … Continue reading

Posted in Big Data, ETL, Hadoop and Map Reduce, Hive, Web Analytic | Tagged , , | 9 Comments

Fraudsters, Outliers and Big Data

Recently, I started working on Hadoop based solutions for fraud detection. Fraud detection is critical for many industries,  including  but not limited to financial,  insurance  and retail. Data mining is a key enabler in effective fraud detection. In this and … Continue reading

Posted in Big Data, Data Mining, Fraud Detection | Tagged , , | 20 Comments