February 2012

Big Data Caught in Storm

Hadoop is great for batch processing. However depending on the incoming data throughput and the cluster characteristic, there is a minimum latency threshold for processing data.

Fraudsters are not Model Citizens

In my earlier post, I did an overview of the outlier detection techniques in big data and specifically Hadoop context. As I mentioned, fraud detection is essentially translates to outlier detection in data mining parlance.

