Category Archives: Anomaly Detection

Time Series Trend and Seasonality Component Decomposition with STL on Spark

You may be interested in decomposing a time series into level, trend, seasonality and remainder components to gain more insight into your time series. You may also be interested in decomposition to separate out the remainder component for anomaly detection. … Continue reading

Posted in Anomaly Detection, Big Data, Data Science, ETL, Spark, Time Series Analytic | Tagged , , , | Leave a comment

Time Series Sequence Anomaly Detection with Markov Chain on Spark

There are many techniques for time series anomaly detection. In this post, the focus is on sequence based anomaly detection of time series data with Markov Chain. The technique will be elucidated with a use case involving data from a … Continue reading

Posted in Anomaly Detection, Big Data, Data Science, Machine Learning, Outlier Detection, Scala, Spark | Tagged , , , , | 1 Comment

Normal Distribution Fitness Test with Chi Square on Spark

Many Machine Learning models is based on certain assumptions made about the data. For example, in ZScore based  anomaly detection, it is  assumed that the data has normal distribution. Your Machine Learning model will be as good as how those … Continue reading

Posted in Anomaly Detection, Big Data, Data Science, Spark, Statistics | Tagged , | Leave a comment

Learning Alarm Threshold from User Feedback using Decision Tree on Spark

Alarm fatigue is a phenomena where some one is exposed to large number of alarms, become desensitized to them and start ignoring them. It’s been reported that security professionals ignore 32% of alarms because they are thought to be false. … Continue reading

Posted in Anomaly Detection, Big Data, Data Science, Outlier Detection, Spark | Tagged , , , , | 1 Comment

Contextual Outlier Detection with Statistical Modeling on Spark

Sometimes an outlier is defined with respect to a context. Whether a data point should be labeled as an outlier depends on the associated context. For a bank ATM, transactions that are considered normal between 6 AM and 10 PM, … Continue reading

Posted in Anomaly Detection, Big Data, Data Science, Spark | Tagged , , | 3 Comments

Alarm Flooding Control with Event Clustering Using Spark Streaming

You show up at work in the morning and open your email to find 100 alarm emails in your inbox for the same error from an application running on some server within a short time window of 1 minute. You … Continue reading

Posted in Anomaly Detection, Big Data, Real Time Processing, Spark, stream processing | Tagged , , , | 1 Comment

Anomaly Detection with Robust Zscore

Anomaly detection with with various statistical modeling based techniques are simple and effective. The Zscore based technique is one among them. Zscore is defined as the absolute difference between a data value and it’s mean normalized with standard deviation. A … Continue reading

Posted in Anomaly Detection, Big Data, data quality, Data Science, Hadoop and Map Reduce | Tagged , , | 8 Comments