Sometimes an outlier is defined with respect to a context. Whether a data point should be labeled as an outlier depends on the associated context. For a bank ATM, transactions that are considered normal between 6 AM and 10 PM, may be considered anomalous between 10 PM and 6 AM. In this case, the context is the hour of the day.

In this post, we will go through some contextual outlier detection techniques based on statistical modeling of the data. The Spark based implementation is available Continue reading