Tag Archives: supply chain

Combating High Cardinality Features in Supervised Machine Learning

Typical training data set for real world machine learning problems has mixture of different types of data including numerical and categorical. Many machine learning algorithms can not handle categorical variables. Those that can, categorical data can pose a serious problem … Continue reading

Posted in Big Data, Data Science, Data Transformation, ETL, Hadoop and Map Reduce, Predictive Analytic | Tagged , , , | Leave a comment

Supplier Fulfillment Forecasting with Continuous Time Markov Chain using Spark

In a supply chain, quantity ordered from a down stream supplier or manufacturer are not necessarily always completely fulfilled, because of various factors. If the extent of under fulfillment could be predicted over a time horizon, then the shortfall items … Continue reading

Posted in Big Data, Data Science, Machine Learning, Scala, Spark | Tagged , , | Leave a comment

Data Quality Control With Outlier Detection

For many Big Data projects, it has been reported  that significant part of the time, sometimes up to 70-80% of time,  is spent in data cleaning and preparation. Typically, in most ETL tools,  you define constraints and rules statically for … Continue reading

Posted in Big Data, Data Science, ETL, Hadoop and Map Reduce, Internet of Things, Outlier Detection, Statistics | Tagged , , , , | 1 Comment