Category Archives: Python

Monte Carlo Simulation Library in Python with Project Cost Estimation as an Example

I was working on a solution for change point detection in time series, which led me to certain two sample statistic, for which critical values didn’t exist. The only option was to simulate the statistic values and estimate critical values … Continue reading

Posted in Data Science, Python, Statistics | Tagged , | Leave a comment

Building SciKitLearn Random Forest Model and Tuning Parameters without writing Python Code

Random Forest is a supervised learning algorithm which can be used for classification and regression. In this article we go though a process of training a Random Forest model including auto parameter tuning without writing any Python code.We will use … Continue reading

Posted in Data Science, Machine Learning, Python, ScikitLearn | Tagged , , | Leave a comment

Evaluation of Time Series Predictability with Kaboudan Metric using Prophet

You might be getting ready to build a time series forecasting model using state of the art LSTM network. Before you proceed you may want to pause and ask yourself whether your time series inherently predictable at all i.e whether … Continue reading

Posted in Python, Time Series Analytic | Tagged , , | Leave a comment

Machine Learning Model Interpretation and Prescriptive Analytic with Lime

Machine learning model interpretablity is the degree to which a human can comprehend the reasons behind the prediction made by a model. Interpretablity may be required for various reasons e.g. meeting compliance requirements or gaining insight for high stakes situation … Continue reading

Posted in Data Science, Machine Learning, Python | Tagged , , | Leave a comment

Automated Machine Learning with Hyperopt and Scikitlearn without Writing Python Code

The most challenging part of building supervised machine learning model is optimization for algorithm selection, feature selection and algorithm specific hyper parameter value selection that yields the best performing model. Undertaking such a task manually is not feasible, unless the … Continue reading

Posted in Data Science, Machine Learning, Python, ScikitLearn, Supervised Learning | Tagged , , , | 3 Comments

Missing Value Imputation with Restricted Boltzmann Machine Neural Network

Missing value is a common problem in many real world data set. There are various techniques for imputing missing values. We will use a kind of Neural Network called RBM for imputing missing values. Restricted Boltzmann Machine (RBM) are stochastic … Continue reading

Posted in Data Science, Deep Learning, ETL, Machine Learning, Python | Tagged , , , | Leave a comment

Six Unsupervised Extractive Text Summarization Techniques Side by Side

In text summarization, we create a summary of the original content that is coherent and captures the salient points in the original content. There are various important usages of text summarization. Something we face almost every day is the text … Continue reading

Posted in Data Science, NLP, Python, Text Analytic, Text Mining | Tagged , | Leave a comment

Synthetic Training Data Generation for Machine Learning Classification Problems using Ancestral Sampling

Access to good training data set is a serious impediment to building supervised Machine Learning models. Such data is scarce and when available, the quality of the data set may be questionable. Even if good quality data set is available, … Continue reading

Posted in Python, Statistics, Supervised Learning | Tagged , , | 1 Comment

Supervised Machine Learning Parameter Search and Tuning with Simulated Annealing

The most challenging phase in supervised Machine Learning pipeline is parameter tuning. There are many parameters, each with a range of values. The so called grid search is brute force approach that tries all possible combinations of values for the … Continue reading

Posted in Machine Learning, Python, ScikitLearn, Supervised Learning | Tagged , , | 2 Comments

Improving Elastic Search Query Result with Query Expansion using Topic Modeling

Query expansion is a process of reformulating a query to improve query results and to be more specific to improve the recall for a query. Topic modeling is an Natural Language Processing (NLP) technique to discover hidden topics or concepts … Continue reading

Posted in elastic search, NLP, Python, Solr, Text Analytic, Text Mining, Topic Modeling | Tagged , , , | 1 Comment