Designing complex Big Data system with myriad of parameters and design choices is a daunting task. It’s almost a black art. Typically we stay with the default parameter settings, unless it fails to meet your requirement which forces you venture out of comfort zone of default settings. Essentially what we are dealing with is a complex optimization problem with no closed form solution. We have to perform a search in a multi dimensional parameter space, where the choice of parameter value combinations may run into hundreds of thousands if not millions.

With limited time and resource, the brute force approach of running tests for all the configuration value combinations is not a viable option. It’s clear that we have to do a guided search through the parameters space, so that we can arrive at the desired parameters values with a limited number of tests. It this post we will discuss an optimization technique called Bayesian optimization, which is popular for solving Continue reading