The document describes an evaluation system used to test and compare the performance of different versions of SigOpt's Bayesian optimization service. The system runs optimization methods on benchmark test functions to collect performance metrics like best found value and area under the curve. It analyzes the results using statistical tests and visualization tools to provide empirical comparisons that guide improvements to SigOpt's core optimization engine.