Determining the overall system performance and measuring the quality of complex search systems are tough questions. Changes come from all subsystems of the complex system, at the same time, making it difficult to assess which modification came from which sub-component and whether they improved or regressed the overall performance. If this wasn’t hard enough, the target against which you are measuring your search system is also constantly evolving, sometimes in real time. Regression testing of the system and its components is crucial, but resources are limited. In this talk I discuss some of the issues involved and some possible ways of dealing with these problems. In particular I want to present an academic view of what I should have known about search quality before I joined Cuil in 2008.