This document presents a framework for auditing search engines to detect differences in user satisfaction across demographics. It describes three methods for more meaningful auditing that control for natural demographic variations: 1) Context matching to select near-identical user activity, 2) A hierarchical query-level model to borrow strength across popular queries, and 3) A query-level pairwise model to directly estimate relative satisfaction between user pairs for the same query. The framework found some light trends of older users being more satisfied but showed auditing is nuanced and different from measuring metrics on binned traffic alone. It provides a general approach for auditing systems using different metrics and user groups.