The story of the correlation between beer and diaper sales is commonly used to explain product affinities in introductory data mining courses. Rarely does anyone ask about the origin of this story. Is it true? Why is it true? What does true mean anyway?
The latter question is the most interesting because it challenges the ideas of accuracy in data and analytic models.
This is the real history of the beer and diapers story, explaining its origins and truth, based on repeated analyses of retail data over two decades. It will show that one can have multiple contradictory results from analytic models, and how they can all be true.