This document discusses why distributed computing is necessary for large datasets and complex queries. It argues that while a single computer may be able to handle datasets of billions of records for simple queries, more complex queries or larger datasets require distributed computing. Distributed systems can leverage multiple machines to parallelize work, handle hardware failures without losing entire datasets, and scale beyond the memory and processing limits of individual machines. The document uses examples of large census and web crawling datasets to illustrate how requirements quickly exceed what is possible on a single computer.