This document discusses optimizing columnar data stores. It begins with an overview of row-oriented versus column-oriented data stores, noting that column stores are well-suited for read-heavy analytical loads as they only need to read relevant data. The document then covers the history of columnar stores and notable features like data encoding, compression, and lazy decompression. It provides examples of run length and dictionary encoding. The document also discusses columnar file formats like RCFile, ORC, and Parquet, providing more details on ORC. It concludes with a case study where optimizations to a petabyte-scale data warehouse including sorting, changed compression, and other configuration changes improved query performance significantly through reduced data size.