Embed presentation
Downloaded 13 times








The document discusses optimizations for Hive and Hadoop at Facebook including RCFile columnar storage, which saves up to 30% storage space and reduces I/O and CPU costs. It also discusses harnessing the sort and bucket properties of data to optimize the group by operator, reducing CPU costs by half. Future work mentioned includes join skew optimization, indexing, and data sharing across multiple machines in parallel.







