This document discusses optimizing data crunching tasks by moving processing to more efficient languages and data formats. It summarizes benchmarks of completing two data analysis tasks on taxi trip data using Python, C, Awk and other tools. The key findings are that shell tools and simple C code using memory mapping are over 20x faster than Python for these text-based tasks. It also demonstrates how reworking tasks to use integer-based schemas and avoid floating point calculations can improve reproducibility and efficiency.