This document introduces distributed computing and tools for processing large tabular data using the Big Data Cluster. It discusses how distributed computing allows tabular data to be replicated across nodes and computation to be parallelized. It then provides an overview of Hadoop and how the Big Data Cluster can be used with tools like Hue, Hive, and Pig to perform analytics on large datasets. Finally, it walks through an example of computing TF-IDF scores on a corpus of text documents from Project Gutenberg.