This document discusses open source tools for big data analytics. It introduces Hadoop, HDFS, MapReduce, HBase, and Hive as common tools for working with large and diverse datasets. It provides overviews of what each tool is used for, its architecture and components. Examples are given around processing log and word count data using these tools. The document also discusses using Pentaho Kettle for ETL and business intelligence projects with big data.