This document provides an introduction to big data, including definitions of big data, where big data comes from, and the value of big data. It describes the Hadoop ecosystem as a framework for distributed storage and processing of large data sets. Key components of Hadoop include HDFS for storage, MapReduce for processing, and related tools like Pig, Hive, HBase, Sqoop, Kafka and Oozie. Machine learning platforms like H2O and Spark MLlib are also discussed. The document provides an overview of starting research in big data and common big data job titles.