This document provides an overview of big data and Hadoop. It discusses the concepts of data science, data-driven decision making, and data analytics. It then describes the types of databases and introduces Hadoop as an open source framework for distributed processing of large datasets across clusters of computers. Key aspects of Hadoop covered include the Hadoop approach using MapReduce, the HDFS architecture with NameNode and DataNodes, and how Hadoop compares to relational database management systems (RDBMS). The agenda concludes with an introduction to the trainer, Akash Pramanik.