This document provides an overview of big data and commonly used methodologies. It defines big data as large volumes of complex data from various sources that is difficult to process using traditional data management tools. The key aspects of big data are volume, variety, and velocity. Hadoop is discussed as a popular framework for processing big data using the MapReduce programming model. HDFS is summarized as a distributed file system used with Hadoop to store and manage large datasets across clusters of computers. Challenges of big data such as storage capacity, processing large and complex datasets, and real-time analytics are also mentioned.