Big data comes from various sources in different formats and structures. It includes social media data, search engine data, medical records, and IoT device data. YouTube alone generates billions of views from over 1 billion users uploading 1.2 million videos daily. Big data is characterized by its variety, velocity, and volume. Hadoop is an open source framework used to process and store big data. It has two core components - HDFS for storage and MapReduce for processing. Hadoop distributes data and tasks across nodes, provides fault tolerance, and scales dynamically with hardware.