- Hadoop is a framework for managing and processing big data distributed across clusters of computers. It allows for parallel processing of large datasets.
- Big data comes from various sources like customer behavior, machine data from sensors, etc. It is used by companies to better understand customers and target ads.
- Hadoop uses a master-slave architecture with a NameNode master and DataNode slaves. Files are divided into blocks and replicated across DataNodes for reliability. The NameNode tracks where data blocks are stored.