This document discusses issues with processing large volumes of data and proposes an enterprise data warehouse architecture capable of handling big data. It aims to explain integrating Hadoop into existing data warehouses.
The first chapter introduces challenges of increased data volume, variety and velocity. It discusses skill shortages in big data and analytics. Existing data warehouses are built for reporting but not analyzing large, unaggregated data.
The second chapter outlines requirements for a new architecture and proposes a multi-platform data warehouse environment incorporating Hadoop. It describes Hadoop components like HDFS, YARN, Hive and tools like Sqoop.
The third chapter focuses on integrating Hadoop into existing data warehouses by implementing star schemas in Hive, addressing security,