Putting Business Intelligence to Work on Hadoop Data Stores
- 1,986 views
An inexpensive way of storing large volumes of data, Hadoop is also scalable and redundant. But getting data out of Hadoop is tough due to a lack of a built-in query language. Also, because users ...
An inexpensive way of storing large volumes of data, Hadoop is also scalable and redundant. But getting data out of Hadoop is tough due to a lack of a built-in query language. Also, because users experience high latency (up to several minutes per query), Hadoop is not appropriate for ad hoc query, reporting, and business analysis with traditional tools.
The first step in overcoming Hadoop's constraints is connecting to HIVE, a data warehouse infrastructure built on top of Hadoop, which provides the relational structure necessary for schedule reporting of large datasets data stored in Hadoop files. HIVE also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data.
But to really unlock the power of Hadoop, you must be able to efficiently extract data stored across multiple (often tens or hundreds) of nodes with a user-friendly ETL (extract, transform and load) tool that will then allow you to move your Hadoop data into a relational data mart or warehouse where you can use BI tools for analysis.
- Total Views
- Views on SlideShare
- Embed Views