Real Time BI with Hadoop

13,556 views

Published on

A brief synopsis of using the Apache Hadoop stack to build a Real-Time Business Intelligence application, including data warehousing and search.

Published in: Technology, Business
2 Comments
16 Likes
Statistics
Notes
  • I hate to display my ignorance among such an august body, but after downloading the presentation (in .key format) absolutely nothing would recognize it - not powerpoint, not Safari, not QuickTime -- nothing. A quick search of Google revealed no plausible alternate application tied to '.key' suffix-files, and there doesn't appear to be an independent app called 'Apple Keynote'. What gives?
    -- Confused in Denver
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I guess you mean NoSQL and not NowSQL, right? ;-)

    Cheers,
    Herbert
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
13,556
On SlideShare
0
From Embeds
0
Number of Embeds
104
Actions
Shares
0
Downloads
442
Comments
2
Likes
16
Embeds 0
No embeds

No notes for slide

Real Time BI with Hadoop

  1. 1. Real-Time BI in Hadoop Bradford Stephens Lead Engineer, Visible Technologies Principal Consultant, Drawn to Scale Consulting
  2. 2. Topics • Scalability and BI • Costs and Abilities • Search as BI
  3. 3. What Is BI?
  4. 4. What is “Real-Time” • Understanding Latency • We aim for <5 secs.
  5. 5. Scalability in BI • Scalbility matters now • Social Media: Catalyst • All data is important • Data doesn’t scale with business size any more
  6. 6. Search as BI • Katta = Distributed Search on Haddoop • Bobo = Faceted Lucene
  7. 7. Doing it Cheap • 100 TB, Structured and Unstructured • Oracle- $100,000,000 • “NewSQL” - $4,000,000 • Hadoop + Katta - $250,000
  8. 8. Why We Need Hadoop • Need to process high-latency data to get the “small stuff” fast • Robust Ecosystem • Need more than SQL. RDBMS not a Swiss- Army Knife
  9. 9. Aggregation is Real- Time • Distributed Search w/ Katta + Facets = Aggregation-Based BI • Sum, Count, Filter, Avg, Group
  10. 10. Protips: Review • Understand High vs. Low Latency data • Hadoop makes it cheap • Pre-aggregate w/ Hadoop, Explore w/ Katta + Faceted Search
  11. 11. The Future • Search/BI as a Platform: “Google my Data Warehouse” • Real-Time MR on HBase

×