Your SlideShare is downloading. ×
Real Time BI with Hadoop
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Real Time BI with Hadoop

12,913

Published on

A brief synopsis of using the Apache Hadoop stack to build a Real-Time Business Intelligence application, including data warehousing and search.

A brief synopsis of using the Apache Hadoop stack to build a Real-Time Business Intelligence application, including data warehousing and search.

Published in: Technology, Business
2 Comments
15 Likes
Statistics
Notes
  • I hate to display my ignorance among such an august body, but after downloading the presentation (in .key format) absolutely nothing would recognize it - not powerpoint, not Safari, not QuickTime -- nothing. A quick search of Google revealed no plausible alternate application tied to '.key' suffix-files, and there doesn't appear to be an independent app called 'Apple Keynote'. What gives?
    -- Confused in Denver
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I guess you mean NoSQL and not NowSQL, right? ;-)

    Cheers,
    Herbert
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
12,913
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
436
Comments
2
Likes
15
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Real-Time BI in Hadoop Bradford Stephens Lead Engineer, Visible Technologies Principal Consultant, Drawn to Scale Consulting
  • 2. Topics • Scalability and BI • Costs and Abilities • Search as BI
  • 3. What Is BI?
  • 4. What is “Real-Time” • Understanding Latency • We aim for <5 secs.
  • 5. Scalability in BI • Scalbility matters now • Social Media: Catalyst • All data is important • Data doesn’t scale with business size any more
  • 6. Search as BI • Katta = Distributed Search on Haddoop • Bobo = Faceted Lucene
  • 7. Doing it Cheap • 100 TB, Structured and Unstructured • Oracle- $100,000,000 • “NewSQL” - $4,000,000 • Hadoop + Katta - $250,000
  • 8. Why We Need Hadoop • Need to process high-latency data to get the “small stuff” fast • Robust Ecosystem • Need more than SQL. RDBMS not a Swiss- Army Knife
  • 9. Aggregation is Real- Time • Distributed Search w/ Katta + Facets = Aggregation-Based BI • Sum, Count, Filter, Avg, Group
  • 10. Protips: Review • Understand High vs. Low Latency data • Hadoop makes it cheap • Pre-aggregate w/ Hadoop, Explore w/ Katta + Faceted Search
  • 11. The Future • Search/BI as a Platform: “Google my Data Warehouse” • Real-Time MR on HBase

×