Pentaho big data camp - 5 min

1,483 views

Published on

Pentaho for Hadoop presentation at Big Data Camp, February 2011.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,483
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
37
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Pentaho big data camp - 5 min

  1. 1. Importance of the hybrid data model for Hadoop driven analytics<br />Ian Fyfe, Pentaho<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  2. 2. Traditional BI<br />?<br />?<br />?<br />?<br />?<br />?<br />?<br />Data Mart(s)<br />Tape=Trash<br />Data<br />Source<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  3. 3. <ul><li>Single source
  4. 4. Large volume
  5. 5. Not distilled
  6. 6. Typically no more than 0-2 lakes per company
  7. 7. Known and unknown questions
  8. 8. Multiple user communities
  9. 9. Don’t fit in traditional RDBMS with a reasonable cost</li></ul>US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Data Lake<br />
  10. 10. Tape/Trash<br />What if...<br />Ad-Hoc<br />Data Lake(s)<br />Data Warehouse<br />Data Mart(s)<br />Data<br />Source<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  11. 11. “The working conditions within Hadoop are shocking”<br />ETL Developer<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Hadoop and BI?<br />
  12. 12. US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Hadoop and BI?<br />You have to do this in Java...<br />public void map(<br /> Text key, <br /> Text value, <br /> OutputCollector output, <br /> Reporter reporter)<br />public void reduce(<br /> Text key, <br /> Iterator values, <br /> OutputCollector output, <br /> Reporter reporter)<br />
  13. 13. Instead of this...<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Hadoop and BI?<br />
  14. 14. If only we had a <br />Java, embeddable, <br />data transformation engine...<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  15. 15. Pentaho BI Suite for Hadoop<br />Lowers technical barriers by providing an easy-to-use ETL environment for managing data in Hadoop<br />Provides end-to-end BI Tools addressing common BI use cases with Hadoop including Reporting, Ad Hoc Query and Interactive Analysis <br />Extreme ETL scalability through integration with Hadoop’s MapReduce framework<br />Workflow Integration of Hadoop jobs with external ETL and BI activities<br />Reduces costs through our subscription-based pricing model, reduced dependency on high paid technical resources, and easier maintainability<br />Interactive Analysis<br />Batch Reporting<br />and Ad Hoc Query<br />Data Marts<br />Agile BI<br /> Hadoop<br />Pentaho DI ETL Jobs<br />PDI<br />PDI<br />Log<br />Files<br />DBs and<br />other sources<br />

×