Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Pentaho big data camp - 5 min

1,501 views

Published on

Pentaho for Hadoop presentation at Big Data Camp, February 2011.

Published in: Technology
  • Be the first to comment

Pentaho big data camp - 5 min

  1. 1. Importance of the hybrid data model for Hadoop driven analytics<br />Ian Fyfe, Pentaho<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  2. 2. Traditional BI<br />?<br />?<br />?<br />?<br />?<br />?<br />?<br />Data Mart(s)<br />Tape=Trash<br />Data<br />Source<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  3. 3. <ul><li>Single source
  4. 4. Large volume
  5. 5. Not distilled
  6. 6. Typically no more than 0-2 lakes per company
  7. 7. Known and unknown questions
  8. 8. Multiple user communities
  9. 9. Don’t fit in traditional RDBMS with a reasonable cost</li></ul>US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Data Lake<br />
  10. 10. Tape/Trash<br />What if...<br />Ad-Hoc<br />Data Lake(s)<br />Data Warehouse<br />Data Mart(s)<br />Data<br />Source<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  11. 11. “The working conditions within Hadoop are shocking”<br />ETL Developer<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Hadoop and BI?<br />
  12. 12. US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Hadoop and BI?<br />You have to do this in Java...<br />public void map(<br /> Text key, <br /> Text value, <br /> OutputCollector output, <br /> Reporter reporter)<br />public void reduce(<br /> Text key, <br /> Iterator values, <br /> OutputCollector output, <br /> Reporter reporter)<br />
  13. 13. Instead of this...<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />Hadoop and BI?<br />
  14. 14. If only we had a <br />Java, embeddable, <br />data transformation engine...<br />US and Worldwide: +1 (866) 660-7555 | Slide <br />© 2010, Pentaho. All Rights Reserved. www.pentaho.com. <br />
  15. 15. Pentaho BI Suite for Hadoop<br />Lowers technical barriers by providing an easy-to-use ETL environment for managing data in Hadoop<br />Provides end-to-end BI Tools addressing common BI use cases with Hadoop including Reporting, Ad Hoc Query and Interactive Analysis <br />Extreme ETL scalability through integration with Hadoop’s MapReduce framework<br />Workflow Integration of Hadoop jobs with external ETL and BI activities<br />Reduces costs through our subscription-based pricing model, reduced dependency on high paid technical resources, and easier maintainability<br />Interactive Analysis<br />Batch Reporting<br />and Ad Hoc Query<br />Data Marts<br />Agile BI<br /> Hadoop<br />Pentaho DI ETL Jobs<br />PDI<br />PDI<br />Log<br />Files<br />DBs and<br />other sources<br />

×