Your SlideShare is downloading. ×
Hadoop uk user group meeting final
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop uk user group meeting final

3,131

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,131
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Pentaho provides a complete, enterprise BI suite from ETL and data integration, through OLAP as well as reporting, dashboards and ad hoc analysis. Our Enterprise Edition BI Suite is modular, enabling users to use the entire set of functionality or to start anywhere that may be a priority such as building and deploying a data warehouse or providing management dashboards. And because Pentaho’s BI Suite is modular, users can easily deploy additional functionality as their needs grow or change. Individually, Pentaho’s BI and data integration applications , Pentaho Data Integration, Pentaho Analysis, Pentaho Reporting, Pentaho Dashboards and Pentaho Analyzer are purpose built and best of breed, providing users with world class BI and data integration functionality to meet the needs of customers ranging from innovative new companies to Fortune 1000.At the most basic level, Pentaho helps you to turn your data, stored throughout your organization. into actionable business intelligence. This functionality can be divided into three core areas: Accessing data, Optimizing and analyzing data and then visualizing information via reports or dashboards.In terms of accessing data, we integrate with both structured data, such as data stored in a relational database or coming from a core business applications such as CRM or ERP , as well as unstructured complex data via our integration with Apache Hadoop. We offer a graphical interface that allows you to quickly connect and transform data sources simply by dragging and dropping them into the Pentaho development environment.Optimizing data means you can slice and dice data to find meaningful trends, uncover root causes or other business-relevant information. It allows you to “have a conversation with the data”, interactively exploring data as you see fit. Pentaho also provides data mining capabilities to discover hidden patterns in the data for purposes of identifying indicators for predicting future performance.Visualization consists of reports and dashboards. Reporting is often where organizations start with business intelligence, trying to get business information out of existing systems to make it available to business users in an attractive, easy-to-consume format. Pentaho reporting provides both operational reporting such as for invoices or bills of lading, as well as historical and analytical reports.Dashboards have become a very popular BI capability because it lets end users easily see their key performance indicators and business metrics in a very easy-to-consume format. Rather than combing through large volumes of reports, users can immediately see what metrics are on track and which ones require immediate attention.The underlying BI server integrates all of these end-user capabilities, providing developers a single view of data across the entire suite. No other BI vendor offers the unique combination of a comprehensive BI suite with the breadth of Pentaho combined with a single, intuitive development interface that greatly simplifies the creation of new BI applications.To meet the range of user needs, Pentaho can be deployed either as an on-premise or on-demand application. In either deployment scenario, you have the same exact product set and functionality. So it is very easy to migrate in either direction. So if you decide to deploy an initial project in the cloud via the on-demand offering in order to deliver business value more quickly, you can then move it back on-premise at a later date very easily.
  • Transcript

    • 1. UK Hadoop User Group Meeting
      Davy Nys, RVP of Enterprise Sales EMEA
      October, 2010
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 2. About Pentaho
      Recognized leader in open source BI & Data Integration
      Average one download every 30 seconds
      Over 8,000 active production deployments
      Over 1,200 customers in 65 countries
      Saved customers >$2 billion in cumulative licenses and maintenance costs
      Backed by Benchmark Capital, Index Ventures and NEA
    • 3. Driven by Customer and Market Need
      Pentaho has been an industry pioneer and innovator since its founding in 2004. As an OSBI company since its start, Pentaho continues to be driven by customer and market need.
      2004 - Founded
      2005 - First open source BI Platform
      2006 - First to offer live integration with Google Maps
      2008 - First BI company to integrate with the iPhone
      2009 - Announced groundbreaking Agile BI Initiative to address the market need of brining BI closer to business users.
      Customers approached Pentaho with big data problems
      2010 - First to offer ad hoc analytics to iPad
      2010 - First to announce and deliver code to support Hadoop and big data analytics
    • 4. Why Pentaho BI for Hadoop?
      Pentaho offers full BI Suite
      Data to dashboards (ETL, OLAP, reporting, dashboards, mining)
      Pentaho lowers on-ramp for Hadoop users
      Lowers complexity and learning curve for Big Data analytics
      Enables users to combine structured and unstructured data
      Few Hadoop applications available, critical need
      Rapidly integrate Hadoop into existing data architectures by easily moving data between Hadoop and databases, data warehouses and other enterprise data stores;
      Agile BI and modern platform, deployed on-premise or on-demand
      Pentaho brings scalability, clustering and deployment options
      100% Java
      Commitment to open source
      COSS frees up $$ for more servers, CPUs
    • 5. Pentaho for Hadoop Download Capability
      Includes support for development, production support will follow with GA
      Collaborative effort between Pentaho and the Pentaho Community
      60+ beta sites over three month beta cycle
      Pentaho contributed code for API integration with HIVE to the open source Apache Foundation
      Pentaho and Amazon Web Services Partnership
      Combines Pentaho Data Integration for Hadoop with Amazon’s Elastic Map Reduce (EMR) to facilitate easy integration with Hadoop data stored in EC2
      Enables hybrid data model between EMR, databases, data warehouses and other on-premise data stores
      Pentaho’s Amazon EC2 offering includes tightly integrated report design for building production or ad hoc reports from data spanning cloud and on-premise data sources (available November, 2010)
      Pentaho for Hadoop Announcements
    • 6. Pentaho for Hadoop Announcements (cont)
      Pentaho and Cloudera Partnership
      Combines Pentaho ‘s business intelligence and data integration capabilities with Cloudera’s Distribution for Hadoop (CDH)
      Enables business users to take advantage of Hadoop with ability to easily and cost-effectively mine, visualize and analyze their Hadoop data
      Pentaho and Impetus Technologies Partnership
      Incorporates Pentaho Agile BI and Pentaho BI Suite for Hadoop into Impetus Large Data Analytics practice
      First major SI to adopt Pentaho for Hadoop
      Facilitates large data analytics projects including expert consulting services, best practices support in Hadoop implementations and nCluster including deployment on private and public clouds
    • 7. Hadoop and BI?
      90% of new Hadoop use cases
      are transformation of
      semi/structured data*
      * of those companies we’ve talked to...
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 8. Big Data
      Terabytes and petabytes of data
      Sometimes per day
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 9. ?
      ?
      ?
      ?
      ?
      ?
      ?
      Traditional BI
      Data Mart(s)
      Tape/Trash
      Data
      Source
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 10. Data Lake
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 13. Data Lakes
      • 0-2 lakes per company
      • 14. Known and unknown questions
      • 15. Multiple user communities
      • 16. $1-10k questions, not $1m ones
      • 17. Don’t fit in traditional RDBMS with a reasonable cost
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 18. Data Lake Requirements
      • Store all the data
      • 19. Satisfy routine reporting and analysis
      • 20. Satisfy ad-hoc query / analysis / reporting
      • 21. Balance performance and cost
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 22. Tape/Trash
      Ad-Hoc
      Data Lake(s)
      Data Warehouse
      What if...
      Data Mart(s)
      Data
      Source
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 23. Pentaho BI Suite for Hadoop
      Data Marts, Data Warehouse,
      Analytical Applications
      Design
      Deploy
      Orchestrate
      Pentaho Data Integration
      Hadoop
      Pentaho Data Integration
      Pentaho Data Integration
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 24. Big Data Does Not Replace Data Marts
      • It’s not a database
      • 25. High latency
      • 26. Optimized for massive data-crunching
      • 27. Databases are immature
      • 28. Databases are no-SQL
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 29. Reporting / Dashboards / Analysis
      Web Tier
      DM & DW
      RDBMS
      Metadata
      Hive
      Hadoop
      Files / HDFS
      Applications & Systems
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 30. Data Lake(s)
      Data Mart(s)
      Data Warehouse
      Ad-Hoc
      Data
      Source
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 31. Data
      Lake
      Reporting / Dashboards / Analysis
      Web Tier
      RDBMS
      Hadoop
      Applications & Systems
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 32. Visualize
      Reporting / Dashboards / Analysis
      Web Tier
      DM & DW
      RDBMS
      Optimize
      Hive
      Hadoop
      Files / HDFS
      Access
      Applications & Systems
      US and Worldwide: +1 (866) 660-7555 | Slide
      © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
    • 33. Pentaho Turns Data into Information
    • 34. Pentaho BI Suite 3.7
      Data Integration 4.1
      Hadoop integration
      Simple file management for HDFS
      Input data from and output to HDFS
      Use PDI Jobs to coordinate Hadoop job execution
      Transformations as MapReduce jobs in Hadoop
      Integration with Amazon Elastic MapReduce
      User Console Improvements
      Thin client Agile BI Wizard
      Upload and stage data
      Simple generation of reporting/OLAP metadata
      Immediate access to self-service BI
      Analyzer/Mondrian
      Drill through to underlying details
      Conditional formatting (traffic lighting)
      Localization Support
      iPad integration
    • 35. Benefits for Users
      Pentaho application tools far easier than native Hadoop
      Enables combined hybrid model of structured and unstructured data
      Faster Time-to-Value
      Widens the potential user base of Hadoop
      Commercial Open Source Software (COSS) economics
      Pentaho’s data integration, reporting and analytical capabilities enable Hadoop developers and business analysts to quickly and easily create BI applications without coding
      Pentaho Data Integration (PDI) is a natural fit for Hadoop given its rich design tools, scalable architecture, open source distribution and adoption at a large number of Hadoop sites
    • 36. Pentaho BI Suite Resources & Events
      Resources
      Pentaho BI Suite landing page: www.pentaho.com/hadoop
      Upcoming resources
      Agile BI White Paper by Joshua Greenbaum. In-depth look at why Agile BI is important and how it is changing the BI industry.
      Technical Agile BI White paper from Pentaho CTO, James Dixon
      Events
      Agile BI Tour: Data to Dashboards in Minutes
      October 13, Oslo, NO
      October 15, Barcelona, ES
      October 19, Seattle, WA
      October 20, Portland, OR
      October 21, San Mateo, CA
      October 22, Kontich, BE
      October 27, Houston, TX
      October 27, Florence, IT
    • 37. Questions and Answers
      Davy Nys
      dnys@pentaho.com or +32 498 160 363
      Join the conversation. You can find us on:
      http://blog.pentaho.com
      @Pentaho
      Pentaho Facebook Group
      Pentaho - Open Source Business Intelligence Group

    ×