• Save
Hadoop uk user group meeting final
Upcoming SlideShare
Loading in...5
×
 

Hadoop uk user group meeting final

on

  • 3,495 views

 

Statistics

Views

Total Views
3,495
Views on SlideShare
3,419
Embed Views
76

Actions

Likes
2
Downloads
0
Comments
0

3 Embeds 76

http://skillsmatter.com 39
http://smash 35
http://85.92.73.37 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Pentaho provides a complete, enterprise BI suite from ETL and data integration, through OLAP as well as reporting, dashboards and ad hoc analysis. Our Enterprise Edition BI Suite is modular, enabling users to use the entire set of functionality or to start anywhere that may be a priority such as building and deploying a data warehouse or providing management dashboards. And because Pentaho’s BI Suite is modular, users can easily deploy additional functionality as their needs grow or change. Individually, Pentaho’s BI and data integration applications , Pentaho Data Integration, Pentaho Analysis, Pentaho Reporting, Pentaho Dashboards and Pentaho Analyzer are purpose built and best of breed, providing users with world class BI and data integration functionality to meet the needs of customers ranging from innovative new companies to Fortune 1000.At the most basic level, Pentaho helps you to turn your data, stored throughout your organization. into actionable business intelligence. This functionality can be divided into three core areas: Accessing data, Optimizing and analyzing data and then visualizing information via reports or dashboards.In terms of accessing data, we integrate with both structured data, such as data stored in a relational database or coming from a core business applications such as CRM or ERP , as well as unstructured complex data via our integration with Apache Hadoop. We offer a graphical interface that allows you to quickly connect and transform data sources simply by dragging and dropping them into the Pentaho development environment.Optimizing data means you can slice and dice data to find meaningful trends, uncover root causes or other business-relevant information. It allows you to “have a conversation with the data”, interactively exploring data as you see fit. Pentaho also provides data mining capabilities to discover hidden patterns in the data for purposes of identifying indicators for predicting future performance.Visualization consists of reports and dashboards. Reporting is often where organizations start with business intelligence, trying to get business information out of existing systems to make it available to business users in an attractive, easy-to-consume format. Pentaho reporting provides both operational reporting such as for invoices or bills of lading, as well as historical and analytical reports.Dashboards have become a very popular BI capability because it lets end users easily see their key performance indicators and business metrics in a very easy-to-consume format. Rather than combing through large volumes of reports, users can immediately see what metrics are on track and which ones require immediate attention.The underlying BI server integrates all of these end-user capabilities, providing developers a single view of data across the entire suite. No other BI vendor offers the unique combination of a comprehensive BI suite with the breadth of Pentaho combined with a single, intuitive development interface that greatly simplifies the creation of new BI applications.To meet the range of user needs, Pentaho can be deployed either as an on-premise or on-demand application. In either deployment scenario, you have the same exact product set and functionality. So it is very easy to migrate in either direction. So if you decide to deploy an initial project in the cloud via the on-demand offering in order to deliver business value more quickly, you can then move it back on-premise at a later date very easily.

Hadoop uk user group meeting final Hadoop uk user group meeting final Presentation Transcript

  • UK Hadoop User Group Meeting
    Davy Nys, RVP of Enterprise Sales EMEA
    October, 2010
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • About Pentaho
    Recognized leader in open source BI & Data Integration
    Average one download every 30 seconds
    Over 8,000 active production deployments
    Over 1,200 customers in 65 countries
    Saved customers >$2 billion in cumulative licenses and maintenance costs
    Backed by Benchmark Capital, Index Ventures and NEA
  • Driven by Customer and Market Need
    Pentaho has been an industry pioneer and innovator since its founding in 2004. As an OSBI company since its start, Pentaho continues to be driven by customer and market need.
    2004 - Founded
    2005 - First open source BI Platform
    2006 - First to offer live integration with Google Maps
    2008 - First BI company to integrate with the iPhone
    2009 - Announced groundbreaking Agile BI Initiative to address the market need of brining BI closer to business users.
    Customers approached Pentaho with big data problems
    2010 - First to offer ad hoc analytics to iPad
    2010 - First to announce and deliver code to support Hadoop and big data analytics
  • Why Pentaho BI for Hadoop?
    Pentaho offers full BI Suite
    Data to dashboards (ETL, OLAP, reporting, dashboards, mining)
    Pentaho lowers on-ramp for Hadoop users
    Lowers complexity and learning curve for Big Data analytics
    Enables users to combine structured and unstructured data
    Few Hadoop applications available, critical need
    Rapidly integrate Hadoop into existing data architectures by easily moving data between Hadoop and databases, data warehouses and other enterprise data stores;
    Agile BI and modern platform, deployed on-premise or on-demand
    Pentaho brings scalability, clustering and deployment options
    100% Java
    Commitment to open source
    COSS frees up $$ for more servers, CPUs
  • Pentaho for Hadoop Download Capability
    Includes support for development, production support will follow with GA
    Collaborative effort between Pentaho and the Pentaho Community
    60+ beta sites over three month beta cycle
    Pentaho contributed code for API integration with HIVE to the open source Apache Foundation
    Pentaho and Amazon Web Services Partnership
    Combines Pentaho Data Integration for Hadoop with Amazon’s Elastic Map Reduce (EMR) to facilitate easy integration with Hadoop data stored in EC2
    Enables hybrid data model between EMR, databases, data warehouses and other on-premise data stores
    Pentaho’s Amazon EC2 offering includes tightly integrated report design for building production or ad hoc reports from data spanning cloud and on-premise data sources (available November, 2010)
    Pentaho for Hadoop Announcements
  • Pentaho for Hadoop Announcements (cont)
    Pentaho and Cloudera Partnership
    Combines Pentaho ‘s business intelligence and data integration capabilities with Cloudera’s Distribution for Hadoop (CDH)
    Enables business users to take advantage of Hadoop with ability to easily and cost-effectively mine, visualize and analyze their Hadoop data
    Pentaho and Impetus Technologies Partnership
    Incorporates Pentaho Agile BI and Pentaho BI Suite for Hadoop into Impetus Large Data Analytics practice
    First major SI to adopt Pentaho for Hadoop
    Facilitates large data analytics projects including expert consulting services, best practices support in Hadoop implementations and nCluster including deployment on private and public clouds
  • Hadoop and BI?
    90% of new Hadoop use cases
    are transformation of
    semi/structured data*
    * of those companies we’ve talked to...
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Big Data
    Terabytes and petabytes of data
    Sometimes per day
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • ?
    ?
    ?
    ?
    ?
    ?
    ?
    Traditional BI
    Data Mart(s)
    Tape/Trash
    Data
    Source
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Data Lake
    • Single source
    • Large volume
    • Not distilled
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Data Lakes
    • 0-2 lakes per company
    • Known and unknown questions
    • Multiple user communities
    • $1-10k questions, not $1m ones
    • Don’t fit in traditional RDBMS with a reasonable cost
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Data Lake Requirements
    • Store all the data
    • Satisfy routine reporting and analysis
    • Satisfy ad-hoc query / analysis / reporting
    • Balance performance and cost
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Tape/Trash
    Ad-Hoc
    Data Lake(s)
    Data Warehouse
    What if...
    Data Mart(s)
    Data
    Source
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Pentaho BI Suite for Hadoop
    Data Marts, Data Warehouse,
    Analytical Applications
    Design
    Deploy
    Orchestrate
    Pentaho Data Integration
    Hadoop
    Pentaho Data Integration
    Pentaho Data Integration
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Big Data Does Not Replace Data Marts
    • It’s not a database
    • High latency
    • Optimized for massive data-crunching
    • Databases are immature
    • Databases are no-SQL
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Reporting / Dashboards / Analysis
    Web Tier
    DM & DW
    RDBMS
    Metadata
    Hive
    Hadoop
    Files / HDFS
    Applications & Systems
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Data Lake(s)
    Data Mart(s)
    Data Warehouse
    Ad-Hoc
    Data
    Source
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Data
    Lake
    Reporting / Dashboards / Analysis
    Web Tier
    RDBMS
    Hadoop
    Applications & Systems
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Visualize
    Reporting / Dashboards / Analysis
    Web Tier
    DM & DW
    RDBMS
    Optimize
    Hive
    Hadoop
    Files / HDFS
    Access
    Applications & Systems
    US and Worldwide: +1 (866) 660-7555 | Slide
    © 2010, Pentaho. All Rights Reserved. www.pentaho.com.
  • Pentaho Turns Data into Information
  • Pentaho BI Suite 3.7
    Data Integration 4.1
    Hadoop integration
    Simple file management for HDFS
    Input data from and output to HDFS
    Use PDI Jobs to coordinate Hadoop job execution
    Transformations as MapReduce jobs in Hadoop
    Integration with Amazon Elastic MapReduce
    User Console Improvements
    Thin client Agile BI Wizard
    Upload and stage data
    Simple generation of reporting/OLAP metadata
    Immediate access to self-service BI
    Analyzer/Mondrian
    Drill through to underlying details
    Conditional formatting (traffic lighting)
    Localization Support
    iPad integration
  • Benefits for Users
    Pentaho application tools far easier than native Hadoop
    Enables combined hybrid model of structured and unstructured data
    Faster Time-to-Value
    Widens the potential user base of Hadoop
    Commercial Open Source Software (COSS) economics
    Pentaho’s data integration, reporting and analytical capabilities enable Hadoop developers and business analysts to quickly and easily create BI applications without coding
    Pentaho Data Integration (PDI) is a natural fit for Hadoop given its rich design tools, scalable architecture, open source distribution and adoption at a large number of Hadoop sites
  • Pentaho BI Suite Resources & Events
    Resources
    Pentaho BI Suite landing page: www.pentaho.com/hadoop
    Upcoming resources
    Agile BI White Paper by Joshua Greenbaum. In-depth look at why Agile BI is important and how it is changing the BI industry.
    Technical Agile BI White paper from Pentaho CTO, James Dixon
    Events
    Agile BI Tour: Data to Dashboards in Minutes
    October 13, Oslo, NO
    October 15, Barcelona, ES
    October 19, Seattle, WA
    October 20, Portland, OR
    October 21, San Mateo, CA
    October 22, Kontich, BE
    October 27, Houston, TX
    October 27, Florence, IT
  • Questions and Answers
    Davy Nys
    dnys@pentaho.com or +32 498 160 363
    Join the conversation. You can find us on:
    http://blog.pentaho.com
    @Pentaho
    Pentaho Facebook Group
    Pentaho - Open Source Business Intelligence Group