Pentaho Big Data Analytics with Vertica and Hadoop

  • 587 views
Uploaded on

Overview of the Pentaho Big Data Analytics Suite from the Pentaho + Vertica presentation at Big Data Techcon 2014 in Boston for the session called "The Ultimate Selfie | Picture Yourself with the …

Overview of the Pentaho Big Data Analytics Suite from the Pentaho + Vertica presentation at Big Data Techcon 2014 in Boston for the session called "The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho"

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
587
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
18
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho
  • 2. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75552 The Ultimate Selfie Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho Pentaho Big Data Analytics Mark Kromer Pentaho Big Data Analytics Product Manager
  • 3. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75553 DBA ETL/BI Developer Business Users & Executives Analysts & Data Scientists OPERATIONAL DATA BIG DATA DATA STREAMPUBLIC/PRIVATE CLOUDS Enterprise & Interactive Reporting Interactive Analysis Dashboards Predictive Analytics Pentaho Business Analytics Data Integration Instaview | Visual Map Reduce DIRECT ACCESS Pentaho Business Analytics Platform
  • 4. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75554 Product Components Pentaho Data Integration • Visual development for big data • Broad connectivity • Data quality & enrichment • Integrated scheduling • Security integration • Visual data exploration • Ad hoc analysis • Interactive charts & visualizations Pentaho Dashboards • Self-service dashboard builder • Content linking & drill through • Highly customized mash-ups Pentaho Data Mining & Predictive Analytics • Model construction & evaluation • Learning schemes • Integration with 3rd part models using PMML Pentaho Enterprise & Interactive Reports • Both ad hoc & distributed reporting • Drag & drop interactive reporting • Pixel-perfect enterprise reports Pentaho for Big Data MapReduce & Instaview • Visual Interface for Developing MR • Self-service big data discovery • Big data access to Data Analysts Pentaho Analyzer
  • 5. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75555 ❯ Simple, easy-to-use visual data exploration ❯ Web-based thin client; in-memory caching ❯ Rich library of interactive visualizations • Geo-mapping, heat grids, scatter plots, bubble charts, line over bar and more • Pluggable visualizations ❯ Java ROLAP engine to analyze structured and unstructured data, with SQL dialects for querying data from RDBMs ❯ Pluggable cache integrating with leading caching architectures: Infinispan (JBoss Data Grid) & Memcached Pentaho Interactive Analysis & Data Discovery Highly Flexible Advanced Visualizations
  • 6. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75556 Pentaho Data Integration Easy to Use, Highly Scalable ❯Graphical ETL designer ❯Data agnostic • Structured, unstructured, web services, packaged apps (Google, SAS, SFDC, etc.), big data sources, traditional sources, JSON, XML, HL7, etc. ❯Batch, low-latency & real time processing ❯Scale-out architecture, deployable to PDI clusters, Hadoop clusters ❯100% Java engine; plug-in architecture for extensibility ❯Workflow, alerting, monitoring Integration, Manipulation & Enrichment Use Cases: Classic ETL – data warehouse creation, population & maintenance Information Delivery – extraction from multiple data sources, transformation and streaming to a report MapReduce Applications – implementing “code-free” transformation pipelines within Hadoop Extensibility – adding 3rd-party functionality that automatically works within any of the above use cases.
  • 7. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75557 Pentaho Big Data Analytics Accelerate the time to big data value • Full continuity from data access to decisions – complete data integration & analytics for any big data store • Faster development, faster runtime – visual development, distributed execution • Instant and interactive analysis – no coding and no ETL required
  • 8. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75558 Pentaho Visual Development Eliminates the Need for Complex Coding Would you rather do this? Scheduling Modeling Ingestion / Manipulation / Integration … or this?
  • 9. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75559 Pentaho Visual MapReduce Drag & Drop, Then Run in the Cluster Parallel Execution as MapReduce in the Hadoop Cluster As Much as 15x Faster Than Hand-Written Code
  • 10. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755510 • Major sponsor of the open source project Weka • Data exploration/visualization, model construction and export, preliminary evaluation • Numerous classification/regression and clustering algorithms • Integration with Pentaho Data Integration ❯ Import 3rd-party models using Predictive Modeling Markup Language (PMML) ❯ Operationalize models inside or outside of a Hadoop Cluster ❯ Incorporate algorithms into Pentaho visual interface; store and version models using the Pentaho repository Pentaho Predictive Analytics Full Predictive Analytics Lifecycle Support
  • 11. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755511 Streamlined Data Refinery Drive a Sustainable Analytics Strategy with Big Data Orchestration at Scale Transactions – Batch & Real-time Enrollments & Redemptions Location, Email, O ther Data Hadoop Cluster Analyzer Reports Data Orchestration
  • 12. © 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755512 blog.pentaho.com @Pentaho Facebook.com/Pentaho Pentaho Business Analytics JOIN THE CONVERSATION. YOU CAN FIND US ON: