Pentaho Analytics on MongoDB

544
-1

Published on

A quick, small run-through of Pentaho Analytics 5.1 on MongoDB, providing native support for ETL, Reporting and Analytics on your MongoDB Collections.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
544
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
30
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Pentaho 5.0 reinforces Pentaho’s mission of delivering the future of analytics. Pentaho had continued to invest in BI and DI together with over 100 new features in PDI and over 250 in the platform overall.
    Continued investments in big data—new integrations—specifically with Mongo and Cassandra—and continues to shield customers from changes in the market.
    Open core and pluggable platform allows us to innovate quickly.
    Pentaho is battle tested with over 1200 commercial customers.
  • Icons are nice and the build-order is great!

    My suggestion the top 3 icons on the left-hand side:
    Customer
    Provisioning
    Billing

    Suggestion for the bottom 3 icons:
    Web
    Network
    Social Media
    (note: Location seems to be important to AT&T but we can just mention this)

    I need to come up with an explanation for why the arrow below “Just in Time Integration” is bi-directional instead of just flowing to Analytics
  • 8
  • http://wiki.pentaho.com/display/EAI/Job+checkpoints+and+restartability
  • Reference Architecture Notes

    Financial services company: Ingest data from various sources into single Big Data store, then processes and summarizes data at customer unique ID level
    Information is available in call center application for service, accessible by research analysts, and leveraged in predictive applications as well
    Pentaho Data Integration can ingest into NoSQL, pull out of NoSQL, and connect to Pentaho Business Analytics for end user needs
  • Sharding is a method for storing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.
    Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or shards. Each shard is an independent database, and collectively, the shards make up a single logical database.
  • Pentaho Analytics on MongoDB

    1. 1. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75551 Pentaho Analytics for MongoDB Mark Kromer Pentaho Big Data Analytics Product Manager @kromerbigdata
    2. 2. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75552 Modern, unified data integration and business analytics platform • Broadest and deepest big data integration • Embeddable, cloud-ready analytics • Big data blending at the source Fast and Broad Innovation • Open source development model • 100% java, pluggable and extensible Critical mass achieved • Over 1,200 commercial customers • Over 10,000 production deployments Pentaho Mission Enabling the future of analytics
    3. 3. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75553 Blending brings the two worlds together Evolving big data architectures P D I Existing ETL Tool or PDI EDW Data Marts Analytics Existing ETL Tool or PDI Customer Provisioning Billing BI Tools Location Web Social Media Network Existing Process or PDI Hadoop Cluster P D I Analytic DB On-Demand Integration & Blending
    4. 4. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75554 Pentaho 5.1 Powering Big Data Analytics @ Scale
    5. 5. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75555 • Unleash operational analytics on MongoDB for IT and Business Analysts • Unlock value of data in MongoDB for analysts with no coding required • Offload data preparation for data scientists • Focus on analytics, better understand customer behavior • Reduce complexity for big data developers • Leverage existing skilled resources and reduce complexity • Improve efficiency and performance for analytics Powering Big Data Analytics @ Scale Meeting the demands of the big data-driven enterprise
    6. 6. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75556 ORCHESTRATE ERP DW Processing CRM Raw Data Parsed Data Analytic Datasets Pentaho Analytics for MongoDB Master Data Analysis & Reporting A N A L Y Z E Unstructured Data Structured Data I N G E S T Ingestion AGG FRAMEWORK Data Integration Analytics
    7. 7. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75557 ❯ Simple, easy-to-use visual data exploration ❯ Web-based thin client; in-memory caching ❯ Rich library of interactive visualizations • Geo-mapping, heat grids, scatter plots, bubble charts, line over bar and more • Pluggable visualizations ❯ Java ROLAP engine to analyze structured and unstructured data, with SQL dialects for querying data from RDBMs ❯ Pluggable cache integrating with leading caching architectures: Infinispan (JBoss Data Grid) & Memcached Pentaho Interactive Analysis & Data Discovery Highly Flexible Advanced Visualizations
    8. 8. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75558 “The Pentaho platform is meeting unmet market needs, allowing users to directly analyze data in MongoDB. We have seen more accurate results with new analyses and are no longer constrained by having to only pull part of our data” Business User (COO) Reporting on Operations and Overhead End Users Dashboards and Reports on Customer Policy Data PDI Data Marts Data Scientist Data Mining and Data Governance Web Services Customer Portal Log Files Cross Department Operations Data PDI Transaction and Policy Data RDBMS PDI JSON transformation Analyzer tuned for MongoDB PDI
    9. 9. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75559 Data Integration ETL, Scheduling, Events, Orchestration • 100% Java engine • Meta-data driven architecture – graphical ETL Designer • Scale-out architecture, deployable to • Desktop • PDI clusters • Hadoop clusters • Plugin architecture for extensibility • Batch, low-latency and real time processing • Rapid onboarding of Analytics • Embeddable
    10. 10. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755510 Concept – Data Transformations INPUT(S) – PROCESS(ES) – OUTPUT(S)
    11. 11. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755511 Concept – Jobs (orchestrate) START – CHECK – WATCH – EXECUTE – NOTIFY - FINISH
    12. 12. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755512 mongoDB clusterPDI ETL Analytics Broad Connectivity Broad connectivity combined with powerful data integration
    13. 13. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755513 • Ability to blend traditional data sources with Big Data • Rapid time to value through drag/drop visual development for Big Data integration • Adaptive Big Data layer guards system from changing Big Data versions – reduces risk • Comprehensive analytics: visualizations, reports, dashboards, ad hoc analysis Why Customer 360 – NoSQL Architecture A Blended View to Drive Revenue Growth and Service Improvements Reference Architecture Notes • Financial services company: Ingest data from source systems into single Big Data store, then process & summarize data at customer unique ID level • Information is available in call center application for service, accessible by research analysts, and leveraged in predictive applications as well NoSQL CRM System Documents & Images Admin. Info Claims Online Interactions Call Center View Research Analysts Predictive Analytics PDI PDI Analyzer Reports
    14. 14. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755514 Flexible Schema for Big Data Variety Every document in a single collection could have different customer data name: “jeff”, eyes: “blue”, loc: [40.7, 73.4], boss: “ben”} {name: “brendan”, aliases: [“el diablo”]} name: “ben”, hat: ”yes”} {name: “matt”, pizza: “DiGiorno”, height: 72, loc: [44.6, 71.3]} {name: “will”, eyes: “blue”, birthplace: “NY”, aliases: [“bill”, “la ciacco”], loc: [32.7, 63.4], boss: ”ben”} 50M Customers = 50M Documents = 1TB
    15. 15. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755515 • Reduces development effort • Data is more useful than independent representations • Documents make it easy to integrate data from multiple schemas into a shared representation Documents Accelerate Time to Market
    16. 16. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755516 Scale Like an Accordion Automatic horizontal scaling based on customer ID
    17. 17. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755517 New Book – Pentaho Analytics for MongoDB
    18. 18. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755518 Thank You
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×