Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Ricardo Pires – Partner & BI Lead Xpand IT
Real Uses Cases
A SET OF INSPIRING USE CASES
USE CASE 1:
ALL TRANSACTIONS,
ONE DASHBOARD
• Dashboard providing a common view across sales transactions
• Multiple roles
• Top management
• Brand managers
• Channel...
DYNAMIC HIERARCHIES
Holding
Brand
Channel
Shop
Ʃ
Ʃ
Ʃ
Ʃ ATTR = abc
• 3 Years historical data
• 7,2 billion transactions representing 4,5TB
• Wide group of users spread across the organizati...
THE SOLUTION
HDFS Hive Impala
Pentaho Data Integration (PDI)
PDI
HBase
Web
Application
Hadoop
• Impala on Cloudera Hadoop can be used as an interactive data
base
• Hadoop distributed nature allows implementing used c...
USE CASE 2:
LOADING THE DATA LAKE
• Data lake goal is to make data available on a centralized location
• Requires dealing with
• Wide set of sources
• Dispa...
THE SMART SOLUTION
Configure
Metadata
Repository
Ingestion Engine
based on
Templates
Use Hadoop as
Data Repository
METADAT...
ARCHITECTURE
HDFS
Web UI
HadoopAny Datasource
PDI
PDI
PDI
{REST}
Ingestion Engine
Hive
• Pentaho Data Integration flexibility is a great match for Hadoop
semi-structured nature
• Cloudera Hadoop can be easily ...
USE CASE 3:
FOSTERING TRANSPARENCY
• Government agencies have long collected data but that doesn’t
mean it can easily be perceived by citizens
• Challenge
• ...
Architecture
BA
SERVER
Public Data
Service
PDI
ETL Web Application
• Pentaho Business Analytics is a comprehensive suite
• Pentaho Server components are really flexible and extensible
allow...
THANK YOU
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Real Use Cases - Pentaho & Big Data Ecosystem
Upcoming SlideShare
Loading in …5
×

Real Use Cases - Pentaho & Big Data Ecosystem

1,148 views

Published on

Presentation to demonstrate a set of inspiring use cases.
By Ricardo Pires – Partner & BI Lead @XpandIT

Published in: Technology
  • Be the first to comment

Real Use Cases - Pentaho & Big Data Ecosystem

  1. 1. Ricardo Pires – Partner & BI Lead Xpand IT Real Uses Cases
  2. 2. A SET OF INSPIRING USE CASES
  3. 3. USE CASE 1: ALL TRANSACTIONS, ONE DASHBOARD
  4. 4. • Dashboard providing a common view across sales transactions • Multiple roles • Top management • Brand managers • Channel managers • Requiring to organize data in multiple ways • Establish dynamic hierarchies based on multiple attributes DYNAMIC VIEW ACROSS SALES
  5. 5. DYNAMIC HIERARCHIES Holding Brand Channel Shop Ʃ Ʃ Ʃ Ʃ ATTR = abc
  6. 6. • 3 Years historical data • 7,2 billion transactions representing 4,5TB • Wide group of users spread across the organization • Intuitive User Interface with a great User Experience • Detailed visualization • Row level security • Maximum dashboard load time 5s CHALLENGES
  7. 7. THE SOLUTION HDFS Hive Impala Pentaho Data Integration (PDI) PDI HBase Web Application Hadoop
  8. 8. • Impala on Cloudera Hadoop can be used as an interactive data base • Hadoop distributed nature allows implementing used cases that wouldn’t be viable on other technologies • We went from 7 days of data to 3 years • Pentaho Data Integration implements and orchestrates the whole ETL process, making it much easier • From traditional data sources to summarized data on Hadoop KEY TAKEAWAYS
  9. 9. USE CASE 2: LOADING THE DATA LAKE
  10. 10. • Data lake goal is to make data available on a centralized location • Requires dealing with • Wide set of sources • Disparate technologies • In this case it is a repetitive batch loading process DATA INGESTION
  11. 11. THE SMART SOLUTION Configure Metadata Repository Ingestion Engine based on Templates Use Hadoop as Data Repository METADATA DRIVEN INGESTION
  12. 12. ARCHITECTURE HDFS Web UI HadoopAny Datasource PDI PDI PDI {REST} Ingestion Engine Hive
  13. 13. • Pentaho Data Integration flexibility is a great match for Hadoop semi-structured nature • Cloudera Hadoop can be easily used to store data and make it immediately available through a SQL interface • Patterns and well defined workflows are essential to data governance KEY TAKEAWAYS
  14. 14. USE CASE 3: FOSTERING TRANSPARENCY
  15. 15. • Government agencies have long collected data but that doesn’t mean it can easily be perceived by citizens • Challenge • Create an intuitive UI to represent more than 100 KPIs across 308 municipalities • Become a standard in terms of transparency GOVERNMENT CHALLENGE
  16. 16. Architecture BA SERVER Public Data Service PDI ETL Web Application
  17. 17. • Pentaho Business Analytics is a comprehensive suite • Pentaho Server components are really flexible and extensible allowing creating custom UIs such as: • Analytics portals • Embed on existing products KEY TAKEAWAYS
  18. 18. THANK YOU

×