How to Transform Into a
Data-Driven Organization
from data to actionable, enterprise-transforming insights!
W E B I N A R
E X I S T S O F T W A R E L A B S I N C
E X I S T S O F T W A R E L A B S I N C
Warren Cruz
Data Solutions Architect
Exist Software Labs
Jonas Lim
VP for Tech Services
Exist Software Labs
Make the most of our time together
2. Ask Questions 3. Review Later1. Take Notes
E X I S T S O F T W A R E L A B S I N C
Data is the new OIL; Data is the new GOLD.
How is your organization holding up in the 4th Industrial Revolution?
"The worldwide data is expected to grow at 61% CAGR (compound annual growth rate) to 175 zettabytes
(a zettabyte is a trillion gigabytes) by 2025, with as much of the data residing in the cloud as in data
centers as per IDC. That’s a lot of data coming into our universe." (Yogesh Gupta, CIO)
E X I S T S O F T W A R E L A B S I N C
What is Data-Driven Transformation and
Can Your Organization Afford to Stay the Same?
The 4 Key Value-Drivers of Data-Driven Transformation
Where Are You Now? Your Organization’s Data Maturity Level
The Key Components of a Data-Driven Transformational Journey
Putting It All Together: An Open-Source Data-Driven
Transformation Starter Pack
E X I S T S O F T W A R E L A B S I N C
What is Data-Driven Transformation and
Can Your Organization Afford to Stay the Same?
What does it mean to
be data-driven?
E X I S T S O F T W A R E L A B S I N C
When and how does data
become transformational?
So what is a Data-Driven
Organization?
Understand and
respond to the
customer better
DATA turned into INSIGHT helps you....
What are the 4 Key Value-Drivers of
Data-Driven Transformation?
E X I S T S O F T W A R E L A B S I N C
Reimagine and
improve business
processes
Identify new
Opportunities
for revenue
Balance risk
and reward
“Where’s the Data?”
No digitized data
consolidation yet
E X I S T S O F T W A R E L A B S I N C
Where Are You Now? Your Organization’s
Data Maturity Level
Challenge: To gain insights from all kinds of data---from structured to semi-structured to unstructured data!
“Why Did It
Happen?”
Dashboards / Scorecards
& Statistical Analysis
“What Happened?”
Data Ingestion, Operational
Reporting, & Ad-Hoc
Reporting
“What’s Likely to
Happen Next?”
Forecasting & Predictive
Analysis
“What’s the Best
Possible Thing That
Could Happen?”
Optimization
VALUETOORGANIZATION
STRUCTURED DATA SEMI-STRUCTURED DATA UNSTRUCTURED DATA
1
2
3
4
5
Data
Source
E X I S T S O F T W A R E L A B S I N C
The Key Components of a Data-Driven
Transformational Journey
Data
Ingestion
Data
Quality
Data Hub /
Warehouse
Data
Consumer
Structured Data
Data Source
E X I S T S O F T W A R E L A B S I N C
Semi-Structured Data Unstructured Data
Data Ingestion
Data Integration
E X I S T S O F T W A R E L A B S I N C
ETL (Extract, Transform, Load) Batch Streaming
CRM
LOB
ERP
Data Warehouse
E X I S T S O F T W A R E L A B S I N C
Data Ingestion
Licensed and Cloud Service
ETL BatchPros
● Support
● Features
Cons
● $$$$
Streaming
Cloud Service
Azure Data
Factory
AWS
Glue
Cloud Service
E X I S T S O F T W A R E L A B S I N C
Data Ingest
Open Source
Pros
● $$$
● Might be enough for
your need
Cons
● Support
● Features
ETL Batch
Streaming
Data Quality
E X I S T S O F T W A R E L A B S I N C
ACCURATE, COMPLETE,
CONSISTENT, TIMELY,
UNIQUE, and VALID data
Quality data ensures
quality insights
No DQ without
Data Integration
Can be comprehensive
or minimalist
E X I S T S O F T W A R E L A B S I N C
Data Quality
Licensed and Cloud Services
Pros
● Support
● Features
Cons
● $$$
Cloud Service
Azure Data
Factory
Data Preparator
E X I S T S O F T W A R E L A B S I N C
Data Quality
Open Source
Pros
● $$$
● Might be enough for
your need
Cons
● Support
● Features
The heart of a
Modern Data
Analytics Platform
E X I S T S O F T W A R E L A B S I N C
Data Hub / Warehouse
More than a
central data
repository
Governed data
access for all
business users
Massively Parallel
Processing
Advanced Analytics
(AI/ML)
E X I S T S O F T W A R E L A B S I N C
Data Hub / Warehouse
Licensed and Cloud Service
Pros
● Support
● Features
● Critical Component
Cons
● $$$
Cloud Service
EXADATA Parallel Data Warehouse
E X I S T S O F T W A R E L A B S I N C
Data Hub / Warehouse
Open Source
Pros
● $$$
● Might be enough for
your need
Cons
● Support
● Features
What is Business Intelligence?
● Traditional BI
● Agile (Self-Service) BI
E X I S T S O F T W A R E L A B S I N C
Data Consumer
Applications
● Smart Apps
Business Intelligence
License
● Tableau, PowerBI, Pentaho, TIBCO Spotfire
Open source
● Pentaho
Application
Smart applications, custom applications
Machine learning and data processing components
E X I S T S O F T W A R E L A B S I N C
Data Consumer
What are the different data consumers?
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Identify your data source
E X I S T S O F T W A R E L A B S I N C
Download the Talend Open Studio for Data Integration and Talend Open
Studio for Data Quality from the following location:
https://www.talend.com/products/talend-open-studio/
The installation guide can be found here:
https://help.talend.com/reader/i2E~KbXvDgp6fxD072VUmg/Inlarig
4UvzOOasUF3HQjg
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Setup your ETL
E X I S T S O F T W A R E L A B S I N C
4-cores 8gb
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Setup your Datahub
E X I S T S O F T W A R E L A B S I N C
You can download the installer from this location:
https://network.pivotal.io/products/pivotal-gpdb#/releases
The installation guide can be found here:
https://gpdb.docs.pivotal.io/6-7/install_guide/install_guide.html
8-cores
16gb
Master Host
8-cores
24gb
Segment Host x2
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Connect your BI tools / Apps
E X I S T S O F T W A R E L A B S I N C
You can download the installer here:
https://community.jaspersoft.com/download
The installation guide can be found here:
https://community.jaspersoft.com/project/jasperreports-se
rver/resources
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
E X I S T S O F T W A R E L A B S I N C
Q & A
E X I S T S O F T W A R E L A B S I N C
Secondly, you can either provision a Linux bare-metal machine or a cloud
compute instance to function as your ETL Server. A 4-core, 8GB RAM machine
should suffice.
Download the Talend Open Studio for Data Integration and Talend Open
Studio for Data Quality from the following location:
https://www.talend.com/products/talend-open-studio/
The installation guide can be found here:
https://help.talend.com/reader/i2E~KbXvDgp6fxD072VUmg/Inlarig
4UvzOOasUF3HQjg
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Setup your ETL
E X I S T S O F T W A R E L A B S I N C
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Setup your Datahub
E X I S T S O F T W A R E L A B S I N C
Thirdly, you can also provision 3 instances of either Linux bare-metal machines or cloud
compute instances to act as your Data Hub. We will be using Greenplum as our Data
Hub platform which will consist of 1 Master Host and 2 Segment Hosts. The Master
Segment can have 8 cores, with 16GB RAM, while each of the Segment Hosts can have
8 cores, with 24GB RAM.
You can download the installer from this location:
https://network.pivotal.io/products/pivotal-gpdb#/releases
The installation guide can be found here:
https://gpdb.docs.pivotal.io/6-7/install_guide/install_guide.html
Putting It All Together: An Open Source Modern
Data Analytic Platform in the Cloud
Connect your BI tools / Apps
E X I S T S O F T W A R E L A B S I N C
Lastly, you can make use of Jaspersoft Reports as your BI tool.
You can download the installer here:
https://community.jaspersoft.com/download
The installation guide can be found here:
https://community.jaspersoft.com/project/jasperreports-serve
r/resources

How to Transform Into a Data-Driven Organization

  • 1.
    How to TransformInto a Data-Driven Organization from data to actionable, enterprise-transforming insights! W E B I N A R E X I S T S O F T W A R E L A B S I N C
  • 2.
    E X IS T S O F T W A R E L A B S I N C Warren Cruz Data Solutions Architect Exist Software Labs Jonas Lim VP for Tech Services Exist Software Labs
  • 3.
    Make the mostof our time together 2. Ask Questions 3. Review Later1. Take Notes E X I S T S O F T W A R E L A B S I N C
  • 4.
    Data is thenew OIL; Data is the new GOLD. How is your organization holding up in the 4th Industrial Revolution? "The worldwide data is expected to grow at 61% CAGR (compound annual growth rate) to 175 zettabytes (a zettabyte is a trillion gigabytes) by 2025, with as much of the data residing in the cloud as in data centers as per IDC. That’s a lot of data coming into our universe." (Yogesh Gupta, CIO) E X I S T S O F T W A R E L A B S I N C
  • 5.
    What is Data-DrivenTransformation and Can Your Organization Afford to Stay the Same? The 4 Key Value-Drivers of Data-Driven Transformation Where Are You Now? Your Organization’s Data Maturity Level The Key Components of a Data-Driven Transformational Journey Putting It All Together: An Open-Source Data-Driven Transformation Starter Pack E X I S T S O F T W A R E L A B S I N C
  • 6.
    What is Data-DrivenTransformation and Can Your Organization Afford to Stay the Same? What does it mean to be data-driven? E X I S T S O F T W A R E L A B S I N C When and how does data become transformational? So what is a Data-Driven Organization?
  • 7.
    Understand and respond tothe customer better DATA turned into INSIGHT helps you.... What are the 4 Key Value-Drivers of Data-Driven Transformation? E X I S T S O F T W A R E L A B S I N C Reimagine and improve business processes Identify new Opportunities for revenue Balance risk and reward
  • 8.
    “Where’s the Data?” Nodigitized data consolidation yet E X I S T S O F T W A R E L A B S I N C Where Are You Now? Your Organization’s Data Maturity Level Challenge: To gain insights from all kinds of data---from structured to semi-structured to unstructured data! “Why Did It Happen?” Dashboards / Scorecards & Statistical Analysis “What Happened?” Data Ingestion, Operational Reporting, & Ad-Hoc Reporting “What’s Likely to Happen Next?” Forecasting & Predictive Analysis “What’s the Best Possible Thing That Could Happen?” Optimization VALUETOORGANIZATION STRUCTURED DATA SEMI-STRUCTURED DATA UNSTRUCTURED DATA 1 2 3 4 5
  • 9.
    Data Source E X IS T S O F T W A R E L A B S I N C The Key Components of a Data-Driven Transformational Journey Data Ingestion Data Quality Data Hub / Warehouse Data Consumer
  • 10.
    Structured Data Data Source EX I S T S O F T W A R E L A B S I N C Semi-Structured Data Unstructured Data
  • 11.
    Data Ingestion Data Integration EX I S T S O F T W A R E L A B S I N C ETL (Extract, Transform, Load) Batch Streaming CRM LOB ERP Data Warehouse
  • 12.
    E X IS T S O F T W A R E L A B S I N C Data Ingestion Licensed and Cloud Service ETL BatchPros ● Support ● Features Cons ● $$$$ Streaming Cloud Service Azure Data Factory AWS Glue Cloud Service
  • 13.
    E X IS T S O F T W A R E L A B S I N C Data Ingest Open Source Pros ● $$$ ● Might be enough for your need Cons ● Support ● Features ETL Batch Streaming
  • 14.
    Data Quality E XI S T S O F T W A R E L A B S I N C ACCURATE, COMPLETE, CONSISTENT, TIMELY, UNIQUE, and VALID data Quality data ensures quality insights No DQ without Data Integration Can be comprehensive or minimalist
  • 15.
    E X IS T S O F T W A R E L A B S I N C Data Quality Licensed and Cloud Services Pros ● Support ● Features Cons ● $$$ Cloud Service Azure Data Factory
  • 16.
    Data Preparator E XI S T S O F T W A R E L A B S I N C Data Quality Open Source Pros ● $$$ ● Might be enough for your need Cons ● Support ● Features
  • 17.
    The heart ofa Modern Data Analytics Platform E X I S T S O F T W A R E L A B S I N C Data Hub / Warehouse More than a central data repository Governed data access for all business users Massively Parallel Processing Advanced Analytics (AI/ML)
  • 18.
    E X IS T S O F T W A R E L A B S I N C Data Hub / Warehouse Licensed and Cloud Service Pros ● Support ● Features ● Critical Component Cons ● $$$ Cloud Service EXADATA Parallel Data Warehouse
  • 19.
    E X IS T S O F T W A R E L A B S I N C Data Hub / Warehouse Open Source Pros ● $$$ ● Might be enough for your need Cons ● Support ● Features
  • 20.
    What is BusinessIntelligence? ● Traditional BI ● Agile (Self-Service) BI E X I S T S O F T W A R E L A B S I N C Data Consumer Applications ● Smart Apps
  • 21.
    Business Intelligence License ● Tableau,PowerBI, Pentaho, TIBCO Spotfire Open source ● Pentaho Application Smart applications, custom applications Machine learning and data processing components E X I S T S O F T W A R E L A B S I N C Data Consumer What are the different data consumers?
  • 22.
    Putting It AllTogether: An Open Source Modern Data Analytic Platform in the Cloud Identify your data source E X I S T S O F T W A R E L A B S I N C
  • 23.
    Download the TalendOpen Studio for Data Integration and Talend Open Studio for Data Quality from the following location: https://www.talend.com/products/talend-open-studio/ The installation guide can be found here: https://help.talend.com/reader/i2E~KbXvDgp6fxD072VUmg/Inlarig 4UvzOOasUF3HQjg Putting It All Together: An Open Source Modern Data Analytic Platform in the Cloud Setup your ETL E X I S T S O F T W A R E L A B S I N C 4-cores 8gb
  • 24.
    Putting It AllTogether: An Open Source Modern Data Analytic Platform in the Cloud Setup your Datahub E X I S T S O F T W A R E L A B S I N C You can download the installer from this location: https://network.pivotal.io/products/pivotal-gpdb#/releases The installation guide can be found here: https://gpdb.docs.pivotal.io/6-7/install_guide/install_guide.html 8-cores 16gb Master Host 8-cores 24gb Segment Host x2
  • 25.
    Putting It AllTogether: An Open Source Modern Data Analytic Platform in the Cloud Connect your BI tools / Apps E X I S T S O F T W A R E L A B S I N C You can download the installer here: https://community.jaspersoft.com/download The installation guide can be found here: https://community.jaspersoft.com/project/jasperreports-se rver/resources
  • 26.
    Putting It AllTogether: An Open Source Modern Data Analytic Platform in the Cloud E X I S T S O F T W A R E L A B S I N C
  • 27.
    Q & A EX I S T S O F T W A R E L A B S I N C
  • 28.
    Secondly, you caneither provision a Linux bare-metal machine or a cloud compute instance to function as your ETL Server. A 4-core, 8GB RAM machine should suffice. Download the Talend Open Studio for Data Integration and Talend Open Studio for Data Quality from the following location: https://www.talend.com/products/talend-open-studio/ The installation guide can be found here: https://help.talend.com/reader/i2E~KbXvDgp6fxD072VUmg/Inlarig 4UvzOOasUF3HQjg Putting It All Together: An Open Source Modern Data Analytic Platform in the Cloud Setup your ETL E X I S T S O F T W A R E L A B S I N C
  • 29.
    Putting It AllTogether: An Open Source Modern Data Analytic Platform in the Cloud Setup your Datahub E X I S T S O F T W A R E L A B S I N C Thirdly, you can also provision 3 instances of either Linux bare-metal machines or cloud compute instances to act as your Data Hub. We will be using Greenplum as our Data Hub platform which will consist of 1 Master Host and 2 Segment Hosts. The Master Segment can have 8 cores, with 16GB RAM, while each of the Segment Hosts can have 8 cores, with 24GB RAM. You can download the installer from this location: https://network.pivotal.io/products/pivotal-gpdb#/releases The installation guide can be found here: https://gpdb.docs.pivotal.io/6-7/install_guide/install_guide.html
  • 30.
    Putting It AllTogether: An Open Source Modern Data Analytic Platform in the Cloud Connect your BI tools / Apps E X I S T S O F T W A R E L A B S I N C Lastly, you can make use of Jaspersoft Reports as your BI tool. You can download the installer here: https://community.jaspersoft.com/download The installation guide can be found here: https://community.jaspersoft.com/project/jasperreports-serve r/resources