SlideShare a Scribd company logo
1 of 100
1Pivotal Confidential–Internal Use Only 1Pivotal Confidential–Internal Use Only
Modern Data Architecture
Alexey Grishchenko
2Pivotal Confidential–Internal Use Only
About me
Enterprise Architect @ Pivotal
 7 years in data processing
 5 years with MPP
 4 years with Hadoop
 Spark contributor
 http://0x0fff.com
3Pivotal Confidential–Internal Use Only
How it started…
Front
End
4Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
5Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
6Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
What about BI?
7Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
Just put it there!
8Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
BI
9Pivotal Confidential–Internal Use Only
How it started…
Front
End
Back
End
DBMS
BI
Was it fast?
10Pivotal Confidential–Internal Use Only
How it started…
Front
End
10ms
Back
End
DBMS
BI
100ms
200ms
1-2 min
11Pivotal Confidential–Internal Use Only
How it started…
Front
End
10ms
Back
End
DBMS
BI
100ms
200ms
1-2 min
yes, single server…
12Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
200ms
1-2 min
More users got
workstations
13Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
400ms
800ms
1-2 min
14Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
400ms
800ms
1-2 min
Split!
15Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
300ms
600ms
1-2 min
16Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
300ms
600ms
1-2 min
Even more users?
17Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
300ms
600ms
1-2 min
Split!
18Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
400ms
1-2 min
Front
End
Back
End
Front
End
Back
End
19Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
400ms
1-2 min
Front
End
Back
End
Front
End
Back
End
What about
automated systems?
20Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
1 sec
5-10 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
21Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
1 sec
5-10 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Database, please, live!
22Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
1 sec
5-10 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
23Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
800ms
15-20 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
24Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
800ms
15-20 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
What if “split” didn’t
help this time?
25Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
800ms
15-20 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Split more! Eventually
it will help…
26Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
27Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
28Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
29Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
300ms
35-40 min
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
Sales went
20%
down!
30Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
600ms
2-3 hrs
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
Sales went
20%
down!
31Pivotal Confidential–Internal Use Only
First Issues
Front
End
10ms
Back
End
DBMS
BI
100ms
600ms
2-3 hrs
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
Front
End
Back
End
DBMS DBMSDBMSDBMS
Sales went
10% up!
Sales went
20%
down!
Stop loading my
system with your
stupid reports!
32Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
2 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
33Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
2 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
We need more
reports!
34Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
3-4 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
Data
Mining
OLAP…
35Pivotal Confidential–Internal Use Only
BI
The Era of Data Warehouse
100ms
DBMS
300ms
3-4 days
FE
BE
DBMS DBMSDBMSDBMS
FE
BE
FE
BE
FE
BE
FE
BE
ETL
DWH
1 day
Data
Mining
OLAP… We need
secondary site!
36Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
37Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
38Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
39Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
Where is our
DWH? We need
this data now!
40Pivotal Confidential–Internal Use Only
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
WAL Replication
3-5 minutes late
41Pivotal Confidential–Internal Use Only
ETL
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
42Pivotal Confidential–Internal Use Only
ETL
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
Why is this data
so old?
43Pivotal Confidential–Internal Use Only
ETL
The Era of Data Warehouse
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
44Pivotal Confidential–Internal Use Only
ETL
Advanced Architecture – ELT
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ETL
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
DBMS DBMS DBMS…
ETL
DDS
Data Marts Reports
Aggregates
OLAP
DBMS DBMS DBMS…
ELT
DDS
Data Marts Reports
Aggregates
OLAP
ODS ODS ODS…
45Pivotal Confidential–Internal Use Only
ELT
Advanced Architecture – ELT
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
46Pivotal Confidential–Internal Use Only
ELT
Advanced Architecture – CDC
100ms
300ms
3-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
1 day
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
DWH
BI
Data
Mining
OLAP…
5-7 days
DBMS DBMS DBMS DBMS DBMS
DBMS DBMS DBMS…
ELT
DDS
Data Marts Reports
Aggregates
OLAP
ODS ODS ODS…
DBMS DBMS DBMS…
ELT
DDS
Data Marts Reports
Aggregates
OLAP
ODS ODS ODS…
CDC
1 day
1 hour
47Pivotal Confidential–Internal Use Only
ELT CDC
Advanced Architecture – CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
48Pivotal Confidential–Internal Use Only
ELT CDC
Advanced Architecture – CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Why is our
secondary site’s
DWH so old?
49Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Moving Forward
50Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
Moving Forward
51Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
Moving Forward
52Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
Moving Forward
53Pivotal Confidential–Internal Use Only
ELT CDC
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
 DWH MPP storage is expensive
Moving Forward
54Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
 DWH MPP storage is expensive
Data Lake
55Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Our problems are
 Time to action takes up to 7 days
 Amount of data is growing
 DWH MPP storage is expensive
Lambda
Data Lake
56Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Data Lake
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
Hadoop
DBMS DBMS DBMS…
ELT
DDS
OLAP Data Marts
Aggregates
Reports
ODS ODS ODS…
CDC
DWH
ODS UDS
Analytical Archives
BI
Data
Mining
OLAP
SQL-on-Hadoop
Data Mining
At Scale
57Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Data Lake
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
BI
Data
Mining
OLAP…
FE
BE
FE
BE
FE
BE
FE
BE
FE
BE
WAL Replication
3-5 minutes late
NAS NAS
Backup / Restore
3 days late
BI
Data
Mining
OLAP…
4-7 days
DBMS DBMS DBMS DBMS DBMS
CDC
DWH
58Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Data Lake
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
Data
Mining
BI OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
59Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Lambda
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
Data
Mining
BI OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
Source
Data
Speed Layer Batch Layer
Serving Layer
Query Query
Master Dataset
Batch
View
Batch
View
Batch
View
Real-time
View
Real-time
View
Real-time
View
60Pivotal Confidential–Internal Use Only
ELT CDC
Modern Architectures – Lambda
100ms
300ms
1-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
3-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
Data
Mining
BI OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
61Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures – Lambda
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
62Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
63Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
64Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
65Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
 How to sync data in real-time systems?
66Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
 How to sync data in real-time systems?
 How to better sync DWH?
67Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
Modern Architectures
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Our problems are
 Too many standby systems
 How to replicate Hadoop cluster?
 How to sync data in real-time systems?
 How to better sync DWH?
Pipelining
68Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
69Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
70Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
71Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Table
72Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
73Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
74Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
75Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DWH
76Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DDS
DWH
77Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DDS
DataMart
DWH
78Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…SOAP
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
load
ODS
DDS
DataMart
DWH
JDBC
79Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
ETL
cp
Batch
ETL
ODS
DDS
DataMart
DWH
JDBC
80Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatch
81Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatch
loadETL
82Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
83Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
STG
BatchApp
Hadoop
HDFS
SQL
On
Hadoop
84Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
STG
BatchApp
Hadoop
HDFS
SQL
On
Hadoop
RTI
App
85Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
FE
BI
App
App
App
…HTTP
BE
Srv
Srv
Srv
…
OLTP
SP
JDBC
Log
Table
CDC
copy
Parse
Batch
load
ODS
DDS
DataMart
DWH
JDBC
API
Queue ETL
ETLBatchApp
ETLBatch
load
loadETL
STG
BatchApp
Hadoop
HDFS
SQL
On
Hadoop
RTI
AppReplicate
86Pivotal Confidential–Internal Use Only
In-Memory
Data Store
ELT CDC
100ms
300ms
0-4 days
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
0-24 hrs
OLAP
Data
Mining
BI…
FE
BE
FE
BE
FE
BE
NAS NAS
Backup / Restore
2 days late
OLAP…
3-6 days
DBMS DBMS DBMS
WAL Replication
3-5 minutes late
CDC
DWHHadoop Hadoop
?
In-Memory
Data Store
RTDM BI
Data
Mining
Modern Data Architecture – Pipelining
87Pivotal Confidential–Internal Use Only
ELT CDC
FE
BE
DBMS DBMS
FE
BE
DBMS
FE
BE
ELT
DWH
OLAP
Data
Mining
RTBI…
FE
BE
FE
BE
FE
BE
CDC
Hadoop
In-Memory
Data Store
BI
Modern Data Architecture – Pipelining
Replication Queue
3-5 minutes late
In-Memory
Data Store
OLAP…
DWHHadoop
BI
Data
Mining
RTBI
DBMS DBMS DBMSWAL Replication
3-5 minutes late
88Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
89Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
HTTP
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal Cloud Foundry
FE
…
App
App
App
Queue BE
…
App
App
App
 Pivotal Labs – agile software
development for next-generation
applications
 Pivotal Cloud Foundry – PaaS for
customer applications
 RabbitMQ – distributed message
queue service on top of PCF
 Spring IO – foundation platform for
modern applications
90Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal GemFire
App
Pivotal GemFire and Apache Geode (incubating) –
in-memory data grid enabling real-time data processing and
real-time decision making for enterprises
91Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Spring XD
Streaming
Spring XD – unified, distributed and extensible framework for
data pipelining: ingesting, batching, processing and exporting
92Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
ES
DDS
DataMart
Pivotal
Greenplum
PostgreSQL
SP
Table
ODS
ETL
ETL
Streaming
Data
Pivotal HD
Pivotal
HAWQ
Data
Mart
 Pivotal HD – leading Hadoop distribution based on ODP
 Pivotal HAWQ and Apache HAWQ (incubating) – bringing the
power of MPP to the Hadoop cluster, best in class SQL-on-
Hadoop solution
 Apache Spark – component of the Pivotal HD distribution,
modern framework for distributed data processing
93Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
Mart
ODS
ETL
ETL
PostgreSQL
SP
Table
 Pivotal PostgreSQL – commercially supported by Pivotal
open source distribution of PostgreSQL
94Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
Data
MartPostgreSQL
SP
Table
ETL
ETL
ES
DDS
DataMart
Pivotal
Greenplum
ODS
Pivotal Greenplum – leading analytical MPP database,
foundation for the enterprise data warehousing systems and
advanced analytics
95Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
Pivotal GemFire
App
Spring XD
Streaming
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
Data Lake
96Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Spring XD
Streaming
ES
DDS
DataMart
Pivotal
Greenplum
PostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal GemFire
App
Streaming
Data
Pivotal HD
Pivotal
HAWQ
Data
Mart
BI
Lambda Architecture
97Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
ES
DDS
DataMart
Pivotal
Greenplum
PostgreSQL
SP
Table
ODS
ETL
ETL
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Streaming
Pivotal HD
BI
Pivotal GemFire
App
Spring XD
Streaming
Data
Pivotal
HAWQ
Data
Mart
Pipelining
98Pivotal Confidential–Internal Use Only
Pivotal and Modern Data Architecture
BI
Pivotal Cloud Foundry
HTTP
FE
…
App
App
App
Queue BE
…
App
App
App
Pivotal GemFire
App
Spring XD
Streaming
Streaming
Data
Pivotal HD
Pivotal
HAWQ
ES
DDS
DataMart
Pivotal
Greenplum
Data
MartPostgreSQL
SP
Table
ODS
ETL
ETL
99Pivotal Confidential–Internal Use Only 99Pivotal Confidential–Internal Use Only
Questions?
BUILT FOR THE SPEED OF BUSINESS

More Related Content

What's hot

Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesCarole Gunst
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief OverviewHal Kalechofsky
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Emerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big ThingEmerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big ThingDATAVERSITY
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lakepunedevscom
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningDatabricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Data Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesData Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesDATAVERSITY
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothAdaryl "Bob" Wakefield, MBA
 

What's hot (20)

Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data Pipelines
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Emerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big ThingEmerging Trends in Data Architecture – What’s the Next Big Thing
Emerging Trends in Data Architecture – What’s the Next Big Thing
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
Mdm: why, when, how
Mdm: why, when, howMdm: why, when, how
Mdm: why, when, how
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
What’s New with Databricks Machine Learning
What’s New with Databricks Machine LearningWhat’s New with Databricks Machine Learning
What’s New with Databricks Machine Learning
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Data Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesData Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph Databases
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
 

Viewers also liked

MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Seriesselvaraaju
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadeaviadea
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)Amazon Web Services
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterDatabricks
 
MapR Data Analyst
MapR Data AnalystMapR Data Analyst
MapR Data Analystselvaraaju
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 

Viewers also liked (14)

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
MapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase APIMapR M7: Providing an enterprise quality Apache HBase API
MapR M7: Providing an enterprise quality Apache HBase API
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
Apache Spark & Hadoop
Apache Spark & HadoopApache Spark & Hadoop
Apache Spark & Hadoop
 
MapR Tutorial Series
MapR Tutorial SeriesMapR Tutorial Series
MapR Tutorial Series
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and Smarter
 
MapR Data Analyst
MapR Data AnalystMapR Data Analyst
MapR Data Analyst
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 

Similar to Modern Data Architecture

The ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseThe ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseFederico Campoli
 
The care and feeding of a MySQL database
The care and feeding of a MySQL databaseThe care and feeding of a MySQL database
The care and feeding of a MySQL databaseDave Stokes
 
20120426 high availability MySQL
20120426 high availability MySQL20120426 high availability MySQL
20120426 high availability MySQLJui-Nan Lin
 
High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013Server Density
 
Pluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheeshPluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheeshRatheesh Kaniyala
 
All About Storeconfigs
All About StoreconfigsAll About Storeconfigs
All About StoreconfigsBrice Figureau
 
The Importance of Data
The Importance of DataThe Importance of Data
The Importance of DataTrendz Lab
 
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frPGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frjlb666
 
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierPhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierDave Stokes
 
IMS11 BMC Susbystem Optimizer - subzero
IMS11   BMC Susbystem Optimizer - subzeroIMS11   BMC Susbystem Optimizer - subzero
IMS11 BMC Susbystem Optimizer - subzeroRobert Hain
 
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Severalnines
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
The 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBAThe 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBApercona2013
 
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at ScaleVeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at ScaleJim Jones
 
The future of tape april 16
The future of tape april 16The future of tape april 16
The future of tape april 16Josef Weingand
 
Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016spectralogic
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relationalTony Tam
 

Similar to Modern Data Architecture (20)

Pinto+Stratopan+Love
Pinto+Stratopan+LovePinto+Stratopan+Love
Pinto+Stratopan+Love
 
The ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in TranswerwiseThe ninja elephant, scaling the analytics database in Transwerwise
The ninja elephant, scaling the analytics database in Transwerwise
 
The care and feeding of a MySQL database
The care and feeding of a MySQL databaseThe care and feeding of a MySQL database
The care and feeding of a MySQL database
 
20120426 high availability MySQL
20120426 high availability MySQL20120426 high availability MySQL
20120426 high availability MySQL
 
High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013High performance Infrastructure Oct 2013
High performance Infrastructure Oct 2013
 
Ds @ bol
Ds @ bolDs @ bol
Ds @ bol
 
Pluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheeshPluk2013 bodybuilding ratheesh
Pluk2013 bodybuilding ratheesh
 
All About Storeconfigs
All About StoreconfigsAll About Storeconfigs
All About Storeconfigs
 
The Importance of Data
The Importance of DataThe Importance of Data
The Importance of Data
 
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.frPGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
 
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and HealthierPhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
 
IMS11 BMC Susbystem Optimizer - subzero
IMS11   BMC Susbystem Optimizer - subzeroIMS11   BMC Susbystem Optimizer - subzero
IMS11 BMC Susbystem Optimizer - subzero
 
Splunk-EMC
Splunk-EMCSplunk-EMC
Splunk-EMC
 
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
The 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBAThe 5 Minute DBA-DBA Skills for Non-DBA
The 5 Minute DBA-DBA Skills for Non-DBA
 
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at ScaleVeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
VeeamON 2023 Architecting Veeam Backup for Microsoft 365 at Scale
 
The future of tape april 16
The future of tape april 16The future of tape april 16
The future of tape april 16
 
Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016Spectra Logic's BlackPearl Developers Summit 2016
Spectra Logic's BlackPearl Developers Summit 2016
 
Why Wordnik went non-relational
Why Wordnik went non-relationalWhy Wordnik went non-relational
Why Wordnik went non-relational
 

Recently uploaded

Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 

Recently uploaded (20)

Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 

Modern Data Architecture

  • 1. 1Pivotal Confidential–Internal Use Only 1Pivotal Confidential–Internal Use Only Modern Data Architecture Alexey Grishchenko
  • 2. 2Pivotal Confidential–Internal Use Only About me Enterprise Architect @ Pivotal  7 years in data processing  5 years with MPP  4 years with Hadoop  Spark contributor  http://0x0fff.com
  • 3. 3Pivotal Confidential–Internal Use Only How it started… Front End
  • 4. 4Pivotal Confidential–Internal Use Only How it started… Front End Back End
  • 5. 5Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS
  • 6. 6Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS What about BI?
  • 7. 7Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS Just put it there!
  • 8. 8Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS BI
  • 9. 9Pivotal Confidential–Internal Use Only How it started… Front End Back End DBMS BI Was it fast?
  • 10. 10Pivotal Confidential–Internal Use Only How it started… Front End 10ms Back End DBMS BI 100ms 200ms 1-2 min
  • 11. 11Pivotal Confidential–Internal Use Only How it started… Front End 10ms Back End DBMS BI 100ms 200ms 1-2 min yes, single server…
  • 12. 12Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 200ms 1-2 min More users got workstations
  • 13. 13Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 400ms 800ms 1-2 min
  • 14. 14Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 400ms 800ms 1-2 min Split!
  • 15. 15Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 300ms 600ms 1-2 min
  • 16. 16Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 300ms 600ms 1-2 min Even more users?
  • 17. 17Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 300ms 600ms 1-2 min Split!
  • 18. 18Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 400ms 1-2 min Front End Back End Front End Back End
  • 19. 19Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 400ms 1-2 min Front End Back End Front End Back End What about automated systems?
  • 20. 20Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 1 sec 5-10 min Front End Back End Front End Back End Front End Back End Front End Back End
  • 21. 21Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 1 sec 5-10 min Front End Back End Front End Back End Front End Back End Front End Back End Database, please, live!
  • 22. 22Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 1 sec 5-10 min Front End Back End Front End Back End Front End Back End Front End Back End
  • 23. 23Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 800ms 15-20 min Front End Back End Front End Back End Front End Back End Front End Back End
  • 24. 24Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 800ms 15-20 min Front End Back End Front End Back End Front End Back End Front End Back End What if “split” didn’t help this time?
  • 25. 25Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 800ms 15-20 min Front End Back End Front End Back End Front End Back End Front End Back End Split more! Eventually it will help…
  • 26. 26Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS
  • 27. 27Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS
  • 28. 28Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up!
  • 29. 29Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 300ms 35-40 min Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up! Sales went 20% down!
  • 30. 30Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 600ms 2-3 hrs Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up! Sales went 20% down!
  • 31. 31Pivotal Confidential–Internal Use Only First Issues Front End 10ms Back End DBMS BI 100ms 600ms 2-3 hrs Front End Back End Front End Back End Front End Back End Front End Back End DBMS DBMSDBMSDBMS Sales went 10% up! Sales went 20% down! Stop loading my system with your stupid reports!
  • 32. 32Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 2 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day
  • 33. 33Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 2 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day We need more reports!
  • 34. 34Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 3-4 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day Data Mining OLAP…
  • 35. 35Pivotal Confidential–Internal Use Only BI The Era of Data Warehouse 100ms DBMS 300ms 3-4 days FE BE DBMS DBMSDBMSDBMS FE BE FE BE FE BE FE BE ETL DWH 1 day Data Mining OLAP… We need secondary site!
  • 36. 36Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP…
  • 37. 37Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late
  • 38. 38Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late
  • 39. 39Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late Where is our DWH? We need this data now!
  • 40. 40Pivotal Confidential–Internal Use Only The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE WAL Replication 3-5 minutes late
  • 41. 41Pivotal Confidential–Internal Use Only ETL The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS
  • 42. 42Pivotal Confidential–Internal Use Only ETL The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS Why is this data so old?
  • 43. 43Pivotal Confidential–Internal Use Only ETL The Era of Data Warehouse 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS
  • 44. 44Pivotal Confidential–Internal Use Only ETL Advanced Architecture – ELT 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ETL DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS… ETL DDS Data Marts Reports Aggregates OLAP DBMS DBMS DBMS… ELT DDS Data Marts Reports Aggregates OLAP ODS ODS ODS…
  • 45. 45Pivotal Confidential–Internal Use Only ELT Advanced Architecture – ELT 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS
  • 46. 46Pivotal Confidential–Internal Use Only ELT Advanced Architecture – CDC 100ms 300ms 3-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 1 day BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late DWH BI Data Mining OLAP… 5-7 days DBMS DBMS DBMS DBMS DBMS DBMS DBMS DBMS… ELT DDS Data Marts Reports Aggregates OLAP ODS ODS ODS… DBMS DBMS DBMS… ELT DDS Data Marts Reports Aggregates OLAP ODS ODS ODS… CDC 1 day 1 hour
  • 47. 47Pivotal Confidential–Internal Use Only ELT CDC Advanced Architecture – CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH
  • 48. 48Pivotal Confidential–Internal Use Only ELT CDC Advanced Architecture – CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Why is our secondary site’s DWH so old?
  • 49. 49Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Moving Forward
  • 50. 50Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are Moving Forward
  • 51. 51Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days Moving Forward
  • 52. 52Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing Moving Forward
  • 53. 53Pivotal Confidential–Internal Use Only ELT CDC 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing  DWH MPP storage is expensive Moving Forward
  • 54. 54Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing  DWH MPP storage is expensive Data Lake
  • 55. 55Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Our problems are  Time to action takes up to 7 days  Amount of data is growing  DWH MPP storage is expensive Lambda Data Lake
  • 56. 56Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Data Lake 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH Hadoop DBMS DBMS DBMS… ELT DDS OLAP Data Marts Aggregates Reports ODS ODS ODS… CDC DWH ODS UDS Analytical Archives BI Data Mining OLAP SQL-on-Hadoop Data Mining At Scale
  • 57. 57Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Data Lake 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs BI Data Mining OLAP… FE BE FE BE FE BE FE BE FE BE WAL Replication 3-5 minutes late NAS NAS Backup / Restore 3 days late BI Data Mining OLAP… 4-7 days DBMS DBMS DBMS DBMS DBMS CDC DWH
  • 58. 58Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Data Lake 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late Data Mining BI OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ?
  • 59. 59Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Lambda 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late Data Mining BI OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? Source Data Speed Layer Batch Layer Serving Layer Query Query Master Dataset Batch View Batch View Batch View Real-time View Real-time View Real-time View
  • 60. 60Pivotal Confidential–Internal Use Only ELT CDC Modern Architectures – Lambda 100ms 300ms 1-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 3-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late Data Mining BI OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ?
  • 61. 61Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures – Lambda 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining
  • 62. 62Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are
  • 63. 63Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems
  • 64. 64Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?
  • 65. 65Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?  How to sync data in real-time systems?
  • 66. 66Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?  How to sync data in real-time systems?  How to better sync DWH?
  • 67. 67Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC Modern Architectures 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Our problems are  Too many standby systems  How to replicate Hadoop cluster?  How to sync data in real-time systems?  How to better sync DWH? Pipelining
  • 68. 68Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining
  • 69. 69Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP
  • 70. 70Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP
  • 71. 71Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Table
  • 72. 72Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table
  • 73. 73Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch
  • 74. 74Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL
  • 75. 75Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DWH
  • 76. 76Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DDS DWH
  • 77. 77Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DDS DataMart DWH
  • 78. 78Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv …SOAP OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL load ODS DDS DataMart DWH JDBC
  • 79. 79Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch ETL cp Batch ETL ODS DDS DataMart DWH JDBC
  • 80. 80Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatch
  • 81. 81Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatch loadETL
  • 82. 82Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL
  • 83. 83Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL STG BatchApp Hadoop HDFS SQL On Hadoop
  • 84. 84Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL STG BatchApp Hadoop HDFS SQL On Hadoop RTI App
  • 85. 85Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining FE BI App App App …HTTP BE Srv Srv Srv … OLTP SP JDBC Log Table CDC copy Parse Batch load ODS DDS DataMart DWH JDBC API Queue ETL ETLBatchApp ETLBatch load loadETL STG BatchApp Hadoop HDFS SQL On Hadoop RTI AppReplicate
  • 86. 86Pivotal Confidential–Internal Use Only In-Memory Data Store ELT CDC 100ms 300ms 0-4 days FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH 0-24 hrs OLAP Data Mining BI… FE BE FE BE FE BE NAS NAS Backup / Restore 2 days late OLAP… 3-6 days DBMS DBMS DBMS WAL Replication 3-5 minutes late CDC DWHHadoop Hadoop ? In-Memory Data Store RTDM BI Data Mining Modern Data Architecture – Pipelining
  • 87. 87Pivotal Confidential–Internal Use Only ELT CDC FE BE DBMS DBMS FE BE DBMS FE BE ELT DWH OLAP Data Mining RTBI… FE BE FE BE FE BE CDC Hadoop In-Memory Data Store BI Modern Data Architecture – Pipelining Replication Queue 3-5 minutes late In-Memory Data Store OLAP… DWHHadoop BI Data Mining RTBI DBMS DBMS DBMSWAL Replication 3-5 minutes late
  • 88. 88Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL
  • 89. 89Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI HTTP Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Pivotal Cloud Foundry FE … App App App Queue BE … App App App  Pivotal Labs – agile software development for next-generation applications  Pivotal Cloud Foundry – PaaS for customer applications  RabbitMQ – distributed message queue service on top of PCF  Spring IO – foundation platform for modern applications
  • 90. 90Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Pivotal GemFire App Pivotal GemFire and Apache Geode (incubating) – in-memory data grid enabling real-time data processing and real-time decision making for enterprises
  • 91. 91Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Spring XD Streaming Spring XD – unified, distributed and extensible framework for data pipelining: ingesting, batching, processing and exporting
  • 92. 92Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming ES DDS DataMart Pivotal Greenplum PostgreSQL SP Table ODS ETL ETL Streaming Data Pivotal HD Pivotal HAWQ Data Mart  Pivotal HD – leading Hadoop distribution based on ODP  Pivotal HAWQ and Apache HAWQ (incubating) – bringing the power of MPP to the Hadoop cluster, best in class SQL-on- Hadoop solution  Apache Spark – component of the Pivotal HD distribution, modern framework for distributed data processing
  • 93. 93Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data Mart ODS ETL ETL PostgreSQL SP Table  Pivotal PostgreSQL – commercially supported by Pivotal open source distribution of PostgreSQL
  • 94. 94Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ Data MartPostgreSQL SP Table ETL ETL ES DDS DataMart Pivotal Greenplum ODS Pivotal Greenplum – leading analytical MPP database, foundation for the enterprise data warehousing systems and advanced analytics
  • 95. 95Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture Pivotal GemFire App Spring XD Streaming BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL Data Lake
  • 96. 96Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Spring XD Streaming ES DDS DataMart Pivotal Greenplum PostgreSQL SP Table ODS ETL ETL Pivotal GemFire App Streaming Data Pivotal HD Pivotal HAWQ Data Mart BI Lambda Architecture
  • 97. 97Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture ES DDS DataMart Pivotal Greenplum PostgreSQL SP Table ODS ETL ETL Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Streaming Pivotal HD BI Pivotal GemFire App Spring XD Streaming Data Pivotal HAWQ Data Mart Pipelining
  • 98. 98Pivotal Confidential–Internal Use Only Pivotal and Modern Data Architecture BI Pivotal Cloud Foundry HTTP FE … App App App Queue BE … App App App Pivotal GemFire App Spring XD Streaming Streaming Data Pivotal HD Pivotal HAWQ ES DDS DataMart Pivotal Greenplum Data MartPostgreSQL SP Table ODS ETL ETL
  • 99. 99Pivotal Confidential–Internal Use Only 99Pivotal Confidential–Internal Use Only Questions?
  • 100. BUILT FOR THE SPEED OF BUSINESS