Getting Denormalized Data from SAP in less than 15 minute intervals using NiFi, then automating the process so NiFi builds the flow for you using a database crawler, data type inference engine and api calls to build the NiFi flows.
1. Bridging the Gap
John Kuchmek – American Water
Adam Michalsky – American Water
Nagaraj Jayakumar - Hortonworks
2. WHO WE ARE
We serve a broad national footprint and a strong
local presence.
We provide services to approximately 15 million
people in 46 states and Ontario, Canada.
We employ 6,900 dedicated and active employees
and support ongoing community support and
corporate responsibility.
We treat and deliver more than one billion gallons
of water daily.
We are the largest and most geographically
diverse publicly traded water and wastewater
service provider in the Unites States.
3. Problem Statement
Achieve fast change data capture from SAP while providing de-
normalized data sets to end consumers without impacting the
source transactional systems.
Hana table replication maintains source system normalization
which can be a problem for business logic design in application
use
No Hana change data capture existed using denormalized table
structures
4. Environment
4 Management Nodes:
(32 Cores x 78 GB)
8 Compute Nodes
(32 Cores x 128 GB)
2 Management Nodes:
(6 Cores x 16 GB)
5 NiFi Nodes
(16 Cores x 64 GB)
14. Average Memory Used (hourly)
0
10
20
30
40
50
60
70
80
90
100
MEMORYINGB
TIME
Average Memory Used Across 8 Node Cluster
Average of Minimum Memory Used
Average of Average Memory Used
Average of Peak Memory Used
The end result in HANA will look like this. UPDATE_TS is our timestamp field.
Special Notes:
A timestamp will only be updated once a change occurs. After initial replication timestamps will be null or 0.
If you want to add a timestamp on a table that already exists on SLT then it needs to be re-replicated.