SlideShare a Scribd company logo
1 of 21
Download to read offline
1
Webinar:
Automation of Data Warehouses / Lakes / Stores
About the speaker
Our competencies
Company stats
This webinar
begins shortly
Petr Hájek
Senior Advisor
Information Management
SOFTWARE DEVELOPMENT
APPLICATION OUTSOURCING
ENTERPRISE INTEGRATION
BUSINESS INTELLIGENCE/DWH
BIG DATA AND DATA SCIENCE
22+ yrs.
On the
tech market
since 1998.
Prague
Headquarters
at cenrte of
Europe.
500+
Experienced
and enthusiastic
professionals.
Top 3
CAD company
in Czech Republic
(IDC study).
28M €
Company
revenue in
2019.
Multiple areas
Clients from
Finance, Insurance
and Telco industry.
– Profinit Data Management
Competency Senior Advisor
– 20+ years of professional
experience in Data Management,
Data Warehousing and Business
Intelligence
– Data Architect, Solution Architect,
Management Consultant
– Citigroup, KPMG
Petr Hájek February 3, 2021
Webinar:
Automation of Data Warehouses /
Data Lakes / Data Stores
About the Lecturer
› Profinit Data Management
Competency Senior Advisor
› 20+ years of professional
experience in Data Management,
Data Warehousing and Business
Intelligence
› Data Architect, Solution Architect,
Management Consultant
› Citigroup, KPMG
Petr Hájek
Seasoned and Windswept
Information Management
Professional
4
About Profinit
ISO 9000 ISO 27000 ISO 20000
Our competencies
Company stats
SOFTWARE DEVELOPMENT
APPLICATION OUTSOURCING
ENTERPRISE INTEGRATION
BUSINESS INTELLIGENCE/DWH
BIG DATA AND DATA SCIENCE
22+ yrs.
On the
tech market
since 1998.
Prague
Headquarters
at cenrte of
Europe.
500+
Experienced
and enthusiastic
professionals.
Top 3
CAD company
in Czech Republic
(IDC study).
28M €
Company
revenue in
2019.
Multiple areas
Clients from
Finance, Insurance
and Telco industry.
Certifications, culture & quality
50+
We serve
many prominent
world clients.
A long history of technical engineering excellence has lead western
companies to rely heavily on skills and expertise from the Czech
Republic. We are proud of the quality of our services and the certificates
ISO 9001, ISO 27001, ISO 20000, PRINCE 2, underpinning our
commitment to provide high quality sustainable services.
5
What are the specifics of “Data Oriented Solutions” ?
› Data Warehouses (DWH)
› Operational Data Stores (ODS)
› Data Lakes (DL)
› Accummulation of Large and
Historical Data from
Heterogenous Sources
› High Complexity & Robustness
6
Data oriented solutions are usually:
UNDOCUMENTED FRAGILE
COMPLICATED
7
What are the major challenges?
› Manual coding
› Too many people to
be organized
› Lack of transparency
8
Reduce the complexity: elementary building blocks
Data Oriented Solutions are:
orchestrated steps of transformations and storage of data
Data Transformation Data Transformation Data Transformation Data
9
Decomposition to Metadata / Rules / Templates
Metadata Templates
Rules
10
Generating scripts like printing envelopes
Everything that
is variable (dynamic)
will be written in the
form of metadata.
Everything that
is repeated (static)
will be prepared in
templates.
Metadata Templates
Rules
Generator
11
Framework
FRAMEWORK DATA SOLUTION
Metadata Templates
Rules
Smart
Automation
Reverse
engineering
12
Investment into the framework
TIME &
COSTS
SCOPE, QUANTITY, COMPLEXITY, ROBUSTNESS
FRAME
WORK
SETUP
Gartner:
200
%
-
400
%
productivity
increase
Source:
Automating
Data
Warehouse
Development,
2020,
G00465794
13
Transformations Patterns
Technical
› Surrogate key assignment
› Consolidation (Deduplication)
› Historization
Mapping
› 1:1
› Column level mapping
› Join & Lookup
› Aggregation
› Union
14
Maintain the framework and automate the lifecycle!
BUSINESS
ANALYST
TECHNICAL
ANALYST
FRAMEWORK
ENGINEER
OPERATIONS
& SUPPORT
BUSINESS
USERS
Business
specs
Prepare
METADATA
15
Enable agile development
BUSINESS
ANALYST
TECHNICAL
ANALYST
BUSINESS
USER
SQUAD
5
BUSINESS
ANALYST
TECHNICAL
ANALYST
BUSINESS
USER
SQUAD
4
BUSINESS
ANALYST
TECHNICAL
ANALYST
BUSINESS
USER
SQUAD
3
BUSINESS
ANALYST
TECHNICAL
ANALYST
BUSINESS
USER
SQUAD
2
FRAMEWORK
ENGINEER
OPERATIONS
& SUPPORT
BUSINESS ANALYST
TECHNICAL ANALYST
BUSINESS USER
SQUAD
1
16
Case Study – DWH for Gambling Industry Regulation
 Data from ~80 public gambling companies loaded every 8 hours
 Maintaining over 25 TB of data in >100.000 database tables
 First version delivered in 5 months by a team of 5 engineers
 99.9 % of SQL code automatically generated
 "DATA_FRAME" - DWH Automation methodology & toolset
17
Advantages of Smart Data Automation
We do the analysis by reusing pre-defined patterns
The result of the analysis are always both human and machine-readable
We foster prototyping
We enjoy the „license to make mistakes“ at almost no cost
We provide immediate feedback in case of errors in the designed metadata
We eliminate manual coding
We streamline the whole development lifecycle and minimize time-to-market
We enable data lineage tracking even before the solution is implemented
18
Traditional Approach vs. Smart Automation
UNDOCUMENTED FRAGILE
TRANSPARENT AGILE
COMPLICATED
ORGANIZED
19
Questions?
20
Webinar:
Automation of Data Oriented Solutions
We need your help to be better!
The webinar
has ended.
Thank you
very much for
attending!
Since you are here, please help us
improve our events and vebinars and take
a look at our short survey. We appreciate
your interest to help us grow.
Contacts
www.bigdataforbanking.com
linkedin.com/company/profinit
petr.hajek@profinit.eu
www.profinit.eu
Petr Hájek
Senior Advisor
Information Management
Profinit EU, s.r.o.
Tychonova 2, 160 00 Prague 6 | Phone + 420 224 316 016
Web
www.profinit.eu
LinkedIn
linkedin.com/company/profinit
Twitter
twitter.com/Profinit_EU
Facebook
facebook.com/Profinit.EU
Youtube
Profinit EU
Thank you
for your attention

More Related Content

What's hot

Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big DataRavinder Kamboj
 
Computing Notes Chapter 1 Zimsec Zimbabwe Alpro Cambridge
Computing Notes Chapter 1 Zimsec Zimbabwe Alpro CambridgeComputing Notes Chapter 1 Zimsec Zimbabwe Alpro Cambridge
Computing Notes Chapter 1 Zimsec Zimbabwe Alpro CambridgeAlpro
 
DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?
DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?
DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?US-Analytics
 
The Database Environment Chapter 1
The Database Environment Chapter 1The Database Environment Chapter 1
The Database Environment Chapter 1Jeanie Arnoco
 
DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?
DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?
DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?US-Analytics
 
RESUME 2016+ updated RDX
RESUME 2016+ updated RDXRESUME 2016+ updated RDX
RESUME 2016+ updated RDXRoderick Hynson
 
Informatica
InformaticaInformatica
Informaticamukharji
 
DRM Webinar Series, PART 4: Best Practices, Unlocked
DRM Webinar Series, PART 4: Best Practices, UnlockedDRM Webinar Series, PART 4: Best Practices, Unlocked
DRM Webinar Series, PART 4: Best Practices, UnlockedUS-Analytics
 
DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...
DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...
DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...US-Analytics
 
Resume_GALINA11_GS-1
Resume_GALINA11_GS-1Resume_GALINA11_GS-1
Resume_GALINA11_GS-1Galina Bergan
 
Warehouse Planning and Implementation
Warehouse Planning and ImplementationWarehouse Planning and Implementation
Warehouse Planning and ImplementationSHIKHA GAUTAM
 
Data Governance for EPM Systems with Oracle DRM
Data Governance for EPM Systems with Oracle DRMData Governance for EPM Systems with Oracle DRM
Data Governance for EPM Systems with Oracle DRMUS-Analytics
 
E&P data management: Implementing data standards
E&P data management: Implementing data standardsE&P data management: Implementing data standards
E&P data management: Implementing data standardsETLSolutions
 
Data integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcuttaData integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcuttaBhawani N Prasad
 

What's hot (20)

Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big Data
 
Computing Notes Chapter 1 Zimsec Zimbabwe Alpro Cambridge
Computing Notes Chapter 1 Zimsec Zimbabwe Alpro CambridgeComputing Notes Chapter 1 Zimsec Zimbabwe Alpro Cambridge
Computing Notes Chapter 1 Zimsec Zimbabwe Alpro Cambridge
 
DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?
DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?
DRM Webinar Series, PART 1: Barriers Preventing You From Getting Started?
 
DDL DML sysytems
DDL DML sysytemsDDL DML sysytems
DDL DML sysytems
 
Bank Tech Asia 2012
Bank Tech Asia 2012Bank Tech Asia 2012
Bank Tech Asia 2012
 
The Database Environment Chapter 1
The Database Environment Chapter 1The Database Environment Chapter 1
The Database Environment Chapter 1
 
DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?
DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?
DRM Webinar Series, PART 3: Will DRM Integrate With Our Applications?
 
RESUME 2016+ updated RDX
RESUME 2016+ updated RDXRESUME 2016+ updated RDX
RESUME 2016+ updated RDX
 
Informatica
InformaticaInformatica
Informatica
 
DRM Webinar Series, PART 4: Best Practices, Unlocked
DRM Webinar Series, PART 4: Best Practices, UnlockedDRM Webinar Series, PART 4: Best Practices, Unlocked
DRM Webinar Series, PART 4: Best Practices, Unlocked
 
DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...
DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...
DRM Webinar Series, PART 2: Concerned You're Not Getting the Most Out of Orac...
 
Resume_GALINA11_GS-1
Resume_GALINA11_GS-1Resume_GALINA11_GS-1
Resume_GALINA11_GS-1
 
Warehouse Planning and Implementation
Warehouse Planning and ImplementationWarehouse Planning and Implementation
Warehouse Planning and Implementation
 
Mydbms
MydbmsMydbms
Mydbms
 
Data Governance for EPM Systems with Oracle DRM
Data Governance for EPM Systems with Oracle DRMData Governance for EPM Systems with Oracle DRM
Data Governance for EPM Systems with Oracle DRM
 
E&P data management: Implementing data standards
E&P data management: Implementing data standardsE&P data management: Implementing data standards
E&P data management: Implementing data standards
 
Data integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcuttaData integration ppt-bhawani nandan prasad - iim calcutta
Data integration ppt-bhawani nandan prasad - iim calcutta
 
ChakravarthyUppara
ChakravarthyUpparaChakravarthyUppara
ChakravarthyUppara
 
Data Warehouse 102
Data Warehouse 102Data Warehouse 102
Data Warehouse 102
 
PG_resume (2)
PG_resume (2)PG_resume (2)
PG_resume (2)
 

Similar to Automating Data Lakes, Data Warehouses and Data Stores

Become More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataBecome More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataDenodo
 
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...Amazon Web Services Korea
 
Key Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo PlatformKey Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo PlatformDenodo
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Denodo
 
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDemystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDenodo
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Microsoft Dynamics CRM- Customer Testimonial by eBRC
Microsoft Dynamics CRM- Customer Testimonial by eBRCMicrosoft Dynamics CRM- Customer Testimonial by eBRC
Microsoft Dynamics CRM- Customer Testimonial by eBRCNerea
 
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...Enterprise Management Associates
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkkguest4e975e2
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 
Big data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sectorBig data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sectorNicolas Sarramagna
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemCapgemini
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataPentaho
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Denodo
 
Analytics on system z final
Analytics on system z finalAnalytics on system z final
Analytics on system z finalPeter Schouboe
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...confluent
 
Electronic equipment selection software - Right Information
Electronic equipment selection software - Right InformationElectronic equipment selection software - Right Information
Electronic equipment selection software - Right InformationRight Information
 
Accelerate Your B2B Supply Chain in the Cloud
Accelerate Your B2B Supply Chain in the CloudAccelerate Your B2B Supply Chain in the Cloud
Accelerate Your B2B Supply Chain in the CloudJijesh Devan
 

Similar to Automating Data Lakes, Data Warehouses and Data Stores (20)

Become More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataBecome More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP Data
 
Enabling 360-degree Business Insights with SAP Data
Enabling 360-degree Business Insights with SAP DataEnabling 360-degree Business Insights with SAP Data
Enabling 360-degree Business Insights with SAP Data
 
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
엔터프라이즈의 AI/ML 활용을 돕는 Paxata 지능형 데이터 전처리 플랫폼 (최문규 이사, PAXATA) :: AWS Techforum...
 
Key Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo PlatformKey Considerations While Rolling Out Denodo Platform
Key Considerations While Rolling Out Denodo Platform
 
Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)Data Virtualization for Data Architects (New Zealand)
Data Virtualization for Data Architects (New Zealand)
 
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDemystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Microsoft Dynamics CRM- Customer Testimonial by eBRC
Microsoft Dynamics CRM- Customer Testimonial by eBRCMicrosoft Dynamics CRM- Customer Testimonial by eBRC
Microsoft Dynamics CRM- Customer Testimonial by eBRC
 
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
Big data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sectorBig data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sector
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
Analytics on system z final
Analytics on system z finalAnalytics on system z final
Analytics on system z final
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
 
Electronic equipment selection software - Right Information
Electronic equipment selection software - Right InformationElectronic equipment selection software - Right Information
Electronic equipment selection software - Right Information
 
Accelerate Your B2B Supply Chain in the Cloud
Accelerate Your B2B Supply Chain in the CloudAccelerate Your B2B Supply Chain in the Cloud
Accelerate Your B2B Supply Chain in the Cloud
 

More from Profinit

Reference Data Management
Reference Data ManagementReference Data Management
Reference Data ManagementProfinit
 
Cloud in examples—(how to) benefit from modern technologies in the cloud
Cloud in examples—(how to) benefit from modern technologies in the cloudCloud in examples—(how to) benefit from modern technologies in the cloud
Cloud in examples—(how to) benefit from modern technologies in the cloudProfinit
 
Building big data pipelines—lessons learned
Building big data pipelines—lessons learnedBuilding big data pipelines—lessons learned
Building big data pipelines—lessons learnedProfinit
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation Profinit
 
Propensity Modelling for Banks
Propensity Modelling for BanksPropensity Modelling for Banks
Propensity Modelling for BanksProfinit
 
Legacy systems modernisation
Legacy systems modernisationLegacy systems modernisation
Legacy systems modernisationProfinit
 
4 Steps Towards Data Transparency
4 Steps Towards Data Transparency4 Steps Towards Data Transparency
4 Steps Towards Data TransparencyProfinit
 
Software systems modernisation
Software systems modernisationSoftware systems modernisation
Software systems modernisationProfinit
 
Odborná snídaně: Datový sklad jako Perpetuum Mobile
Odborná snídaně: Datový sklad jako Perpetuum MobileOdborná snídaně: Datový sklad jako Perpetuum Mobile
Odborná snídaně: Datový sklad jako Perpetuum MobileProfinit
 
Data Science a MLOps v prostředí cloudu
Data Science a MLOps v prostředí clouduData Science a MLOps v prostředí cloudu
Data Science a MLOps v prostředí clouduProfinit
 
Detekce sociálních vazeb: domácnosti a přátelé
Detekce sociálních vazeb: domácnosti a přáteléDetekce sociálních vazeb: domácnosti a přátelé
Detekce sociálních vazeb: domácnosti a přáteléProfinit
 
Výsledky backtestu propensitního modelu
Výsledky backtestu propensitního modeluVýsledky backtestu propensitního modelu
Výsledky backtestu propensitního modeluProfinit
 
Propensitní modelování
Propensitní modelováníPropensitní modelování
Propensitní modelováníProfinit
 
Profinit Webinar: Benefits of Software Systems Modernization over their Repla...
Profinit Webinar: Benefits of Software Systems Modernization over their Repla...Profinit Webinar: Benefits of Software Systems Modernization over their Repla...
Profinit Webinar: Benefits of Software Systems Modernization over their Repla...Profinit
 
Profinit webinar: Instalment Detector
Profinit webinar: Instalment DetectorProfinit webinar: Instalment Detector
Profinit webinar: Instalment DetectorProfinit
 
Profinit_snidane_DWH_22_10_2019_publish
Profinit_snidane_DWH_22_10_2019_publishProfinit_snidane_DWH_22_10_2019_publish
Profinit_snidane_DWH_22_10_2019_publishProfinit
 
2019 09-23-snidane qa-public
2019 09-23-snidane qa-public2019 09-23-snidane qa-public
2019 09-23-snidane qa-publicProfinit
 
2019 03-20 snidane-serie-kuchyne-full
2019 03-20 snidane-serie-kuchyne-full2019 03-20 snidane-serie-kuchyne-full
2019 03-20 snidane-serie-kuchyne-fullProfinit
 
2018 11-28 snidane-serie-kuchyne
2018 11-28 snidane-serie-kuchyne2018 11-28 snidane-serie-kuchyne
2018 11-28 snidane-serie-kuchyneProfinit
 
Matedatový sklad
Matedatový skladMatedatový sklad
Matedatový skladProfinit
 

More from Profinit (20)

Reference Data Management
Reference Data ManagementReference Data Management
Reference Data Management
 
Cloud in examples—(how to) benefit from modern technologies in the cloud
Cloud in examples—(how to) benefit from modern technologies in the cloudCloud in examples—(how to) benefit from modern technologies in the cloud
Cloud in examples—(how to) benefit from modern technologies in the cloud
 
Building big data pipelines—lessons learned
Building big data pipelines—lessons learnedBuilding big data pipelines—lessons learned
Building big data pipelines—lessons learned
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
Propensity Modelling for Banks
Propensity Modelling for BanksPropensity Modelling for Banks
Propensity Modelling for Banks
 
Legacy systems modernisation
Legacy systems modernisationLegacy systems modernisation
Legacy systems modernisation
 
4 Steps Towards Data Transparency
4 Steps Towards Data Transparency4 Steps Towards Data Transparency
4 Steps Towards Data Transparency
 
Software systems modernisation
Software systems modernisationSoftware systems modernisation
Software systems modernisation
 
Odborná snídaně: Datový sklad jako Perpetuum Mobile
Odborná snídaně: Datový sklad jako Perpetuum MobileOdborná snídaně: Datový sklad jako Perpetuum Mobile
Odborná snídaně: Datový sklad jako Perpetuum Mobile
 
Data Science a MLOps v prostředí cloudu
Data Science a MLOps v prostředí clouduData Science a MLOps v prostředí cloudu
Data Science a MLOps v prostředí cloudu
 
Detekce sociálních vazeb: domácnosti a přátelé
Detekce sociálních vazeb: domácnosti a přáteléDetekce sociálních vazeb: domácnosti a přátelé
Detekce sociálních vazeb: domácnosti a přátelé
 
Výsledky backtestu propensitního modelu
Výsledky backtestu propensitního modeluVýsledky backtestu propensitního modelu
Výsledky backtestu propensitního modelu
 
Propensitní modelování
Propensitní modelováníPropensitní modelování
Propensitní modelování
 
Profinit Webinar: Benefits of Software Systems Modernization over their Repla...
Profinit Webinar: Benefits of Software Systems Modernization over their Repla...Profinit Webinar: Benefits of Software Systems Modernization over their Repla...
Profinit Webinar: Benefits of Software Systems Modernization over their Repla...
 
Profinit webinar: Instalment Detector
Profinit webinar: Instalment DetectorProfinit webinar: Instalment Detector
Profinit webinar: Instalment Detector
 
Profinit_snidane_DWH_22_10_2019_publish
Profinit_snidane_DWH_22_10_2019_publishProfinit_snidane_DWH_22_10_2019_publish
Profinit_snidane_DWH_22_10_2019_publish
 
2019 09-23-snidane qa-public
2019 09-23-snidane qa-public2019 09-23-snidane qa-public
2019 09-23-snidane qa-public
 
2019 03-20 snidane-serie-kuchyne-full
2019 03-20 snidane-serie-kuchyne-full2019 03-20 snidane-serie-kuchyne-full
2019 03-20 snidane-serie-kuchyne-full
 
2018 11-28 snidane-serie-kuchyne
2018 11-28 snidane-serie-kuchyne2018 11-28 snidane-serie-kuchyne
2018 11-28 snidane-serie-kuchyne
 
Matedatový sklad
Matedatový skladMatedatový sklad
Matedatový sklad
 

Recently uploaded

Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 

Recently uploaded (20)

Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 

Automating Data Lakes, Data Warehouses and Data Stores

  • 1. 1 Webinar: Automation of Data Warehouses / Lakes / Stores About the speaker Our competencies Company stats This webinar begins shortly Petr Hájek Senior Advisor Information Management SOFTWARE DEVELOPMENT APPLICATION OUTSOURCING ENTERPRISE INTEGRATION BUSINESS INTELLIGENCE/DWH BIG DATA AND DATA SCIENCE 22+ yrs. On the tech market since 1998. Prague Headquarters at cenrte of Europe. 500+ Experienced and enthusiastic professionals. Top 3 CAD company in Czech Republic (IDC study). 28M € Company revenue in 2019. Multiple areas Clients from Finance, Insurance and Telco industry. – Profinit Data Management Competency Senior Advisor – 20+ years of professional experience in Data Management, Data Warehousing and Business Intelligence – Data Architect, Solution Architect, Management Consultant – Citigroup, KPMG
  • 2. Petr Hájek February 3, 2021 Webinar: Automation of Data Warehouses / Data Lakes / Data Stores
  • 3. About the Lecturer › Profinit Data Management Competency Senior Advisor › 20+ years of professional experience in Data Management, Data Warehousing and Business Intelligence › Data Architect, Solution Architect, Management Consultant › Citigroup, KPMG Petr Hájek Seasoned and Windswept Information Management Professional
  • 4. 4 About Profinit ISO 9000 ISO 27000 ISO 20000 Our competencies Company stats SOFTWARE DEVELOPMENT APPLICATION OUTSOURCING ENTERPRISE INTEGRATION BUSINESS INTELLIGENCE/DWH BIG DATA AND DATA SCIENCE 22+ yrs. On the tech market since 1998. Prague Headquarters at cenrte of Europe. 500+ Experienced and enthusiastic professionals. Top 3 CAD company in Czech Republic (IDC study). 28M € Company revenue in 2019. Multiple areas Clients from Finance, Insurance and Telco industry. Certifications, culture & quality 50+ We serve many prominent world clients. A long history of technical engineering excellence has lead western companies to rely heavily on skills and expertise from the Czech Republic. We are proud of the quality of our services and the certificates ISO 9001, ISO 27001, ISO 20000, PRINCE 2, underpinning our commitment to provide high quality sustainable services.
  • 5. 5 What are the specifics of “Data Oriented Solutions” ? › Data Warehouses (DWH) › Operational Data Stores (ODS) › Data Lakes (DL) › Accummulation of Large and Historical Data from Heterogenous Sources › High Complexity & Robustness
  • 6. 6 Data oriented solutions are usually: UNDOCUMENTED FRAGILE COMPLICATED
  • 7. 7 What are the major challenges? › Manual coding › Too many people to be organized › Lack of transparency
  • 8. 8 Reduce the complexity: elementary building blocks Data Oriented Solutions are: orchestrated steps of transformations and storage of data Data Transformation Data Transformation Data Transformation Data
  • 9. 9 Decomposition to Metadata / Rules / Templates Metadata Templates Rules
  • 10. 10 Generating scripts like printing envelopes Everything that is variable (dynamic) will be written in the form of metadata. Everything that is repeated (static) will be prepared in templates. Metadata Templates Rules Generator
  • 11. 11 Framework FRAMEWORK DATA SOLUTION Metadata Templates Rules Smart Automation Reverse engineering
  • 12. 12 Investment into the framework TIME & COSTS SCOPE, QUANTITY, COMPLEXITY, ROBUSTNESS FRAME WORK SETUP Gartner: 200 % - 400 % productivity increase Source: Automating Data Warehouse Development, 2020, G00465794
  • 13. 13 Transformations Patterns Technical › Surrogate key assignment › Consolidation (Deduplication) › Historization Mapping › 1:1 › Column level mapping › Join & Lookup › Aggregation › Union
  • 14. 14 Maintain the framework and automate the lifecycle! BUSINESS ANALYST TECHNICAL ANALYST FRAMEWORK ENGINEER OPERATIONS & SUPPORT BUSINESS USERS Business specs Prepare METADATA
  • 16. 16 Case Study – DWH for Gambling Industry Regulation  Data from ~80 public gambling companies loaded every 8 hours  Maintaining over 25 TB of data in >100.000 database tables  First version delivered in 5 months by a team of 5 engineers  99.9 % of SQL code automatically generated  "DATA_FRAME" - DWH Automation methodology & toolset
  • 17. 17 Advantages of Smart Data Automation We do the analysis by reusing pre-defined patterns The result of the analysis are always both human and machine-readable We foster prototyping We enjoy the „license to make mistakes“ at almost no cost We provide immediate feedback in case of errors in the designed metadata We eliminate manual coding We streamline the whole development lifecycle and minimize time-to-market We enable data lineage tracking even before the solution is implemented
  • 18. 18 Traditional Approach vs. Smart Automation UNDOCUMENTED FRAGILE TRANSPARENT AGILE COMPLICATED ORGANIZED
  • 20. 20 Webinar: Automation of Data Oriented Solutions We need your help to be better! The webinar has ended. Thank you very much for attending! Since you are here, please help us improve our events and vebinars and take a look at our short survey. We appreciate your interest to help us grow. Contacts www.bigdataforbanking.com linkedin.com/company/profinit petr.hajek@profinit.eu www.profinit.eu Petr Hájek Senior Advisor Information Management
  • 21. Profinit EU, s.r.o. Tychonova 2, 160 00 Prague 6 | Phone + 420 224 316 016 Web www.profinit.eu LinkedIn linkedin.com/company/profinit Twitter twitter.com/Profinit_EU Facebook facebook.com/Profinit.EU Youtube Profinit EU Thank you for your attention