SlideShare a Scribd company logo
Fast forward
Data
Warehouse/ETL Testing
Migration Testing
The Agile Way
ETL Testing & Monitoring Platform
Author: Sandesh Gawande
CTO -iCEDQ
Torana, Inc.
Email: Sandesh.g@ToranaInc.com
Office: 203 666 4442
Twitter: @sandesh_gawande
Skype: Sandesh.Gawande.ToranaInc
LinkedIn: https://www.linkedin.com/in/sandesh-
gawande-1a25757
About: Torana, Inc.
 Year established 2005
 Stamford, CT
 Developers of Big Data Integration & Data
Migration, Data warehouse/ETL Testing
Software
 Fortune 500 customers in Banking,
Insurance, Healthcare, e-commerce,
Manufacturing
1x
100x
10,000x
Find the issues
early
or the costs are too
High
Finding Issues in QA Stage is the Best, but
QA is…
 Not Agile
 With waterfall approch its too late...
 Not Automated
 Manual data checks  Wasted Time
 No repeatability or consistency
 No way to test millions of rows
 Wrong focus on creating scripts rather than the business problem
 Cannot reconcile data across systems (e.g. Files vs. Database)
 Not Collobrative
 QA teams work in isolation
 No feedabck to developers or business users
 Disoragnized
 No Transparancy
 Late Discovery of Issues
 Project Failure or High Costs
Not Agile
Manual & Slow
Not Collaborative
Or Feedback
No Transparency
or Compliance
But Why is it so Difficult to
Automate ETL Testing?
 ETL Processes don’t have screens
 Conventional QA Automation product were designed for screen
based testing
 New Concepts
 Source Data + Transformation = Target Data
 Quality of an ETL Process = Expected Data vs. Actual Data
 Most developers are from traditional software development
 New to concepts such as data reconciliation for ETL Testing
 Mix up of QA/QC concept with Data Quality
 High Volume of Data (Millions of rows)
 Since the source data and target data could be in two different
systems reconciliation is difficult
Introducing iCEDQ…
Automated & Fast
Agile
Collaborative
Feedback &
Transparency
An automation platform for:
A.Data Warehouse Testing
• ETL Testing
• MDM Testing
• Data Integration Testing
B.Data Migration Testing
C.Data Monitoring
iCEDQ has in-Memory Rules Engine
 It Tests ETL Transformations by
 Validating the output data generated by ETL Validation Rule
Reconciliation Rule
 It Tests ETL Transformations by
 Reconciling Source Data Vs. Target Data
 …
ETL
Data Warehouse/ETL Test Automation
Data Sources ETL Data Warehouse
Tech Validation Test Biz Validation Test
Biz Reconciliation Test
Tech Reconciliation Test
Validation Tests Reconciliation Tests
Technical Validation Rule Business Validate Rules Technical Reconciliation Rules Business Reconciliation Rules
Validate incoming data before processing.
Test for…
• Data format
• Nulls
• Data types
• more
Business rules based validation will
indicate if there is an data issue becuase
of ETL processes, Source data or wrong
requirements...
• Check if Net Amount =? Gross Amout
– (Taxes + Fees + Commisions)
These rules test specific to an ETL
process which is doing transformation…
• An ETL processes calculating end of
day balances from daily transactions
can be tested. Sum of todays
transactions =? Today’s End of Day
balance – Yesterday’s End of Day
balance
These tests are designed to test the overall
system independent of the ETL processes,
Source data or business requirements
A
Data Migration Test Automation
Legacy System ETL New System
Initial Reconciliation Test
Post Reconciliation Test
Initial Migration Testing Post Migration Testing
1st Create the data structures in the target system. Ex. Table, columns. 2nd
copy the initial data from the legacy system to the new database
• iCEDQ can validate the tables, columns, data types & precision
• Reconcile the legacy vs. target data to make sure they have the same
initial state
Once the initial state is populated & tested. The post migration phase
involves. Feeding the same data or triggering of same business processes
in legacy system and the new system.
• iCEDQ can reconcile the data to make sure the after running the
business processes the data generated same
• Because regardless of the system change, unless there is a business
rule change the net output from business point of view must be same
B
Production Data Monitoring Automation
Source Stage Stage Data Warehouse Data Warehouse Data
Marts
Data Marts Reports /
Extracts
Process
Load Stage
Customer
Process
Load Stage
Policy
Process
Load Stage
Claims
Process
Load Dim
Customer
Process
Load Daily
Claims
Process
Load Month
Policy
Process
Load P&L
Process
Load Dim
Customer
Process
Load Month
Claims
Start
Stop
Monitoring in Series Monitoring in Parallel
Embed iCEDQ Rules in the batch process
• If Audit Fails the users are notified and the process can be stopped
automatically
The Audit Rules are run in parallel to the batch process
• If Audit Fails the users are notified but the process is not stopped
automatically
iCEDQ
C
UserStory
Tech Requirements MappingDocument ETL Process
Audit Requirements Test Case
iCEDQ Rule 1
iCEDQ Rule 2
…
 Test processes in parallel to the
development pipeline
 No reasons to wait!
iCEDQ is Agile
Development Pipeline
QA Pipeline
iCEDQ-Central Repository & Collaborative
 Centralized Repository for
Rules Library
 An collaborative
environment to work
together
 Work together regardless
of the
 Location
 time
 Role
iCEDQ-Feedback & Transparency
 Dashboard
 Fails & Custom Reports
 Integration with ALM & Issue management
 Auto Notification
 Ability to drill down to an defect
 Audit Logs & execution histroy…
iCEDQ-What changed?
Before After
NO Reconcile Across Files & database YES
Very Complicated SQL NO SQL or Simple SQL
Test millions of rows
Cost
Test Coverage
NO Repeatability & Consistency YES
NO Scheduling YES
Desktop Based Test Execution Server Based
NO Transparency & Reporting YES
Cost of Defect
NO Regression Testing & Audit YES
NO Production Monitoring YES
1000… Millions…
100% 60%
High Low
HighLow
Who uses iCEDQ?
 Stock Exchange
 Banks
 Insurance
 Manufacturing
 Healthcare
 E-Commerce
 Manufacturing
 …
iCEDQ Healthcare Client
iCEDQ Usage
 iCEDQ was used for Migration Testing
 Test provider data migration from Mainframe to
MDM
 iCEDQ Enterprise Data Warehouse Testing
 Test Members Data, Enrolment data, Plans Data,
Claims Data load from Legacy to (Enterprise Data
warehouse)EDW & Health Rules to EDW
 iCEDQ to Validate External Feeds
 Test data feeds to State of Maryland, CMS
(Centers for Medicare & Medicaid Services)
iCEDQ Feedback
 Helped Finalize Requirements
 It found anomalies in the requirements and
mapping documents and provided feedback
 Helped Test Automation
 It was able to automatically reconcile feeds from
legacy as well as new system.
 This was impossible to test manually
 Transparency to management
 It linked with defect management system and auto
generated status
Fast forward Data Warehouse
& Migration Testing
ETL Testing & Monitoring Platform

More Related Content

What's hot

Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
Durga Gadiraju
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
Michael Rainey
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
mahezabeenIlkal
 
Informatica to ODI Migration – What, Why and How | Informatica to Oracle Dat...
Informatica to ODI Migration – What, Why and How |  Informatica to Oracle Dat...Informatica to ODI Migration – What, Why and How |  Informatica to Oracle Dat...
Informatica to ODI Migration – What, Why and How | Informatica to Oracle Dat...
Jade Global
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
Rajesh Kumar
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
Steven Ensslen
 
Oracle Cloud Infrastructure
Oracle Cloud InfrastructureOracle Cloud Infrastructure
Oracle Cloud Infrastructure
MarketingArrowECS_CZ
 
What is BI Testing and The Importance of BI Report Testing
What is BI Testing and The Importance of BI Report TestingWhat is BI Testing and The Importance of BI Report Testing
What is BI Testing and The Importance of BI Report Testing
Torana, Inc.
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
Mark Kromer
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
BugRaptors
 
Logical Data Fabric: An Introduction
Logical Data Fabric: An IntroductionLogical Data Fabric: An Introduction
Logical Data Fabric: An Introduction
Denodo
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinar
RTTS
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
priyadharshini626440
 
DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data ModelerDAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DATAVERSITY
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data Flow
Mark Kromer
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
CloverDX (formerly known as CloverETL)
 
E-Business Suite on Oracle Cloud
E-Business Suite on Oracle CloudE-Business Suite on Oracle Cloud
E-Business Suite on Oracle Cloud
Keith Kiattipong
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 

What's hot (20)

Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
 
Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
Informatica to ODI Migration – What, Why and How | Informatica to Oracle Dat...
Informatica to ODI Migration – What, Why and How |  Informatica to Oracle Dat...Informatica to ODI Migration – What, Why and How |  Informatica to Oracle Dat...
Informatica to ODI Migration – What, Why and How | Informatica to Oracle Dat...
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
 
Measuring Data Quality with DataOps
Measuring Data Quality with DataOpsMeasuring Data Quality with DataOps
Measuring Data Quality with DataOps
 
Oracle Cloud Infrastructure
Oracle Cloud InfrastructureOracle Cloud Infrastructure
Oracle Cloud Infrastructure
 
What is BI Testing and The Importance of BI Report Testing
What is BI Testing and The Importance of BI Report TestingWhat is BI Testing and The Importance of BI Report Testing
What is BI Testing and The Importance of BI Report Testing
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
 
Logical Data Fabric: An Introduction
Logical Data Fabric: An IntroductionLogical Data Fabric: An Introduction
Logical Data Fabric: An Introduction
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinar
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
 
DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data ModelerDAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data Flow
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
E-Business Suite on Oracle Cloud
E-Business Suite on Oracle CloudE-Business Suite on Oracle Cloud
E-Business Suite on Oracle Cloud
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 

Viewers also liked

Test Automation for Data Warehouses
Test Automation for Data Warehouses Test Automation for Data Warehouses
Test Automation for Data Warehouses
Patrick Van Renterghem
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
ETLSolutions
 
Migration testing
Migration testingMigration testing
Migration testing
Indium Software
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
RTTS
 
Agile Methodology - Data Migration v1.0
Agile Methodology - Data Migration v1.0Agile Methodology - Data Migration v1.0
Agile Methodology - Data Migration v1.0
Julian Samuels
 
WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software
Patrick Van Renterghem
 
Open Source Migration
Open Source MigrationOpen Source Migration
Open Source Migration
rw2
 
IBM MDM 10.1 What's New - Aomar Bariz
IBM MDM 10.1  What's New - Aomar BarizIBM MDM 10.1  What's New - Aomar Bariz
IBM MDM 10.1 What's New - Aomar Bariz
IBMInfoSphereUGFR
 
Magento 2 ist da
Magento 2 ist daMagento 2 ist da
Magento 2 ist da
Splendid Internet GmbH
 
Technology organizational chart
Technology organizational chartTechnology organizational chart
Technology organizational chart
Rhoncla82
 
Data Migration In An Agile Open Source World
Data Migration In An Agile Open Source WorldData Migration In An Agile Open Source World
Data Migration In An Agile Open Source World
Craig Smith
 
TeraStream - Data Integration/Migration/ETL/Batch Tool
TeraStream - Data Integration/Migration/ETL/Batch ToolTeraStream - Data Integration/Migration/ETL/Batch Tool
TeraStream - Data Integration/Migration/ETL/Batch Tool
DataStreams
 
Software craftsmanship meetup (Zurich 2015) on solving real problems without ...
Software craftsmanship meetup (Zurich 2015) on solving real problems without ...Software craftsmanship meetup (Zurich 2015) on solving real problems without ...
Software craftsmanship meetup (Zurich 2015) on solving real problems without ...
Tudor Girba
 
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBMInfoSphereUGFR
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
Stephen Alex
 
A First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformA First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job Platform
Safe Software
 
Migration and Testing (EVO 2008)
Migration and Testing (EVO 2008)Migration and Testing (EVO 2008)
Migration and Testing (EVO 2008)
Tudor Girba
 
Database migration
Database migrationDatabase migration
Database migration
Sankar Patnaik
 
Food Contact Materials: Migration testing using MS - Waters Corporation Food ...
Food Contact Materials: Migration testing using MS - Waters Corporation Food ...Food Contact Materials: Migration testing using MS - Waters Corporation Food ...
Food Contact Materials: Migration testing using MS - Waters Corporation Food ...
Waters Corporation - Food QC, Safety & Research
 
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Andreas Buckenhofer
 

Viewers also liked (20)

Test Automation for Data Warehouses
Test Automation for Data Warehouses Test Automation for Data Warehouses
Test Automation for Data Warehouses
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 
Migration testing
Migration testingMigration testing
Migration testing
 
QuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solutionQuerySurge - the automated Data Testing solution
QuerySurge - the automated Data Testing solution
 
Agile Methodology - Data Migration v1.0
Agile Methodology - Data Migration v1.0Agile Methodology - Data Migration v1.0
Agile Methodology - Data Migration v1.0
 
WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software WhereScape, the pioneer in data warehouse automation software
WhereScape, the pioneer in data warehouse automation software
 
Open Source Migration
Open Source MigrationOpen Source Migration
Open Source Migration
 
IBM MDM 10.1 What's New - Aomar Bariz
IBM MDM 10.1  What's New - Aomar BarizIBM MDM 10.1  What's New - Aomar Bariz
IBM MDM 10.1 What's New - Aomar Bariz
 
Magento 2 ist da
Magento 2 ist daMagento 2 ist da
Magento 2 ist da
 
Technology organizational chart
Technology organizational chartTechnology organizational chart
Technology organizational chart
 
Data Migration In An Agile Open Source World
Data Migration In An Agile Open Source WorldData Migration In An Agile Open Source World
Data Migration In An Agile Open Source World
 
TeraStream - Data Integration/Migration/ETL/Batch Tool
TeraStream - Data Integration/Migration/ETL/Batch ToolTeraStream - Data Integration/Migration/ETL/Batch Tool
TeraStream - Data Integration/Migration/ETL/Batch Tool
 
Software craftsmanship meetup (Zurich 2015) on solving real problems without ...
Software craftsmanship meetup (Zurich 2015) on solving real problems without ...Software craftsmanship meetup (Zurich 2015) on solving real problems without ...
Software craftsmanship meetup (Zurich 2015) on solving real problems without ...
 
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
A First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job PlatformA First Look at San Francisco’s New ETL Job Platform
A First Look at San Francisco’s New ETL Job Platform
 
Migration and Testing (EVO 2008)
Migration and Testing (EVO 2008)Migration and Testing (EVO 2008)
Migration and Testing (EVO 2008)
 
Database migration
Database migrationDatabase migration
Database migration
 
Food Contact Materials: Migration testing using MS - Waters Corporation Food ...
Food Contact Materials: Migration testing using MS - Waters Corporation Food ...Food Contact Materials: Migration testing using MS - Waters Corporation Food ...
Food Contact Materials: Migration testing using MS - Waters Corporation Food ...
 
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
Part 4 - Data Warehousing Lecture at BW Cooperative State University (DHBW)
 

Similar to Automate data warehouse etl testing and migration testing the agile way

Etl testing strategies
Etl testing strategiesEtl testing strategies
Etl testing strategies
sivam_1
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
Cognizant
 
Etl testing
Etl testingEtl testing
Etl testing
Sandip Patil
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
RTTS
 
Data Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open SourceData Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open Source
Stratebi
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
RTTS
 
DGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data QualityDGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data Quality
Caserta
 
Etl testing
Etl testingEtl testing
Etl testing
Krishna Prasad
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
Vinny (Gurvinder) Ahuja
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
Jason Packer
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
RTTS
 
ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training Presentation
Apurba Biswas
 
Leveraging Automated Data Validation to Reduce Software Development Timeline...
Leveraging Automated Data Validation  to Reduce Software Development Timeline...Leveraging Automated Data Validation  to Reduce Software Development Timeline...
Leveraging Automated Data Validation to Reduce Software Development Timeline...
Cognizant
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing process
Rakesh Hansalia
 
Testing in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf SoftwareTesting in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf Software
Josiah Renaudin
 
End User Informatics
End User InformaticsEnd User Informatics
End User Informatics
Ambareesh Kulkarni
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
RTTS
 
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity Software Ireland
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
RTTS
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
RTTS
 

Similar to Automate data warehouse etl testing and migration testing the agile way (20)

Etl testing strategies
Etl testing strategiesEtl testing strategies
Etl testing strategies
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
 
Etl testing
Etl testingEtl testing
Etl testing
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
Data Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open SourceData Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open Source
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
 
DGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data QualityDGIQ 2015 The Fundamentals of Data Quality
DGIQ 2015 The Fundamentals of Data Quality
 
Etl testing
Etl testingEtl testing
Etl testing
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training Presentation
 
Leveraging Automated Data Validation to Reduce Software Development Timeline...
Leveraging Automated Data Validation  to Reduce Software Development Timeline...Leveraging Automated Data Validation  to Reduce Software Development Timeline...
Leveraging Automated Data Validation to Reduce Software Development Timeline...
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing process
 
Testing in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf SoftwareTesting in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf Software
 
End User Informatics
End User InformaticsEnd User Informatics
End User Informatics
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
 
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 

Automate data warehouse etl testing and migration testing the agile way

  • 1. Fast forward Data Warehouse/ETL Testing Migration Testing The Agile Way ETL Testing & Monitoring Platform
  • 2. Author: Sandesh Gawande CTO -iCEDQ Torana, Inc. Email: Sandesh.g@ToranaInc.com Office: 203 666 4442 Twitter: @sandesh_gawande Skype: Sandesh.Gawande.ToranaInc LinkedIn: https://www.linkedin.com/in/sandesh- gawande-1a25757 About: Torana, Inc.  Year established 2005  Stamford, CT  Developers of Big Data Integration & Data Migration, Data warehouse/ETL Testing Software  Fortune 500 customers in Banking, Insurance, Healthcare, e-commerce, Manufacturing
  • 4. Finding Issues in QA Stage is the Best, but QA is…  Not Agile  With waterfall approch its too late...  Not Automated  Manual data checks  Wasted Time  No repeatability or consistency  No way to test millions of rows  Wrong focus on creating scripts rather than the business problem  Cannot reconcile data across systems (e.g. Files vs. Database)  Not Collobrative  QA teams work in isolation  No feedabck to developers or business users  Disoragnized  No Transparancy  Late Discovery of Issues  Project Failure or High Costs Not Agile Manual & Slow Not Collaborative Or Feedback No Transparency or Compliance
  • 5. But Why is it so Difficult to Automate ETL Testing?  ETL Processes don’t have screens  Conventional QA Automation product were designed for screen based testing  New Concepts  Source Data + Transformation = Target Data  Quality of an ETL Process = Expected Data vs. Actual Data  Most developers are from traditional software development  New to concepts such as data reconciliation for ETL Testing  Mix up of QA/QC concept with Data Quality  High Volume of Data (Millions of rows)  Since the source data and target data could be in two different systems reconciliation is difficult
  • 6. Introducing iCEDQ… Automated & Fast Agile Collaborative Feedback & Transparency An automation platform for: A.Data Warehouse Testing • ETL Testing • MDM Testing • Data Integration Testing B.Data Migration Testing C.Data Monitoring
  • 7. iCEDQ has in-Memory Rules Engine  It Tests ETL Transformations by  Validating the output data generated by ETL Validation Rule Reconciliation Rule  It Tests ETL Transformations by  Reconciling Source Data Vs. Target Data  … ETL
  • 8. Data Warehouse/ETL Test Automation Data Sources ETL Data Warehouse Tech Validation Test Biz Validation Test Biz Reconciliation Test Tech Reconciliation Test Validation Tests Reconciliation Tests Technical Validation Rule Business Validate Rules Technical Reconciliation Rules Business Reconciliation Rules Validate incoming data before processing. Test for… • Data format • Nulls • Data types • more Business rules based validation will indicate if there is an data issue becuase of ETL processes, Source data or wrong requirements... • Check if Net Amount =? Gross Amout – (Taxes + Fees + Commisions) These rules test specific to an ETL process which is doing transformation… • An ETL processes calculating end of day balances from daily transactions can be tested. Sum of todays transactions =? Today’s End of Day balance – Yesterday’s End of Day balance These tests are designed to test the overall system independent of the ETL processes, Source data or business requirements A
  • 9. Data Migration Test Automation Legacy System ETL New System Initial Reconciliation Test Post Reconciliation Test Initial Migration Testing Post Migration Testing 1st Create the data structures in the target system. Ex. Table, columns. 2nd copy the initial data from the legacy system to the new database • iCEDQ can validate the tables, columns, data types & precision • Reconcile the legacy vs. target data to make sure they have the same initial state Once the initial state is populated & tested. The post migration phase involves. Feeding the same data or triggering of same business processes in legacy system and the new system. • iCEDQ can reconcile the data to make sure the after running the business processes the data generated same • Because regardless of the system change, unless there is a business rule change the net output from business point of view must be same B
  • 10. Production Data Monitoring Automation Source Stage Stage Data Warehouse Data Warehouse Data Marts Data Marts Reports / Extracts Process Load Stage Customer Process Load Stage Policy Process Load Stage Claims Process Load Dim Customer Process Load Daily Claims Process Load Month Policy Process Load P&L Process Load Dim Customer Process Load Month Claims Start Stop Monitoring in Series Monitoring in Parallel Embed iCEDQ Rules in the batch process • If Audit Fails the users are notified and the process can be stopped automatically The Audit Rules are run in parallel to the batch process • If Audit Fails the users are notified but the process is not stopped automatically iCEDQ C
  • 11. UserStory Tech Requirements MappingDocument ETL Process Audit Requirements Test Case iCEDQ Rule 1 iCEDQ Rule 2 …  Test processes in parallel to the development pipeline  No reasons to wait! iCEDQ is Agile Development Pipeline QA Pipeline
  • 12. iCEDQ-Central Repository & Collaborative  Centralized Repository for Rules Library  An collaborative environment to work together  Work together regardless of the  Location  time  Role
  • 13. iCEDQ-Feedback & Transparency  Dashboard  Fails & Custom Reports  Integration with ALM & Issue management  Auto Notification  Ability to drill down to an defect  Audit Logs & execution histroy…
  • 14. iCEDQ-What changed? Before After NO Reconcile Across Files & database YES Very Complicated SQL NO SQL or Simple SQL Test millions of rows Cost Test Coverage NO Repeatability & Consistency YES NO Scheduling YES Desktop Based Test Execution Server Based NO Transparency & Reporting YES Cost of Defect NO Regression Testing & Audit YES NO Production Monitoring YES 1000… Millions… 100% 60% High Low HighLow
  • 15. Who uses iCEDQ?  Stock Exchange  Banks  Insurance  Manufacturing  Healthcare  E-Commerce  Manufacturing  …
  • 16. iCEDQ Healthcare Client iCEDQ Usage  iCEDQ was used for Migration Testing  Test provider data migration from Mainframe to MDM  iCEDQ Enterprise Data Warehouse Testing  Test Members Data, Enrolment data, Plans Data, Claims Data load from Legacy to (Enterprise Data warehouse)EDW & Health Rules to EDW  iCEDQ to Validate External Feeds  Test data feeds to State of Maryland, CMS (Centers for Medicare & Medicaid Services) iCEDQ Feedback  Helped Finalize Requirements  It found anomalies in the requirements and mapping documents and provided feedback  Helped Test Automation  It was able to automatically reconcile feeds from legacy as well as new system.  This was impossible to test manually  Transparency to management  It linked with defect management system and auto generated status
  • 17. Fast forward Data Warehouse & Migration Testing ETL Testing & Monitoring Platform