Automate data warehouse etl testing and migration testing the agile way

Fast forward
Data
Warehouse/ETL Testing
Migration Testing
The Agile Way
ETL Testing & Monitoring Platform

Author: Sandesh Gawande
CTO -iCEDQ
Torana, Inc.
Email: Sandesh.g@ToranaInc.com
Office: 203 666 4442
Twitter: @sandesh_gawande
Skype: Sandesh.Gawande.ToranaInc
LinkedIn: https://www.linkedin.com/in/sandesh-
gawande-1a25757
About: Torana, Inc.
 Year established 2005
 Stamford, CT
 Developers of Big Data Integration & Data
Migration, Data warehouse/ETL Testing
Software
 Fortune 500 customers in Banking,
Insurance, Healthcare, e-commerce,
Manufacturing

1x
100x
10,000x
Find the issues
early
or the costs are too
High

Finding Issues in QA Stage is the Best, but
QA is…
 Not Agile
 With waterfall approch its too late...
 Not Automated
 Manual data checks  Wasted Time
 No repeatability or consistency
 No way to test millions of rows
 Wrong focus on creating scripts rather than the business problem
 Cannot reconcile data across systems (e.g. Files vs. Database)
 Not Collobrative
 QA teams work in isolation
 No feedabck to developers or business users
 Disoragnized
 No Transparancy
 Late Discovery of Issues
 Project Failure or High Costs
Not Agile
Manual & Slow
Not Collaborative
Or Feedback
No Transparency
or Compliance

But Why is it so Difficult to
Automate ETL Testing?
 ETL Processes don’t have screens
 Conventional QA Automation product were designed for screen
based testing
 New Concepts
 Source Data + Transformation = Target Data
 Quality of an ETL Process = Expected Data vs. Actual Data
 Most developers are from traditional software development
 New to concepts such as data reconciliation for ETL Testing
 Mix up of QA/QC concept with Data Quality
 High Volume of Data (Millions of rows)
 Since the source data and target data could be in two different
systems reconciliation is difficult

Introducing iCEDQ…
Automated & Fast
Agile
Collaborative
Feedback &
Transparency
An automation platform for:
A.Data Warehouse Testing
• ETL Testing
• MDM Testing
• Data Integration Testing
B.Data Migration Testing
C.Data Monitoring

iCEDQ has in-Memory Rules Engine
 It Tests ETL Transformations by
 Validating the output data generated by ETL Validation Rule
Reconciliation Rule
 It Tests ETL Transformations by
 Reconciling Source Data Vs. Target Data
 …
ETL

Data Warehouse/ETL Test Automation
Data Sources ETL Data Warehouse
Tech Validation Test Biz Validation Test
Biz Reconciliation Test
Tech Reconciliation Test
Validation Tests Reconciliation Tests
Technical Validation Rule Business Validate Rules Technical Reconciliation Rules Business Reconciliation Rules
Validate incoming data before processing.
Test for…
• Data format
• Nulls
• Data types
• more
Business rules based validation will
indicate if there is an data issue becuase
of ETL processes, Source data or wrong
requirements...
• Check if Net Amount =? Gross Amout
– (Taxes + Fees + Commisions)
These rules test specific to an ETL
process which is doing transformation…
• An ETL processes calculating end of
day balances from daily transactions
can be tested. Sum of todays
transactions =? Today’s End of Day
balance – Yesterday’s End of Day
balance
These tests are designed to test the overall
system independent of the ETL processes,
Source data or business requirements
A

Data Migration Test Automation
Legacy System ETL New System
Initial Reconciliation Test
Post Reconciliation Test
Initial Migration Testing Post Migration Testing
1st Create the data structures in the target system. Ex. Table, columns. 2nd
copy the initial data from the legacy system to the new database
• iCEDQ can validate the tables, columns, data types & precision
• Reconcile the legacy vs. target data to make sure they have the same
initial state
Once the initial state is populated & tested. The post migration phase
involves. Feeding the same data or triggering of same business processes
in legacy system and the new system.
• iCEDQ can reconcile the data to make sure the after running the
business processes the data generated same
• Because regardless of the system change, unless there is a business
rule change the net output from business point of view must be same
B

Production Data Monitoring Automation
Source Stage Stage Data Warehouse Data Warehouse Data
Marts
Data Marts Reports /
Extracts
Process
Load Stage
Customer
Process
Load Stage
Policy
Process
Load Stage
Claims
Process
Load Dim
Customer
Process
Load Daily
Claims
Process
Load Month
Policy
Process
Load P&L
Process
Load Dim
Customer
Process
Load Month
Claims
Start
Stop
Monitoring in Series Monitoring in Parallel
Embed iCEDQ Rules in the batch process
• If Audit Fails the users are notified and the process can be stopped
automatically
The Audit Rules are run in parallel to the batch process
• If Audit Fails the users are notified but the process is not stopped
automatically
iCEDQ
C

UserStory
Tech Requirements MappingDocument ETL Process
Audit Requirements Test Case
iCEDQ Rule 1
iCEDQ Rule 2
…
 Test processes in parallel to the
development pipeline
 No reasons to wait!
iCEDQ is Agile
Development Pipeline
QA Pipeline

iCEDQ-Central Repository & Collaborative
 Centralized Repository for
Rules Library
 An collaborative
environment to work
together
 Work together regardless
of the
 Location
 time
 Role

iCEDQ-Feedback & Transparency
 Dashboard
 Fails & Custom Reports
 Integration with ALM & Issue management
 Auto Notification
 Ability to drill down to an defect
 Audit Logs & execution histroy…

iCEDQ-What changed?
Before After
NO Reconcile Across Files & database YES
Very Complicated SQL NO SQL or Simple SQL
Test millions of rows
Cost
Test Coverage
NO Repeatability & Consistency YES
NO Scheduling YES
Desktop Based Test Execution Server Based
NO Transparency & Reporting YES
Cost of Defect
NO Regression Testing & Audit YES
NO Production Monitoring YES
1000… Millions…
100% 60%
High Low
HighLow

Who uses iCEDQ?
 Stock Exchange
 Banks
 Insurance
 Manufacturing
 Healthcare
 E-Commerce
 Manufacturing
 …

iCEDQ Healthcare Client
iCEDQ Usage
 iCEDQ was used for Migration Testing
 Test provider data migration from Mainframe to
MDM
 iCEDQ Enterprise Data Warehouse Testing
 Test Members Data, Enrolment data, Plans Data,
Claims Data load from Legacy to (Enterprise Data
warehouse)EDW & Health Rules to EDW
 iCEDQ to Validate External Feeds
 Test data feeds to State of Maryland, CMS
(Centers for Medicare & Medicaid Services)
iCEDQ Feedback
 Helped Finalize Requirements
 It found anomalies in the requirements and
mapping documents and provided feedback
 Helped Test Automation
 It was able to automatically reconcile feeds from
legacy as well as new system.
 This was impossible to test manually
 Transparency to management
 It linked with defect management system and auto
generated status

Fast forward Data Warehouse
& Migration Testing
ETL Testing & Monitoring Platform

Automate data warehouse etl testing and migration testing the agile way

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Automate data warehouse etl testing and migration testing the agile way

Similar to Automate data warehouse etl testing and migration testing the agile way (20)

Recently uploaded

Recently uploaded (20)

Automate data warehouse etl testing and migration testing the agile way