SlideShare a Scribd company logo
Data Warehouse(ETL) Testing
By:
Rakesh Hansalia
in.linkedin.com/in/rakeshhansalia/
Index
...............................................................................................................................................2
Introduction.............................................................................................................................3
About Data Warehouse...........................................................................................................3
Data Warehouse definition.................................................................................................3
Testing Process for Data warehouse:......................................................................................3
Requirements Testing :......................................................................................................3
Unit Testing : ....................................................................................................................4
Integration Testing : ..........................................................................................................4
Scenarios to be covered in Integration Testing...............................................................5
Validating the Report data..............................................................................................6
User Acceptance Testing....................................................................................................6
Conclusion..............................................................................................................................6
Introduction
This document details the testing process involved in data warehouse testing and test coverage
areas. It explains the importance of data warehouse application testing and the various steps of
the testing process.
About Data Warehouse
Data warehouse is the main repository of the organization's historical data. It contains the data
for management's decision support system. The important factor leading to the use of a data
warehouse is that a data analyst can perform complex queries and analysis (data mining) on the
information within data warehouse without slowing down the operational systems.
Data Warehouse definition
• Subject-oriented : Subject Oriented -Data warehouses are designed to help you analyse
data. For example, to learn more about your company's sales data, you can build a
warehouse that concentrates on sales. Using this warehouse, you can answer questions
like "Who was our best customer for this item last year?" This ability to define a data
warehouse by subject matter, sales in this case, makes the data warehouse subject
oriented. The data is organized so that all the data elements relating to the same real-
world event or object are linked together.
• Integrated : Integration is closely related to subject orientation. Data warehouses must
put data from disparate sources into a consistent format. The database contains data
from most or all of an organization's operational applications and is made consistent.
• Time-variant : The changes to the data in the database are tracked and recorded to
produce reports on data changed over time. In order to discover trends in business,
analysts need large amounts of data. A data warehouse's focus on change over time is
what is meant by the term time variant.
• Non-volatile : Data in the database is never over-written or deleted, once committed,
the data is static, read-only, but retained for future reporting. Once entered into the
warehouse, data should not change. This is logical because the purpose of a data
warehouse is to enable you to analyse what has occurred.
Testing Process for Data warehouse:
Testing for a Data warehouse consists of requirements testing, unit testing, integration testing
and acceptance testing.
Requirements Testing :
The main aim for doing Requirements testing is to check stated requirements for
completeness. Requirements can be tested on following factors.
1. Are the requirements Complete?
2. Are the requirements Singular?
3. Are the requirements Ambiguous?
4. Are the requirements Developable?
5. Are the requirements Testable?
In a Data warehouse, the requirements are mostly around reporting. Hence it becomes more
important to verify whether these reporting requirements can be provided using the data
available.
Successful requirements are those structured closely to business rules and address
functionality and performance. These business rules and requirements provide a solid
foundation to the data architects. Using the defined requirements and business rules, high
level design of the data model is created. Once requirements and business rules are
available, rough scripts can be drafted to validate the data model constraints against the
defined business rules.
Unit Testing :
Unit testing for data warehouses is WHITEBOX. It should check the ETL
procedures/mappings/jobs and the reports developed. This is usually done by the
developers.
Unit testing will involve following
1. Whether ETLs are accessing and picking up right data from right source.
2. All the data transformations are correct according to the business rules and data
warehouse is correctly populated with the transformed data.
3. Testing the rejected records that don’t fulfil transformation rules.
Integration Testing :
After unit testing is complete, it should form the basis of starting integration testing.
Integration testing should test out initial and incremental loading of the data warehouse.
Integration testing will involve following
1. Sequence of ETLs jobs in batch.
2. Initial loading of records on data warehouse.
3. Incremental loading of records at a later date to verify the newly inserted or updated
data.
4. Testing the rejected records that don’t fulfil transformation rules.
5. Error log generation.
The overall Integration testing life cycle executed is planned in four phases: Requirements
Understanding, Test Planning and Design, Test Case Preparation and Test Execution.
Scenarios to be covered in Integration Testing
Integration Testing would cover End-to-End Testing for DWH. The coverage of the tests
would include the below:
1. Count Validation
- Record Count Verification DWH backend/Reporting queries against source and
target as a initial check.
2. Source Isolation
- Validation after isolating the driving sources.
3. Dimensional Analysis
- Data integrity between the various source tables and relationships.
4. Statistical Analysis
- Validation for various calculations.
5. Data Quality Validation
- Check for missing data, negatives and consistency. Field-by-Field data verification
can be done to check the consistency of source and target data.
6. Granularity
- Validate at the lowest granular level possible (Lowest in the hierarchy E.g.
Country-City-Street – start with test cases on street).
7. Other validations
- Graphs, Slice/dice, meaningfulness, accuracy
.
QA Team Reviews BRD for
completeness.
QA Team builds Test Plan
Develop Test Cases and
SQL Queries
Test Execution
Process for Data warehouse Testing
Business Requirement
Document/Requirement
Traceability Matrix
Business Requirement
Document/Requirement
Traceability Matrix
Requirements TestingRequirements Testing
High Level Design
document
High Level Design
document
Review of HLDReview of HLD
Test Case PreparationTest Case Preparation
Functional TestingFunctional Testing
Regression TestingRegression Testing
Performance TestingPerformance Testing
User Acceptance Testing (UAT)User Acceptance Testing (UAT)
Unit TestingUnit Testing
Validating the Report data
Once the ETLs are tested for count and data verification, the data being showed onto the
reports hold utmost importance. QA team should verify the data reported with the
source data for consistency and accuracy.
1. Verify Report data with source
- Although the data present in a data warehouse will be stored at an aggregate level
compare to source systems. Here the QA team should verify the granular data
stored in data warehouse against the source data available.
2. Field level data verification
- QA team must understand the linkages for the fields displayed in the report and
should trace back and compare that with the source systems.
3. Creating SQLs
- Create SQL queries to fetch and verify the data from Source and Target.
Sometimes it’s not possible to do the complex transformations done in ETL. In such
a case the data can be transferred to some file and calculations can be performed.
User Acceptance Testing
Here the system is tested with full functionality and is expected to function as in production. At
the end of UAT, the system should be acceptable to the client for use in terms of ETL process
integrity and business functionality and reporting.
Conclusion
Evolving needs of the business and changes in the source systems will drive continuous change
in the data warehouse schema and the data being loaded. Hence, it is necessary that
development and testing processes are clearly defined, followed by impact-analysis and strong
alignment between development, operations and the business.
Validating the Report data
Once the ETLs are tested for count and data verification, the data being showed onto the
reports hold utmost importance. QA team should verify the data reported with the
source data for consistency and accuracy.
1. Verify Report data with source
- Although the data present in a data warehouse will be stored at an aggregate level
compare to source systems. Here the QA team should verify the granular data
stored in data warehouse against the source data available.
2. Field level data verification
- QA team must understand the linkages for the fields displayed in the report and
should trace back and compare that with the source systems.
3. Creating SQLs
- Create SQL queries to fetch and verify the data from Source and Target.
Sometimes it’s not possible to do the complex transformations done in ETL. In such
a case the data can be transferred to some file and calculations can be performed.
User Acceptance Testing
Here the system is tested with full functionality and is expected to function as in production. At
the end of UAT, the system should be acceptable to the client for use in terms of ETL process
integrity and business functionality and reporting.
Conclusion
Evolving needs of the business and changes in the source systems will drive continuous change
in the data warehouse schema and the data being loaded. Hence, it is necessary that
development and testing processes are clearly defined, followed by impact-analysis and strong
alignment between development, operations and the business.

More Related Content

What's hot

ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training Presentation
Apurba Biswas
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
RTTS
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testing
Garuda Trainings
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
ines beltaief
 
ETL Process
ETL ProcessETL Process
ETL Process
Rohin Rangnekar
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
BugRaptors
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
Omid Vahdaty
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
Ismail El Gayar
 
MS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data miningMS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data mining
DataminingTools Inc
 
Testing data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanTesting data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti Bhushan
Kirti Bhushan
 
ETL Process
ETL ProcessETL Process
ETL Process
Karthik Selvaraj
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
JesusaEspeleta
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
Yogendra Uikey
 
ETL and its impact on Business Intelligence
ETL and its impact on Business IntelligenceETL and its impact on Business Intelligence
ETL and its impact on Business Intelligence
IshaPande
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
PanaEk Warawit
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
jeshocarme
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
ABDUL KHALIQ
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
idnats
 

What's hot (20)

ETL Testing Training Presentation
ETL Testing Training PresentationETL Testing Training Presentation
ETL Testing Training Presentation
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testing
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
ETL Process
ETL ProcessETL Process
ETL Process
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
SQL for ETL Testing
SQL for ETL TestingSQL for ETL Testing
SQL for ETL Testing
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
 
MS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data miningMS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data mining
 
Testing data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanTesting data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti Bhushan
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
ETL and its impact on Business Intelligence
ETL and its impact on Business IntelligenceETL and its impact on Business Intelligence
ETL and its impact on Business Intelligence
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Etl - Extract Transform Load
Etl - Extract Transform LoadEtl - Extract Transform Load
Etl - Extract Transform Load
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 

Similar to Data Warehouse (ETL) testing process

Etl testing
Etl testingEtl testing
Etl testing
Krishna Prasad
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakes
shivindkaur
 
Varsha_CV_ETLTester5.8Years
Varsha_CV_ETLTester5.8YearsVarsha_CV_ETLTester5.8Years
Varsha_CV_ETLTester5.8YearsVarsha Hiremath
 
Etl testing strategies
Etl testing strategiesEtl testing strategies
Etl testing strategies
sivam_1
 
Data warehouse-testing
Data warehouse-testingData warehouse-testing
Data warehouse-testingraianup
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
Cognizant
 
Data Ware House Testing
Data Ware House TestingData Ware House Testing
Data Ware House Testingmanojpmat
 
Database Testing.pptx
Database Testing.pptxDatabase Testing.pptx
Database Testing.pptx
ssuser88c0fd1
 
DMM9 - Data Migration Testing
DMM9 - Data Migration TestingDMM9 - Data Migration Testing
DMM9 - Data Migration TestingNick van Beest
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
RTTS
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
RTTS
 
IRJET- Testing Improvement in Business Intelligence Area
IRJET- Testing Improvement in Business Intelligence AreaIRJET- Testing Improvement in Business Intelligence Area
IRJET- Testing Improvement in Business Intelligence Area
IRJET Journal
 
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docxA Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
ronak56
 
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docxA Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
makdul
 
Automate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayAutomate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile way
Torana, Inc.
 
Testing in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf SoftwareTesting in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf Software
Josiah Renaudin
 
Software Project Management: Testing Document
Software Project Management: Testing DocumentSoftware Project Management: Testing Document
Software Project Management: Testing Document
Minhas Kamal
 

Similar to Data Warehouse (ETL) testing process (20)

Etl testing
Etl testingEtl testing
Etl testing
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakes
 
Varsha_CV_ETLTester5.8Years
Varsha_CV_ETLTester5.8YearsVarsha_CV_ETLTester5.8Years
Varsha_CV_ETLTester5.8Years
 
Etl testing strategies
Etl testing strategiesEtl testing strategies
Etl testing strategies
 
Data warehouse-testing
Data warehouse-testingData warehouse-testing
Data warehouse-testing
 
Deliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL TestingDeliver Trusted Data by Leveraging ETL Testing
Deliver Trusted Data by Leveraging ETL Testing
 
Data Ware House Testing
Data Ware House TestingData Ware House Testing
Data Ware House Testing
 
Database Testing.pptx
Database Testing.pptxDatabase Testing.pptx
Database Testing.pptx
 
DMM9 - Data Migration Testing
DMM9 - Data Migration TestingDMM9 - Data Migration Testing
DMM9 - Data Migration Testing
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
 
IRJET- Testing Improvement in Business Intelligence Area
IRJET- Testing Improvement in Business Intelligence AreaIRJET- Testing Improvement in Business Intelligence Area
IRJET- Testing Improvement in Business Intelligence Area
 
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docxA Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
 
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docxA Comprehensive Approach to Data Warehouse TestingMatteo G.docx
A Comprehensive Approach to Data Warehouse TestingMatteo G.docx
 
Automate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayAutomate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile way
 
Testing in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf SoftwareTesting in the New World of Off-the-Shelf Software
Testing in the New World of Off-the-Shelf Software
 
ETL Testing
ETL TestingETL Testing
ETL Testing
 
Jithender_3+Years_Exp_ETL Testing
Jithender_3+Years_Exp_ETL TestingJithender_3+Years_Exp_ETL Testing
Jithender_3+Years_Exp_ETL Testing
 
Soumya sree Sridharala
Soumya sree SridharalaSoumya sree Sridharala
Soumya sree Sridharala
 
Software Project Management: Testing Document
Software Project Management: Testing DocumentSoftware Project Management: Testing Document
Software Project Management: Testing Document
 

More from Rakesh Hansalia

Pre-Launch Website Testing Checklist
Pre-Launch Website Testing ChecklistPre-Launch Website Testing Checklist
Pre-Launch Website Testing ChecklistRakesh Hansalia
 
Fresher interview question for software testing (QA) manual + basic automation
Fresher interview question for software testing (QA) manual + basic automationFresher interview question for software testing (QA) manual + basic automation
Fresher interview question for software testing (QA) manual + basic automation
Rakesh Hansalia
 
Selenium Webdriver Commands
Selenium Webdriver CommandsSelenium Webdriver Commands
Selenium Webdriver Commands
Rakesh Hansalia
 
How to create Xpath, CSS, DOM for your Selenium script
How to create Xpath, CSS, DOM  for your Selenium scriptHow to create Xpath, CSS, DOM  for your Selenium script
How to create Xpath, CSS, DOM for your Selenium script
Rakesh Hansalia
 
How does the QA team prepare test cases for Data Warehouse (BI) projects?
How does the QA team prepare test cases for Data Warehouse (BI) projects?How does the QA team prepare test cases for Data Warehouse (BI) projects?
How does the QA team prepare test cases for Data Warehouse (BI) projects?
Rakesh Hansalia
 
Types of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse projectTypes of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse projectRakesh Hansalia
 
Bi report testing Ver. 01
Bi report testing Ver.  01Bi report testing Ver.  01
Bi report testing Ver. 01
Rakesh Hansalia
 
Valuable information that you should share with world
Valuable information that you should share with world Valuable information that you should share with world
Valuable information that you should share with world Rakesh Hansalia
 
QL- roles responsibilities
QL-  roles responsibilitiesQL-  roles responsibilities
QL- roles responsibilitiesRakesh Hansalia
 
Selenium webdriver course content rakesh hansalia
Selenium webdriver course content rakesh hansaliaSelenium webdriver course content rakesh hansalia
Selenium webdriver course content rakesh hansaliaRakesh Hansalia
 

More from Rakesh Hansalia (11)

Pre-Launch Website Testing Checklist
Pre-Launch Website Testing ChecklistPre-Launch Website Testing Checklist
Pre-Launch Website Testing Checklist
 
Fresher interview question for software testing (QA) manual + basic automation
Fresher interview question for software testing (QA) manual + basic automationFresher interview question for software testing (QA) manual + basic automation
Fresher interview question for software testing (QA) manual + basic automation
 
Selenium Webdriver Commands
Selenium Webdriver CommandsSelenium Webdriver Commands
Selenium Webdriver Commands
 
How to create Xpath, CSS, DOM for your Selenium script
How to create Xpath, CSS, DOM  for your Selenium scriptHow to create Xpath, CSS, DOM  for your Selenium script
How to create Xpath, CSS, DOM for your Selenium script
 
How does the QA team prepare test cases for Data Warehouse (BI) projects?
How does the QA team prepare test cases for Data Warehouse (BI) projects?How does the QA team prepare test cases for Data Warehouse (BI) projects?
How does the QA team prepare test cases for Data Warehouse (BI) projects?
 
Types of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse projectTypes of testing done in a Data Warehouse project
Types of testing done in a Data Warehouse project
 
Bi report testing Ver. 01
Bi report testing Ver.  01Bi report testing Ver.  01
Bi report testing Ver. 01
 
Software testing myths
Software testing mythsSoftware testing myths
Software testing myths
 
Valuable information that you should share with world
Valuable information that you should share with world Valuable information that you should share with world
Valuable information that you should share with world
 
QL- roles responsibilities
QL-  roles responsibilitiesQL-  roles responsibilities
QL- roles responsibilities
 
Selenium webdriver course content rakesh hansalia
Selenium webdriver course content rakesh hansaliaSelenium webdriver course content rakesh hansalia
Selenium webdriver course content rakesh hansalia
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 

Data Warehouse (ETL) testing process

  • 1. Data Warehouse(ETL) Testing By: Rakesh Hansalia in.linkedin.com/in/rakeshhansalia/
  • 2. Index ...............................................................................................................................................2 Introduction.............................................................................................................................3 About Data Warehouse...........................................................................................................3 Data Warehouse definition.................................................................................................3 Testing Process for Data warehouse:......................................................................................3 Requirements Testing :......................................................................................................3 Unit Testing : ....................................................................................................................4 Integration Testing : ..........................................................................................................4 Scenarios to be covered in Integration Testing...............................................................5 Validating the Report data..............................................................................................6 User Acceptance Testing....................................................................................................6 Conclusion..............................................................................................................................6
  • 3. Introduction This document details the testing process involved in data warehouse testing and test coverage areas. It explains the importance of data warehouse application testing and the various steps of the testing process. About Data Warehouse Data warehouse is the main repository of the organization's historical data. It contains the data for management's decision support system. The important factor leading to the use of a data warehouse is that a data analyst can perform complex queries and analysis (data mining) on the information within data warehouse without slowing down the operational systems. Data Warehouse definition • Subject-oriented : Subject Oriented -Data warehouses are designed to help you analyse data. For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented. The data is organized so that all the data elements relating to the same real- world event or object are linked together. • Integrated : Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format. The database contains data from most or all of an organization's operational applications and is made consistent. • Time-variant : The changes to the data in the database are tracked and recorded to produce reports on data changed over time. In order to discover trends in business, analysts need large amounts of data. A data warehouse's focus on change over time is what is meant by the term time variant. • Non-volatile : Data in the database is never over-written or deleted, once committed, the data is static, read-only, but retained for future reporting. Once entered into the warehouse, data should not change. This is logical because the purpose of a data warehouse is to enable you to analyse what has occurred. Testing Process for Data warehouse: Testing for a Data warehouse consists of requirements testing, unit testing, integration testing and acceptance testing. Requirements Testing : The main aim for doing Requirements testing is to check stated requirements for completeness. Requirements can be tested on following factors. 1. Are the requirements Complete? 2. Are the requirements Singular? 3. Are the requirements Ambiguous? 4. Are the requirements Developable? 5. Are the requirements Testable? In a Data warehouse, the requirements are mostly around reporting. Hence it becomes more important to verify whether these reporting requirements can be provided using the data available. Successful requirements are those structured closely to business rules and address functionality and performance. These business rules and requirements provide a solid
  • 4. foundation to the data architects. Using the defined requirements and business rules, high level design of the data model is created. Once requirements and business rules are available, rough scripts can be drafted to validate the data model constraints against the defined business rules. Unit Testing : Unit testing for data warehouses is WHITEBOX. It should check the ETL procedures/mappings/jobs and the reports developed. This is usually done by the developers. Unit testing will involve following 1. Whether ETLs are accessing and picking up right data from right source. 2. All the data transformations are correct according to the business rules and data warehouse is correctly populated with the transformed data. 3. Testing the rejected records that don’t fulfil transformation rules. Integration Testing : After unit testing is complete, it should form the basis of starting integration testing. Integration testing should test out initial and incremental loading of the data warehouse. Integration testing will involve following 1. Sequence of ETLs jobs in batch. 2. Initial loading of records on data warehouse. 3. Incremental loading of records at a later date to verify the newly inserted or updated data. 4. Testing the rejected records that don’t fulfil transformation rules. 5. Error log generation. The overall Integration testing life cycle executed is planned in four phases: Requirements Understanding, Test Planning and Design, Test Case Preparation and Test Execution.
  • 5. Scenarios to be covered in Integration Testing Integration Testing would cover End-to-End Testing for DWH. The coverage of the tests would include the below: 1. Count Validation - Record Count Verification DWH backend/Reporting queries against source and target as a initial check. 2. Source Isolation - Validation after isolating the driving sources. 3. Dimensional Analysis - Data integrity between the various source tables and relationships. 4. Statistical Analysis - Validation for various calculations. 5. Data Quality Validation - Check for missing data, negatives and consistency. Field-by-Field data verification can be done to check the consistency of source and target data. 6. Granularity - Validate at the lowest granular level possible (Lowest in the hierarchy E.g. Country-City-Street – start with test cases on street). 7. Other validations - Graphs, Slice/dice, meaningfulness, accuracy . QA Team Reviews BRD for completeness. QA Team builds Test Plan Develop Test Cases and SQL Queries Test Execution Process for Data warehouse Testing Business Requirement Document/Requirement Traceability Matrix Business Requirement Document/Requirement Traceability Matrix Requirements TestingRequirements Testing High Level Design document High Level Design document Review of HLDReview of HLD Test Case PreparationTest Case Preparation Functional TestingFunctional Testing Regression TestingRegression Testing Performance TestingPerformance Testing User Acceptance Testing (UAT)User Acceptance Testing (UAT) Unit TestingUnit Testing
  • 6. Validating the Report data Once the ETLs are tested for count and data verification, the data being showed onto the reports hold utmost importance. QA team should verify the data reported with the source data for consistency and accuracy. 1. Verify Report data with source - Although the data present in a data warehouse will be stored at an aggregate level compare to source systems. Here the QA team should verify the granular data stored in data warehouse against the source data available. 2. Field level data verification - QA team must understand the linkages for the fields displayed in the report and should trace back and compare that with the source systems. 3. Creating SQLs - Create SQL queries to fetch and verify the data from Source and Target. Sometimes it’s not possible to do the complex transformations done in ETL. In such a case the data can be transferred to some file and calculations can be performed. User Acceptance Testing Here the system is tested with full functionality and is expected to function as in production. At the end of UAT, the system should be acceptable to the client for use in terms of ETL process integrity and business functionality and reporting. Conclusion Evolving needs of the business and changes in the source systems will drive continuous change in the data warehouse schema and the data being loaded. Hence, it is necessary that development and testing processes are clearly defined, followed by impact-analysis and strong alignment between development, operations and the business.
  • 7. Validating the Report data Once the ETLs are tested for count and data verification, the data being showed onto the reports hold utmost importance. QA team should verify the data reported with the source data for consistency and accuracy. 1. Verify Report data with source - Although the data present in a data warehouse will be stored at an aggregate level compare to source systems. Here the QA team should verify the granular data stored in data warehouse against the source data available. 2. Field level data verification - QA team must understand the linkages for the fields displayed in the report and should trace back and compare that with the source systems. 3. Creating SQLs - Create SQL queries to fetch and verify the data from Source and Target. Sometimes it’s not possible to do the complex transformations done in ETL. In such a case the data can be transferred to some file and calculations can be performed. User Acceptance Testing Here the system is tested with full functionality and is expected to function as in production. At the end of UAT, the system should be acceptable to the client for use in terms of ETL process integrity and business functionality and reporting. Conclusion Evolving needs of the business and changes in the source systems will drive continuous change in the data warehouse schema and the data being loaded. Hence, it is necessary that development and testing processes are clearly defined, followed by impact-analysis and strong alignment between development, operations and the business.