SlideShare a Scribd company logo
1 of 7
Download to read offline
paper
Topic:- Ensure Accuracy in Data Transformation with Data Testing
Framework(DTF)
Abstract:
Need for a Data Testing Framework
Testing holds high stakes in helping businesses make insightful and intelligent decisions using available
information. Given the growing complexity in Data Warehousing and Business Intelligence space in the
IT Industry; L&T InfoTech has developed a cost-effective solution to address the following challenges
faced by clients:
Security services to that rich environment. Higher security assurance typically comes with higher
integration costs and reduced usability. TCS recommends a risk-based, cost-effective, holistic mobile
security solution with focus on user experience and enhancing customer engagement.
• Unavailability of comprehensive testing tools
• Varied skill sets required to understand various file formats
• Voluminous data from heterogeneous sources
• 100 % data validation is not be feasible
• Manual comparison of data is tedious and error-prone
Data Testing Framework is a testing framework that easily integrates with users’ needs for different
types of data validation processes. It enables users to compare and validate data across various types of
data sources and databases.
DTF Overview
DTF is Open Source data validation and comparison framework that allows a user to perform data-
centric testing. It’s simple User Interface (UI) enables users to easily configure the tool as per their
testing needs. The framework also provides detailed results of the test cases enabling faster analysis of
Information stored in a data warehouse is critical to organizations for decision making
and Predictive analysis. The huge volume of data loaded onto a data warehouse makes
exhaustive manual comparison of data impractical. The existing quality tools are either
manual or have other limitations, and do not cover all aspects of data warehouse testing.
Therefore, a holistic solution is required to test high-volume applications that are built on
Data Warehouse (DW) or Business Intelligence (BI) architecture.
test results.
What is DTF?
The DTF has been developed by synthesizing years of experience in the Database Testing area. DTF can
be used for comparison of data from two different data feeds after data migration or reconciliation.These
source and target data-feeds can be database table, database query, flat file, CSV, PSV or an Excel file.
DTF has a proven track record of comparing high volume of data and supports leading databases
in the market.
DTF can be configured to perform the following types of comparisons:
• File to File comparator
• File to Database table comparator
• Database to Database comparator
• Query output to File comparator
• Database Table comparator
• Database Table to Query output
• Database table to Fixed Length File Comparator
• Database table to XML comparator
• Database table to Stored Procedure output comparator
DTF provides a user-friendly UI to the testers from non-technical background and allows them to
configure the tool to operate in different modes for different types of comparisons.
Execution Steps
Common test scenarios required for data conversion testing can be broadly classified into the following
categories:
• Table/schema validation (includes the verification of indexes, stored procedures and trigger)
• Count and data validation
• Data character set conversion
• File processing (In cases where the source is a file)
• Batch job and business rule validation
• Interface testing
Figure 1: DTF Process
The process for data testing using DTF is as follows:
1. Analyze – Study the data model of the source and the target databases to understand the
conversion process. If the source is a flat file, analyze the file’s structure and its mapping with
the target database.
2. Data Mapping – The mapping between the source and the target databases & tables needs to be
configured in the DTF. If there are no schema changes, the mapping of the source and target
databases at database level is enough. There may be scenarios where either the data of one
source table is distributed to multiple target tables or the data of multiple source tables is merged
in one target table. In such cases, the mapping of source and the target tables will be required to
be configured in the DTF at column level.
3. Test Case Creation – The test cases for various data comparison and validation scenarios can be
created in DTF using the data mappings done. DTF also provides the user an option to create test
suites and execute multiple test cases in a single framework execution.
4. Execute & Report – DTF test case or test suite can be executed in DTF by providing different
run time DTF execution options. Following are some DTF execution options:
• Trim Data before Comparison
• Ignore case in Comparison
• Database Schema Comparison
• Full Database Comparison
Once the execution is complete, a detailed report is generated which gives the following details:
• Summary report
• Mismatched records
• Extra records in source
• Extra records in Target
All reports are generated in a spreadsheet, which are detailed and convenient to analyze.
Building Blocks of DTF
Figure 2: DTF Building Blocks
DTF comprises the following three blocks:
• DTF Util Manager - DTF Util Manager is responsible for reading/writing data into
files/databases and data conversion, if required, for internal DTF logic. It ensures that the source
and target data arein same format before data goes to the DTF Compare Engine. It implements
logic for all other activities other than actual data comparison and report generation.
• DTF Compare Engine - DTF Compare Engine is responsible for actual comparison of source and
target data. If the data is huge, it divides the data into predefined sized chunks and does the
comparison. Formation of the data chunks and data comparison is done in parallel to have faster
comparison. This engine communicates with DTF Report Manager to give details of comparison
execution result.
• DTF Report Manager - DTF Report Manager is responsible for generating DTF reports by taking
comparison execution results from DTF Compare Engine. It generates reports in excel format.
ports are generated in two categories: summary reports and detailed reports. It takes comparison
execution time as a reference and creates folders with that name to store reports for every
execution.
In addition to the three primary blocks, DTF has the following building blocks, each of which
Represents different data feeds:
• Excel Files
• Flat Files
• Database Tables
• Database Query
Excel Config file block represents configuration input excel files. Typically, a user lists the
parameters for comparison between the source and destination in these configuration file(s).
DTF Report block represents DTF summary as well as DTF detailed reports generated after
comparison execution.
Software Requirements
• JRE 1.6
• Microsoft Office
• Windows Operating System
Hardware Requirements
• 1 GB RAM or greater
• 3 GHz CPU
Benefits offered by DTF
• DTF is a very cost-effective solution as it is developed using Open Source Tool.
• Detailed reports help in identifying problems Reduction in test execution effort.
• Reusability of the framework across different Data Warehousing projects.
• Less maintenance because of the modular structure of the framework.
• Ability to work with different types of data feeds.
• Easier result analysis through Excel sheets.
Differentiators
• Simple test script creation and execution
• Tester productivity increased with improved quality of testing
• Cost savings of 30%
• Compressed testing cycle
Conclusion
DTF, the open source technology based framework that supports all databases currently available in the
market, creates detailed reports that help organizations identify defects and take corrective actions based
on the inputs. Enterprises are thus able to achieve cost and efforts savings with enhanced test coverage
through automation. Accurate, real-time information is readily available to help in making informed
decisions.
Conclusion
DTF, the open source technology based framework that supports all databases currently available in the
market, creates detailed reports that help organizations identify defects and take corrective actions based
on the inputs. Enterprises are thus able to achieve cost and efforts savings with enhanced test coverage
through automation. Accurate, real-time information is readily available to help in making informed
decisions.

More Related Content

What's hot

Cts informatica interview question answers
Cts informatica interview question answersCts informatica interview question answers
Cts informatica interview question answersSweta Singh
 
Physical Database Design & Performance
Physical Database Design & PerformancePhysical Database Design & Performance
Physical Database Design & PerformanceAbdullah Khosa
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouseJ M
 
SAS DATAFLUX DATA MANAGEMENT STUDIO TRAINING
SAS DATAFLUX DATA MANAGEMENT STUDIO TRAININGSAS DATAFLUX DATA MANAGEMENT STUDIO TRAINING
SAS DATAFLUX DATA MANAGEMENT STUDIO TRAININGbidwhm
 
Informatica interview questions and answers|Informatica Faqs 2014
Informatica interview questions and answers|Informatica Faqs 2014Informatica interview questions and answers|Informatica Faqs 2014
Informatica interview questions and answers|Informatica Faqs 2014BigClasses.com
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasiryasir873
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseBugRaptors
 
Physical database design(database)
Physical database design(database)Physical database design(database)
Physical database design(database)welcometofacebook
 
Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig bidwhm
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyAnkita Dubey
 
Chapter12 designing databases
Chapter12 designing databasesChapter12 designing databases
Chapter12 designing databasesDhani Ahmad
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing processRakesh Hansalia
 
Lecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data WarehouseLecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data Warehousephanleson
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process Omid Vahdaty
 
127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collectionsAmit Sharma
 

What's hot (20)

Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 
Cts informatica interview question answers
Cts informatica interview question answersCts informatica interview question answers
Cts informatica interview question answers
 
Physical Database Design & Performance
Physical Database Design & PerformancePhysical Database Design & Performance
Physical Database Design & Performance
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
SAS DATAFLUX DATA MANAGEMENT STUDIO TRAINING
SAS DATAFLUX DATA MANAGEMENT STUDIO TRAININGSAS DATAFLUX DATA MANAGEMENT STUDIO TRAINING
SAS DATAFLUX DATA MANAGEMENT STUDIO TRAINING
 
Informatica interview questions and answers|Informatica Faqs 2014
Informatica interview questions and answers|Informatica Faqs 2014Informatica interview questions and answers|Informatica Faqs 2014
Informatica interview questions and answers|Informatica Faqs 2014
 
Introduction To Msbi By Yasir
Introduction To Msbi By YasirIntroduction To Msbi By Yasir
Introduction To Msbi By Yasir
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
 
Physical database design(database)
Physical database design(database)Physical database design(database)
Physical database design(database)
 
Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig
 
Transaction
TransactionTransaction
Transaction
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubey
 
Chapter12 designing databases
Chapter12 designing databasesChapter12 designing databases
Chapter12 designing databases
 
Etl testing
Etl testingEtl testing
Etl testing
 
Data Warehouse (ETL) testing process
Data Warehouse (ETL) testing processData Warehouse (ETL) testing process
Data Warehouse (ETL) testing process
 
Lecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data WarehouseLecture 04 - Granularity in the Data Warehouse
Lecture 04 - Granularity in the Data Warehouse
 
Introduction to ETL and Data Integration
Introduction to ETL and Data IntegrationIntroduction to ETL and Data Integration
Introduction to ETL and Data Integration
 
Introduction to ETL process
Introduction to ETL process Introduction to ETL process
Introduction to ETL process
 
Fundamentals of Database Design
Fundamentals of Database DesignFundamentals of Database Design
Fundamentals of Database Design
 
127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections127556030 bisp-informatica-question-collections
127556030 bisp-informatica-question-collections
 

Viewers also liked

10 3 all handouts animal diversity 2010 jewett edit
10 3 all handouts animal diversity 2010 jewett edit10 3 all handouts animal diversity 2010 jewett edit
10 3 all handouts animal diversity 2010 jewett editMrJewett
 
The australian sea lion [recovered]
The australian sea lion [recovered]The australian sea lion [recovered]
The australian sea lion [recovered]y3ehps
 
Dissertation Proposal Meeting
Dissertation Proposal MeetingDissertation Proposal Meeting
Dissertation Proposal Meetingroycekimmons
 
07 reflejos
07 reflejos07 reflejos
07 reflejosjotesoul
 

Viewers also liked (6)

10 3 all handouts animal diversity 2010 jewett edit
10 3 all handouts animal diversity 2010 jewett edit10 3 all handouts animal diversity 2010 jewett edit
10 3 all handouts animal diversity 2010 jewett edit
 
The australian sea lion [recovered]
The australian sea lion [recovered]The australian sea lion [recovered]
The australian sea lion [recovered]
 
Totalitarismo 1 em
Totalitarismo 1 emTotalitarismo 1 em
Totalitarismo 1 em
 
Dissertation Proposal Meeting
Dissertation Proposal MeetingDissertation Proposal Meeting
Dissertation Proposal Meeting
 
07 reflejos
07 reflejos07 reflejos
07 reflejos
 
Calcium
CalciumCalcium
Calcium
 

Similar to Ranjitbanshpal1

Ajith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETLAjith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETLAjith Kumar Pampatti
 
Test strategy utilising mc useful tools
Test strategy utilising mc useful toolsTest strategy utilising mc useful tools
Test strategy utilising mc useful toolsMark Chappell
 
Managing Data Integration Initiatives
Managing Data Integration InitiativesManaging Data Integration Initiatives
Managing Data Integration InitiativesAllinConsulting
 
Office automation system report
Office automation system reportOffice automation system report
Office automation system reportAmit Kulkarni
 
Office automation system report
Office automation system reportOffice automation system report
Office automation system reportAmit Kulkarni
 
E&P data management: Implementing data standards
E&P data management: Implementing data standardsE&P data management: Implementing data standards
E&P data management: Implementing data standardsETLSolutions
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptxsharpan
 
3 Software Estmation.ppt
3 Software Estmation.ppt3 Software Estmation.ppt
3 Software Estmation.pptSoham De
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beammarkgrover
 
MetaSuite productfolder- ETL-Tool für große Datenmengen
MetaSuite productfolder- ETL-Tool für große DatenmengenMetaSuite productfolder- ETL-Tool für große Datenmengen
MetaSuite productfolder- ETL-Tool für große DatenmengenMinerva SoftCare GmbH
 
Mukhtar_Resume_ETL_Developer
Mukhtar_Resume_ETL_DeveloperMukhtar_Resume_ETL_Developer
Mukhtar_Resume_ETL_DeveloperMukhtar Mohammed
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
 
Mukhtar resume etl_developer
Mukhtar resume etl_developerMukhtar resume etl_developer
Mukhtar resume etl_developerMukhtar Mohammed
 
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...Jitendra Bafna
 

Similar to Ranjitbanshpal1 (20)

Ajith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETLAjith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETL
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Test strategy utilising mc useful tools
Test strategy utilising mc useful toolsTest strategy utilising mc useful tools
Test strategy utilising mc useful tools
 
Managing Data Integration Initiatives
Managing Data Integration InitiativesManaging Data Integration Initiatives
Managing Data Integration Initiatives
 
Office automation system report
Office automation system reportOffice automation system report
Office automation system report
 
Office automation system report
Office automation system reportOffice automation system report
Office automation system report
 
E&P data management: Implementing data standards
E&P data management: Implementing data standardsE&P data management: Implementing data standards
E&P data management: Implementing data standards
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
 
Info sphere overview
Info sphere overviewInfo sphere overview
Info sphere overview
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
3 Software Estmation.ppt
3 Software Estmation.ppt3 Software Estmation.ppt
3 Software Estmation.ppt
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
 
Streaming is a Detail
Streaming is a DetailStreaming is a Detail
Streaming is a Detail
 
Unit 4.pptx
Unit 4.pptxUnit 4.pptx
Unit 4.pptx
 
MetaSuite productfolder- ETL-Tool für große Datenmengen
MetaSuite productfolder- ETL-Tool für große DatenmengenMetaSuite productfolder- ETL-Tool für große Datenmengen
MetaSuite productfolder- ETL-Tool für große Datenmengen
 
Abdul ETL Resume
Abdul ETL ResumeAbdul ETL Resume
Abdul ETL Resume
 
Mukhtar_Resume_ETL_Developer
Mukhtar_Resume_ETL_DeveloperMukhtar_Resume_ETL_Developer
Mukhtar_Resume_ETL_Developer
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Mukhtar resume etl_developer
Mukhtar resume etl_developerMukhtar resume etl_developer
Mukhtar resume etl_developer
 
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
 

More from ranjit banshpal

Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...ranjit banshpal
 
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHESSECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHESranjit banshpal
 
Secure Image Retrieval based on Hybrid Features and Hashes
Secure Image Retrieval based on Hybrid Features and HashesSecure Image Retrieval based on Hybrid Features and Hashes
Secure Image Retrieval based on Hybrid Features and Hashesranjit banshpal
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...ranjit banshpal
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mpranjit banshpal
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technologyranjit banshpal
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviationranjit banshpal
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniquesranjit banshpal
 

More from ranjit banshpal (15)

Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
 
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHESSECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
 
Secure Image Retrieval based on Hybrid Features and Hashes
Secure Image Retrieval based on Hybrid Features and HashesSecure Image Retrieval based on Hybrid Features and Hashes
Secure Image Retrieval based on Hybrid Features and Hashes
 
LCT in day2 day life
LCT in day2 day lifeLCT in day2 day life
LCT in day2 day life
 
Fingerprint recognition
Fingerprint recognitionFingerprint recognition
Fingerprint recognition
 
“Web crawler”
“Web crawler”“Web crawler”
“Web crawler”
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mp
 
Face recognition technology
Face recognition technologyFace recognition technology
Face recognition technology
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniques
 
Hybrid encryption
Hybrid encryption Hybrid encryption
Hybrid encryption
 
Autocorrelators1
Autocorrelators1Autocorrelators1
Autocorrelators1
 
Static Networks
Static NetworksStatic Networks
Static Networks
 
Ranjitbanshpal
RanjitbanshpalRanjitbanshpal
Ranjitbanshpal
 

Recently uploaded

Arti Languages Pre Seed Send Ahead Pitchdeck 2024.pdf
Arti Languages Pre Seed Send Ahead Pitchdeck 2024.pdfArti Languages Pre Seed Send Ahead Pitchdeck 2024.pdf
Arti Languages Pre Seed Send Ahead Pitchdeck 2024.pdfwill854175
 
ICS2208 Lecture4 Intelligent Interface Agents.pdf
ICS2208 Lecture4 Intelligent Interface Agents.pdfICS2208 Lecture4 Intelligent Interface Agents.pdf
ICS2208 Lecture4 Intelligent Interface Agents.pdfVanessa Camilleri
 
LEAD6001 - Introduction to Advanced Stud
LEAD6001 - Introduction to Advanced StudLEAD6001 - Introduction to Advanced Stud
LEAD6001 - Introduction to Advanced StudDr. Bruce A. Johnson
 
Plant Tissue culture., Plasticity, Totipotency, pptx
Plant Tissue culture., Plasticity, Totipotency, pptxPlant Tissue culture., Plasticity, Totipotency, pptx
Plant Tissue culture., Plasticity, Totipotency, pptxHimansu10
 
3.12.24 The Social Construction of Gender.pptx
3.12.24 The Social Construction of Gender.pptx3.12.24 The Social Construction of Gender.pptx
3.12.24 The Social Construction of Gender.pptxmary850239
 
EDD8524 The Future of Educational Leader
EDD8524 The Future of Educational LeaderEDD8524 The Future of Educational Leader
EDD8524 The Future of Educational LeaderDr. Bruce A. Johnson
 
3.12.24 Freedom Summer in Mississippi.pptx
3.12.24 Freedom Summer in Mississippi.pptx3.12.24 Freedom Summer in Mississippi.pptx
3.12.24 Freedom Summer in Mississippi.pptxmary850239
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxheathfieldcps1
 
Metabolism , Metabolic Fate& disorders of cholesterol.pptx
Metabolism , Metabolic Fate& disorders of cholesterol.pptxMetabolism , Metabolic Fate& disorders of cholesterol.pptx
Metabolism , Metabolic Fate& disorders of cholesterol.pptxDr. Santhosh Kumar. N
 
2024.03.16 How to write better quality materials for your learners ELTABB San...
2024.03.16 How to write better quality materials for your learners ELTABB San...2024.03.16 How to write better quality materials for your learners ELTABB San...
2024.03.16 How to write better quality materials for your learners ELTABB San...Sandy Millin
 
ASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in Pharmacy
ASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in PharmacyASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in Pharmacy
ASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in PharmacySumit Tiwari
 
PHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdf
PHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdfPHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdf
PHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdfSumit Tiwari
 
Research Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchResearch Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchRushdi Shams
 
AI Uses and Misuses: Academic and Workplace Applications
AI Uses and Misuses: Academic and Workplace ApplicationsAI Uses and Misuses: Academic and Workplace Applications
AI Uses and Misuses: Academic and Workplace ApplicationsStella Lee
 
25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...
25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...
25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...Nguyen Thanh Tu Collection
 
Dhavni Theory by Anandvardhana Indian Poetics
Dhavni Theory by Anandvardhana Indian PoeticsDhavni Theory by Anandvardhana Indian Poetics
Dhavni Theory by Anandvardhana Indian PoeticsDhatriParmar
 

Recently uploaded (20)

Arti Languages Pre Seed Send Ahead Pitchdeck 2024.pdf
Arti Languages Pre Seed Send Ahead Pitchdeck 2024.pdfArti Languages Pre Seed Send Ahead Pitchdeck 2024.pdf
Arti Languages Pre Seed Send Ahead Pitchdeck 2024.pdf
 
Problems on Mean,Mode,Median Standard Deviation
Problems on Mean,Mode,Median Standard DeviationProblems on Mean,Mode,Median Standard Deviation
Problems on Mean,Mode,Median Standard Deviation
 
ICS2208 Lecture4 Intelligent Interface Agents.pdf
ICS2208 Lecture4 Intelligent Interface Agents.pdfICS2208 Lecture4 Intelligent Interface Agents.pdf
ICS2208 Lecture4 Intelligent Interface Agents.pdf
 
Least Significance Difference:Biostatics and Research Methodology
Least Significance Difference:Biostatics and Research MethodologyLeast Significance Difference:Biostatics and Research Methodology
Least Significance Difference:Biostatics and Research Methodology
 
LEAD6001 - Introduction to Advanced Stud
LEAD6001 - Introduction to Advanced StudLEAD6001 - Introduction to Advanced Stud
LEAD6001 - Introduction to Advanced Stud
 
Plant Tissue culture., Plasticity, Totipotency, pptx
Plant Tissue culture., Plasticity, Totipotency, pptxPlant Tissue culture., Plasticity, Totipotency, pptx
Plant Tissue culture., Plasticity, Totipotency, pptx
 
3.12.24 The Social Construction of Gender.pptx
3.12.24 The Social Construction of Gender.pptx3.12.24 The Social Construction of Gender.pptx
3.12.24 The Social Construction of Gender.pptx
 
EDD8524 The Future of Educational Leader
EDD8524 The Future of Educational LeaderEDD8524 The Future of Educational Leader
EDD8524 The Future of Educational Leader
 
3.12.24 Freedom Summer in Mississippi.pptx
3.12.24 Freedom Summer in Mississippi.pptx3.12.24 Freedom Summer in Mississippi.pptx
3.12.24 Freedom Summer in Mississippi.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
 
Metabolism , Metabolic Fate& disorders of cholesterol.pptx
Metabolism , Metabolic Fate& disorders of cholesterol.pptxMetabolism , Metabolic Fate& disorders of cholesterol.pptx
Metabolism , Metabolic Fate& disorders of cholesterol.pptx
 
2024.03.16 How to write better quality materials for your learners ELTABB San...
2024.03.16 How to write better quality materials for your learners ELTABB San...2024.03.16 How to write better quality materials for your learners ELTABB San...
2024.03.16 How to write better quality materials for your learners ELTABB San...
 
ASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in Pharmacy
ASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in PharmacyASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in Pharmacy
ASTRINGENTS.pdf Pharmacognosy chapter 5 diploma in Pharmacy
 
PHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdf
PHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdfPHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdf
PHARMACOGNOSY CHAPTER NO 5 CARMINATIVES AND G.pdf
 
Research Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchResearch Methodology and Tips on Better Research
Research Methodology and Tips on Better Research
 
AI Uses and Misuses: Academic and Workplace Applications
AI Uses and Misuses: Academic and Workplace ApplicationsAI Uses and Misuses: Academic and Workplace Applications
AI Uses and Misuses: Academic and Workplace Applications
 
25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...
25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...
25 CHUYÊN ĐỀ ÔN THI TỐT NGHIỆP THPT 2023 – BÀI TẬP PHÁT TRIỂN TỪ ĐỀ MINH HỌA...
 
Dhavni Theory by Anandvardhana Indian Poetics
Dhavni Theory by Anandvardhana Indian PoeticsDhavni Theory by Anandvardhana Indian Poetics
Dhavni Theory by Anandvardhana Indian Poetics
 
ANOVA Parametric test: Biostatics and Research Methodology
ANOVA Parametric test: Biostatics and Research MethodologyANOVA Parametric test: Biostatics and Research Methodology
ANOVA Parametric test: Biostatics and Research Methodology
 

Ranjitbanshpal1

  • 1. paper Topic:- Ensure Accuracy in Data Transformation with Data Testing Framework(DTF) Abstract: Need for a Data Testing Framework Testing holds high stakes in helping businesses make insightful and intelligent decisions using available information. Given the growing complexity in Data Warehousing and Business Intelligence space in the IT Industry; L&T InfoTech has developed a cost-effective solution to address the following challenges faced by clients: Security services to that rich environment. Higher security assurance typically comes with higher integration costs and reduced usability. TCS recommends a risk-based, cost-effective, holistic mobile security solution with focus on user experience and enhancing customer engagement. • Unavailability of comprehensive testing tools • Varied skill sets required to understand various file formats • Voluminous data from heterogeneous sources • 100 % data validation is not be feasible • Manual comparison of data is tedious and error-prone Data Testing Framework is a testing framework that easily integrates with users’ needs for different types of data validation processes. It enables users to compare and validate data across various types of data sources and databases. DTF Overview DTF is Open Source data validation and comparison framework that allows a user to perform data- centric testing. It’s simple User Interface (UI) enables users to easily configure the tool as per their testing needs. The framework also provides detailed results of the test cases enabling faster analysis of Information stored in a data warehouse is critical to organizations for decision making and Predictive analysis. The huge volume of data loaded onto a data warehouse makes exhaustive manual comparison of data impractical. The existing quality tools are either manual or have other limitations, and do not cover all aspects of data warehouse testing. Therefore, a holistic solution is required to test high-volume applications that are built on Data Warehouse (DW) or Business Intelligence (BI) architecture.
  • 2. test results. What is DTF? The DTF has been developed by synthesizing years of experience in the Database Testing area. DTF can be used for comparison of data from two different data feeds after data migration or reconciliation.These source and target data-feeds can be database table, database query, flat file, CSV, PSV or an Excel file. DTF has a proven track record of comparing high volume of data and supports leading databases in the market. DTF can be configured to perform the following types of comparisons: • File to File comparator • File to Database table comparator • Database to Database comparator • Query output to File comparator • Database Table comparator • Database Table to Query output • Database table to Fixed Length File Comparator • Database table to XML comparator • Database table to Stored Procedure output comparator DTF provides a user-friendly UI to the testers from non-technical background and allows them to configure the tool to operate in different modes for different types of comparisons. Execution Steps Common test scenarios required for data conversion testing can be broadly classified into the following categories: • Table/schema validation (includes the verification of indexes, stored procedures and trigger) • Count and data validation • Data character set conversion • File processing (In cases where the source is a file) • Batch job and business rule validation • Interface testing
  • 3. Figure 1: DTF Process The process for data testing using DTF is as follows: 1. Analyze – Study the data model of the source and the target databases to understand the conversion process. If the source is a flat file, analyze the file’s structure and its mapping with the target database. 2. Data Mapping – The mapping between the source and the target databases & tables needs to be configured in the DTF. If there are no schema changes, the mapping of the source and target databases at database level is enough. There may be scenarios where either the data of one source table is distributed to multiple target tables or the data of multiple source tables is merged in one target table. In such cases, the mapping of source and the target tables will be required to be configured in the DTF at column level. 3. Test Case Creation – The test cases for various data comparison and validation scenarios can be created in DTF using the data mappings done. DTF also provides the user an option to create test suites and execute multiple test cases in a single framework execution. 4. Execute & Report – DTF test case or test suite can be executed in DTF by providing different run time DTF execution options. Following are some DTF execution options: • Trim Data before Comparison • Ignore case in Comparison • Database Schema Comparison
  • 4. • Full Database Comparison Once the execution is complete, a detailed report is generated which gives the following details: • Summary report • Mismatched records • Extra records in source • Extra records in Target All reports are generated in a spreadsheet, which are detailed and convenient to analyze. Building Blocks of DTF Figure 2: DTF Building Blocks DTF comprises the following three blocks: • DTF Util Manager - DTF Util Manager is responsible for reading/writing data into files/databases and data conversion, if required, for internal DTF logic. It ensures that the source and target data arein same format before data goes to the DTF Compare Engine. It implements logic for all other activities other than actual data comparison and report generation. • DTF Compare Engine - DTF Compare Engine is responsible for actual comparison of source and target data. If the data is huge, it divides the data into predefined sized chunks and does the comparison. Formation of the data chunks and data comparison is done in parallel to have faster comparison. This engine communicates with DTF Report Manager to give details of comparison execution result. • DTF Report Manager - DTF Report Manager is responsible for generating DTF reports by taking comparison execution results from DTF Compare Engine. It generates reports in excel format. ports are generated in two categories: summary reports and detailed reports. It takes comparison execution time as a reference and creates folders with that name to store reports for every execution.
  • 5. In addition to the three primary blocks, DTF has the following building blocks, each of which Represents different data feeds: • Excel Files • Flat Files • Database Tables • Database Query Excel Config file block represents configuration input excel files. Typically, a user lists the parameters for comparison between the source and destination in these configuration file(s). DTF Report block represents DTF summary as well as DTF detailed reports generated after comparison execution. Software Requirements • JRE 1.6 • Microsoft Office • Windows Operating System Hardware Requirements • 1 GB RAM or greater • 3 GHz CPU Benefits offered by DTF • DTF is a very cost-effective solution as it is developed using Open Source Tool. • Detailed reports help in identifying problems Reduction in test execution effort. • Reusability of the framework across different Data Warehousing projects. • Less maintenance because of the modular structure of the framework. • Ability to work with different types of data feeds. • Easier result analysis through Excel sheets. Differentiators • Simple test script creation and execution • Tester productivity increased with improved quality of testing • Cost savings of 30% • Compressed testing cycle
  • 6. Conclusion DTF, the open source technology based framework that supports all databases currently available in the market, creates detailed reports that help organizations identify defects and take corrective actions based on the inputs. Enterprises are thus able to achieve cost and efforts savings with enhanced test coverage through automation. Accurate, real-time information is readily available to help in making informed decisions.
  • 7. Conclusion DTF, the open source technology based framework that supports all databases currently available in the market, creates detailed reports that help organizations identify defects and take corrective actions based on the inputs. Enterprises are thus able to achieve cost and efforts savings with enhanced test coverage through automation. Accurate, real-time information is readily available to help in making informed decisions.