ETL Validator gives quick and easy way to create test cases for comparing counts and measures of source & target data sources. A variance can be specified too. Here, we will create a Checksum test case that will compare measures and counts. The same functionality is also implemented in Component test case using 'Measure Validation'.
4. Usecase :
This use-case shows how
to compare measures,
counts between two data
sources. And a variance
also can be specified.
Start with creating a new
Checksum Test Case
Checksum Testcase
5. Usecase:
Name the test case.
Select the Target and
Source Database
Connection.
Navigate to the next
screen.
Checksum Testcase
6. Usecase:
SQL can be typed into Target
and Source Query areas OR
Use Query Builder.
Here we use custom SQL:
Source:
SELECT CUST_ID, Count(*) COUNT_ALL,
avg(cust_id) AVG_ID, min(cust_id)
MIN_ID, max(cust_id) MAX_ID,
sum(cust_id)
SUM_ID, count(distinct(cust_id))
DISTINCT_ID,
max(length(cust_first_name)) as
Max_Fst_Name,
min(length(cust_first_name)) as
Min_Fst_Name
FROM SRC_CUSTOMERS
GROUP BY CUST_ID
Checksum Testcase
7. Usecase:
Target:
SELECT CUST_ID, Count(*) as
COUNT_ALL , avg(cust_id) AVG_ID,
min(cust_id) MIN_ID, max(cust_id)
MAX_ID, sum(cust_id) SUM_ID,
count(distinct(cust_id)) DISTINCT_ID,
max(length (cust_first_name))
Max_Fst_Name, min(length
(cust_first_name)) Min_Fst_Name
FROM TGT_CUSTOMERS
GROUP BY CUST_ID
Execute query in both
source and target panes.
Results are displayed below
in the grids.
Navigate to next screen.
Checksum Testcase
8. Usecase:
The list of fields from both
datasets is displayed.
Select/de-select fields as per
requirement.
Target fields order should
match with that of source
fields. If it is off, select the
right one from drop-down.
Specify the variance or leave
it as is.
Specify ‘Join’ criteria.
Navigate to next screen.
Variance
Checksum Testcase
10. Usecase:
Checksum Testcase
Resulting datasets are
categorised into
‘Unmatched’, ‘Matched’,
‘Source’ and ‘Target’ data.
Unmatched data is listed
and further sub-
categorized.
Click on downward arrows
to see the records.
11. Usecase:
Checksum Testcase
‘Fail’ status indicates that
there was a difference in
the measure between the
two data sources.
The first 2 datasets are
records present only in
source or in target.
Hence, as there is no
corresponding record, it is
a ‘Fail’.
‘Run Summary’ gives a
quick idea about
Matched/Unmatched
data.
12. Usecase:
Checksum Testcase
In ‘Unmatched’ results,
both the Source and
Target values are
displayed. The status is
‘Pass’ if they match and
‘Fail’ if they don’t.
Notice that ‘variance’ is
also displayed.
These differences can be
exported into Excel.
Notice that the variance
of max and min fst_name
is >40%
Export to Excel
14. Usecase:
All the records that
matched, show the values
for source and target
measures + variance
value.
And the ‘Pass’/’Fail’ status
per measure per record
pair is indicated.
The left panel has the run
durations, queries and
data sources.
Checksum Testcase
15. Usecase:
Datasets from Source and
Target are displayed in
the other categories.
Now, let us go back to
the mapping and change
the variance.
Checksum Testcase
16. Usecase:
Change the Variance to
50% for max_fst_name
and min_fst_name
Navigate to next screen.
Checksum Testcase
17. Usecase:
Notice that only one
record is in ‘Unmatched
Results’ with a ‘Fail’ status.
The other record ‘Passed’
because of the allowed
‘Variance’.
The left panel has the run
durations, queries and
data sources.
The same report can be
viewed in browser.
Checksum Testcase
Report in Browser
19. Usecase:
Same functionality can also
be done in Component
Testcase through ‘Measure
Validation’.
In the ‘Mapping
Component’, click on the
‘+’ to add ‘Measure
Validation’.
Navigate to next screen.
To learn how to create Component
Testcase, refer to -
https://www.slideshare.net/ProductM
arketingdata/etl-validator-usecase-
testing-transformations-or-derived-
fields
Component Testcase
20. Usecase:
Same source and target
SQLs that were used
earlier are the data
sources here also.
The list of fields from both
datasets is displayed.
Select/de-select fields as
per requirement.
Component Testcase
21. Usecase:
Target fields order should
match with that of source
fields. If it is off, select the
right one from drop-
down.
Specify the variance or
leave it as is.
Specify ‘Join’ criteria.
Navigate to next screen.
Component Testcase
23. Usecase:
All the results displayed
are similar to how
Checksum testcase
displayed earlier.
Component Testcase
24. More with ETL Validator….
• Validating Field and Data Format
• Data counts validation with allowed variance
• Check Data Quality using Data Rules Test Plan
• Advanced ETL Testing using a Component Test Case
• Avoiding inline views on your queries in ETL Validator
• Checking for Mandatory Fields
• Data Profiling of Source and Target
www.datagaps.com