ETL Validator gives quick and easy way to create test cases for identifying Duplicates in data sources. Here, we will create a test case that will identify duplicates of First Name + Last Name.
4. Usecase :
Duplicate Records Check
Create a test case:
Identify records from
Customers table which
have duplicate First
Name + Last Name
Start with creating a
new Data Rules Test
Plan
5. Usecase:
Name the test plan.
Select the Database
Connection.
Navigate to the next
screen.
Duplicate Records Check
6. Usecase:
Expand the schema; in this
example, ‘public’.
Select and expand the
table ‘Customers’.
Click on ‘Add Duplicate
Check Rule’.
Duplicate Records Check
7. Usecase:
By default, the Rule Builder
shows the SQL - SELECT
customers.cust_id, Count(*) as
Custom_RowCount
FROM customers customers
GROUP BY customers.cust_id
HAVING ( Count(*) > 1 )
This can me modified by
deleting/adding the columns
needed.
Right click on the empty square
next to ‘cust_id’. Select the option
to ‘Delete Column’.
Duplicate Records Check
8. Usecase:
Rule1 is the default duplicate rule.
Select ‘cust_first_name’ from the field list
and drag it to the right pane. Similarly
drag and drop ‘cust_last_name’.
Click on ‘Build Query’. Duplicates are
displayed in the grid below.
SQL changes to: SELECT Count(*) as
Custom_RowCount, customers.cust_first_name,
customers.cust_last_name
FROM customers customers
GROUP BY customers.cust_first_name,
customers.cust_last_name
HAVING ( Count(*) > 1 )
Name the Rule and Save Query.
Duplicate Records Check
Drag & Drop
SQL Changes
Duplicates
9. Usecase:
Notice that the rule
‘Duplicates’ is now shown in
Custom Data Rules pane.
Navigate to the next screen.
Duplicate Records Check
10. Usecase:
In order to run test cases
of only the ‘Customers’
table –
• Click on settings icon
• Unselect other tables.
Save the settings.
Click on ‘Run’
Duplicate Records Check
11. Usecase:
Click on ‘Run’.
‘FAILED’ indicates that
there are records that
didn’t satisfy the rule.
The grid below shows
results from first test case
in the list of top pane.
Click on subsequent test
cases to see those results.
Duplicate Records Check
12. Usecase:
The results can also be
exported into Excel.
Once the export is done,
an alert is displayed.
Click on ‘View Report in
Browser’ to see same
results in web layout.
Duplicate Records Check
13. Usecase:
Same info is
displayed in web
layout.
The link can be
shared with others.
Click on the upward
arrow of other test
cases to see the
record results.
Duplicate Records Check
14. More with ETL Validator….
• Validating Field and Data Format
• Data counts validation with allowed variance
• Check Data Quality using Data Rules Test Plan
• Advanced ETL Testing using a Component Test Case
• Avoiding inline views on your queries in ETL Validator
• Checking for Mandatory Fields
www.datagaps.com