1
Data Validation         Overview2
What is Data Validation?  Identifying errors in data sets that have been Moved or Transformed to ensure  they are Complete...
How most companies test today•   Many companies perform testing manually by writing SQL    scripts, using Excel, or hand c...
Problems with Manual Testing• Takes a long time and is expensive   − Time is spent writing queries and waiting for them to...
Current Approach: Like a Photo Hunt                                      6
Current Approach: Stare and Compare      Data Set #1           Data Set #2                                          7
What is the Data Validation Option?  DVO is a independent (black box)     testing solution that provides     automation, r...
Some DVO use casesData Being Transformed  •   ETL Reconciliation  •   Data Masking  •   ETL Testing  •   Application Migra...
Two Value Propositions for DVO    Ensure the integrity of data as it moves through                  the IT environment.   ...
Two Value Propositions for DVOEnsure the integrity of data as it moves through              the IT environment.           ...
How DVO works with PowerCenter                            Data Validation Option                                          ...
Key Features of DVO• Broad data connectivity • DBMS (Oracle, SQL Server, DB2, Sybase, Teradata, Netezza) • Mainframe (DB2 ...
Comparing DVO with Manual Testing                                    14
Technology CompanyDevelopment  and Test                                  Reduced data testing time by 80%                 ...
Financial Services Company ProductionReconciliation                                Ensures DW is Complete and Accurate    ...
Mid-size Technology Company ProductionReconciliation                                   Reconciling MDM data using         ...
18
Upcoming SlideShare
Loading in...5
×

Table29 Data Validation 95

1,802

Published on

Informatica DVO

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,802
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
64
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Table29 Data Validation 95

  1. 1. 1
  2. 2. Data Validation Overview2
  3. 3. What is Data Validation? Identifying errors in data sets that have been Moved or Transformed to ensure they are Complete and Accurate and meet Expectations or Requirements. 3
  4. 4. How most companies test today• Many companies perform testing manually by writing SQL scripts, using Excel, or hand coding testing logic into their integration processes• Reconciliation is done manually (if done at all) − basic SQL scripts (row counts, aggregates) or manually written mappings/logic• Customers estimate data testing SHOULD take 25-30% of all hours spent on Data Integration − Most customers admit they do not do enough data validation, resulting in poorer data quality and higher project risk• PowerCenter upgrades can take up to weeks or months to complete due to manual testing effort − It takes one day to upgrade the ETL software 4
  5. 5. Problems with Manual Testing• Takes a long time and is expensive − Time is spent writing queries and waiting for them to run and then searching through the results• Error-prone manual process − “Stare and compare”• Cannot perform thorough testing − Time/Cost pressure leads to “try it here and there” approach − Testing ends when the deadline is reached, done or not• Usual problems associated with writing custom code − No audit trail − No reuse − No methodology 5
  6. 6. Current Approach: Like a Photo Hunt 6
  7. 7. Current Approach: Stare and Compare Data Set #1 Data Set #2 7
  8. 8. What is the Data Validation Option? DVO is a independent (black box) testing solution that provides automation, repeatability and auditability to virtually any data testing or reconciliation process. 8
  9. 9. Some DVO use casesData Being Transformed • ETL Reconciliation • Data Masking • ETL Testing • Application MigrationData is Identical • ETL version upgrade • ETL Migration • Database migration • Application Retirement 9
  10. 10. Two Value Propositions for DVO Ensure the integrity of data as it moves through the IT environment. Development & Test Production ReconciliationProvide automation for unit and Protect the integrity of data that regression testing is loaded into of integration logic. production systems.Ensure that data produced by DI Erroneous data due to failedcode meets requirements and loads, faulty logic or operational expectations issues is caught in a proactive automated manner and can be addressed as needed 10
  11. 11. Two Value Propositions for DVOEnsure the integrity of data as it moves through the IT environment. 11
  12. 12. How DVO works with PowerCenter Data Validation Option Repository Database Reports DVO Clients & Warehouse Views V_Summary Id: name name: string V_Tests Define Price: integer DateId: name in: date V_Results Datename: string out: date Tests Salary: float Price: integer Date Id: name in: date Date name: string out: date Price: integer Salary: float Date in: date Date out: date Salary: float Execute Tests Results Data Accessed Repository and Integration ServicesRepository Enterprise PowerCenter Data 12
  13. 13. Key Features of DVO• Broad data connectivity • DBMS (Oracle, SQL Server, DB2, Sybase, Teradata, Netezza) • Mainframe (DB2 z/OS, DB2 AS/400, IMS, Adabas, MF Flat files, VSAM) • SalesForce.com , SAP transparent tables, SAS, ODBC and Flat files• Numerous built-in tests • COUNT, COUNT_DISTINCT, COUNT_ROWS, MIN, MAX, AVG, SUM • SET AinB, SET BinA, SET AeqB • VALUE, OUTER VALUE, Expressions• Model ETL constructs • LOOKUPs, Arbitrary SQL Relationships• Other • Run from GUI or CLI (DVOCmd) • Built-in reporting 13
  14. 14. Comparing DVO with Manual Testing 14
  15. 15. Technology CompanyDevelopment and Test Reduced data testing time by 80% with Data Validation OptionKEY BUSINESS IMPERATIVE AND IT INITIATIVE SAAS provider of Sales Compensation and analytics • Data absolutely has to be correct as it affects peoples’ paychecks • Very high visibility of the data with users • Trust in the data is keyTHE CHALLENGE INFORMATICA ADVANTAGE RESULTS/BENEFITS• New release every ~1 month • With DVO they are able to • Have created a test suite of• 1 Full week of data testing test 100,000s rows of data in over 1000 Tests by QA team per release regression tests • Testers can manage the• Developers wrote SQL for • Developers no longer required testing environment testing the data to write SQL • Can test large volumes of data• Testers would execute the • Testers are now empowered • Testing time reduced from 1 SQL, track errors and work and independent of week to 1 day (80% less) with Developers to resolve developers • Spend “free time” on higher• And who was testing the SQL level tasks to make sure it was correct? Informatica Confidential – Under NDA 15
  16. 16. Financial Services Company ProductionReconciliation Ensures DW is Complete and Accurate with Data Validation OptionKEY BUSINESS IMPERATIVE AND IT INITIATIVE Good data is essential to good business decisions. Their calculations of portfolio risk and value must be correct. • Spends “hundreds of millions” purchasing troubled debt in the USA • The data and risk calculations on those assets must be correct. • Bad data could cost them “millions” and put them out of business.THE CHALLENGE INFORMATICA ADVANTAGE RESULTS/BENEFITS • Business users were • With DVO they are able to • DVO found where data was complaining about missing perform detailed missing data in the systems. reconciliations across source • Found thousands of missing • Data errors can lead to very and target systems. records due to bad coding, & costly bad business decisions. • With DVO, they have a complete improperly rerun failed jobs • They were doing manual testing audit trail. • Reloaded all missing data in two via developer-written mappings weeks and PL/SQL • They are looking to implement • Other products available today ongoing incremental validation could not meet their for all new data loaded into requirements tables Informatica Confidential – Under NDA 16
  17. 17. Mid-size Technology Company ProductionReconciliation Reconciling MDM data using Data Validation OptionKEY BUSINESS IMPERATIVE AND IT INITIATIVE Customer and contact hub is pivotal to efficient business operations Millions of records processed across various systems Ensure BAs, line managers and customers had access to accurate and complete data based on their needsTHE CHALLENGE INFORMATICA ADVANTAGE RESULTS/BENEFITS• No easy way to reconcile • DVO reconciled data • Identified errors due to data in systems to identify across systems (e.g. faulty DI logic, and error bad data or identify extent SalesForce and Hub) handling process of errors and found: • Ensured incorrect• Incorrectly augmented • 1000s of missing records records no longer being data in systems between systems use in marketing• Gold record data didn’t • Incorrectly augmented campaigns always match across D&B data • Bad customer data no systems • Improperly coded golden longer reaching customer• Faulty records propagated records in portal downstream. Informatica Confidential – Under NDA 17
  18. 18. 18
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×