Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Requirements


Published on

In the U.S., pharmaceutical firms must meet electronic record-keeping regulations set by the Food and Drug Administration (FDA). The regulation is Title 21 CFR Part 11, commonly known as Part 11.

Part 11 requires regulated firms to implement controls for software and systems involved in processing many forms of data as part of business operations and product development.

Enterprise data warehouses are used by the pharmaceutical and medical device industries for storing data covered by Part 11. QuerySurge, the only test tool designed specifically for automating the testing of data warehouses and the ETL process, is the market leader in testing data warehouses used by Part 11-governed companies.

For more on QuerySurge and Pharma, please visit

Published in: Technology, Business
  • Be the first to comment

Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Requirements

  1. 1. 1 Presenter Bill Hayduk Founder / President Presenter Jeff Bocarsly, Ph.D. Senior Architect built bybuilt by QuerySurge™
  2. 2. built by The average organization loses $8.2 million annually through poor Data Quality. - Gartner 46% of companies cite Data Quality as a barrier for adopting Business Intelligence products. - InformationWeek The cost per patient data of Phase 3 clinical studies of new pharmaceuticals exceeds $26,000. - Journal of Clinical Research Best Practices built by QuerySurge™
  3. 3. Pharma’s 2 Largest Data Warehousing Concerns built by QuerySurge™
  4. 4. Pharma’s Largest Data Warehouse Concerns (1) Data Integrity (2) Compliance built by QuerySurge™
  5. 5. (1) Data Integrity high risk of defects that are not readily visible Missing Data Truncation of Data Data Type Mismatch Null Translation errors Incorrect Type Translation Misplaced Data Extra Records Transformation Logic Errors/Holes Simple/Small Errors Sequence Generator errors Undocumented Requirements Not Enough Records built by Pharma’s Data Warehouse Concerns QuerySurge™
  6. 6. Pharma’s Data Warehouse Concerns (2) Compliance Need to comply with Part 11 mandates historical test information test version history test execution data: who, what & when test cycle information visibility of assets archived test results built by QuerySurge™
  7. 7. Why is this Important?  Periodic data reporting to FDA  Periodic data reporting to int’l bodies (1) Data Integrity (2) Compliance  FDA announced audits  Unannounced FDA audits Consequences Severe financial and business built by QuerySurge™
  8. 8. Pharma’s Testing & Reporting Needs built by QuerySurge™
  9. 9.  automate the manual testing of data  compare millions of rows of data quickly  flag mismatches and inconsistencies in data sets  provide flexibility in scheduling test runs  generate informative reports that can easily be shared with the team  validate up to 100% of all of all data, mitigating the risk Data Integrity needs Need a testing solution that can… built by QuerySurge™
  10. 10. Part 11 Reporting needs  track test history  provide reporting on test version history  record all test execution by testing owner’s name and date  deliver auditable reports of test cycles  store all test outcomes and test data  offer a read-only user for reviewing test assets  support archiving of results Need a testing solution that can… built by QuerySurge™
  11. 11. built by The solution… built by QuerySurge™
  12. 12. What is QuerySurge™? the collaborative Data Testing solution that finds bad data & provides a holistic view of your data’s health built by QuerySurge™
  13. 13. • Reduce your costs & risks • Improve your data quality • Accelerate your testing cycles • Share information with your team with QuerySurge™ you can: built by QuerySurge™ • Provides huge ROI (i.e. 1,300%)* *based on client’s calculation of Return on Investment
  14. 14. Finding Bad Data SQL HQL SQL HQL SQL SQL  QS pulls data from data sources  QS pulls data from target data store  QS compares data quickly  QS generates reports, audit trails How? Reports, Data Health Dashboard built by QuerySurge™
  15. 15. QuerySurge™ Architecture Web-based… Installs on... Linux Connects to… …or any other JDBC compliant data source built by QuerySurge™ QuerySurge Controller QuerySurge Server QuerySurge Agents Flat Files
  16. 16. Collaboration Testers - functional testing - regression testing - result analysis Developers / DBAs - unit testing - result analysis Data Analysts - review, analyze data - verify mapping failures Operations teams - monitoring - result analysis Managers - oversight - result analysis Share information on the built by QuerySurge™
  17. 17. the QuerySurge advantage built by QuerySurge™ Automate the entire testing cycle  Automate kickoff, tests, comparison, auto-emailed results Create Tests easily with no SQL programming  ensures minimal time & effort to create tests / obtain results Test across different platforms  data warehouse, Hadoop, NoSQL, database, flat file, XML Collaborate with team  Data Health dashboard, shared tests & auto-emailed reports Verify more data & do it quickly  verifies up to 100% of all data up to 1,000 x faster Integrate for Continuous Delivery  Integrates with most Build, ETL & QA management software
  18. 18. built by QuerySurge™ QuerySurge™ Modules Design Tests SchedulingDeep-Dive Reporting Run Dashboard Query WizardsData Health Dashboard
  19. 19. Fast and Easy. No programming needed. built by QuerySurge™ QuerySurge™ Modules Compare by Table, Column & Row • Perform 80% of all data tests •Automatically generates SQL code • Opens up testing to novice & non- technical team members • Speeds up testing for skilled SQL coders • provides a huge Return-On-Investment
  20. 20. built by QuerySurge™ QuerySurge™ Modules 3 Types of Data Comparison Wizards: The also provide you with automated features for: o filtering (‘Where’ clause) and o sorting (‘Order By’ clause) Column-Level Comparison: This is great for Big Data stores and Data Warehouses where tables will have some columns containing transformations and some columns with no transformations. Many tables and columns can be compared simultaneously and quickly. Table-Level Comparison: This comparator is great for Data Migrations and Database Upgrades with no transformations at all. Many tables can be compared simultaneously and quickly. Row Count Comparison: Great for all - Big Data stores, Data Warehouses, Data Migrations and Database Upgrades. Many tables and rows can be compared simultaneously and quickly.
  21. 21. Design Library  Create custom Query Pairs (source & target SQLs)  Great for team members skilled with SQL QuerySurge™ Modules Scheduling  Build groups of Query Pairs  Schedule Test Runs for: • immediately • at a specific date/time • automatically after build or ETL process built by QuerySurge™
  22. 22. Deep-Dive Reporting  Examine and automatically email test results Run Dashboard  View real-time execution  Analyze real-time results QuerySurge™ Modules built by QuerySurge™
  23. 23. built by QuerySurge™ • view data reliability & pass rate • add, move, filter, zoom-in on any data widget & underlying data • verify build success or failure
  24. 24. Test Management Connectors built by QuerySurge™  Drive QuerySurge execution from your Test Management Solution  Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution  Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge results • HP ALM (Quality Center) • Microsoft Team Foundation Server • IBM Rational Quality Manager Integration with leading Test Management Solutions
  25. 25. Case Study Fortune 500 firm: Clinical Trial Data built by QuerySurge™
  26. 26. Case Study: Fortune 500 Pharma Challenge How can a Data Warehouse team assure data integrity over multiple builds when the cost per patient data of Phase 3 clinical studies exceeds $26,000 and volume of live case data is > 1 TB? Strategy Implement QuerySurge™ to dramatically increase coverage of data that is verified for each build. Implementation • 1,000 SQL queries written to compare case data from the source systems to the DWH after ETL. • QuerySurge™automated the scheduling, test runs, comparisons and reporting for each build. built by QuerySurge™
  27. 27. Metrics  500 mappings  2.5 million data items  1.25 billion verifications  Complete run finished in 7 days  45% of data was covered.  14 builds were deployed  115 defects were discovered and remediated Case Study: Fortune 500 Pharma Benefits • 10-fold increase in the speed of testing. • Huge increase in coverage of data (from less than 1/10 % to 45%) • Production defects discovered that were missed in previous cycles • Huge savings on clean records (115 defects x $26,000/record) • A huge time savings (3.6 years x 10 people) • Avoidance of lawsuits and FDA fines built by QuerySurge™
  28. 28. (1) a Trial in the Cloud of QuerySurge, including self-learning tutorial that works with sample data for 3 days or (2) a Downloaded Trial of QuerySurge, including self-learning tutorial with sample data or your data for 15 days or (3) a Proof of Concept of QuerySurge, including a kickoff & setup meeting and weekly meetings with our team of experts for 30 days more information, Go here TRIAL IN THE CLOUD built by QuerySurge™ Free Trials
  29. 29. built by QuerySurge™ QuerySurge For more on the Pharma & QuerySurge, go to