This presentation will provide an overview on how reconciliation and/or validation rules can be defined and trial data can be checked against these rules. By utilizing JReview’s built-in browser and advanced functionalities, objects can be defined with drill-down capabilities to perform these data validation checks. For e.g. reconciliation checks between EDC data and external lab data can be easily performed by reviewing a summary object with discrepant information such as subject id, discrepancy category and discrepancy message with an ability to drill down to detailed discrepant data listings. This approach should support pro-active data management for ongoing trials increasing the overall data quality. Similar approach can be applied to review data against sponsor defined data standards checks.
1. Leveraging JReview as a Data Quality Solution
Raj Indupuri &
Chandi Kodthiwada
Confidential Presentation
September 18, 2012
2. Agenda
• Data Quality Challenges
• JReview Solution Overview
• Data Reconciliation Business Case
• Data Standards Business Case
• Q&A
3. Data Quality Challenges
Data Reconciliation
• Very tedious
Different sources and systems
• JReview
Variant structures and formats
Interactive with drill-down
Labor intensive capabilities
• Access and Ease of use Self-service
Different refresh cycles Why did it happen?
Error-prone if performed using What’s happening now?
spreadsheets
• Proactive Data Management
Ongoing review and
verification
Data Standards
• Reusable across trials
• Difficult to validate compliance
checks ongoing Global Objects
Customizable
• Difficult to validate sponsor and
protocol related checks
• Difficult to get visibility during
trial conduct
Intensive programming and
SAS based backend processes
4. JReview Solution Overview – How?
Specifications
• Define Categories and Items for creating an analysis friendly
discrepancy panel
• Add Notes to provide further insight into the discrepancy
• Conceptualize Run-time parameters
5. JReview Solution Overview – How?
Design/Programming
• Implement a Materialized View
• Programming will abstract all the source data type disparities &
structure variances in source data from end-user
JReview Integration/Object Development
• Import SQL development [Discrepancy Item Categorization &
Identification]
• Develop Objects based on business needs: ranging from
Discrepancy metrics per site to Subject level discrepancy
listings
• Slice and Dice data: Allow Object drill-down from a high-level
summary to a detail subject level listing
6. Data Reconciliation - Requirements
Define discrepancy details
Category Item Notes
Subject Identifiers Subject Initials Subject Initials Mismatch
Date of Birth Date of Birth Mismatch
Sex Sex Mismatch
Visit Discrepancies Visit/Planned Time point Name Not in eCRF Data
Visit/Planned Time point Name Not in External Vendor Data
Data Discrepancies Date/Time of ECG Date Mismatch
ECG Result Result Mismatch
Completion Status Test marked complete but not in
External Vendor Data
5
7. Data Reconciliation - Requirements
Variables to reconcile (ECG eCRF vs. ECG External Provider)
Field Name Column Heading
Derived Category
Derived Item
Derived Notes
EG.USUBJID/EP.USUBJID Unique Subject ID
EG.EGTEST/EP.ECTEST ECG Test Name
EG.VISITNUM/EP.VISITNUM Visit Number
EG.VISIT/EP.VISIT Visit
eCRF Planned Time Point External Planned Time Point
EG.EGTPT/EP.ECTPT EP.EPTPT
Name Name
EG.EGSEQ eCRF Sequence Number EP.EPSEQ External Sequence Number
EG.EGDTC eCRF Date/Time of ECG EP.EPDTC External Date/Time of ECG
EG.EGSTAT eCRF Completion Status EP.EPSTAT External Completion Status
eCRF Completion Status at
EG.EGSTAT1 EP.EPSTAT External Completion Status
each Time point
DM.SEX eCRF Subject Sex EP.EPSEX External Sex
DS.SUBINIT eCRF Subject Initials EP.SUBJINIT External Subject Initials
DM. BRTHDTC eCRF Birth Date EP.EPDOB External Birth Date
EG.EGORRES eCRF Result EP.EPVAL External ECG Evaluation
10. Data Reconciliation – Design and Develop
• Identify Sources:
• EG (eCRF ECG Data)
Source
Dataset/Table • EP (External Vendor ECG Data)
• Develop a view with aggregated Identifier information from both
sources and join the source data back to the aggregated Identifier
View
Programming
information effectively joining data wherever applicable
• Performance: Run the view every time? Query a static table
Materialized [Maintenance] ?
View/Table
• Discrepancy Categorization
• Discrepancy Identification
Import SQL
• Build Objects
JReview Object • Summary, Detailed & Graphs
Development
9
13. Data Standards - Requirements
Define data standards checks
Data Validation Data
Category Validation ID Data Validation Item Severity
Consistency C0001 Duplicate --SEQ Error
Consistency C0002 Duplicate USUJID, with different SUBJID Error
Presence SD0001 No records in data source Warning
Presence No Disposition record found for subject Warning
SD0069
Presence No Exposure record found for subject Warning
SD0070
Null value in variable marked as
Presence Error
SD0002 Required
12
Two areas that affect overall data quality: data reconciliation and data standards compliance checks
SpecificationsAdd Categories and Items for creating an analysis friendly discrepancy panelFor similar discrepancies, add Notes to give further insight into the discrepancy Conceptualize Run-time parameters