D. Divya, M. Pharmacy
 Clinical data Management is a process of collecting,
entering, validating or cleaning the data obtained in
Clinical trial.
 Pharmaceutical industries relies on the electronically
captured data for the evaluation of medicines, there is
need to follow good practices in CDM and maintain
standards in electronic data capture. These electronic
records have to comply with a Code of Federal
Regulations, 21 CFR Part 11.
 Society for Clinical Data Management (SCDM)
publishes the Good Clinical Data Management
Practices (GCDMP) guidelines, a document providing
the standards of good practice within CDM.
 Many software tools are available for data
management, and these are called Clinical Data
Management Systems (CDMS).
Commenly used CDM tools are
 Clintrial
 Informa
 Rave
 Oracle Clinical
 Openclinica
 TrialDB
 PhOSCo
 eClinical Suite
Set up Phase:
1) CRF design and development
2) Writing edit checks
3) Programming
4) CRF Annotation
5) Data base design
6) User Acceptance Testing (UAT)
Conduct Phase:
1) Data Collection and data entry
2) Validation
3) Discrepancy Management
4) Med DRA Coding
5) SAE/Lab data reconsillation
Close out phase:
1) Database audit
2) Database freeze/ lock
 According to ICH-GCP, CRF is a printed or
electronic document to record all of the
protocol required information to be reported to
the sponsor on each trial subject.
 CRF also known as Data Collection Instrument
(DCI).
Types of CRF:
 Standard case report form or general case
report forms.
Ex: subject enrollment form, eligibility form,
subject randomization form, medical history,
physical examination, clinical laboratory data,
compliance, concomitant medication, adverse
events, off study and death.
 Study specific case report forms
Ex: Biomarker forms such as cell differentiation
biomarkers, inflammatory cytokines, DNA
ploidy analysis, PGE2 levels, proliferation
analysis, nuclear morphometry.
Pharmacokinetics forms.
Header : Protocol ID
Site ID
Subject ID
Patient initials
Footer: Investigator signatures
Date of signature
Version number
Page number
Main module: Details of the Domain.
 Use of consistent formats, font styles and font sizes.
 Selection of portrait versus landscape versus combination
layouts.
 Use of clear and concise questions, prompt and instructions.
 Visual cues should be provided
Ex: check boxes, Radio buttons
 Clear guidance about skip patterns like what to skip and
what not to skip should be mentioned at appropriate places.
 Skips (are instructions provided in the CRF page to maintain
the connectivity between pages) should be kept to a
minimum by the placement of questions to avoid
confusions.
 Separate the columns with thick lines.
 Provide bold and italiaized instructions.
 Minimize free text responses.
 Page numbering if necessary, should be consistent
throughout.
 Avoid using “check all that apply” as it forces
assumptions about the clinical data.
 Specify the unit of measurement.
 Indicate the number of decimal places to be recorded.
 Use standard date format.
Ex: dd/mm/yy throughout the CRF.
 Use pre coded answer sets.
ex: Yes/No
Male/Female
Mild/Moderate/Severe
 Not to split modules/ sections ( a set of one or
more related groups of questions that pertain to a
single clinical study visit) like
Ex: AE section should not be split and laid across
pages such that information related to a single AE
will have to be collected from different pages.
Edit Checks are invaluable tools used to
enhance the quality of data or to clean the data.
Types of Edit Checks:
 Univariate edit checks
 Multivariate edit checks
Domain Variable Condition Error msg Resolution
/Discrepan
cy logic
Type
DM(Demo
graphy)
Site ID If Site ID is
missing
Site ID is
missing
Please
enter Site
ID
Site ID
should be
provided
Univariate
CO(Conco
mitant
medication
)
Therapy
provided--
Yes
Drug name
is missing
If therapy
is
provided,
please
mention
the drug
name also
Drug name
should be
provided
Multivariat
e
Process for defining and implementing edit checks:
CDM Prepares draft ECS
CDM circulate a draft ECS for review
Team meeting to finalize the ECS
Team review and approve final ECS
CDM issue final ECS
For annotating the case report forms we have to
maintain standards. These standards are given by
CDISC.
CDISC---Clinical Data Interchange Standard
Consortium.
PRM -----Protocol Representation Model.
CDASH---- Clinical Data Acquisition Standard
Harmonization.
LAB ------Laboratory data model
SDTM-----Study Data Tabulation Model
ADAM-----Analysis Data Model
SEND-----Standards for Exchange of Non Clinical Data.
PRM for designing the protocol, CDASH and LAB
standards for data collection and exchange, SDTM for
tabulating the data, ADAM for analyzing the data.
According to SDTM every variable is defined by 8 capital letters
and every domain is defined by 2 capital letters.
Annotation
Domain:
DM= Demography, CO= Concomitant medication etc.
Variable:
Site ID = SITEID
Subject ID = SUBJID
Visit type = VISIT
Visit date = VISITDTC
Date of ICF = RFICDTC etc.
 Variable name should not exceed 8 capital letters.
 Special characters other than “ _ “should not be used.
 Any variable should never start with number.
 Database designers should understand what data
should be collected. What will be the data type, length
and response type.
 What are the key questions to be answered.
 How will the data be analyzed.
 What reports will be used.
 How often a new report be generated.
 Questioners elements are of 2 types
 Close end questions
Ex: Gender ?
Male, female, others.
 Open end questions
 Ex: why you are participating in clinical trials ?
We can’t anticipate the answers.
 Requirement of entering value checking if the
data needed to be entered is possible to be
entered in a particular field.
 Need to check Format of values
 Range checks
 Floating decimals.
 Negative value checks
 Future date checks
 Confirmation of logic between particular fields.
 Comparing extracted data to original data.
 Personal data protection evaluation (PHI-
Protected Health Info)
 Correction of lab values units and ranges
 Data collection is done using the CRF that may
exist in the form of paper or an electronic
version which are translated to the database by
means of data entry in house.
 In the eCRF based CDM, the investigator will
be logging into the CDM system and entering
the data directly at the site. In eCRF method
chances of errors are less and the resolution of
discrepanices happens faster.
 CRF tracking: The entries made in the CRF will
be monitored by Clinical Research Associate
for completeness and filled up CRFs are
retrieved and handed over to the CDM team.
 Data entry:
Single Data entry
Double Data entry
Mostly double data entry is performed where
in the data is entered by two operators
separately.
 Data validation is a series of documented tests
of the data with the goal of ensuring the quality
and integrity of the data.
 Validation Process is complex and dependant
on the data captured, business and regulatory
concerns, the data management software used
and several other factors.
Why does clinical data need Validation?
From a business perspective, the data are how
the FDA, other regulators and business
partners evaluate the worth of the product.
From an ethical perspective clinical data affect
treatment decisions, which affect patient health
and the patient population.
Validation is of 2 types
1)Clinical Data base validation
2)Clinical Data Validation
Clinical Data base validation: which means
performing the edit checks. Sponsor decides
what checks should be used, what code lists are
appropriate and what procedures will be used
for invalid results.
Clinical Data Validation:
By Investigator:
 The investigator should ensure the accuracy,
completeness, legibility and timeliness for the
data reported to the sponsor in the CRF’s and
in all required reports.
 The investigator should ensure that any data
reported on the CRF are consistent with the
patient’s medical records and where applicable
discrepancies should be explained.
By Monitor (CRA):
 Monitor should check the CRF entries with the
source documents and inform the investigator of
any errors/omissions and assure that all data are
correctly and completely recorded and reported.
This is called source data verification (SDV)
 CRF compared with the original medical records to
ensure it is complete and accurate.
 Through SDV process, the monitor should confirm
accurate transcription of data from source files to
the CRF.
There are 2 methods of SDV.
 Direct access: the monitor is given direct access
to the actual source document and conducts an
independent comparison versus the CRF.
 Indirect access: the monitor is not allowed
access either to the actual or to the photocopied
source document. Key variables are choosen
for which the investigator or member of staff
reads the source documents entry while the
monitor compares it with the CRF entry.
By CDM:
 CDM data validation activities are an integral part of
GCP and fundamental to the delivery of high quality
data for statistical analysis and reporting.
 Attention should be focused on ensuring that the data
are a reasonable representation of what actually
happened at the investigator site.
 The aim is to transform data recorded on CRF’s into
information that can be used in the final clinical report
from which the right conclusions about the new drug
can be made.
 Data clarification queries are is issued to the
investigator at various stages in the process, in
particular as a result of pre entry review, data entry
and running of edit checks.
 This is also called as query resolution. It helps in
cleaning the data and gathers enough evidence for
the deviations observed in data.
 Based on the types identified, discrepancies are
either flagged to the investigator for clarification or
closed in house by Self Evident Corrections (SEC)
without sending DCF to the site. Most common
SEC’s are obvious spelling errors.
 For discrepancies that require clarifications from
the investigator, DCF will be sent to the site.
DCF contains the following elements:
 Site ID:
 Investigator name:
 Date:
 Subject ID:
 Module name:
 Reviewer:
 Comments:
 Resolution:
Medical dictionary for regulatory authorities. It
is a rich and highly specific standardized
medical terminology to facilitate sharing of
regulatory information internationally for
medical products used by humans. Products
covered by the scope of medDra include
pharmaceuticals, biological, vaccines and drug
device combination products.
Why we need to code?
 In order to standardize the data. Easy to
exchange from one country to another country,
to regulatory agencies and to MNC’s.
 WHO.DD – Drug dictionary
 WHO.ART – Adverse Reporting Terminology
 COSTART – Coding Symbols for Thesaurus of
Adverse Reaction Terms.
Structure:
SOC (System Organic Class)
HLGT (High Level Group Term)
HLT (High Level Term)
LLT (Low Level Term)
PT (Preferred Term)
 Reconcillation is the process for ensuring
consistency in clinical trial data between the data
collection database and the safety database.
Examples of items that have to be reconcilled:
 AE – Event description
 Start of event – onset date
 End of event – stop date
 Severity – intensity
 Seriousness criteria – AE Serious
 Relationship to study drug – investigator casuality.
Concomitant medication
 Drug name
 Start date
 Stop date
 Dosages
 Frequency
 Database audits are conducted between soft lock and
hard lock (freeze and lock) of the database. Prior to the
database audit the auditor should receive the following
documents related to the trial and database.
 Study protocol including amendments.
 CRF
 Data management plan
 Statistical analysis plan
 Annotated CRF
 List of coding dictionaries
 List of the laboratory units
 List of all electronic and manual plausibility checks
 SOP’s of all procedures related to data management.
Auditing Process:
 Data managers compare data in the database
against the CRF and any associated correction
forms.
 Data managers may review listings of text
fields. A separate listing review by Clinical
Research Associates is often required for study
lock
 The most common reconciliation with external
systems is for serious adverse events. Data on
SAEs are typically stored in both the clinical
data management system and also in a separate
SAE system. When reconciliation at study
close, data management staff look for
 Cases found in the SAE system but not in the
CDM system.
 Events found in the CDM system but not in the
SAE system.
 Deaths reported in one but not the other.
 Instances where the basic data matched up but
where there are differences, such as in onset
date.
 Data base locking is usually a two step process.
The first step is often referred to as soft lock or
database freeze and occurs after all data
cleaning, validation and QC activities have
been finalized. The second step is called hard
lock or data base lock. At this stage the data
base is handed over to statistics for data
analysis and the data can be un blinded incase
of a blinded study.

Clinical Data Management

  • 1.
    D. Divya, M.Pharmacy
  • 2.
     Clinical dataManagement is a process of collecting, entering, validating or cleaning the data obtained in Clinical trial.  Pharmaceutical industries relies on the electronically captured data for the evaluation of medicines, there is need to follow good practices in CDM and maintain standards in electronic data capture. These electronic records have to comply with a Code of Federal Regulations, 21 CFR Part 11.  Society for Clinical Data Management (SCDM) publishes the Good Clinical Data Management Practices (GCDMP) guidelines, a document providing the standards of good practice within CDM.
  • 3.
     Many softwaretools are available for data management, and these are called Clinical Data Management Systems (CDMS). Commenly used CDM tools are  Clintrial  Informa  Rave  Oracle Clinical  Openclinica  TrialDB  PhOSCo  eClinical Suite
  • 4.
    Set up Phase: 1)CRF design and development 2) Writing edit checks 3) Programming 4) CRF Annotation 5) Data base design 6) User Acceptance Testing (UAT)
  • 5.
    Conduct Phase: 1) DataCollection and data entry 2) Validation 3) Discrepancy Management 4) Med DRA Coding 5) SAE/Lab data reconsillation Close out phase: 1) Database audit 2) Database freeze/ lock
  • 6.
     According toICH-GCP, CRF is a printed or electronic document to record all of the protocol required information to be reported to the sponsor on each trial subject.  CRF also known as Data Collection Instrument (DCI).
  • 7.
    Types of CRF: Standard case report form or general case report forms. Ex: subject enrollment form, eligibility form, subject randomization form, medical history, physical examination, clinical laboratory data, compliance, concomitant medication, adverse events, off study and death.
  • 8.
     Study specificcase report forms Ex: Biomarker forms such as cell differentiation biomarkers, inflammatory cytokines, DNA ploidy analysis, PGE2 levels, proliferation analysis, nuclear morphometry. Pharmacokinetics forms.
  • 9.
    Header : ProtocolID Site ID Subject ID Patient initials Footer: Investigator signatures Date of signature Version number Page number Main module: Details of the Domain.
  • 10.
     Use ofconsistent formats, font styles and font sizes.  Selection of portrait versus landscape versus combination layouts.  Use of clear and concise questions, prompt and instructions.  Visual cues should be provided Ex: check boxes, Radio buttons  Clear guidance about skip patterns like what to skip and what not to skip should be mentioned at appropriate places.  Skips (are instructions provided in the CRF page to maintain the connectivity between pages) should be kept to a minimum by the placement of questions to avoid confusions.
  • 11.
     Separate thecolumns with thick lines.  Provide bold and italiaized instructions.  Minimize free text responses.  Page numbering if necessary, should be consistent throughout.  Avoid using “check all that apply” as it forces assumptions about the clinical data.  Specify the unit of measurement.  Indicate the number of decimal places to be recorded.
  • 12.
     Use standarddate format. Ex: dd/mm/yy throughout the CRF.  Use pre coded answer sets. ex: Yes/No Male/Female Mild/Moderate/Severe  Not to split modules/ sections ( a set of one or more related groups of questions that pertain to a single clinical study visit) like Ex: AE section should not be split and laid across pages such that information related to a single AE will have to be collected from different pages.
  • 13.
    Edit Checks areinvaluable tools used to enhance the quality of data or to clean the data. Types of Edit Checks:  Univariate edit checks  Multivariate edit checks
  • 14.
    Domain Variable ConditionError msg Resolution /Discrepan cy logic Type DM(Demo graphy) Site ID If Site ID is missing Site ID is missing Please enter Site ID Site ID should be provided Univariate CO(Conco mitant medication ) Therapy provided-- Yes Drug name is missing If therapy is provided, please mention the drug name also Drug name should be provided Multivariat e
  • 15.
    Process for definingand implementing edit checks: CDM Prepares draft ECS CDM circulate a draft ECS for review Team meeting to finalize the ECS Team review and approve final ECS CDM issue final ECS
  • 16.
    For annotating thecase report forms we have to maintain standards. These standards are given by CDISC. CDISC---Clinical Data Interchange Standard Consortium. PRM -----Protocol Representation Model. CDASH---- Clinical Data Acquisition Standard Harmonization. LAB ------Laboratory data model SDTM-----Study Data Tabulation Model ADAM-----Analysis Data Model SEND-----Standards for Exchange of Non Clinical Data. PRM for designing the protocol, CDASH and LAB standards for data collection and exchange, SDTM for tabulating the data, ADAM for analyzing the data.
  • 17.
    According to SDTMevery variable is defined by 8 capital letters and every domain is defined by 2 capital letters. Annotation Domain: DM= Demography, CO= Concomitant medication etc. Variable: Site ID = SITEID Subject ID = SUBJID Visit type = VISIT Visit date = VISITDTC Date of ICF = RFICDTC etc.
  • 18.
     Variable nameshould not exceed 8 capital letters.  Special characters other than “ _ “should not be used.  Any variable should never start with number.  Database designers should understand what data should be collected. What will be the data type, length and response type.  What are the key questions to be answered.  How will the data be analyzed.  What reports will be used.
  • 19.
     How oftena new report be generated.  Questioners elements are of 2 types  Close end questions Ex: Gender ? Male, female, others.  Open end questions  Ex: why you are participating in clinical trials ? We can’t anticipate the answers.
  • 20.
     Requirement ofentering value checking if the data needed to be entered is possible to be entered in a particular field.  Need to check Format of values  Range checks  Floating decimals.  Negative value checks  Future date checks
  • 21.
     Confirmation oflogic between particular fields.  Comparing extracted data to original data.  Personal data protection evaluation (PHI- Protected Health Info)  Correction of lab values units and ranges
  • 22.
     Data collectionis done using the CRF that may exist in the form of paper or an electronic version which are translated to the database by means of data entry in house.  In the eCRF based CDM, the investigator will be logging into the CDM system and entering the data directly at the site. In eCRF method chances of errors are less and the resolution of discrepanices happens faster.
  • 23.
     CRF tracking:The entries made in the CRF will be monitored by Clinical Research Associate for completeness and filled up CRFs are retrieved and handed over to the CDM team.  Data entry: Single Data entry Double Data entry Mostly double data entry is performed where in the data is entered by two operators separately.
  • 24.
     Data validationis a series of documented tests of the data with the goal of ensuring the quality and integrity of the data.  Validation Process is complex and dependant on the data captured, business and regulatory concerns, the data management software used and several other factors.
  • 25.
    Why does clinicaldata need Validation? From a business perspective, the data are how the FDA, other regulators and business partners evaluate the worth of the product. From an ethical perspective clinical data affect treatment decisions, which affect patient health and the patient population.
  • 26.
    Validation is of2 types 1)Clinical Data base validation 2)Clinical Data Validation Clinical Data base validation: which means performing the edit checks. Sponsor decides what checks should be used, what code lists are appropriate and what procedures will be used for invalid results.
  • 27.
    Clinical Data Validation: ByInvestigator:  The investigator should ensure the accuracy, completeness, legibility and timeliness for the data reported to the sponsor in the CRF’s and in all required reports.  The investigator should ensure that any data reported on the CRF are consistent with the patient’s medical records and where applicable discrepancies should be explained.
  • 28.
    By Monitor (CRA): Monitor should check the CRF entries with the source documents and inform the investigator of any errors/omissions and assure that all data are correctly and completely recorded and reported. This is called source data verification (SDV)  CRF compared with the original medical records to ensure it is complete and accurate.  Through SDV process, the monitor should confirm accurate transcription of data from source files to the CRF.
  • 29.
    There are 2methods of SDV.  Direct access: the monitor is given direct access to the actual source document and conducts an independent comparison versus the CRF.  Indirect access: the monitor is not allowed access either to the actual or to the photocopied source document. Key variables are choosen for which the investigator or member of staff reads the source documents entry while the monitor compares it with the CRF entry.
  • 30.
    By CDM:  CDMdata validation activities are an integral part of GCP and fundamental to the delivery of high quality data for statistical analysis and reporting.  Attention should be focused on ensuring that the data are a reasonable representation of what actually happened at the investigator site.  The aim is to transform data recorded on CRF’s into information that can be used in the final clinical report from which the right conclusions about the new drug can be made.  Data clarification queries are is issued to the investigator at various stages in the process, in particular as a result of pre entry review, data entry and running of edit checks.
  • 31.
     This isalso called as query resolution. It helps in cleaning the data and gathers enough evidence for the deviations observed in data.  Based on the types identified, discrepancies are either flagged to the investigator for clarification or closed in house by Self Evident Corrections (SEC) without sending DCF to the site. Most common SEC’s are obvious spelling errors.  For discrepancies that require clarifications from the investigator, DCF will be sent to the site.
  • 32.
    DCF contains thefollowing elements:  Site ID:  Investigator name:  Date:  Subject ID:  Module name:  Reviewer:  Comments:  Resolution:
  • 33.
    Medical dictionary forregulatory authorities. It is a rich and highly specific standardized medical terminology to facilitate sharing of regulatory information internationally for medical products used by humans. Products covered by the scope of medDra include pharmaceuticals, biological, vaccines and drug device combination products.
  • 34.
    Why we needto code?  In order to standardize the data. Easy to exchange from one country to another country, to regulatory agencies and to MNC’s.  WHO.DD – Drug dictionary  WHO.ART – Adverse Reporting Terminology  COSTART – Coding Symbols for Thesaurus of Adverse Reaction Terms.
  • 35.
    Structure: SOC (System OrganicClass) HLGT (High Level Group Term) HLT (High Level Term) LLT (Low Level Term) PT (Preferred Term)
  • 36.
     Reconcillation isthe process for ensuring consistency in clinical trial data between the data collection database and the safety database. Examples of items that have to be reconcilled:  AE – Event description  Start of event – onset date  End of event – stop date  Severity – intensity  Seriousness criteria – AE Serious  Relationship to study drug – investigator casuality.
  • 37.
    Concomitant medication  Drugname  Start date  Stop date  Dosages  Frequency
  • 38.
     Database auditsare conducted between soft lock and hard lock (freeze and lock) of the database. Prior to the database audit the auditor should receive the following documents related to the trial and database.  Study protocol including amendments.  CRF  Data management plan  Statistical analysis plan  Annotated CRF  List of coding dictionaries  List of the laboratory units  List of all electronic and manual plausibility checks  SOP’s of all procedures related to data management.
  • 39.
    Auditing Process:  Datamanagers compare data in the database against the CRF and any associated correction forms.  Data managers may review listings of text fields. A separate listing review by Clinical Research Associates is often required for study lock
  • 40.
     The mostcommon reconciliation with external systems is for serious adverse events. Data on SAEs are typically stored in both the clinical data management system and also in a separate SAE system. When reconciliation at study close, data management staff look for  Cases found in the SAE system but not in the CDM system.
  • 41.
     Events foundin the CDM system but not in the SAE system.  Deaths reported in one but not the other.  Instances where the basic data matched up but where there are differences, such as in onset date.
  • 42.
     Data baselocking is usually a two step process. The first step is often referred to as soft lock or database freeze and occurs after all data cleaning, validation and QC activities have been finalized. The second step is called hard lock or data base lock. At this stage the data base is handed over to statistics for data analysis and the data can be un blinded incase of a blinded study.