2. • Case Report Form (CRF)
A printed, optical, or electronic document designed to
record all of the protocol-required information to be reported to the sponsor
on each trial subject.
• Derived Dataset Requirements / Specifications document (DDR/DDS)
This document provides the specifications/guidelines to create the value-
added datasets from the raw dataset.
• Statistical Analysis Plan (SAP)
A document that provides comprehensive and detailed description of the
methods and presentation of data analysis proposed for a clinical trial.
• Protocol
The plan that is followed in a clinical trial that describes the objective(s),
design, methodology, statistical considerations, and organization of a trial.
OVERVIEW
2
3. RAW DATA
3
The basic requirement in the A&R process is the raw data, which acts as a
base for all reports. This data is the direct reflection of the CRF collected data.
Each organization has its own standards for collecting raw data.
After all the data discrepancies are resolved by data managers, cleaned data
is present in the database. This data needs to be extracted from the database
into the reporting server. This process is called Data Extraction
Extraction codes:
These are SAS codes written
by the Statistical Programmer
to generate raw datasets
from database.
Client-specific tool:
Raw datasets can be
generated from database
using client-specific tools.
4. SDTM
As per the new regulations of FDA, all the clinical trial data has to
conform to one globally acceptable and approved standard.
Initially each pharmaceutical had its own way of collecting, mining
and submitting the data to FDA for drug approval. This proved to be
cumbersome due to inconsistent data from all pharmaceuticals.
This gave rise to the need for having one common submission
standard.
Hence CDISC came up with SDTM (Study Data Tabulation Model)
which defines a standard structure for the clinical trials study data
tabulations. These tabulations represent the essential data collected
about subjects. However, it must be understood that the SDTM is not
a data collection standard
4
5. MAPPING PROCESS
Mapping:
It is the process of getting the data from any other standard into the
required data standards. Thus, mapping refers to transformation of the
data from different standards to a specified acceptable form.
The Raw data needs to be mapped to SDTM standards to make it
compliant for regulatory submissions.
5
6. DERIVED DATASET - VAD CREATION
Derived dataset is obtained by applying the specifications defined in the
specification documents to the mapped(SDTM) dataset.
The Protocol, SAP and CRF are referred during the creation of Derived
dataset. These documents are used to create the
1. Specification document
2. Programming Plan document (which contains Test Plan)
Which are required for creation of Derived dataset.
6
7. Why is Derived dataset required?
To make the datasets ready for reporting
To perform additional analysis on the data
7
Raw Data Derived
dataset
Subjid Ht (Cm) Wt (Kg)
1001 150 60
1002 155 55
1003 160 49
1004 136 64
1005 171 77
Subjid Ht Wt BMI
1001 150 60 2.666667
1002 155 55 2.289282
1003 160 49 1.914063
1004 136 64 3.460208
1005 171 77 2.633289
BMI=wt/(ht*ht)*1000
Derived Datasets
8. SPECIFICATION DOCUMENT
Creation of Specifications Documents
The statistician works with statistical programmer for the development of Specification
Document according to the Protocol, SAP and CRF.
The specification document consists of format, type, length, label and computational
details of the variables to be derived (data fields which are not in the raw dataset)
This document also contains the name of the variable from the raw datasets that are used
to create these derived variables.
Example: ‘Body Mass Index (BMI)’ is a derived data field which is calculated using ‘Height’
& ‘Weight’ from the raw dataset.
8
9. PROGRAMMING PLAN
• Creation of Programming Plan
The programming plan consists of the algorithms (flow diagram), data presentations and
programming standards to be followed to generate the Derived dataset
The Programming plan also contains the Test Plan which is used for the validation of the Dataset.
The test Plan validates the programming used to create derived datasets and checks whether
the statistical output meets the requirement mentioned in the specification document
It Provides an easy way to confirm that the non-standard derived datasets and outputs have been
programmed correctly.
These documents can be either created by the programmer or provided by the client
9
10. DERIVED DATASET CREATION
Once the Specifications and Programming plan documents are finalized,
SAS programmer generates the code to create the Derived dataset
containing the required variables as specified in the Specifications and
Programming plan.
The coding conventions are followed during the programming process that
includes the naming of the variable, labels and the coding styles as
required.
10
11. QUALITY CHECK (QC) /PEER REVIEW OF DERIVED
DATASET
QC of Derived Dataset:
Once the Derived dataset is created it is reviewed by a peer-reviewer as per client requirement.
Peer review is done to ensure that developed code fulfills agreed upon requirements/specifications
and produces accurate results.
E.g. Of ways QC can be done
1) The Reviewer or Programmer checks the dataset against the QC checklist. The QC checklist
consists of the guidelines for verifying the contents and the attributes of the dataset.
2) The reviewer performs a thorough check of the datasets generated by writing an independent
review code and then checking the output of the review code with the output of the original code.
11
12. GENERATION OF REPORTS
This actually is the key process of Analytics & Reporting team. Once the
QC/Peer Review of Derived dataset done, programmers go ahead with
generating reports based on these Derived Datasets.
Reports could be:
• Summary Reports (Summarizing the data)
• Listings Reports (Listing the entire data as it is)
• Figures/Graph (Graphical Representation of data)
12
15. GENERATION OF REPORTS
• List of Tables (LOT):
This document contains the list of tables to be generated along with the table
number and headings
• Mock Shells:
This document carries the mock tables (Format) of all the reports that need to
be generated.
The programmer generates the reports based on the programming plan,
table shells and the LoT using SAS.
Protocol document and CRFs can also referred by the programmer to
generate the reports.
15
16. QC/PEER REVIEW OF REPORTS
• After the reports are generated, the reviewer reviews the reports.
The Peer-Reviewer performs a thorough check of the reports generated and the codes are compared with the
requirements mentioned in the SAP, Programming Plan and LoT/TOC.
E.g. Of ways QC can be done
1) The Reviewer or Programmer checks the Report against the QC checklist. The QC checklist consists of the guidelines
for verifying the contents, attributes and formatting of the reports.
2) The reviewer performs a thorough check of the reports generated by writing an independent review code and then
checking the output of the review code with the output of the original code.
16
17. E-PUBLISHING
Once the Reports are generated as per the requirements, they have to be
published in a common repository.
For this a request is sent through the appropriate system for publishing
Tasks of E-Publishers:
• To load Tables, Datasets and CRFs into a central repository for further
processing and FDA submission.
• Provide data for Medical Communications to work on.
• Enable submission of files for regulatory review.
As Publishing uses high-end technology, this process is much more automated
than any other process in CDM.
17
18. A & R FLOW CHART
18
Database
Protocol, CRF, SAP,
Programming
Plan, DDR/DDS
issues
Yes
NoReport
Generation
Protocol, SAP,
Programming Plan,
LOT, TOC
Raw Data
Derived
Dataset
QC
Reports
Report QC issues
Yes
No
E-Publishing Published
Reports
Extraction Mapping
SDTM
19. PK/PD
Pharmacokinetics (PK) - Study of what the body does to the drug, time course of
passage of drug through the body.
Pharmacodynamics (PD) - Study of what drug does to the body, term used for biological
effects (Efficacy).
Pharmacokineticists in the Clinical department analyse Population Pharmacokinetic
and Pharmacodynamic (POP PK/PD) using a PK-Analysis Software. Example:
NONMEM (Non-linear Mixed Effect Modelling), WINNONLIN (Developed by
Microsoft).
Intermediate SAS dataset is generated using the specification (Global Data Request Form)
and the raw datasets as handed over by the pharmacokineticist for the study
These SAS datasets are then converted into a .csv file using SAS generic codes.
The .csv file produced is fed into the PK software which computes the PK parameters such
as ‘c max’, ‘t max’ …
The QC method involves scrutinizing the .csv file and comparing it to the specification.
19
20. PK/PD FLOW CHART
Raw Data
20
Global Data request
Form
Intermediate
dataset
.CSV File
Dataset
Creation using
codes
.csv Creation
using codes
Inserting .csv
into PK
software
PK Parameters
QCissues
Yes
No
21. ANNUAL REPORTS & INVESTIGATIONAL BROCHURE
Annual Reports
• Documents the progress of clinical studies, i.e. It summarizes
the progress of the drug investigation on a periodic basis.
• Is submitted annually to the US FDA.
• It summarizes some aspects of the clinical trial which includes:
o Demographic characteristics
o reporting of Adverse Events (serious and non-
serious),
o Study Discontinuations occurred in the reporting
period.
21
22. ANNUAL REPORTS & INVESTIGATOR’S BROCHURE
Investigator’s Brochure
The Investigator's Brochure is a basic document which is required in a clinical trial, together with the clinical
trial protocol. According to FDA regulations (Title 21 CFR 312.23) an Investigator's Brochure must contain:
• a description of the drug substance and the formulation,
• a summary of the pharmacological and toxicological effects,
• a summary of information relating to safety and effectiveness in humans, and
• a description of possible risks and adverse reactions to be anticipated and precautions or special monitoring.
An IB may be requested at any stage of the Trial by the Physician.
The number of tables and timelines may vary as per the request.
Ex: Summary reports for Adverse Events, Discontinuations and Laboratory data.
22
23. ALGORITHMS
23
• Algorithms team works in support of other teams in SAS Programming.
• Any change in a code in the Reporting System requires an Issue ticket to be
raised.
• Once the ticket is passed it is assigned to a developer and a reviewer.
• The developer develops the specification and the code.
• At the same time the reviewer develops the Test Plan which is used for the
validation of the code.
• Test plan prepared by the Reviewer is reviewed by the developer.
• Once the testplan is passed the reviewer reviews the specification and code
based on the checks mentioned in the testplan.
• Once the code is validated, they are tested in a testing server.
• After completion of peer testing, the issue is assigned to system-testers, the
system level codes are changed and the issue is closed.
24. ALGORITHMS FLOW CHART
24
Assigned to Peer-reviewerAssigned To Developer
Specification and Code Testplan
Issues Issues
System Testing Peer TestImplementation
Yes Yes
NoNo
Issue Request Impact Assessment Change Request
25. oWhat are the three types of reports?
oWhat is a programming plan?
oWhich document is created for QC?
o What is SDTM?
oThe annual reports submitted is submitted to whom?
25
Test Your Understanding
27. • Summarize important points here
– Instructions: Enter the key take-away of this chapter. They must be inline with the
objective slide.
SUMMARY
27