Combined Data Interchange Standard
Consortium (CDISC)
Standard Data Tabulation Model
Presented By:
Ankur Sharma
Biostatistical Programmer
PAREXEL International,
Baltimore, MD, USA
Definitions
 CDISC : Clinical Data Interchange Standard
Consortium
 SDTM : Standard Data Tabulation Model
 ADaM : Analysis Data Model
 SDS : Submission Data Standards
 DDT : Data Definition Tables
Clinical Data Interchange
Standard Consortium (CDISC)
 CDISC is a global, open, multidisciplinary, non-profit
organization that has established standards to
support the acquisition, exchange, submission and
archive of clinical research data and metadata.
 Leads the development of standards that improve
efficiency while supporting the scientific nature of
clinical research.
 Recognizes the ultimate goal of creating regulatory
submissions that allow for flexibility in scientific
content and are easily.
Standard Data Tabulation Model
(SDTM)
 SDTM defines a standard structure for study data
tabulations (datasets) that are to be submitted to a
regulatory authority such as the Food & Drug
Administration (FDA).
 Benefits of SDTM:
SDTM allows reviewers at the FDA to develop a
repository of all submitted studies and create stand
alone tools to access, manipulate and view the study
data.
SDTM Implementation Guide and Its
Versions
 Study Data Tabulation Model Implementation Guide
(SDMIG) for Clinical Trials is prepared by
Submission Data Standards (SDS) Team.
 Implementation guide has two different versions:
i) SDTMIG V3.1.1
ii) SDTMIG V3.1.2 (newest and preferred)
Continued….
 SDTM Implementation guide describes the general
conceptual model for preparing clinical study data
that is submitted to regulatory authorities. The
SDTMIG V3.1.2 provides specific domain models,
assumptions, business rules, and examples for
preparing standard tabulation datasets.
SDTM Fundamentals
• SDTM Variable Classification:
1.) Identifier : These are the variable which
identifies the study, subject involved,
domain and sequence number.
2.) Topic : This specifies the focus of the
observations
3.) Timing : Describes the timing of an observations
4.) Qualifier : Contains additional text, values,
or results which helps describes the
observations
Continued….
 5.) Rule: This explains the algorithm or calculation
involved to derived date- times or visits. This is
mainly
used in Trial Design Domain.
 Classification of Qualifier Variables:
Qualifier is further categorized into five classes:
1.) Grouping Qualifier : These are used to group
together
a collection of observation. Example:
LBCAT.
2.) Result Qualifier : This describes the specific result
Continued….
 3.) Synonym Qualifier: This variable contains
alternate
value name for a particular observation. For ex:
AETERM, AEMODIFY and AEDECOD.
 4.) Record Qualifier: This defines additional
attributes of
an observations in a record. Ex: AEREL.
 5.) Variable Qualifier: This variable further describes
the
value of a observation in a record. Ex: Lab Units
Observation Class
Datasets containing observations are classified into
three classes:
 Intervention: This class captures information
regarding investigational treatment, therapeutic
treatment and procedures. Ex: CM, EX, SU.
 Events: This class captures occurrences and
incidents occurred during study trial. Ex: AE, MH,
DS, DV.
 Findings: This class captures observation resulting
from planned evaluation. Ex: IE, LB, QS, PE, PC,
PP, SC, VS.
Special Purpose Domain:
Special Purpose Domain:
 Include subject level data and do not conform to any
of the three classes of observation datasets.
 Examples are:
– Demographics (DM)
– Comments (CO)
– Subject Visits (SV)
– Subject Elements (SE)
Trial Design Model
Trial Design: The design of a clinical trial is a plan for
what assessments will be done to subjects and what
data will be collected during the trial to address the
trial's objectives.
 These datasets fall under this model:
– Trial Arms (TA)
– Trial Elements (TE)
– Trial Visits (TV)
– Trial Inclusion/Exclusion Criteria (TI)
– Trial Summary Information (TS)
Special Purpose Relationship Datasets
 Supplemental Qualifiers – SUPPQUAL:
Suppqual datasets are used to capture non standard
variables and their association to parent records.
• Relate Records – RELREC:
RELREC is used to describe the relationship
between records in two or more dataset. For ex:
Adverse Event record related to the Concomitant
medication.
Continued….
 Supplemental Qualifiers – SUPPQUAL:
Supplemental Qualifiers are always created in the
following situation.
1.) Availability of non SDTM standard data which has
study data but cannot be used in the parent
domain.
For ex: PE abnormal findings are in SUPPPE
Continued….
2.) Any dataset in which SDTM variable has text
value
exceeding the length of 200 character limit. Text
value is split such that characters 1- 200 are in the
parent domain and characters >200 go into the
Suppqual domain.
Only exception in this case: Trial
Inclusion/Exclusion
(TI) domain. If variable IETEST >200, then the
remaining part of text will go into metadata and will
linked to the Define.xml.
Continued….
 RELREC is created only per sponsor’s request for
the following cases.
1.) Information collected about relationship between
concomitant medication and Adverse Event for an
observation.
2.) Any information which has link between
multiple datasets and has a scientific rationale
behind
the link.
Sponsor Defined or Custom Domain
 These are the domains usually created in any of the
study trial, when we encounter data which is of non
SDTM standard and cannot be included in any of the
SDTM domain. To include this non SDTM standard
data into the study domains, sponsor defined or
custom domains are created.
 Naming of these domains is also sponsor
dependent, but the first letter for these domains
assigned according to their observation class
defined on next slide:
Continued….
 Intervention = X-
 Events = Y-
 Findings = Z-
 According to the data available, we can decide under
which observation class particular data resides and
we can term the name accordingly.
 For ex: If we have any kind of assessment data
involved. So assessments provides us with finding of
new data information, so dataset can be assigned as
ZA.
Metadata Contents and Attributes
 Core Variables:
a.) Required: These variable must be present in
the dataset and cannot be null for any record.
b.) Expected: These variable must be present in
the dataset but can have a null value.
c.) Permissible: These variable should be included in
variable appropriately and when data is collected.
If all records have a null value, then this variable
should be dropped.
Controlled Terminology
 Controlled Terminology is defined as the terminology
that controls the value of any variable. (See
Appendix C, Page 271 SDTMIG 3.1.2).
 In almost all of SDTM domains, there are some
variables which always have controlled terminology
associated with them. If any variable is defined in
the SDTMIG with the Controlled Terms or Formats
as ACN, NY, STERF, NCOMPLT etc., then all the
values of this variable must be populated using the
Controlled terminology.
Continued….
TerminologySDTM Terminology.xls
• This file consist of all value of controlled
terminology for SDTM variables and its synonym
values. While creating SDTM domain
programmer must check the value for controlled
variable in the file and then provide responses.
Lab_controlled_terminology.xls
Date and Time variable
 --DTC and –STDTC :
All timing variable –DTC and –STDTC variable fall
into either permissible or expected category
depending on dataset. As per definition these
variable are allowed to have null records. As per
CDISC guidelines all timing variable must be
presented in ISO 8601 format in all of the SDTM
domain.
ISO 8601 Format
• ISO 8601 Format:
As per the FDA guidelines all the dates available in
SDTM dataset must follow the following format for
the date and time presentation in variables such as -
-DTC and --STDTC.
YYYY-MM-DDTHH:MM:SS
• Also as per the format, value of this variable must be
a character value.
ISO 8601 Duration Values
 In any case possible, where instead of dates and
time, we encountered the value in the following way
for example as shown in the column label Duration
recorded then values must be recorded as shown in
the column on right labeled as –DUR value.
 Duration recorded --DUR value
2 Years P2Y
10 Weeks P10W
3 Months 10 days P3M10D
2 hours before RFSTDTC -PT2H*
References:
 www.cdisc.org
 http://www.cancer.gov/cancertopics/terminologyreso
urces/CDISC
Questions??

cdiscsdtmtrainingpresentation-12900042423172-phpapp01.ppt

  • 1.
    Combined Data InterchangeStandard Consortium (CDISC) Standard Data Tabulation Model Presented By: Ankur Sharma Biostatistical Programmer PAREXEL International, Baltimore, MD, USA
  • 2.
    Definitions  CDISC :Clinical Data Interchange Standard Consortium  SDTM : Standard Data Tabulation Model  ADaM : Analysis Data Model  SDS : Submission Data Standards  DDT : Data Definition Tables
  • 3.
    Clinical Data Interchange StandardConsortium (CDISC)  CDISC is a global, open, multidisciplinary, non-profit organization that has established standards to support the acquisition, exchange, submission and archive of clinical research data and metadata.  Leads the development of standards that improve efficiency while supporting the scientific nature of clinical research.  Recognizes the ultimate goal of creating regulatory submissions that allow for flexibility in scientific content and are easily.
  • 4.
    Standard Data TabulationModel (SDTM)  SDTM defines a standard structure for study data tabulations (datasets) that are to be submitted to a regulatory authority such as the Food & Drug Administration (FDA).  Benefits of SDTM: SDTM allows reviewers at the FDA to develop a repository of all submitted studies and create stand alone tools to access, manipulate and view the study data.
  • 5.
    SDTM Implementation Guideand Its Versions  Study Data Tabulation Model Implementation Guide (SDMIG) for Clinical Trials is prepared by Submission Data Standards (SDS) Team.  Implementation guide has two different versions: i) SDTMIG V3.1.1 ii) SDTMIG V3.1.2 (newest and preferred)
  • 6.
    Continued….  SDTM Implementationguide describes the general conceptual model for preparing clinical study data that is submitted to regulatory authorities. The SDTMIG V3.1.2 provides specific domain models, assumptions, business rules, and examples for preparing standard tabulation datasets.
  • 7.
    SDTM Fundamentals • SDTMVariable Classification: 1.) Identifier : These are the variable which identifies the study, subject involved, domain and sequence number. 2.) Topic : This specifies the focus of the observations 3.) Timing : Describes the timing of an observations 4.) Qualifier : Contains additional text, values, or results which helps describes the observations
  • 8.
    Continued….  5.) Rule:This explains the algorithm or calculation involved to derived date- times or visits. This is mainly used in Trial Design Domain.  Classification of Qualifier Variables: Qualifier is further categorized into five classes: 1.) Grouping Qualifier : These are used to group together a collection of observation. Example: LBCAT. 2.) Result Qualifier : This describes the specific result
  • 9.
    Continued….  3.) SynonymQualifier: This variable contains alternate value name for a particular observation. For ex: AETERM, AEMODIFY and AEDECOD.  4.) Record Qualifier: This defines additional attributes of an observations in a record. Ex: AEREL.  5.) Variable Qualifier: This variable further describes the value of a observation in a record. Ex: Lab Units
  • 10.
    Observation Class Datasets containingobservations are classified into three classes:  Intervention: This class captures information regarding investigational treatment, therapeutic treatment and procedures. Ex: CM, EX, SU.  Events: This class captures occurrences and incidents occurred during study trial. Ex: AE, MH, DS, DV.  Findings: This class captures observation resulting from planned evaluation. Ex: IE, LB, QS, PE, PC, PP, SC, VS.
  • 11.
    Special Purpose Domain: SpecialPurpose Domain:  Include subject level data and do not conform to any of the three classes of observation datasets.  Examples are: – Demographics (DM) – Comments (CO) – Subject Visits (SV) – Subject Elements (SE)
  • 12.
    Trial Design Model TrialDesign: The design of a clinical trial is a plan for what assessments will be done to subjects and what data will be collected during the trial to address the trial's objectives.  These datasets fall under this model: – Trial Arms (TA) – Trial Elements (TE) – Trial Visits (TV) – Trial Inclusion/Exclusion Criteria (TI) – Trial Summary Information (TS)
  • 13.
    Special Purpose RelationshipDatasets  Supplemental Qualifiers – SUPPQUAL: Suppqual datasets are used to capture non standard variables and their association to parent records. • Relate Records – RELREC: RELREC is used to describe the relationship between records in two or more dataset. For ex: Adverse Event record related to the Concomitant medication.
  • 14.
    Continued….  Supplemental Qualifiers– SUPPQUAL: Supplemental Qualifiers are always created in the following situation. 1.) Availability of non SDTM standard data which has study data but cannot be used in the parent domain. For ex: PE abnormal findings are in SUPPPE
  • 15.
    Continued…. 2.) Any datasetin which SDTM variable has text value exceeding the length of 200 character limit. Text value is split such that characters 1- 200 are in the parent domain and characters >200 go into the Suppqual domain. Only exception in this case: Trial Inclusion/Exclusion (TI) domain. If variable IETEST >200, then the remaining part of text will go into metadata and will linked to the Define.xml.
  • 16.
    Continued….  RELREC iscreated only per sponsor’s request for the following cases. 1.) Information collected about relationship between concomitant medication and Adverse Event for an observation. 2.) Any information which has link between multiple datasets and has a scientific rationale behind the link.
  • 17.
    Sponsor Defined orCustom Domain  These are the domains usually created in any of the study trial, when we encounter data which is of non SDTM standard and cannot be included in any of the SDTM domain. To include this non SDTM standard data into the study domains, sponsor defined or custom domains are created.  Naming of these domains is also sponsor dependent, but the first letter for these domains assigned according to their observation class defined on next slide:
  • 18.
    Continued….  Intervention =X-  Events = Y-  Findings = Z-  According to the data available, we can decide under which observation class particular data resides and we can term the name accordingly.  For ex: If we have any kind of assessment data involved. So assessments provides us with finding of new data information, so dataset can be assigned as ZA.
  • 19.
    Metadata Contents andAttributes  Core Variables: a.) Required: These variable must be present in the dataset and cannot be null for any record. b.) Expected: These variable must be present in the dataset but can have a null value. c.) Permissible: These variable should be included in variable appropriately and when data is collected. If all records have a null value, then this variable should be dropped.
  • 20.
    Controlled Terminology  ControlledTerminology is defined as the terminology that controls the value of any variable. (See Appendix C, Page 271 SDTMIG 3.1.2).  In almost all of SDTM domains, there are some variables which always have controlled terminology associated with them. If any variable is defined in the SDTMIG with the Controlled Terms or Formats as ACN, NY, STERF, NCOMPLT etc., then all the values of this variable must be populated using the Controlled terminology.
  • 21.
    Continued…. TerminologySDTM Terminology.xls • Thisfile consist of all value of controlled terminology for SDTM variables and its synonym values. While creating SDTM domain programmer must check the value for controlled variable in the file and then provide responses. Lab_controlled_terminology.xls
  • 22.
    Date and Timevariable  --DTC and –STDTC : All timing variable –DTC and –STDTC variable fall into either permissible or expected category depending on dataset. As per definition these variable are allowed to have null records. As per CDISC guidelines all timing variable must be presented in ISO 8601 format in all of the SDTM domain.
  • 23.
    ISO 8601 Format •ISO 8601 Format: As per the FDA guidelines all the dates available in SDTM dataset must follow the following format for the date and time presentation in variables such as - -DTC and --STDTC. YYYY-MM-DDTHH:MM:SS • Also as per the format, value of this variable must be a character value.
  • 24.
    ISO 8601 DurationValues  In any case possible, where instead of dates and time, we encountered the value in the following way for example as shown in the column label Duration recorded then values must be recorded as shown in the column on right labeled as –DUR value.  Duration recorded --DUR value 2 Years P2Y 10 Weeks P10W 3 Months 10 days P3M10D 2 hours before RFSTDTC -PT2H*
  • 25.
  • 26.