SlideShare a Scribd company logo
1 of 19
Download to read offline
International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 7 Issue 4, July-August 2023 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 270
Data Management and Analysis in Clinical Trials
S. Reddemma1
, Chetana Menda2
, Manoj Kumar3
1,2
Pharm D, Student at ClinoSol Research, Hyderabad, Telangana, India
3
M. Pharmacy (Pharmaceutics), Student at ClinoSol Research, Hyderabad, Telangana, India
ABSTRACT
Data management and analysis play a critical role in the successful
conduct of clinical trials. Proper collection, validation, and handling
of data are essential for ensuring the reliability and integrity of study
findings. Data management involves the design and implementation
of data capture tools, such as electronic case report forms (eCRFs), to
efficiently collect and store clinical data. Additionally, data analysis
is a crucial step that involves applying statistical methods to extract
meaningful insights from the collected data. This paper provides an
overview of the key components of data management and analysis in
clinical trials, highlighting the importance of adherence to data
standards, ensuring data quality, and maintaining data security.
Effective data management and analysis not only lead to robust study
outcomes but also contribute to the overall advancement of medical
knowledge and patient care.
KEYWORDS: Data Management, Data Analysis, Clinical Trials,
Electronic Case Report Forms (eCRFs), Data Collection, Data
Validation
How to cite this paper: S. Reddemma |
Chetana Menda | Manoj Kumar "Data
Management and Analysis in Clinical
Trials" Published in
International
Journal of Trend in
Scientific Research
and Development
(ijtsrd), ISSN:
2456-6470,
Volume-7 | Issue-4,
August 2023, pp.270-288, URL:
www.ijtsrd.com/papers/ijtsrd59667.pdf
Copyright © 2023 by author (s) and
International Journal of Trend in
Scientific Research and Development
Journal. This is an
Open Access article
distributed under the
terms of the Creative Commons
Attribution License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
I. INTRODUCTION
A clinical trial aims to address the research question
by producing data that may be used to either confirm
or disprove a theory. The outcome of the
investigation is significantlyinfluenced by the quality
of the generated data. A relevant and essential
component of a clinical trial is clinical data
management (CDM). In the course of their study, all
researchers engage in CDM activities, whether
intentionally or unintentionally. Clinical trials require
effective data administration to produce trustworthy,
high-quality, and statistically sound data while
substantially decreasing the number of stages needed
for medication development and marketing. As a
result, the time between medication the production
and sale are greatly shortened.
Clinical data management makes ensuring that studies
are carried out, maintained, and analyzed with the
right level of quality and cost, and that the data are
reliable for supporting any results. In the phase of
clinical research where study data are gathered,
clinical data management (CDM) is extremely
important. For bioequivalence research, it is
necessary for producing high-quality, accurate,
trustworthy, and statistically sound datasets.[1].
OBJECTIVES
CDM stands for the collection, cleaning, and
management of subject data in compliance with the
rules and regulations. The primary objective of CDM
processes is to produce high-quality data by reducing
errors and missing data while gathering as much data
as is necessary for analysis.[1].
Best practices are applied to ensure that the data is
accurate, reliable, and managed appropriatelyin order
to accomplish this aim.
II. DATA MANAGEMENT IN CLINICAL
TRIALS
The whole process of collecting, validating, and
organizing subject research data is carried out in
compliance with established procedures to guarantee
high-quality data that is accurate and comprehensive.
The main objective is to collect as much information
as is necessary while minimizing the overall number
of errors for study. In order to do this, specialized
techniques-such as software tools-are used to
maintain audit trials that enable rapid identification
and rectification of data discrepancies even in
significant and complicated clinical investigations.
IJTSRD59667
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 271
The Data Management Plan paper defines the CDM
procedures to be followed during the experiment and
offers a step-by-step guide for processing the data
under controlled circumstances. The DMP includes
information on the database architecture, data entry
and tracking policies, quality control practices, SAE
reconciliation policies, discrepancy handling, data
transfer/extraction policies, and database locking
policies.
List of data management activities
1. Data collection
2. CRF collection
3. CRF annotation
4. Database design
5. Data entry
6. Medical coding
7. Data validation.
8. Discrepancy management
9. Database lock
Clinical software programs called databases were
developed to make it simpler for CDM perform out
various examinations. These tools are often easy to
use and constructed in accordance with regulatory
standards. Entry of data is done using CRF layouts,
and the database contains details about the study's
objectives, timetables, visits, investigators, sites, and
participants. These entry screens are verified using
fake data before being used to actual data collection.
CRF design and data collection
The CRF, which is available in both paper and
electronic versions, is used to collect data. Paper or
electronic CRFs are used for data collecting. In
contrast to conventional approaches, which require
internal data input, e-CRF-based CDMs enable
investigators or designated individuals to submit data
immediately at the site. e-CRF techniques are well-
liked by pharmaceutical corporations for use in their
drug development procedures since they minimize
error and expedite discrepancies.
Monitoring CRF
The correctness and completeness of the CRF
submissions will be checked by the Clinical Research
Associate (CRA). The recovered CRFs are given to
the CDM team. The CDM team will continue to
maintain track of and record the recovered CRFs. The
investigator is called for an explanation if any data is
missing or unclear, and the issue is then resolved.
Data entry
The entry of data for paper CRFs obtained from
locations is done in accordance with DMP
requirements. Double enter data requires two
operators working independently to spot transcription
errors and discrepancies. In comparison to a single
data input, this reduces mistake rates and aids in
verification, reconciliation, and maintaining a cleaner
database.
Validation of data
Testing the accuracy of data in accordance with
protocol requirements is known as data validation.
Data validity is ensured by edit check programs,
which find errors in entered data. These programs are
evaluated using fake data that contains errors, such
errors resulting from missing data, inconsistent data,
range checks, or protocol violations. The system will
indicate discrepancies if the condition is TRUE, and
Data Clarification Forms (DCFs) can be created to
remedy the discovered discrepancies.
Discrepancy management
Reviewing discrepancies, looking into their causes,
resolving them with documentary evidence, or
deeming them irresolvable, is the process of
discrepancy management, also known as query
resolution. A discrepancy database, which often
comes with audit trails, is used by CDMS to track and
store discrepancies. The CDM team periodically
evaluates discrepancies and marks them as "closed"
to avoid further validation failures. For cleaning data,
discrepancy management is essential and needs to be
constantly monitored.
Database locking
Quality checks are followed by final data validation,
and SAS datasets are finalized after consulting a
statistician. Before database locking, data
management tasks must be finished using a pre-lock
checklist. Clean data is extracted for statistical
analysis after approval, at the moment when the
database is locked [1] [2].
Quality Assurance
In order to ensure compliance with regulatory
requirements, quality assurance in data management
is a dynamic process that foresees errors and flaws in
data generation. The high quality and integrity of the
data must be maintained throughout the clinical
research process.
Quality Control
The quality control process in clinical research
ensures internal consistency through routine
operational checks at each level of the trial process
and data handling to confirm the trial procedure'
compliance and the accuracy of the data [3].
III. ELECTRONIC DATA CAPTURE
SYSTEMS (EDC)
Clinical trials employ an electronic platform known
as an Electronic Data Capture (EDC) system to
electronically collect, organize, and store trial data.
EDC techniques are designed to take the role of
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 272
paper-based data collecting methods, which are
usually time-consuming, prone to error, and
ineffective. Electronic data collecting has lately
become more common in clinical studies. In clinical
trials, EDC systems can improve accuracy and
efficiency.
Benefits of EDC
Electronic data capture (EDC) technologies are
beneficial for clinical trials because they improve data
quality, efficiency, data monitoring, patient safety,
and compliance with regulations.
The EDC method also has the following advantages:
It speeds up study completion;
It is more secure;
It is simpler to discover what you need.
Types of EDC systems:
1. Web-based EDC Systems
2. Cloud-Based EDC Systems
3. Hybrid EDC Systems
4. Custom EDC Systems
5. Open-Source EDC Systems
6. Vendor-Supported EDC Systems [4].
Features and functions
To fulfil the needs of researchers and market
developments, EDC software providers regularly
create new features. CRF Designer, Data Entry,
Query Management, and Data Export are examples of
basic functionalities.
The Clinical Data Management System (CTMS)
software, the End-to-End Use of CDISC Standards,
Risk-Based Monitoring, Patient-Reported Outcomes
(PRO) Software, and Standardized Codes like WHO
Drug and MeDRA are some examples of advanced
features. Other features include randomization of
participants and risk management systems.
Automatic query methods, automatic medical word
coding, the creation of customized reports, strong
data cleaning modules, and source data verification
are examples of advanced functionality. It is
necessary that regulatory organizations code
pharmaceuticals and adverse events against the
MedDRA and WHO drug dictionaries, respectively.
Implementing EDC system
Even though putting in place an Electronic Data
Collection (EDC) system might be difficult, clinical
trials can gain a great deal from it if it is done with
care.
The steps involved in implementing an EDC system
include
1. Defining your EDC needs
2. Select the appropriate EDC solution
3. Implementation of EDC
4. Design the EDC database
5. Test the EDC
6. Implement the EDC.
7. Monitor the EDC system [4][5][6]
Data validation query management
1. To ensure accurate, reliable, and high-quality
clinical trial data, data cleaning is essential. In
order to streamline data cleaning processes and
decrease manual review and correction, our EDC
provides capabilities like automatic edit checks
and data validation rules. Data cleaning is crucial
for regulatory compliance and decision-making
because it eliminates biased outcomes and
potential patient damage. Clinical study findings
might be distorted without data cleansing, which
could result in bad outcomes and inaccurate
conclusions.
2. Detection of errors and discrepancies
3. Correcting errors and discrepancies
4. Data cleaning activities documentation
5. In order to resolve data queries in clinical trials
and guarantee complete and accurate data, query
management is essential. It aids in finding and
eliminating errors, discrepancies, and
inconsistencies, enhancing data quality and
lowering risks of regulatory non-compliance. For
effective query tracking and resolution, our EDC
provides a query management system.
6. Query identification
7. Developing queries
8. Responses to queries
9. Recording query management activities [7] [8].
IV. DATA STANDARDS AND
HARMONIZATION:
Regulatory authorities that assess clinical research
reports establish data standards, which constitute
printed guidelines for the collecting, using, and
managing of data obtained during clinical trials.
CDISC
A nonprofit organization called the "Clinical Data
Interchange Standards Consortium," or CDISC for
short, operates on a global scale and actively develops
data standards utilizing the skills and experience of
volunteers working in the pharmaceutical industry.
CDISC creates and promotes standards to make it
possible for data to be collected, shared, submitted,
and archived for use in the development of medical
and biopharmaceutical products. The consortium
works with international organizations like the
Chinese National Medical Products Administration
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 273
(NMPA), the Japanese Pharmaceuticals and Medical
Devices Agency (PMDA), the European Medicines
Agency (EMA), and the U.S. Food and Drug
Administration (FDA) to develop rules and
specifications that affect the standards for both
clinical and nonclinical data. Along with global law,
CDISC standards are always increasing and changing.
CDISC Domains:
Domains exist for SEND, SDTM, and ADaM data. A
domain is a group of connected observations made on
a single topic that are gathered for all human or
animal test participants in a clinical or nonclinical
study.
Therapeutic PK domains:
For SEND and SDTM, the PK domains of
pharmacokinetic concentrations (PC),
pharmacokinetic parameters (PP), and related records
(RELREC) are crucial.
Non-therapeutic PK domains:
SEND domains comprise PC and PP domains as well
as pool definitions (POOLDEF) and additional
qualifiers for pharmacokinetic concentration
(SUPPPC). While SUPPPC offers additional
qualifiers not included in PC variables, POOLDEF
aggregates and identifies individual animals in a
pooled profile for study.
SEND
A manual for organizing, developing, and formatting
nonclinical data is called SEND, or Standard for the
Exchange of Nonclinical Data. For nonclinical data,
such as data from carcinogenicity studies, repeat
dosage toxicology, single dose toxicology, respiratory
and cardiovascular tests, and specimen assessments
for immunogenicity, the SEND Implementation
Guide (SEND-IG) offers specified regions and
samples.
SDTM
A well-known CDISC standard for organized and
structured content in clinical research data is the
SDTM, or research Data Tabulation Model. Based on
the SDTM's structure and metadata, the SDTM
Implementation Guide (SDTM-IG) offers a
standardized set of domains for clinical data
reporting.
ADaM
The FDA commonly uses SDTM data as the source
of analysis-ready data in a data format called ADaM,
or Analysis Data Model. This format is different from
SDTM standards, which concentrate on generating
and mapping data that has been acquired from raw
sources [9].
Harmonization of data
Data harmonization is the process of combining data
from several studies into a single, useable dataset in
order to improve comparability and answer research
questions. Critical issues in medical, psychological,
and behavioral research can be addressed by
combining data from multiple-center, independent, or
single-center studies and restricted data sources. Two
different types of harmonization exist.
Before starting research, investigators must establish
rules for data collection, management, and pooling.
This is known as Prospective harmonization. This
method offers flexibility, core statistics, and requires
less work to solve the drawbacks of retrospective
harmonization.
Retrospective harmonization includes combining
information from several research and utilizing the
expertise of subject-matter experts to recognize and
translate variables into standard definitions and units
of measurement. [10].
Role of data standards in data sharing and
interoperability
Increasing interoperability
Improving comparability
Increasing discoverability
Enabling aggregation
Enabling linkability [11].
V. STATISTICAL ANALYSIS PLANNING
In clinical trials, statistical analysis planning is a
crucial phase that requires developing an overall
strategy for analyzing trial data. The main objective
of statistical analysis planning is to make sure that the
statistical methods and processes utilized in the
analysis are well-defined, valid, and suitable for
addressing the research objectives [12].
A. Overview of statistical analysis planning in
clinical trials
SAP is a defined overview of the planned statistical
basic/advance procedures for clinical trial analysis,
which is written in both the study protocol and
independently. The SAP is essential and one of the
most important confidential regulatory documents in
the planning of a clinical study.
According to ICH E9, SAP is typically referred to as
reporting and analysis plans, although in some
organisations it may alternatively be referred to as
data analysis plans (DAP) or statistical analysis plans
(SAP).
In the clinical study report, these analyses will
evaluate the efficacy and safety of Investigational
Medicinal Product in comparison to the standard
drug.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 274
B. Statistical analysis plan (SAP) development
and documentation
Significant guidelines utilized in SAP development
include ICH E9 (International Conference on
Harmonisation of Technical Requirements for
Pharmaceuticals for Human Use) and SPIRIT
(Standard Protocol Items: Recommendations for
Interventional Trials).
The following must be clearly explained in the
SAP: the aims, primary objectives, secondary
objectives, exploratory objectives, primary,
secondary, and exploratory endpoints, trial
population, trial design, sample size calculations with
justifications/assumptions, and randomization
procedures. A SAP must also provide a thorough
explanation of the statistical technique, including
efficacy analysis, safety data analysis, reporting
formats, etc [13].
According to ICH E9 guidance, "the principal
features of the eventual SAP of the data should be
described in the statistical section of the protocol."
A detailed, thorough, and transparent summary of the
intended statistical analyses must be made available
in order to support the reproducibility of research and
dispel worries about misconduct and fraud in clinical
research. According to standard guidelines, SAP
should be kept in a confidential clinical trial master
file and used during regulatory authorization audits to
verify whether statistical records comply with
standards [14].
C. Selection of appropriate statistical methods
and endpoints
A crucial step in the analysis of biomedical data is
choosing a suitable method for statistical analysis. A
poor choice of statistical approach not only causes
major problems during the interpretation of the
findings, but it also has an impact on the study's
conclusion. For every specific situation, statistical
procedures are available in statistics for the analysis
and interpretation of the data. Understanding the
assumptions and conditions of the statistical methods
is necessary for a researcher in order to choose the
best statistical method for data analysis [15].
Data analysis uses two basic statistical methods:
Descriptive statistics, which uses indices like
mean, median, and standard deviation to
summarize data, and
Inferential statistics, which uses statistical tests
like the student's t-test, the ANOVA test, etc. to
make conclusions from the data [16].
The selection of an appropriate statistical method
depends on the following three things:
1. Aim and objective of the study,
2. Type and distribution of the data used, and
3. Nature of the observations (paired/unpaired).
Type and distribution of the data used:
The selection of the statistical test varies as per the
data type.
DATA TYPE STATISTICAL TEST
Nominal/
Ordinal/ or
discrete data
Nonparametric methods
Continuous
data
Parametric methods (t-test, F
test, Pearson correlation
coefficient, linear regression) as
well as Nonparametric methods.
Concept of Parametric and Nonparametric
Methods:
The term "parametric" refers to all statistical
techniques used to compare means, whereas the term
"nonparametric" refers to techniques used to compare
variables other than means, such as median, mean
ranks, and proportions. Parametric tests are based on
the assumption that the variable is continuous and
normally distributed. Nonparametric approaches are
employed when the continuous data has a non-normal
distribution or when there are other types of data
besides continuous variables.
The most frequently used parametric methods have
nonparametric counterparts, which acts as backup
analysis when assumptions of a parametric test are
violated [17].
ENDPOINTS:
An endpoint is a specific outcome of a clinical trial
that is statistically evaluated to determine the safety
and effectiveness of the medicine being tested.
Endpoints (or outcomes) are the statistical measures
chosen for each research participant that are required
to meet the objectives.
Examples of endpoints used in clinical trials are blood
pressures, weight, survival time, etc. [18].
A clinical trial may have one or more primary,
secondary, and exploratory outcomes.
In order to assess whether the research met its goals
or, in the case of interventional clinical trials, to get
regulatory approval, primary endpoints will be
employed. Secondary endpoints are measurements
that add to our understanding of how a medication
affects the primary endpoint or show additional
impacts on the disease or condition. Exploratory
endpoints might be clinically important events that
are anticipated to occur too infrequently to show a
treatment effect[19].
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 275
The design of the clinical trial, the nature of the
illness or condition being treated, and the expected
effect of the medicine under investigation all have an
impact on the endpoint selection.[20].
D. Sample size calculation and power analysis
The most important phase in the design of a research
study is calculating power and sample size. Power is
the probability that the null hypothesis-which claims
that there are no statistically significant differences
between study groups in the underlying population
for sample estimates like mean, percentage, odds, and
correlation coefficient-will be successfully rejected.
In simpler terms, researchers utilize estimates of
sample size and power to figure out how many
participants are required to answer the study question
(or null hypothesis) [21].
The following details must be taken into account
while determining the sample size:(i) alpha ( ), (ii)
beta (∝), and (iii) effect size (ES).
Alpha (or the P-value) is the probability of finding a
difference between the treatments when a difference
does not exist. This is usually expressed as a 5%
chance (i.e., P < 0.05) that the null hypothesis is
falsely rejected. A type I statistical error, which arises
when groups are compared frequently, refers to the
false claim that P < 0.05. Each time another variable
is compared, there is a 5% risk that the treatments
will differ due to random data selection. After
examining a number of these variables, it was
discovered that the probability of at least one variable
differing between treatments was approximately the
product of the number of variables.
Table: The relationship between the null
hypothesis and the true effects of the treatment
Beta is the probability of failing to find a difference
between the treatments when a difference exists. In
the statistical literature, the maximum value is 0.20,
which corresponds to a 20% risk that the null
hypothesis is incorrectly accepted. If the error
exceeds more than 0.2, it is classified as a type II
statistical error, which happens when the null
hypothesis is incorrectly accepted. This is usually
expressed in terms of the power of the study; that is,
the probability that the null hypothesis can be rejected
if the treatments differ. Power is defined as 1−∝. For
a ∝ of 0.2, the power is 0.8, which is the minimum
power required to accept the null hypothesis. Type II
statistical errors occur when the power of the study is
<0.8.
The effect size (ES) is a measure of the smallest
clinically acceptable difference between treatments
normalized by the standard deviation of the data
(equation).
The power of a study is calculated after the
investigation is completed using the actual sample
size, Effective Size, and value. Power tables using the
sample size and ES (calculated from the results of the
study) as well as the value are available in the
standard textbooks [22].
VI. STATISTICAL ANALYSIS
TECHNIQUES
A. Descriptive statistics and data summarization
Descriptive analysis is the process of summarizing
and describing data using various numerical metrics
and graphical representations. These statistics offer a
brief and meaningful summary of the main
characteristics, patterns, and distributions present in a
dataset. It allows you to draw a conclusion about the
distribution of your data and identify similarities
among variables, enabling you to conduct additional
statistical analysis [23].
Types of Descriptive Analysis:
1. Frequency tabulations and distributions:
Frequency tabulation and distributions provide an
efficient counting summary of a sample of data points
for ease of interpretation.
Frequency tabulation is a very commonly applied
procedure used to summarize information from
questionnaires, both in terms of tabulating various
demographic characteristics (e.g., gender, age,
education level, occupation) and in terms of actual
responses to questions (e.g., numbers responding
‘yes’ or ‘no’ to a particular question).
2. Central tendency:
Central tendency provides a set of numerical
summary measures that indicate the central, average,
or typical score in a given variable's distribution of
scores. The three most commonly reported measures
of central tendency are the mean, median, and mode.
Measures of central tendency are useful in almost any
type of experimental design, and in any observational
studies where quantitative data are available and must
be summarized.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 276
3. Variability:
The purpose of variance assessment is to indicate the
degree of spread in a sample of scores; that is, how
different the scores are from one another in relation to
a corresponding measure of central tendency. The
three most commonly reported measures of variance
are the range, variance, and standard deviation.
Range is the most basic measure of variability for a
set of data scores. The range is simply the sample's
highest score minus the sample's lowest score. It
gives a very basic idea of the degree of spread in the
scores.
The Variance assesses the variability of scores
relative to the sample mean by using information
from every individual in the sample. Variance
measures the average squared deviation of each score
from the sample mean. Deviation refers to the
difference between an observed score value and the
sample mean-they are squared.
The most often reported measure of variability is the
Standard deviation (SD). The standard deviation is
obtained by taking the variance's square root. The
standard deviation is a measure of how far apart each
score is from the sample mean. In many instances,
parametric statistical methods make use of both the
sample mean and sample standard deviation. The
standard deviation is therefore an essential measure
for both testing hypotheses and describing data.
4. Correlation:
Correlation coefficients describe the strength of
association between two variables. Correlation
coefficients commonly vary from "0," which
indicates no correlation, to "-1" and "1," which
indicate perfect association. The square of the
correlation coefficient can be viewed as the
proportion of one variable's variation predicted by the
other variable. Positive correlation coefficients
describe situations in which increases in value of one
of the variables are associated with increases in the
other variable, while negative correlation coefficients
describe situations in which increases in one of the
variables are associated with decreases in the other
[24].
5. Graphical methods to display data:
Its main purpose is to visually summarize data sample
characteristics for ease of interpretation.
Bar graphs and Pie charts:
These two graph forms are useful for summarizing
the frequency of occurrence of various values (or
ranges of values) when the data are categorical
(nominal or ordinal level of measurement).
A bar chart summarizes data by using vertical and
horizontal axes. The vertical axis represents the
frequency (number) of occurrence or the relative
frequency (%) of occurrence; the horizontal axis
represents the data categories of interest.
A pie chart visualizes category frequencies more
simply by dividing a circular plot into wedges or
slices whose sizes are proportional to the relative
frequency (%) of occurrence of various data
categories.
Histograms and Frequency polygons:
These two graph forms are useful for summarizing
the frequency of occurrence of various values (or
ranges of values) when the data is essentially
continuous (interval or ratio level of measurement).
Both histograms and frequency polygons summarize
data using vertical and horizontal axes. The vertical
axis represents the frequency (number) of occurrences
or the relative frequency (%) of occurrences, while
the horizontal axis represents data values or ranges of
values of interest. The frequency polygon employs
lines and points to represent frequency, whereas the
histogram uses bars of variable heights.
Line graphs:
The line graph has a similar appearance to the
frequency polygon but is far more adaptive in terms
of data summarization. Instead, we summarize data
information, such as averages or means, for various
groups of people. As a result, one major application
of the line graph is to categorize scores on a specific
variable based on membership in the categories of a
second variable.
Scatter plots:
In correlational research, scatterplots are particularly
useful for demonstrating the link between two
interval- or ratio-scaled variables or measures of
interest obtained on the same people. In a scatterplot,
all data point pairs in the sample are graphed, with
one variable selected to be represented on the
horizontal axis, the second variable selected to be
represented on the vertical axis.
A scatterplot's shape and tilt reveal visual information
about the direction and strength of the association
between the two variables. Positive relationships are
shown by a cloud of dots that are often directed
upward and towards the right side of the graph. A
cloud of points with a general downward tilt towards
the right side of the graph indicates a negative
relationship [25].
B. Inferential statistics and hypothesis testing
Inferential statistics are frequently used to compare
treatment group differences. Inferential statistics
compare treatment groups using measures from the
experiment's sample of subjects and draw conclusions
about the larger population of subjects.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 277
Note that inferential statistics usually suggest but
cannot absolutely prove an explanation or cause-and-
effect relationship [26].
The most inferential statistics are based on the
principle of a test-statistic value. This value, coupled
with the degrees of freedom (a sample size measure)
and the rejection criteria, is used to assess whether
there are differences between the treatment groups.
The larger the sample size, the more likely it is that a
statistic will show that differences exist between
treatment groups.
The null hypothesis, H₀, is first proposed as a written
statement or a mathematical expression. The most
common null hypothesis is that "no difference(s) exist
between the groups, and they all come from the same
population." Second, an alternative hypothesis, H₁, is
proposed, which will be accepted if there is sufficient
evidence to reject the null hypothesis i.e., H₁ will be
“the two populations are different”. The tests all
generate a significance probability (p value, or SP)
that shows the chance that the observed value of the
test statistic belongs to the distribution specified by
the null hypothesis for the test.
A p value of 0.5 indicates that there is a 50% chance
that the observation fits the null hypothesis, or one in
two chance, whereas a p value of 0.05 indicates that
this probability is only 5%, or one in twenty chance.
A one-in-two chance is not low enough to be certain
that the null hypothesis is incorrect, whereas a one-in-
twenty chance makes it far more likely. At this level,
we may agree that the null hypothesis is false; hence,
a p value of 0.05 is commonly used as the 'cut-off'
probability. p values less than 0.05 are considered
good evidence against the null hypothesis; values
greater than this are not.
Before doing an inferential test, it is important to
understand the sort of data being analyzed as well as
whether the data, or modified data, is regularly
distributed. If so, parametric tests can be employed to
analyze the data; otherwise, nonparametric tests can
be utilized [27].
C. Survival analysis and time-to-event endpoints
The time until an event happens is the primary
endpoint of interest in survival analysis, a statistical
tool for data analysis. It's commonly utilized in
medical research, particularly clinical trials and
observational studies. In clinical research, time-to-
event endpoints, also known as survival endpoints,
are frequently used to determine how long it takes for
an event to occur. These endpoints give important
information regarding the time it takes for an event,
such as death, illness recurrence, or treatment failure,
to occur[28].
Here are some key concepts in survival analysis and
time-to-event endpoints:
Survival Time: Survival time refers to the duration
from a defined starting point (e.g., diagnosis,
treatment initiation) until the occurrence of a specific
event or endpoint. The event of interest could be
death, disease recurrence, progression, or any other
predefined outcome.
The survival function, S (t), expresses the probability
that a person will survive longer than some specified
time t. It expresses the probability that the random
variable T will exceed the specified time t. The
survival function is fundamental to a survival
analysis.
Kaplan-Meier Estimator: The Kaplan-Meier
estimator is a nonparametric method for estimating
the survival function, which represents the probability
of surviving beyond a certain time point. It considers
censored observations and produces survival
estimates at various time periods.
Survival Curve: The survival curve is a graphic
representation of the estimated survival function over
time. It represents a probability of survival or the
occurrence of an event as a function of time. To
analyse differences in survival outcomes, the survival
curves can be compared between different therapy
groups or patient subgroups.
Censoring: Censoring occurs when the event of
interest for some individuals has not occurred by the
end of the study or when they are lost to follow-up.
Censoring can be either right-censoring (the event did
not occur by the end of the study) or interval-
censoring (the event occurred between two
observation times, but the exact time is unknown).
Censored observations provide only partial
information and must be handled carefullyin survival
analysis.
Hazard Function: The hazard function (also known
as the hazard rate) is a measure of the instantaneous
rate at which events occur at a certain period,
assuming the individual has survived up until that
time. It shows the probability of experiencing an
event in a short time interval, assuming survival up to
that time.
Hazard ratio (HR): The hazard ratio (HR) is similar
to the concept of relative risk. It has been used to
characterize the results of therapeutic studies in which
the question is whether treatment can reduce the
duration of an illness. The hazard ratio is an estimate
of the ratio of hazard rate in the treated and control
groups [29] [30].
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 278
Cox Proportional Hazards Model:
A common regression model in survival analysis is
the Cox proportional hazards model. It enables
evaluation of the relationship between covariates
(e.g., treatment, age, and gender) and the hazard
function while assuming that the hazard ratio remains
constant all over time. The Cox model provides
hazard ratios, which quantify the effect of a covariate
on the probability of an event occurring.
Cox regression is a semi-parametric approach since
the baseline hazard function, h0(t), does not need to
be given. The Cox proportional hazard model makes
two assumptions: The hazard ratios of two people are
time-independent and only apply to time-independent
covariates. This means that the hazard functions for
any two individuals at any point in time are
proportional.
Log-Rank Test:
The log-rank test is a popular statistical test for
comparing survival curves between two or more
groups. It determines whether there are statistically
significant differences in survival times across
groups. The log-rank test considers both observed and
censored observations [31].
Survival analysis and time-to-event endpoints in
clinical research provide useful insights into the
probability and timing of events of interest. They
enable the assessment of therapy efficacy, the
identification of prognostic variables, and an
understanding of disease development. These
methods assist researchers and clinicians in making
informed decisions about patient care and contribute
to the advancement of evidence-based medicine.
D. Regression analysis and modeling
Regression models are increasingly being used in
clinical and epidemiologic investigations to assess
therapy efficacy, investigate risk variables,
investigate prognostic patterns, and make predictions
for specific patients, among other things.
Two models have emerged: binary responses, such
as in-hospital death or the presence/absence of a
certain condition, and censored continuous
responses, such as time until death or therapeutic
response in a sample of patients, not all of whom may
have died or responded.
There are four kinds of regression model
assumptions:
1. The participants of the study are a random sample
of the population from whom the inference is to
be drawn, using independent observations.
2. The response variable's distribution has certain
characteristics. The logistic model makes no such
assumption, whereas the Cox model assumes that
the hazard functions for two individuals are
proportionate over time, or that the log[-log
survival] or log hazard curves for two individuals
are equidistant over time.
3. The function that connects a prediction to a
response has a specific structure. In its most basic
form, the logistic model assumes that a
continuous predictor is linearly connected to the
log odds of the result.
4. Predictors act additively unless "interaction"
variables are included in the model [32].
Overview of regression analysis and modeling in
clinical trials:
Linear Regression: Linear regression can be used
in clinical trials to assess the influence of a
treatment on a continuous outcome while
controlling for other variables. It can, for
example, be used to evaluate the effect of a drug
dosage on blood pressure levels while controlling
for variables such as age, gender, and baseline
blood pressure.
Logistic Regression: Logistic regression is
typically employed when the outcome variable is
binary or categorical (e.g., response vs. non-
response, presence vs. absence of a disease). In
clinical trials, logistic regression can be used to
model treatment response or the occurrence of
adverse events.
Cox proportional hazards regression: Survival
analysis uses Cox proportional hazards regression
to assess the relationship between predictor
variables and time-to-event outcomes. In clinical
trials, Cox regression is commonly used to
evaluate the effects of treatments or prognostic
factors on survival outcomes.
Model Selection and Validation: In clinical trials,
it is important to carefully select variables and
adjust for potential confounders before applying
regression analysis and modeling [33].
In clinical trials, regression analysis and modeling
provide useful insights into the correlations between
variables, treatment effects, and outcome predictions.
Researchers can use these strategies to account for
confounding effects, adjust for covariates, and make
evidence-based decisions. On the other hand, proper
model development, validation, and interpretation are
critical to ensuring the correctness and
trustworthiness of the outcomes.
E. Subgroup analysis and stratification
In general, a subgroup is a subset or portion of a study
population that is distinguished by a particular
characteristic or set of characteristics.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 279
Subgroup analysis is defined as any data analysis
focused on a selected subgroup. Subgroup analysis
aims at characterizing observed differences among
multiple subgroups, such as comparing treatment
differences in a trial for different patient subgroups
defined by gender, age at entry, and other baseline
characteristics. It is a type of exploratory data
analysis in which the goal is to discover a subgroup of
people who can account for an observed difference,
such as in a study to evaluate whether or not an
observed treatment difference can be explained by
some subgroup.
Stratification:
Stratification is the deliberate and defined allocation
of participants to distinct treatment groups based on
specific predefined variables or characteristics. A
stratification variable is also a subgrouping variable.
The goal is to ensure that essential parameters are
represented fairly across treatment arms, which can
increase internal validity and reduce confounding
effects.
Stratification in trials is done at "randomization time"
to guarantee that the treatment groups are comparable
with reference to the stratification variable.
Comparability is achieved by ensuring that the
treatment groups have the same mix of individuals in
terms of the stratification variable. For example, if
gender is stratified, each treatment group will have
the equal proportion of males to females [34].
F. Sensitivity analysis and missing data handling
Sensitivity analysis examines the impact of different
assumptions or variations in the analysis to determine
the robustness of study findings or conclusions. It
helps in determining the stability of data and the
impact of important parameters on study outcomes. If
the sensitivity analyses produce results that are
consistent with the primary results, researchers can be
confident that the assumptions used in the primary
analysis had little impact on the outcomes, giving
support to the trial findings.
A particular analysis is considered a sensitivity
analysis if it meets the following criteria:
1. The proposed analysis aims to respond to the
same question as the primary analysis,
2. There is a possibility that the proposed analysis
will yield results that differ from those of the
primary analysis, and
3. There would be uncertainty as to which analysis
to believe if the proposed analysis produced
results that differed from the primary analysis.
These criteria can help guide sensitivity analysis
and identify what to consider when assessing
sensitivity analysis results [35].
Handling of Missing data:
Although researchers make every effort to avoid
missing data, it is present in almost every study.
Ignoring missing data in statistical analysis can result
in significantly biased study outcomes. Rubin was the
first to create a framework of several sorts of missing
data (missing data mechanisms) that are important in
determining the next stages in missing data handling.
The three missing data mechanisms are missing
completely at random (MCAR), missing at random
(MAR), and missing not at random (MNAR).
MCAR means that missing values are distributed at
random across the data sample. The cause of missing
data is unrelated to study variables or outcomes. Data
on blood pressure, for example, are lacking because
some persons were unable to visit the study centre
due to, say, a public transport strike; these missing
data are MCAR.
The term MAR denotes that the probability of
missing data is related to other variables. For
example, when more data on blood pressure of
persons with high BMI are absent, these missing data
are referred to as MAR.
MNAR occurs when the probability of missing data is
determined by the variable's values. This happens
when the patients with the highest blood pressure
values do not come to the research centre. When
missing data are MNAR, there is no straightforward
way to obtain meaningful results. One possibilityis to
conduct a large number of sensitivity studies to
evaluate the influence of missing data on study
findings.
Multiple imputation (MI) method is used to handle
the missing data. MI consists of three phases:
imputation, analysis, and pooling. In the imputation
phase, each missing value is substituted by several
different values, which leads to multiple imputed
datasets. In the MI analysis phase, the multiple
datasets are analyzed with the appropriate statistical
approach, and the results are summarized into one
final estimate according to Rubin's guidelines during
the pooling phase [36].
Missing data handling and sensitivity analyses help
ensure the validity and reliability of study results by
accounting for the impact of missing data on the
analysis's conclusions.
VII. SAFETY AND EFFICACY ANALYSIS
A. Analysis of safety endpoints and adverse
events
An important part of clinical research and drug
development is the analysis of safety endpoints and
adverse events. It involves examining the occurrence
of adverse events and their potential impact on
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 280
patients to determine the safety profile of a medical
intervention, such as a medicine or treatment.
Safety endpoints are specific events or measures used
to evaluate the safety of a treatment. These endpoints
may include adverse events, which are any untoward
medical occurrences that occur during treatment.
Adverse reactions might range from moderate to
severe or even life-threatening. Analyzing safety
endpoints involves collecting and documenting
adverse events reported by study participants,
categorizing them according to severity and relevance
to therapy, and estimating their incidence and overall
impact.
Analysing safety endpoints and adverse events
usually includes calculating the incidence rates of
adverse events, determining the severity and duration
of each event, investigating potential patterns or
trends, and comparing the safety profile of the
treatment under study to a control group or reference
standard. To analyze the significance of recorded
adverse events and their association with the
treatment, statistical approaches such as chi-square
testing or logistic regression may be used.
B. Analysis of efficacy endpoints and treatment
outcomes
Analyzing efficacy endpoints and treatment outcomes
evaluates the effectiveness of medical intervention in
attaining its intended purpose or therapeutic goals.
Efficacy endpoints are specific measurements or
outcomes that represent a treatment's therapeutic
effect.
Depending on the nature of the study, efficacy
endpoint analysis may include examining various
parameters such as symptom improvement, disease
progression, or overall survival rates. In most cases,
these endpoints are set and outlined in the study
protocol or statistical analysis plan.
To examine efficacy endpoints, researchers collect
relevant information from studyparticipants, compare
outcomes between treatment groups (e.g.,
experimental vs. control), and use statistical analysis
to establish the significance of observed differences.
T-tests, analysis of variance (ANOVA), survival
analysis, and regression models are common
statistical approaches used in efficacy analysis.
Treatment adherence, patient compliance, and
potential confounding variables that may influence
treatment outcomes are all elements considered in the
analysis of efficacy endpoints. Subgroup studies may
also be used to see whether treatment efficacy varies
by patient population or demographic factor [37].
C. Data monitoring committees and interim
analysis
Data monitoring committees (DMCs), also known as
Data Safety Monitoring Boards (DSMBs) or
independent monitoring committees, are independent
bodies of experts who oversee clinical trial progress
and safety. To assure participant safety and the
study's scientific integrity, they provide an additional
level of oversight and monitoring.
Interim analyses, which are scheduled evaluations of
accumulating research data performed at pre-
specified time points during a clinical trial, are
performed by DMCs. Interim analyses provide for an
early assessment of the study's progress, safety, and
efficacy. The basic goals of interim analyses are to
determine whether the trial should proceed as
planned, modify the protocol, or cancel the study
prematurely owing to safety or efficacy issues.
The DMC reviews unblinded data related to safety
endpoints and efficacy outcomes during interim
analyses. They examine adverse event rates, treatment
effects, and statistical studies to assess the overall
risk-benefit balance of the intervention. Based on the
interim analysis findings, the DMC's
recommendations assist the research sponsor or
investigators in making informed decisions about the
trial's continuation or modification.
D. Independent statistical analysis and review
Independent statistical analysis and review play a
crucial role in ensuring the validity, reliability, and
transparency of clinical trial data. Apart from the
study investigators, independent statisticians or
statistical review committees conduct a thorough
investigation of the study's statistical methodologies,
data collection, and analytic processes.
Their responsibilities include ensuring the accuracy
and consistency of the obtained data, evaluating the
appropriateness of the statistical methods used in the
analysis, and ensuring the validity of the study
conclusions. To validate the presented conclusions,
independent statisticians may conduct further
analyses or reanalyze the data.
Potential biases or errors in the study design,
collection of data, or analysis can be detected and
addressed by performing an independent statistical
analysis and review. This procedure boosts the
credibility of the study's findings and ensures that the
conclusions generated from the data are accurate and
reliable. Independent statistical analysis and review
are especially important in regulatory submissions
when other experts may be involved in evaluating the
methodology and data quality of the study [38].
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 281
VIII. DATA MONITORING AND QUALITY
ASSURANCE
The quality assurance and monitoring program is
crucial for ensuring high-quality data acquisition and
reporting in study operations. It should be developed
and implemented by coordinating center
investigators, with the support of studyleadership and
field site personnel. Data quality monitoring should
cover the entire data collection, transmission, storage,
and analysis process, with summary reports prepared
and distributed to study leadership.
A. Data monitoring committees and their role
The Data and Safety Monitoring group (DSMC) is an
independent organization that makes
recommendations on whether a trial should be
continued or terminated based on safety and
effectiveness factors. Its major goal is to protect
patient safety, and it is normally avoided to blind the
DSMC to medical care form. The committee is made
up of three to five doctors, statisticians, chemists, and
ethicists. The DSMC must create statistical analysis
guidelines at the start of the trial, but the DSMC
cannot depend primarily on statistical rules. The
DSMC must maintain a current Endpoint Committee
and assure appropriate data quality.
B. Monitoring of data quality and integrity
For organisations that rely on precise, dependable
data for reporting, compliance, and customer
satisfaction, it is essential to uphold high standards of
data quality and integrity. Data quality and integrity
are assessed by examining factors such as accuracy,
completeness, timeliness, consistency, validity, and
relevance.
To assure data quality, prioritise it and create
measurements. Look into failures. Spend money on
internal training. Create data governance guidelines.
Processes should be audited, and data stewards should
be assigned. Security requirements should be updated.
Implement a centralised source of truth. Data streams
should be integrated and automated. Use the cloud to
ensure consistent data management. [39].
C. Data cleaning procedures and validation
Data cleaning is a three-stage process that involves
screening, diagnosing, and editing suspected data
abnormalities. It is more efficient to detect errors by
actively searching for them. In small studies, there
may be little or no distinction between databases and
analysis datasets [40].
1. Screening phase
Screening data involves distinguishing four types of
oddities: lack or excess of data, outliers, strange
patterns in distributions, and unexpected analysis
results. Screening methods need not onlybe statistical
but can also be based on investigator experience, pilot
studies, literature evidence, or common sense. To
make screening objective and systematic, researchers
should examine data with simple descriptive tools,
predefine expectations, plan applications, and
compare data with screening criteria. Erroneous
inliers can be identified using scatter plots, regression
analysis, or consistency checks.
Screening methods include
Checking of questionnaires using fixed
algorithms.
Validated data entry and double data entry.
Browsing of data tables after sorting.
Printouts of variables not passing range checks
and of records not passing consistency checks.
Graphical exploration of distributions: box plots,
histograms, and scatter plots [41].
2. Diagnostic phase
This phase aims to clarify the nature of data points,
patterns, and statistics. Possible diagnoses include
erroneous, true extreme, true normal, or idiopathic.
Some data points are logically or biologically
impossible, requiring both soft and hard cutoffs [42].
In some cases, a combination of diagnostic
procedures may be necessary.
To ensure consistency in data flow, access well-
archived and documented data and justify any
changes.
In the diagnostic phase, investigators must use
subject-matter knowledge to identify acceptable
values. Correcting input errors and reviewing quality
assurance procedures can help. The diagnostic phase
is labor-intensive, with budgetary, logistical, and
personnel requirements often underestimated.
Treatment phase
Researchers must decide how to handle problematic
observations, including errors, missing values, and
true values. They can correct impossible values,
delete them if correctable, or leave them unchanged.
For biological continuous variables, accuracy can be
enhanced by taking the average of both values.
For true extreme values and suspect values, the
investigator should examine their influence on
analysis results and use statistical methods to evaluate
their impact. Some authors recommend keeping true
extreme values in the analysis, but many exceptions
exist [43]. True extreme values may be excluded due
to unexpected extraneous processes or protocol-
prescribed exclusion criteria.
Data validation
The Society for Clinical Data Management's
guidelines for good clinical data management
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 282
practices lack focus on data quality standards for
clinical trial data.
The Data Validation stage involves identifying
missing data through standard data cleaning reports. It
is crucial to understand the difference between
handling missing data for data cleansing purposes and
for efficacy and safety analysis. Data reconciliation
occurs at the end of clinical trials, comparing two sets
of records to ensure accuracy and validity. Questions
during reconciliation should be handled in the same
manner as clinical queries. Standard operating
procedures and quality analysis should be part of
every study, regardless of the endpoints [44] [45].
D. Handling of protocol deviations and missing
data
Missing data is identified during data validation using
common data cleaning reports. It's crucial to
differentiate between treating missing data for data
cleansing, efficacy, and safety analysis. Data
cleansing should gather missing information on
CRFs, and data loading should be properly
performed.
Major (serious) PD: deviations that may affect the
subject’s rights, safety, or well-being; completeness
and accuracy of study data.
The research subject received the wrong treatment
or dose.
Subjects did not meet inclusion criteria but were
not withdrawn from the study (e.g. age
requirements, certain health conditions, test
results out of specified range, etc.)
The research subject received an excluded,
concomitant medication.
Breaches of confidentiality
Minor (non -serious): Minor Protocol Deviations are
not eligible for prospective review, as they do not
significantly impact subjects' rights, safety, well-
being, or data completeness or reliability.
Participants do not show up for scheduled
research visits.
Study procedures conducted out-of-sequence.
Failure to perform required lab tests,
measurements or evaluation.
Both emergent and non-emergent deviations are
possible. According to GCP guidelines and CFR 21
(312.56(d)), the sponsor must notify the reviewer as
soon as possible if a deviation occurs in an
emergency situation, such as when a participant's life
or physical well-being is in danger. FDA will consult
with a sponsor about the necessity to end an
investigation upon request.
The term "change in research" (for medical research)
should be used to describe non-emergent deviations
that constitute a significant modification in the
approved Protocol.
From the perspective of data management, there are
six main groups of Protocol Deviation:
1. Protocol Inclusion/Exclusion criteria
2. Discontinuation of treatment
3. Compliance
4. Study drug related
5. Medication related
6. Pain related [46].
IX. REGULATORY CONSIDERATIONS
AND REPORTING:
A. Regulatory requirements for data
management and analysis
It is necessary to adhere to rules and standards while
managing and analyzing data. Effective CDM
procedures and electronic data capture standards must
be followed since the pharmaceutical sector depends
on electronically generated data to evaluate
medications.
The Society for Clinical Data Management (SCDM)
has developed a document called the Good Clinical
Data Management Practices (GCDMP) Guidelines
that offers advice on the regulatory-compliant
approved CDM practices.
Study Data Tabulation Model Implementation Guide
for Human Clinical Trials (SDTMIG) and the Clinical
Data Acquisition Standards Harmonization (CDASH)
standards, available free of cost from the CDISC
website, are some of the standards developed to
support the acquisition, exchange, submission, and
archival of clinical research data.
Data validation is done to check whether the collected
data is in accordance with the protocol modifications.
Edit check programs are built to identify the
discrepancies in the entered data, which are
incorporated in the database, to ensure data validity.
The most difficult challenge would be planning and
implementing data management systems in a rapidly
changing operational environment where technology
growth outpaces current infrastructure [47].
Submission of statistical analysis plans (SAP) and
datasets
Requirements for SAP and dataset submission may
differ among regulatory agencies and countries.
Thorough consultation with the guidelines and
regulations of the respective regulatory authorities,
such as FDA in the United States or the EMA in the
European union is needed.
In the planning of a clinical study, the SAP is crucial
and among the most significant confidential
regulatory documents. The ICH E9 (International
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 283
Conference on Harmonization of Technical
Requirements for Pharmaceuticals for Human Use)
recommendations were important standards used in
the development of SAP. The results and key findings
are supported by statistical analysis and datasets. The
SAP, which incorporates intended statistical methods
and analyses to be utilized in assessing the research
objectives, is created during the first stage of a
clinical investigation. The study team and statisticians
analyze SAP to guarantee accuracy and
appropriateness before presenting it to the regulatory
organizations. Regulatory authorities are provided a
copy of SAP to evaluate the reliability and validity of
the planned analysis. Some regulatory submissions,
such as the New Drug Application (NDA) and the
Investigational New Drug (IND) application, contain
an authorized SAP.
Datasets, along with raw and derived data, need to be
prepared for submission. The preparation of datasets
involves organizing the data in a standardized format
that needs to be compiled with regulatory guidelines
such as Clinical Data Interchange Standards
Consortium (CDISC) standards. Before submission,
data integrity checks and quality control procedures
are done to ensure the completeness and consistency
of datasets. Datasets are submitted in a specified
format depending on the regulatory agency's
requirements. Commonly used formats are the Study
Data Tabulation Model (SDTM) for clinical trial data
and the Analysis Data Model (ADaM) for datasets.
B. Clinical study reports (CSR) and statistical
summaries
CSRs provide detailed information about the study's
design, methodology, findings, and conclusions. A
CSR is the primary document for presenting the
findings of a clinical study.
CSRs should adhere to regulatory guidelines issued
by the International Council for Harmonization of
Technical Requirements for Pharmaceuticals for
Human Use (ICH). The statistical analyses in CSRs
includes efficacy and safety analyses, handling of
missing data, subgroup analyses etc.
Statistical Summaries provide a concise overview of
key statistical analyses and findings from the clinical
study. Their aim is to present the statistical findings in
a clear and understandable manner. Summaries are
important documents as they provide information to
assess the robustness and significance of the clinical
study.
Both CSRs and Statistical summaries are crucial for
the regulatory submissions as they provide
information for the regulatory authorities to evaluate
the validity, safety and efficacy of clinical study.
C. Adverse event reporting and safety analysis
Adverse event reporting and safety analysis are
essential in the context of drug development and
clinical trials.
Researchers, sponsors, and investigators have an
obligation to report an adverse event to regulatory
bodies. Serious adverse events are reported within a
defined period and reporting forms may include
standardized forms or electronic reporting systems.
Safety analysis involves the evaluation of the safety
profile of the clinical study that includes the
assessment of adverse events, laboratory data, vital
signs collected during the study. These safety analysis
findings are documented in the clinical study report
(CSR) and other regulatory submissions. Safety
analyses is all about the Benefit-Risk Assessment of
study intervention. It helps to determine the safety of
the intervention and also helps in regulatory decisions
regarding the approval of intervention.
Compliance with regulatory requirements for data
management, analysis, submission, and reporting is
important to ensure the reliability of clinical study
and to protect the well-being of study participants.
X. EMERGING TRENDS AND
TECHNOLOGIES
Emerging trends and technology are reshaping the
clinical trial landscape, transforming how research is
conducted, data is collected, and patient outcomes are
monitored. Some important clinical trial trends and
technology include:
1. Decentralized clinical trials
2. Electronic informed consent
3. Artificial Intelligence (AI) and Machine Learning
(ML)
4. Real World Data and Real-World Evidence
5. Blockchain technology
A. Use of real-world data and real-world evidence
in clinical trial analysis:
In the field of healthcare, the use of real-world data
(RWD) and real-world evidence (RWE) in clinical
trial analysis is a developing trend. Traditional
clinical trials are carried out in controlled conditions
with stringent inclusion and exclusion criteria, which
may not adequately represent the diversity and
complexity of real-world patient populations [48].
Data collected from ordinary clinical practice, such as
electronic health records, claims databases, patient
registries, wearable devices, and mobile health
applications, is referred to as real-world data. Real-
world evidence, on the other hand, is derived through
analyzing this data in order to comprehend the safety,
efficacy, and effectiveness of medical therapies in
real-world contexts [49].
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 284
Some important advantages of incorporating real-
world data and proof include:
Real-world data provides a more diversified patient
population, including a range of demographics,
comorbidities, and treatment trends. This provides a
more complete picture of how interventions perform
in different patient groups, which can be used to
inform personalized healthcare methods.
Long-term results: Clinical trials frequently have
short follow-up periods, whereas real-world data can
capture long-term outcomes and assess the durability
of therapeutic benefits. This allows for a more precise
assessment of safety profiles, treatment adherence,
and long-term effectiveness.
Safety monitoring: Real-world data will help detect
unusual or long-term adverse events that may not be
visible in controlled clinical studies. This allows for
the early detection of warning signs and the continual
monitoring of drug safety profiles.
Wearables and Remote Monitoring: Wearable
technologies such as fitness trackers, smartwatches,
and biosensors allow for continuous monitoring of
patient data such as vital signs, physical activity,
sleep patterns, and medication adherence. Remote
monitoring systems improve data collection accuracy,
provide real-time insights, and enable remote patient
treatment, decreasing participant burden and
enhancing data quality.
RWE has been demonstrated to improve
healthcare quality and cost-effectiveness:
Real-world expenses and resource use can be taken
into consideration when using real-world evidence to
assess the financial impact of projects. Health policy,
value-based healthcare, and reimbursement choices
can all be influenced by this data.
Real-World Evidence is a powerful tool for
supporting clinicians and patients in making informed
treatment decisions. It is crucial to note that RWE
does not replace clinical trials, but it can frequently
add to and complement them in order to advance
clinical knowledge [50].
B. Integration of artificial intelligence and
machine learning in data analysis
Machine learning (ML), an element of AI, dealing
with the algorithms and statistical models that enable
computers to learn and make assumptions from data.
AI is a broad term for robots that imitate human
intelligence. In the context of emerging trends,
researchers and healthcare professionals can acquire
relevant insights from huge and complicated datasets
by integrating AI and ML technologies in data
analysis.
AI and machine learning algorithms are useful in
identifying patterns, detecting abnormalities, and
making predictions based on vast quantities of data.
These technologies can help with disease diagnosis,
therapy optimization, medication development, and
personalized medicine in healthcare. Healthcare
practitioners may improve patient outcomes, increase
efficiency, and speed medical research by integrating
AI and ML in data analysis.
The use of AI and machine learning in data analysis
provides a number of benefits, including:
Large volumes of data, including electronic health
records, genetic data, imaging data, and patient-
reported outcomes, may be processed and analyzed
by AI and ML algorithms. Researchers may now
discover correlations, trends, and insights that might
otherwise go unnoticed when utilizing conventional
analysis techniques.
Predictive models may be developed using AI and
machine learning methods to help researchers detect
possible risks, forecast patient outcomes, and improve
trial design. The effectiveness and success rate of
clinical trials can be increased by using these models
to aid in patient grouping, treatment response
prediction, and endpoint selection.
Data Quality Assurance: AI and machine learning
algorithms can help ensure data quality by finding
and repairing errors, highlighting discrepancies, and
validating data against established criteria.
AI and ML help healthcare professionals in making
evidence-based decisions by providing real-time
insights, treatment recommendations, and risk
assessments based on a combination of clinical data,
trial results, and medical literature.
Human skill and interpretation are still important in
clinical trial analyses, but AI and ML are effective
tools for augmenting decision-making and improving
efficiency [51].
C. Cloud-based data management and analytics
solutions
Cloud-based data management and analytics systems
are gaining recognition in the healthcare business due
to their scalability, flexibility, and cost-effectiveness.
These technologies enable healthcare organizations to
securely store, manage, and analyze large quantities
of data in the cloud, eliminating the need for on-
premises infrastructure [52].
Cloud-based platforms provide increased
accessibility, allowing researchers, physicians, and
other stakeholders to interact and access data from
any location at any time. Furthermore, cloud-based
analytics solutions offer extensive data processing
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 285
capabilities, enabling for efficient healthcare data
analysis and visualization. This trend encourages
accessibility and the development of data-driven
healthcare apps and services by facilitating seamless
data sharing.
Data Security and Privacy: Cloud providers often
provide advanced security measures to safeguard data
from unauthorized access while also adhering to
industry norms and standards. Cloud systems provide
security features such as encryption, access limits,
regular backups, and disaster recovery protocols. Data
privacy is also protected through stringent data
protection procedures and compliance with privacy
regulations.
Cloud-based data management and analytics solutions
improve partnership and enable organizations to fully
utilize their clinical trial data [53].
D. Blockchain technology for data security and
integrity
Blockchain is the technology powering Bitcoin,
serving as an open, distributed public ledger that
records all Bitcoin transactions in a safe and
verifiable manner, eliminating the need for a third
party to process payments.
Blockchain technology produces a permanent record
of transactions or data entry. Once a data block is
added to the blockchain, it becomes almost
impossible to change or tamper with. This feature
ensures that clinical trial data, such as patient records,
informed consent, and study findings, cannot be
edited retroactively, enhancing data integrity and
auditability [54].
Blockchain uses cryptographic techniques to encrypt
data, adding an additional layer of protection.
Sensitive patient data can be securely kept on the
blockchain, maintaining privacy and limiting access
to certain data elements to authorized parties. Smart
contracts allow patients to have control over their data
and express consent for its use.
Blockchain technology enables investigators,
auditors, and regulators to verify the origin,
authenticity, and integrity of data by enhancing data
traceability and transparency.
Blockchain technology holds promise for data
security and integrity in clinical trials [55].
CONCLUSION:
Data management and analysis are essential in
clinical trials, and several major findings and insights
have emerged in this field. To begin, the use of
electronic data capture (EDC) systems has improved
data quality and integrity by reducing errors and
accelerating data collection processes. Second, the
introduction of standardised data formats has
improved data exchange and sharing between trial
sites and systems.
In addition, the utilisation of recent analytics
techniques like machine learning and artificial
intelligence has made it easier to analyse huge
quantities of information more accurately and
efficiently, improving decision-making and leading to
the identification of patient subgroups. Furthermore,
the incorporation of actual data from many sources,
including electronic health records, has produced
insightful information on patient outcomes and
treatment success. The development of robust data
governance and compliance structures to safeguard
patient confidentiality and adhere to regulatory
requirements ensures that maintaining data privacy
and security remains a top concern.
Effective data management and analysis in clinical
trials have significant implications for the overall
success and advancement of medical research:
1. Improved decision making.
2. Effective data management accelerates data
collection, monitoring, and reporting processes,
reducing manual errors and enhancing trial
efficiency.
3. Data management systems allow for real-time
monitoring of patient safety and adverse events,
allowing for early intervention and risk
minimization. This assures participant well-being
and compliance to ethical standards.
4. Effective data management supports post-trial
analysis by helping researchers in identifying
long-term safety profiles and assessing outcomes
after treatment. Furthermore, well-documented
and well-organized data management systems
assure regulatory compliance, making regulatory
submissions and approvals easier.
5. The accumulation of high-quality data through
effective management and analysis aids in the
creation of robust proof for ongoing research
projects and provides input for evidence-based
medical practises, allowing for continuous patient
care improvement.
Researchers and sponsors might consider the
following suggestions to improve data management
and analysis practises in clinical trials:
1. Standardize data collection to ensure consistency
and accuracy across trial sites.
2. Develop Data Management Plans
3. Training on best practices for data management,
such as data input, quality assurance, and data
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 286
protection, should be provided for the research
staff. Ensure that suitable resources, such as data
management software, statistical tools, and data
analysis experience, are available.
4. Implement Data Quality Assurance Measures
such as data validation checks, during data
collection to minimize errors and maintain data
quality.
5. Utilize Advanced Analytics Techniques to take
advantage of the potential of large and complex
datasets.
6. Ensure Data Security and Privacy to protect
patient confidentiality and comply with data
privacy regulations.
7. Conduct regular assessments and audits of data
management and analysis practices to identify
areas for improvement.
8. Collaborate with data management and analysis
professionals, such as biostatisticians and data
scientists, to optimise study design, analysis
plans, and results interpretation.
In conclusion, effective data management and
analysis in clinical trials are essential for informed
decision-making, improved trial efficiency,
personalized medicine, patient safety, collaboration,
regulatory compliance, and advancing evidence-based
medicine. Standardizing data collection, investing in
training and resources, utilizing advanced analytics,
ensuring data security, fostering data sharing,
engaging data experts, embracing technology
innovations, and regularly evaluating practices are
key recommendations to enhance data management
and analysis in clinical trials. By implementing these
practices, researchers and sponsors can optimize trial
outcomes, accelerate medical research, and improve
patient care.
References:
[1] Krishnankutty B, Bellary S, Kumar NB,
Moodahadu LS. Data management in clinical
research: an overview. Indian journal of
pharmacology. 2012 Mar; 44(2):168.
[2] Kala N. G.*, O. Shruti, Aishwarya B. M
Overview of clinical data management and
statistical analysis of bioequivalence study
International journal of clinical Trials 2021
Nov; 8(4):323-329
[3] Clinical Data Management: Top 5 Important
Aspects, Trial Expert,
https://credevo.com/articles/2021/08/15/clinical
-data-management-top-5-important-aspects/
[4] What is an Electronic Data Capture System in
clinical trials?, ClinVigilant.
https://www.clinvigilant.com/blog/what-is-an-
electronic-data-capture-edc-in-clinical-trials/
[5] Raviteja MN, Gupta NV. A review on
electronic data management in pharmaceutical
industry. Asian Journal of Pharmaceutical and
Clinical Research. 2013; 6(2):38-42.
[6] Welker JA. Implementation of electronic data
capture systems: barriers and solutions.
Contemporary clinical trials. 2007 May 1;
28(3):329-36.
[7] Mudaranthakam DP, Krebill R, Singh RD,
Price C, Thompson J, Gajewski B, Koestler D,
Mayo MS. Case Study: Electronic Data Capture
System Validation at an Academic Institution.
Data basics: a publication supported by and for
the members of The Society for Clinical Data
Management, Inc. 2019;25(2):16.
[8] Data Cleaning and Query Management
Importance in EDC
https://www.biopharmaservices.com/blog/data-
cleaning-and-query-management-importance-
inedc/#:~:text=Data%20cleaning%20and%20q
uery%20management%20ensure%20that%20d
ata%20is%20accurate,of%20success%20in%20
clinical%20trials.
[9] What Is CDISC and What Are CDISC Data
Standards?, Allucent.
https://www.allucent.com/resources/blog/what-
cdisc-and-what-are-cdisc-data-standards
[10] Gurugubelli VS, Fang H, Shikany JM, Balkus
SV, Rumbut J, Ngo H, Wang H, Allison JJ,
Steffen LM. A review of harmonization
methods for studying dietary patterns. Smart
Health. 2022 Mar 1; 23:100263.
[11] Standards for data and interoperability, gitbook.
https://open-data-institute.gitbook.io/data-
governance-playbook/play-four-making-data-
interoperable/standards-for-data-and-
interoperability
[12] Clinical Trial Statistical analysis: How to
minimize noise, Cognivia, June 21, 2021.
https://cognivia.com/clinical-trial-statistical-
analysis-how-to-minimize-
noise/#:~:text=Statistical%20analyses%20in%2
0clinical%20trials,factor%20impacts%20a%20r
esponse%20variable.
[13] Ramya Sriram, How-to-develop-a-statistical-
analysis-plan-sap-for-clinical-trials?, 08
February, 2021.
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 287
https://www.kolabtree.com/blog/how-to-
develop-a-statistical-analysis-plan-sap-for-
clinical-trials/
[14] Gamble C, Krishan A, Stocken D, et al.
Guidelines for the Content of Statistical
Analysis Plans in Clinical Trials. JAMA. 2017;
318(23):2337–2343.
[15] Nayak, Barun K, and Avijit Hazra. “How to
choose the right statistical test?.” Indian journal
of ophthalmology vol. 59, 2 (2011): 85-6.
[16] Prabhaker Mishra, Sabaretnam Mayilvaganan,
Amit Agarwal, Statistical Methods in
Endocrine Surgery Journal Club, World j
Endoc surg 2015;7(1):2123.
[17] Mishra, Prabhaker et al. “Selection of
appropriate statistical methods for data
analysis.” Annals of cardiac anaesthesia vol.
22, 3 (2019): 297-301.
[18] Objectives and Endpoints, Pennstate Eberly
College of Science.
https://online.stat.psu.edu/stat509/book/export/
html/653
[19] Endpoint, U.S. Food and Drug Administration
(FDA) Patient-Focused Drug Development
Glossary, NCImetathesaurus
https://toolkit.ncats.nih.gov/glossary/endpoint
[20] James Yang, clinical-trial-endpoint-selection,
memoinoncology
https://memoinoncology.com/clinical-
trials/clinical-trial-endpoint-selection/
[21] Suresh, Kp, and S Chandrashekara. “Sample
size estimation and power analysis for clinical
research studies.” Journal of human
reproductive sciences vol. 5, 1 (2012): 7-13.
[22] Lerman, J. Study design in clinical research:
sample size estimation and power analysis. Can
J Anaesth 43, 184–191 (1996).
[23] Ayush Singh Rawat, An Overview of
Descriptive Analysis, 31 March, 2021.
https://www.analyticssteps.com/blogs/overview
-descriptive-analysis
[24] Barkan, Howard. “Statistics in clinical research:
Important considerations.” Annals of cardiac
anaesthesia vol. 18, 1 (2015): 74-82.
[25] Barkan, Howard. “Statistics in clinical research:
Important considerations.” Annals of cardiac
anaesthesia vol. 18, 1 (2015): 74-82.
[26] Hypothesis-testing, Scalestatistics
https://www.scalestatistics.com/hypothesis-
testing.html
[27] Richard Chin, Bruce Y. Lee, Introduction to
Clinical Trial Statistics, in Principles and
Practice of Clinical Trial Medicine,, Academic
Press, 2008, 43-60,
[28] Dr. Deng, On Biostatistics and Clinical Trials:
Time to Event end points.
http://onbiostatistics.blogspot.com/2015/09/tim
e-to-event-end-points.html?m=1
[29] Time-To-Event Data Analysis, Columbia
University Mailman School of Public Health.
https://www.publichealth.columbia.edu/researc
h/population-health-methods/time-event-data-
analysis
[30] Jennifer Le-Rademacher, Xiaofei Wang, Time-
To-Event Data: An Overview and Analysis
Considerations, Journal of Thoracic Oncology,
Volume 16, Issue 7, 2021, Pages 1067-1074,
ISSN 1556-0864.
[31] Singh, Ritesh, and Keshab Mukhopadhyay.
“Survival analysis in clinical trials: Basics and
must know areas.” Perspectives in clinical
research vol. 2,4 (2011): 145-8.
[32] Frank E. Harre, Jr. and others, Regression
Models in Clinical Studies: Determining
Relationships Between Predictors and
Response, JNCI: Journal of the National Cancer
Institute, Volume 80, Issue 15, 5 October 1988,
Pages 1198–1202
[33] Prashanth Sharma, Different Types of
Regression Models, 19 January 2022.
https://www.analyticsvidhya.com/blog/2022/01
/different-types-of-regression-models/
[34] Curt Meinert, subgroup analysis in trials, 11
March 2001.
[35] Parpia, S., Morris, T.P., Phillips, M.R. et al.
Sensitivity analysis in clinical trials: three
criteria for a valid sensitivity analysis. Eye 36,
2073–2074 (2022).
[36] Martijn W. Heymans, Jos W.R. Twisk,
Handling missing data in clinical research,
Journal of Clinical Epidemiology, Volume 151,
2022, 185-188, 0895-4356.
[37] Fanaroff, Alexander C et al. “Methods for
safety and endpoint ascertainment:
identification of adverse events through
scrutiny of negatively adjudicated events.”
Trials vol. 21, 1 323. 9 Apr. 2020.
[38] Sartor, Oliver, and Susan Halabi. “Independent
data monitoring committees: an update and
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 288
overview.” Urologic oncology vol. 33, 3
(2015): 143-8.
[39] Gassman JJ, Owen WW, Kuntz TE, Martin JP,
Amoroso WP. Data quality assurance,
monitoring, and reporting. Control Clin Trials.
1995 Apr; 16(2 Suppl):104S-136S.
[40] Van den Broeck J, Cunningham SA, Eeckels R,
Herbst K. Data cleaning: detecting, diagnosing,
and editing data abnormalities. PLoS Med.
2005 Oct; 2(10):e267.
[41] Winkler WE. Problems with inliers.
Washington (DC): Census Bureau. Research
Reports Series RR98. 1998.
[42] Altman DG. Practical statistics for medical
research. CRC press; 1990 Nov 22.
[43] Gardner MJ, Altman DG. Statistics with
confidence: confidence intervals and statistical
guidelines, in Statistics with confidence:
confidence intervals and statistical guidelines
1995. Brithis Medical Journal.
[44] Andrew J. Fletcher, Lionel D. Edwards,
Anthony W. Fox, Peter D. Stonier, "Principles
and Practice of Pharmaceutical Medicine"
(Wiley and Sons, Ltd., 2nd ed., 2007), pp.88-
91.
[45] Society for Clinical Data Management, "Good
clinical data management practices", (Society
for Clinical Data Management, version 3.0,
2003)
[46] U.S. Department of Health and Human
Services, FDA, CDER, CBER, "Guideline for
industry: E6 Good Clinical Practice:
Consolidated Guidance". (Geneva:
International Conference on Harmonization,
1996 and), accessed on January, 2009.
[47] Krishnankutty, Binny et al. “Data management
in clinical research: An overview.” Indian
journal of Pharmacology, vol. 44, 2 (2012):
168-72.
[48] Chodankar, Deepa. “Introduction to real-world
evidence studies.” Perspectives in clinical
research vol. 12, 3 (2021): 171-174.
[49] Real-World Evidence, U.S. Food & Drug
Administration https://www.fda.gov/science-
research/science-and-research-special-
topics/real-world-evidence
[50] What-is-real-world-evidence-in-clinical-trials?,
VIAL. https://vial.com/blog/articles/what-is-
real-world-evidence-in-clinical-
trials/&utm_source=organic
[51] Shah, P., Kendall, F., Khozin, S. et al. Artificial
intelligence and machine learning in clinical
development: a translational perspective. npj
Digit. Med. 2, 69 (2019).
[52] Julie Clements, How cloud data management
solutions benefit clinical research industry?, 02
January, 2018.
[53] Luthria, Gaurav, and Qingbo Wang.
“Implementing a Cloud Based Method for
Protected Clinical Trial Data Sharing.” Pacific
Symposium on Biocomputing. Pacific
Symposium on Biocomputing vol. 25 (2020):
647-658.
[54] de-Melo-Diogo, Marta & Tavares, Jorge &
Luís, Ângelo. (2021). Data Security in Clinical
Trials Using Blockchain Technology.
10.4018/978-1-7998-7363-1.ch010.
[55] Benchoufi, M., Ravaud, P. Blockchain
technology for improving clinical research
quality. Trials 18, 335 (2017).

More Related Content

Similar to Data management and analysis in clinical trials

Streamlining Data Collection eCRF Design and Machine Learning
Streamlining Data Collection eCRF Design and Machine LearningStreamlining Data Collection eCRF Design and Machine Learning
Streamlining Data Collection eCRF Design and Machine Learningijtsrd
 
Clinical data-management-overview
Clinical data-management-overviewClinical data-management-overview
Clinical data-management-overviewAcri India
 
Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...
Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...
Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...ClinosolIndia
 
CLINICAL DATA MANAGEMENT- AN OVERVIEW
CLINICAL DATA MANAGEMENT- AN OVERVIEWCLINICAL DATA MANAGEMENT- AN OVERVIEW
CLINICAL DATA MANAGEMENT- AN OVERVIEWClinosolIndia
 
Database Designing in Clinical Data Management
Database Designing in Clinical Data ManagementDatabase Designing in Clinical Data Management
Database Designing in Clinical Data ManagementClinosolIndia
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousingJuliaWilson68
 
Understanding clinical data management
Understanding clinical data managementUnderstanding clinical data management
Understanding clinical data managementfinenessinstitute
 
Revelatory Trends in Clinical Research and Data Management
Revelatory Trends in Clinical Research and Data ManagementRevelatory Trends in Clinical Research and Data Management
Revelatory Trends in Clinical Research and Data ManagementSagar Ghotekar
 
Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity Bhaswat Chakraborty
 
Study start up activities in clinical data management
Study start up activities in clinical data managementStudy start up activities in clinical data management
Study start up activities in clinical data managementsoumyapottola
 
Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies KCR
 
Qa what is_clinical_data_management
Qa what is_clinical_data_managementQa what is_clinical_data_management
Qa what is_clinical_data_managementHitesh Kadam
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data ManagementMahesh Koppula
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data ManagementDABBETA DIVYA
 
Data Mapping in Clinical Trials
Data Mapping in Clinical TrialsData Mapping in Clinical Trials
Data Mapping in Clinical TrialsClinosolIndia
 
Overview of Risk Based Monitoring in Clinical Trial Processes
Overview of Risk Based Monitoring in Clinical Trial ProcessesOverview of Risk Based Monitoring in Clinical Trial Processes
Overview of Risk Based Monitoring in Clinical Trial ProcessesEditorIJTSRD1
 

Similar to Data management and analysis in clinical trials (20)

CDM
CDMCDM
CDM
 
Streamlining Data Collection eCRF Design and Machine Learning
Streamlining Data Collection eCRF Design and Machine LearningStreamlining Data Collection eCRF Design and Machine Learning
Streamlining Data Collection eCRF Design and Machine Learning
 
Clinical data-management-overview
Clinical data-management-overviewClinical data-management-overview
Clinical data-management-overview
 
Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...
Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...
Enhancing Data Quality in Clinical Trials: Best Practices and Quality Control...
 
CLINICAL DATA MANAGEMENT- AN OVERVIEW
CLINICAL DATA MANAGEMENT- AN OVERVIEWCLINICAL DATA MANAGEMENT- AN OVERVIEW
CLINICAL DATA MANAGEMENT- AN OVERVIEW
 
Database Designing in Clinical Data Management
Database Designing in Clinical Data ManagementDatabase Designing in Clinical Data Management
Database Designing in Clinical Data Management
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
 
Understanding clinical data management
Understanding clinical data managementUnderstanding clinical data management
Understanding clinical data management
 
Revelatory Trends in Clinical Research and Data Management
Revelatory Trends in Clinical Research and Data ManagementRevelatory Trends in Clinical Research and Data Management
Revelatory Trends in Clinical Research and Data Management
 
Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity
 
Study start up activities in clinical data management
Study start up activities in clinical data managementStudy start up activities in clinical data management
Study start up activities in clinical data management
 
Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies
 
Clinical_Data_Management.docx
Clinical_Data_Management.docxClinical_Data_Management.docx
Clinical_Data_Management.docx
 
Qa what is_clinical_data_management
Qa what is_clinical_data_managementQa what is_clinical_data_management
Qa what is_clinical_data_management
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Clinical data management
Clinical data management Clinical data management
Clinical data management
 
04_GCDMP.pptx
04_GCDMP.pptx04_GCDMP.pptx
04_GCDMP.pptx
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Data Mapping in Clinical Trials
Data Mapping in Clinical TrialsData Mapping in Clinical Trials
Data Mapping in Clinical Trials
 
Overview of Risk Based Monitoring in Clinical Trial Processes
Overview of Risk Based Monitoring in Clinical Trial ProcessesOverview of Risk Based Monitoring in Clinical Trial Processes
Overview of Risk Based Monitoring in Clinical Trial Processes
 

More from ijtsrd

‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementationijtsrd
 
Edge Computing in Space Enhancing Data Processing and Communication for Space...
Edge Computing in Space Enhancing Data Processing and Communication for Space...Edge Computing in Space Enhancing Data Processing and Communication for Space...
Edge Computing in Space Enhancing Data Processing and Communication for Space...ijtsrd
 
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and ProspectsDynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and Prospectsijtsrd
 
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...ijtsrd
 
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...ijtsrd
 
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...ijtsrd
 
Problems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A StudyProblems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A Studyijtsrd
 
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...ijtsrd
 
The Impact of Educational Background and Professional Training on Human Right...
The Impact of Educational Background and Professional Training on Human Right...The Impact of Educational Background and Professional Training on Human Right...
The Impact of Educational Background and Professional Training on Human Right...ijtsrd
 
A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...ijtsrd
 
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...ijtsrd
 
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...ijtsrd
 
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. SadikuSustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadikuijtsrd
 
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...ijtsrd
 
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...ijtsrd
 
Activating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment MapActivating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment Mapijtsrd
 
Educational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger SocietyEducational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger Societyijtsrd
 
Integration of Indian Indigenous Knowledge System in Management Prospects and...
Integration of Indian Indigenous Knowledge System in Management Prospects and...Integration of Indian Indigenous Knowledge System in Management Prospects and...
Integration of Indian Indigenous Knowledge System in Management Prospects and...ijtsrd
 
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...ijtsrd
 
Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...
Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...
Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...ijtsrd
 

More from ijtsrd (20)

‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation‘Six Sigma Technique’ A Journey Through its Implementation
‘Six Sigma Technique’ A Journey Through its Implementation
 
Edge Computing in Space Enhancing Data Processing and Communication for Space...
Edge Computing in Space Enhancing Data Processing and Communication for Space...Edge Computing in Space Enhancing Data Processing and Communication for Space...
Edge Computing in Space Enhancing Data Processing and Communication for Space...
 
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and ProspectsDynamics of Communal Politics in 21st Century India Challenges and Prospects
Dynamics of Communal Politics in 21st Century India Challenges and Prospects
 
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
Assess Perspective and Knowledge of Healthcare Providers Towards Elehealth in...
 
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...The Impact of Digital Media on the Decentralization of Power and the Erosion ...
The Impact of Digital Media on the Decentralization of Power and the Erosion ...
 
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
Online Voices, Offline Impact Ambedkars Ideals and Socio Political Inclusion ...
 
Problems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A StudyProblems and Challenges of Agro Entreprenurship A Study
Problems and Challenges of Agro Entreprenurship A Study
 
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
Comparative Analysis of Total Corporate Disclosure of Selected IT Companies o...
 
The Impact of Educational Background and Professional Training on Human Right...
The Impact of Educational Background and Professional Training on Human Right...The Impact of Educational Background and Professional Training on Human Right...
The Impact of Educational Background and Professional Training on Human Right...
 
A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...A Study on the Effective Teaching Learning Process in English Curriculum at t...
A Study on the Effective Teaching Learning Process in English Curriculum at t...
 
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
The Role of Mentoring and Its Influence on the Effectiveness of the Teaching ...
 
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
Design Simulation and Hardware Construction of an Arduino Microcontroller Bas...
 
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. SadikuSustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
Sustainable Energy by Paul A. Adekunte | Matthew N. O. Sadiku | Janet O. Sadiku
 
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
Concepts for Sudan Survey Act Implementations Executive Regulations and Stand...
 
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
Towards the Implementation of the Sudan Interpolated Geoid Model Khartoum Sta...
 
Activating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment MapActivating Geospatial Information for Sudans Sustainable Investment Map
Activating Geospatial Information for Sudans Sustainable Investment Map
 
Educational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger SocietyEducational Unity Embracing Diversity for a Stronger Society
Educational Unity Embracing Diversity for a Stronger Society
 
Integration of Indian Indigenous Knowledge System in Management Prospects and...
Integration of Indian Indigenous Knowledge System in Management Prospects and...Integration of Indian Indigenous Knowledge System in Management Prospects and...
Integration of Indian Indigenous Knowledge System in Management Prospects and...
 
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
DeepMask Transforming Face Mask Identification for Better Pandemic Control in...
 
Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...
Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...
Cyber Ethics An Introduction by Paul A. Adekunte | Matthew N. O. Sadiku | Jan...
 

Recently uploaded

Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 

Recently uploaded (20)

Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 

Data management and analysis in clinical trials

  • 1. International Journal of Trend in Scientific Research and Development (IJTSRD) Volume 7 Issue 4, July-August 2023 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 270 Data Management and Analysis in Clinical Trials S. Reddemma1 , Chetana Menda2 , Manoj Kumar3 1,2 Pharm D, Student at ClinoSol Research, Hyderabad, Telangana, India 3 M. Pharmacy (Pharmaceutics), Student at ClinoSol Research, Hyderabad, Telangana, India ABSTRACT Data management and analysis play a critical role in the successful conduct of clinical trials. Proper collection, validation, and handling of data are essential for ensuring the reliability and integrity of study findings. Data management involves the design and implementation of data capture tools, such as electronic case report forms (eCRFs), to efficiently collect and store clinical data. Additionally, data analysis is a crucial step that involves applying statistical methods to extract meaningful insights from the collected data. This paper provides an overview of the key components of data management and analysis in clinical trials, highlighting the importance of adherence to data standards, ensuring data quality, and maintaining data security. Effective data management and analysis not only lead to robust study outcomes but also contribute to the overall advancement of medical knowledge and patient care. KEYWORDS: Data Management, Data Analysis, Clinical Trials, Electronic Case Report Forms (eCRFs), Data Collection, Data Validation How to cite this paper: S. Reddemma | Chetana Menda | Manoj Kumar "Data Management and Analysis in Clinical Trials" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-7 | Issue-4, August 2023, pp.270-288, URL: www.ijtsrd.com/papers/ijtsrd59667.pdf Copyright © 2023 by author (s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0) I. INTRODUCTION A clinical trial aims to address the research question by producing data that may be used to either confirm or disprove a theory. The outcome of the investigation is significantlyinfluenced by the quality of the generated data. A relevant and essential component of a clinical trial is clinical data management (CDM). In the course of their study, all researchers engage in CDM activities, whether intentionally or unintentionally. Clinical trials require effective data administration to produce trustworthy, high-quality, and statistically sound data while substantially decreasing the number of stages needed for medication development and marketing. As a result, the time between medication the production and sale are greatly shortened. Clinical data management makes ensuring that studies are carried out, maintained, and analyzed with the right level of quality and cost, and that the data are reliable for supporting any results. In the phase of clinical research where study data are gathered, clinical data management (CDM) is extremely important. For bioequivalence research, it is necessary for producing high-quality, accurate, trustworthy, and statistically sound datasets.[1]. OBJECTIVES CDM stands for the collection, cleaning, and management of subject data in compliance with the rules and regulations. The primary objective of CDM processes is to produce high-quality data by reducing errors and missing data while gathering as much data as is necessary for analysis.[1]. Best practices are applied to ensure that the data is accurate, reliable, and managed appropriatelyin order to accomplish this aim. II. DATA MANAGEMENT IN CLINICAL TRIALS The whole process of collecting, validating, and organizing subject research data is carried out in compliance with established procedures to guarantee high-quality data that is accurate and comprehensive. The main objective is to collect as much information as is necessary while minimizing the overall number of errors for study. In order to do this, specialized techniques-such as software tools-are used to maintain audit trials that enable rapid identification and rectification of data discrepancies even in significant and complicated clinical investigations. IJTSRD59667
  • 2. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 271 The Data Management Plan paper defines the CDM procedures to be followed during the experiment and offers a step-by-step guide for processing the data under controlled circumstances. The DMP includes information on the database architecture, data entry and tracking policies, quality control practices, SAE reconciliation policies, discrepancy handling, data transfer/extraction policies, and database locking policies. List of data management activities 1. Data collection 2. CRF collection 3. CRF annotation 4. Database design 5. Data entry 6. Medical coding 7. Data validation. 8. Discrepancy management 9. Database lock Clinical software programs called databases were developed to make it simpler for CDM perform out various examinations. These tools are often easy to use and constructed in accordance with regulatory standards. Entry of data is done using CRF layouts, and the database contains details about the study's objectives, timetables, visits, investigators, sites, and participants. These entry screens are verified using fake data before being used to actual data collection. CRF design and data collection The CRF, which is available in both paper and electronic versions, is used to collect data. Paper or electronic CRFs are used for data collecting. In contrast to conventional approaches, which require internal data input, e-CRF-based CDMs enable investigators or designated individuals to submit data immediately at the site. e-CRF techniques are well- liked by pharmaceutical corporations for use in their drug development procedures since they minimize error and expedite discrepancies. Monitoring CRF The correctness and completeness of the CRF submissions will be checked by the Clinical Research Associate (CRA). The recovered CRFs are given to the CDM team. The CDM team will continue to maintain track of and record the recovered CRFs. The investigator is called for an explanation if any data is missing or unclear, and the issue is then resolved. Data entry The entry of data for paper CRFs obtained from locations is done in accordance with DMP requirements. Double enter data requires two operators working independently to spot transcription errors and discrepancies. In comparison to a single data input, this reduces mistake rates and aids in verification, reconciliation, and maintaining a cleaner database. Validation of data Testing the accuracy of data in accordance with protocol requirements is known as data validation. Data validity is ensured by edit check programs, which find errors in entered data. These programs are evaluated using fake data that contains errors, such errors resulting from missing data, inconsistent data, range checks, or protocol violations. The system will indicate discrepancies if the condition is TRUE, and Data Clarification Forms (DCFs) can be created to remedy the discovered discrepancies. Discrepancy management Reviewing discrepancies, looking into their causes, resolving them with documentary evidence, or deeming them irresolvable, is the process of discrepancy management, also known as query resolution. A discrepancy database, which often comes with audit trails, is used by CDMS to track and store discrepancies. The CDM team periodically evaluates discrepancies and marks them as "closed" to avoid further validation failures. For cleaning data, discrepancy management is essential and needs to be constantly monitored. Database locking Quality checks are followed by final data validation, and SAS datasets are finalized after consulting a statistician. Before database locking, data management tasks must be finished using a pre-lock checklist. Clean data is extracted for statistical analysis after approval, at the moment when the database is locked [1] [2]. Quality Assurance In order to ensure compliance with regulatory requirements, quality assurance in data management is a dynamic process that foresees errors and flaws in data generation. The high quality and integrity of the data must be maintained throughout the clinical research process. Quality Control The quality control process in clinical research ensures internal consistency through routine operational checks at each level of the trial process and data handling to confirm the trial procedure' compliance and the accuracy of the data [3]. III. ELECTRONIC DATA CAPTURE SYSTEMS (EDC) Clinical trials employ an electronic platform known as an Electronic Data Capture (EDC) system to electronically collect, organize, and store trial data. EDC techniques are designed to take the role of
  • 3. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 272 paper-based data collecting methods, which are usually time-consuming, prone to error, and ineffective. Electronic data collecting has lately become more common in clinical studies. In clinical trials, EDC systems can improve accuracy and efficiency. Benefits of EDC Electronic data capture (EDC) technologies are beneficial for clinical trials because they improve data quality, efficiency, data monitoring, patient safety, and compliance with regulations. The EDC method also has the following advantages: It speeds up study completion; It is more secure; It is simpler to discover what you need. Types of EDC systems: 1. Web-based EDC Systems 2. Cloud-Based EDC Systems 3. Hybrid EDC Systems 4. Custom EDC Systems 5. Open-Source EDC Systems 6. Vendor-Supported EDC Systems [4]. Features and functions To fulfil the needs of researchers and market developments, EDC software providers regularly create new features. CRF Designer, Data Entry, Query Management, and Data Export are examples of basic functionalities. The Clinical Data Management System (CTMS) software, the End-to-End Use of CDISC Standards, Risk-Based Monitoring, Patient-Reported Outcomes (PRO) Software, and Standardized Codes like WHO Drug and MeDRA are some examples of advanced features. Other features include randomization of participants and risk management systems. Automatic query methods, automatic medical word coding, the creation of customized reports, strong data cleaning modules, and source data verification are examples of advanced functionality. It is necessary that regulatory organizations code pharmaceuticals and adverse events against the MedDRA and WHO drug dictionaries, respectively. Implementing EDC system Even though putting in place an Electronic Data Collection (EDC) system might be difficult, clinical trials can gain a great deal from it if it is done with care. The steps involved in implementing an EDC system include 1. Defining your EDC needs 2. Select the appropriate EDC solution 3. Implementation of EDC 4. Design the EDC database 5. Test the EDC 6. Implement the EDC. 7. Monitor the EDC system [4][5][6] Data validation query management 1. To ensure accurate, reliable, and high-quality clinical trial data, data cleaning is essential. In order to streamline data cleaning processes and decrease manual review and correction, our EDC provides capabilities like automatic edit checks and data validation rules. Data cleaning is crucial for regulatory compliance and decision-making because it eliminates biased outcomes and potential patient damage. Clinical study findings might be distorted without data cleansing, which could result in bad outcomes and inaccurate conclusions. 2. Detection of errors and discrepancies 3. Correcting errors and discrepancies 4. Data cleaning activities documentation 5. In order to resolve data queries in clinical trials and guarantee complete and accurate data, query management is essential. It aids in finding and eliminating errors, discrepancies, and inconsistencies, enhancing data quality and lowering risks of regulatory non-compliance. For effective query tracking and resolution, our EDC provides a query management system. 6. Query identification 7. Developing queries 8. Responses to queries 9. Recording query management activities [7] [8]. IV. DATA STANDARDS AND HARMONIZATION: Regulatory authorities that assess clinical research reports establish data standards, which constitute printed guidelines for the collecting, using, and managing of data obtained during clinical trials. CDISC A nonprofit organization called the "Clinical Data Interchange Standards Consortium," or CDISC for short, operates on a global scale and actively develops data standards utilizing the skills and experience of volunteers working in the pharmaceutical industry. CDISC creates and promotes standards to make it possible for data to be collected, shared, submitted, and archived for use in the development of medical and biopharmaceutical products. The consortium works with international organizations like the Chinese National Medical Products Administration
  • 4. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 273 (NMPA), the Japanese Pharmaceuticals and Medical Devices Agency (PMDA), the European Medicines Agency (EMA), and the U.S. Food and Drug Administration (FDA) to develop rules and specifications that affect the standards for both clinical and nonclinical data. Along with global law, CDISC standards are always increasing and changing. CDISC Domains: Domains exist for SEND, SDTM, and ADaM data. A domain is a group of connected observations made on a single topic that are gathered for all human or animal test participants in a clinical or nonclinical study. Therapeutic PK domains: For SEND and SDTM, the PK domains of pharmacokinetic concentrations (PC), pharmacokinetic parameters (PP), and related records (RELREC) are crucial. Non-therapeutic PK domains: SEND domains comprise PC and PP domains as well as pool definitions (POOLDEF) and additional qualifiers for pharmacokinetic concentration (SUPPPC). While SUPPPC offers additional qualifiers not included in PC variables, POOLDEF aggregates and identifies individual animals in a pooled profile for study. SEND A manual for organizing, developing, and formatting nonclinical data is called SEND, or Standard for the Exchange of Nonclinical Data. For nonclinical data, such as data from carcinogenicity studies, repeat dosage toxicology, single dose toxicology, respiratory and cardiovascular tests, and specimen assessments for immunogenicity, the SEND Implementation Guide (SEND-IG) offers specified regions and samples. SDTM A well-known CDISC standard for organized and structured content in clinical research data is the SDTM, or research Data Tabulation Model. Based on the SDTM's structure and metadata, the SDTM Implementation Guide (SDTM-IG) offers a standardized set of domains for clinical data reporting. ADaM The FDA commonly uses SDTM data as the source of analysis-ready data in a data format called ADaM, or Analysis Data Model. This format is different from SDTM standards, which concentrate on generating and mapping data that has been acquired from raw sources [9]. Harmonization of data Data harmonization is the process of combining data from several studies into a single, useable dataset in order to improve comparability and answer research questions. Critical issues in medical, psychological, and behavioral research can be addressed by combining data from multiple-center, independent, or single-center studies and restricted data sources. Two different types of harmonization exist. Before starting research, investigators must establish rules for data collection, management, and pooling. This is known as Prospective harmonization. This method offers flexibility, core statistics, and requires less work to solve the drawbacks of retrospective harmonization. Retrospective harmonization includes combining information from several research and utilizing the expertise of subject-matter experts to recognize and translate variables into standard definitions and units of measurement. [10]. Role of data standards in data sharing and interoperability Increasing interoperability Improving comparability Increasing discoverability Enabling aggregation Enabling linkability [11]. V. STATISTICAL ANALYSIS PLANNING In clinical trials, statistical analysis planning is a crucial phase that requires developing an overall strategy for analyzing trial data. The main objective of statistical analysis planning is to make sure that the statistical methods and processes utilized in the analysis are well-defined, valid, and suitable for addressing the research objectives [12]. A. Overview of statistical analysis planning in clinical trials SAP is a defined overview of the planned statistical basic/advance procedures for clinical trial analysis, which is written in both the study protocol and independently. The SAP is essential and one of the most important confidential regulatory documents in the planning of a clinical study. According to ICH E9, SAP is typically referred to as reporting and analysis plans, although in some organisations it may alternatively be referred to as data analysis plans (DAP) or statistical analysis plans (SAP). In the clinical study report, these analyses will evaluate the efficacy and safety of Investigational Medicinal Product in comparison to the standard drug.
  • 5. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 274 B. Statistical analysis plan (SAP) development and documentation Significant guidelines utilized in SAP development include ICH E9 (International Conference on Harmonisation of Technical Requirements for Pharmaceuticals for Human Use) and SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials). The following must be clearly explained in the SAP: the aims, primary objectives, secondary objectives, exploratory objectives, primary, secondary, and exploratory endpoints, trial population, trial design, sample size calculations with justifications/assumptions, and randomization procedures. A SAP must also provide a thorough explanation of the statistical technique, including efficacy analysis, safety data analysis, reporting formats, etc [13]. According to ICH E9 guidance, "the principal features of the eventual SAP of the data should be described in the statistical section of the protocol." A detailed, thorough, and transparent summary of the intended statistical analyses must be made available in order to support the reproducibility of research and dispel worries about misconduct and fraud in clinical research. According to standard guidelines, SAP should be kept in a confidential clinical trial master file and used during regulatory authorization audits to verify whether statistical records comply with standards [14]. C. Selection of appropriate statistical methods and endpoints A crucial step in the analysis of biomedical data is choosing a suitable method for statistical analysis. A poor choice of statistical approach not only causes major problems during the interpretation of the findings, but it also has an impact on the study's conclusion. For every specific situation, statistical procedures are available in statistics for the analysis and interpretation of the data. Understanding the assumptions and conditions of the statistical methods is necessary for a researcher in order to choose the best statistical method for data analysis [15]. Data analysis uses two basic statistical methods: Descriptive statistics, which uses indices like mean, median, and standard deviation to summarize data, and Inferential statistics, which uses statistical tests like the student's t-test, the ANOVA test, etc. to make conclusions from the data [16]. The selection of an appropriate statistical method depends on the following three things: 1. Aim and objective of the study, 2. Type and distribution of the data used, and 3. Nature of the observations (paired/unpaired). Type and distribution of the data used: The selection of the statistical test varies as per the data type. DATA TYPE STATISTICAL TEST Nominal/ Ordinal/ or discrete data Nonparametric methods Continuous data Parametric methods (t-test, F test, Pearson correlation coefficient, linear regression) as well as Nonparametric methods. Concept of Parametric and Nonparametric Methods: The term "parametric" refers to all statistical techniques used to compare means, whereas the term "nonparametric" refers to techniques used to compare variables other than means, such as median, mean ranks, and proportions. Parametric tests are based on the assumption that the variable is continuous and normally distributed. Nonparametric approaches are employed when the continuous data has a non-normal distribution or when there are other types of data besides continuous variables. The most frequently used parametric methods have nonparametric counterparts, which acts as backup analysis when assumptions of a parametric test are violated [17]. ENDPOINTS: An endpoint is a specific outcome of a clinical trial that is statistically evaluated to determine the safety and effectiveness of the medicine being tested. Endpoints (or outcomes) are the statistical measures chosen for each research participant that are required to meet the objectives. Examples of endpoints used in clinical trials are blood pressures, weight, survival time, etc. [18]. A clinical trial may have one or more primary, secondary, and exploratory outcomes. In order to assess whether the research met its goals or, in the case of interventional clinical trials, to get regulatory approval, primary endpoints will be employed. Secondary endpoints are measurements that add to our understanding of how a medication affects the primary endpoint or show additional impacts on the disease or condition. Exploratory endpoints might be clinically important events that are anticipated to occur too infrequently to show a treatment effect[19].
  • 6. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 275 The design of the clinical trial, the nature of the illness or condition being treated, and the expected effect of the medicine under investigation all have an impact on the endpoint selection.[20]. D. Sample size calculation and power analysis The most important phase in the design of a research study is calculating power and sample size. Power is the probability that the null hypothesis-which claims that there are no statistically significant differences between study groups in the underlying population for sample estimates like mean, percentage, odds, and correlation coefficient-will be successfully rejected. In simpler terms, researchers utilize estimates of sample size and power to figure out how many participants are required to answer the study question (or null hypothesis) [21]. The following details must be taken into account while determining the sample size:(i) alpha ( ), (ii) beta (∝), and (iii) effect size (ES). Alpha (or the P-value) is the probability of finding a difference between the treatments when a difference does not exist. This is usually expressed as a 5% chance (i.e., P < 0.05) that the null hypothesis is falsely rejected. A type I statistical error, which arises when groups are compared frequently, refers to the false claim that P < 0.05. Each time another variable is compared, there is a 5% risk that the treatments will differ due to random data selection. After examining a number of these variables, it was discovered that the probability of at least one variable differing between treatments was approximately the product of the number of variables. Table: The relationship between the null hypothesis and the true effects of the treatment Beta is the probability of failing to find a difference between the treatments when a difference exists. In the statistical literature, the maximum value is 0.20, which corresponds to a 20% risk that the null hypothesis is incorrectly accepted. If the error exceeds more than 0.2, it is classified as a type II statistical error, which happens when the null hypothesis is incorrectly accepted. This is usually expressed in terms of the power of the study; that is, the probability that the null hypothesis can be rejected if the treatments differ. Power is defined as 1−∝. For a ∝ of 0.2, the power is 0.8, which is the minimum power required to accept the null hypothesis. Type II statistical errors occur when the power of the study is <0.8. The effect size (ES) is a measure of the smallest clinically acceptable difference between treatments normalized by the standard deviation of the data (equation). The power of a study is calculated after the investigation is completed using the actual sample size, Effective Size, and value. Power tables using the sample size and ES (calculated from the results of the study) as well as the value are available in the standard textbooks [22]. VI. STATISTICAL ANALYSIS TECHNIQUES A. Descriptive statistics and data summarization Descriptive analysis is the process of summarizing and describing data using various numerical metrics and graphical representations. These statistics offer a brief and meaningful summary of the main characteristics, patterns, and distributions present in a dataset. It allows you to draw a conclusion about the distribution of your data and identify similarities among variables, enabling you to conduct additional statistical analysis [23]. Types of Descriptive Analysis: 1. Frequency tabulations and distributions: Frequency tabulation and distributions provide an efficient counting summary of a sample of data points for ease of interpretation. Frequency tabulation is a very commonly applied procedure used to summarize information from questionnaires, both in terms of tabulating various demographic characteristics (e.g., gender, age, education level, occupation) and in terms of actual responses to questions (e.g., numbers responding ‘yes’ or ‘no’ to a particular question). 2. Central tendency: Central tendency provides a set of numerical summary measures that indicate the central, average, or typical score in a given variable's distribution of scores. The three most commonly reported measures of central tendency are the mean, median, and mode. Measures of central tendency are useful in almost any type of experimental design, and in any observational studies where quantitative data are available and must be summarized.
  • 7. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 276 3. Variability: The purpose of variance assessment is to indicate the degree of spread in a sample of scores; that is, how different the scores are from one another in relation to a corresponding measure of central tendency. The three most commonly reported measures of variance are the range, variance, and standard deviation. Range is the most basic measure of variability for a set of data scores. The range is simply the sample's highest score minus the sample's lowest score. It gives a very basic idea of the degree of spread in the scores. The Variance assesses the variability of scores relative to the sample mean by using information from every individual in the sample. Variance measures the average squared deviation of each score from the sample mean. Deviation refers to the difference between an observed score value and the sample mean-they are squared. The most often reported measure of variability is the Standard deviation (SD). The standard deviation is obtained by taking the variance's square root. The standard deviation is a measure of how far apart each score is from the sample mean. In many instances, parametric statistical methods make use of both the sample mean and sample standard deviation. The standard deviation is therefore an essential measure for both testing hypotheses and describing data. 4. Correlation: Correlation coefficients describe the strength of association between two variables. Correlation coefficients commonly vary from "0," which indicates no correlation, to "-1" and "1," which indicate perfect association. The square of the correlation coefficient can be viewed as the proportion of one variable's variation predicted by the other variable. Positive correlation coefficients describe situations in which increases in value of one of the variables are associated with increases in the other variable, while negative correlation coefficients describe situations in which increases in one of the variables are associated with decreases in the other [24]. 5. Graphical methods to display data: Its main purpose is to visually summarize data sample characteristics for ease of interpretation. Bar graphs and Pie charts: These two graph forms are useful for summarizing the frequency of occurrence of various values (or ranges of values) when the data are categorical (nominal or ordinal level of measurement). A bar chart summarizes data by using vertical and horizontal axes. The vertical axis represents the frequency (number) of occurrence or the relative frequency (%) of occurrence; the horizontal axis represents the data categories of interest. A pie chart visualizes category frequencies more simply by dividing a circular plot into wedges or slices whose sizes are proportional to the relative frequency (%) of occurrence of various data categories. Histograms and Frequency polygons: These two graph forms are useful for summarizing the frequency of occurrence of various values (or ranges of values) when the data is essentially continuous (interval or ratio level of measurement). Both histograms and frequency polygons summarize data using vertical and horizontal axes. The vertical axis represents the frequency (number) of occurrences or the relative frequency (%) of occurrences, while the horizontal axis represents data values or ranges of values of interest. The frequency polygon employs lines and points to represent frequency, whereas the histogram uses bars of variable heights. Line graphs: The line graph has a similar appearance to the frequency polygon but is far more adaptive in terms of data summarization. Instead, we summarize data information, such as averages or means, for various groups of people. As a result, one major application of the line graph is to categorize scores on a specific variable based on membership in the categories of a second variable. Scatter plots: In correlational research, scatterplots are particularly useful for demonstrating the link between two interval- or ratio-scaled variables or measures of interest obtained on the same people. In a scatterplot, all data point pairs in the sample are graphed, with one variable selected to be represented on the horizontal axis, the second variable selected to be represented on the vertical axis. A scatterplot's shape and tilt reveal visual information about the direction and strength of the association between the two variables. Positive relationships are shown by a cloud of dots that are often directed upward and towards the right side of the graph. A cloud of points with a general downward tilt towards the right side of the graph indicates a negative relationship [25]. B. Inferential statistics and hypothesis testing Inferential statistics are frequently used to compare treatment group differences. Inferential statistics compare treatment groups using measures from the experiment's sample of subjects and draw conclusions about the larger population of subjects.
  • 8. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 277 Note that inferential statistics usually suggest but cannot absolutely prove an explanation or cause-and- effect relationship [26]. The most inferential statistics are based on the principle of a test-statistic value. This value, coupled with the degrees of freedom (a sample size measure) and the rejection criteria, is used to assess whether there are differences between the treatment groups. The larger the sample size, the more likely it is that a statistic will show that differences exist between treatment groups. The null hypothesis, H₀, is first proposed as a written statement or a mathematical expression. The most common null hypothesis is that "no difference(s) exist between the groups, and they all come from the same population." Second, an alternative hypothesis, H₁, is proposed, which will be accepted if there is sufficient evidence to reject the null hypothesis i.e., H₁ will be “the two populations are different”. The tests all generate a significance probability (p value, or SP) that shows the chance that the observed value of the test statistic belongs to the distribution specified by the null hypothesis for the test. A p value of 0.5 indicates that there is a 50% chance that the observation fits the null hypothesis, or one in two chance, whereas a p value of 0.05 indicates that this probability is only 5%, or one in twenty chance. A one-in-two chance is not low enough to be certain that the null hypothesis is incorrect, whereas a one-in- twenty chance makes it far more likely. At this level, we may agree that the null hypothesis is false; hence, a p value of 0.05 is commonly used as the 'cut-off' probability. p values less than 0.05 are considered good evidence against the null hypothesis; values greater than this are not. Before doing an inferential test, it is important to understand the sort of data being analyzed as well as whether the data, or modified data, is regularly distributed. If so, parametric tests can be employed to analyze the data; otherwise, nonparametric tests can be utilized [27]. C. Survival analysis and time-to-event endpoints The time until an event happens is the primary endpoint of interest in survival analysis, a statistical tool for data analysis. It's commonly utilized in medical research, particularly clinical trials and observational studies. In clinical research, time-to- event endpoints, also known as survival endpoints, are frequently used to determine how long it takes for an event to occur. These endpoints give important information regarding the time it takes for an event, such as death, illness recurrence, or treatment failure, to occur[28]. Here are some key concepts in survival analysis and time-to-event endpoints: Survival Time: Survival time refers to the duration from a defined starting point (e.g., diagnosis, treatment initiation) until the occurrence of a specific event or endpoint. The event of interest could be death, disease recurrence, progression, or any other predefined outcome. The survival function, S (t), expresses the probability that a person will survive longer than some specified time t. It expresses the probability that the random variable T will exceed the specified time t. The survival function is fundamental to a survival analysis. Kaplan-Meier Estimator: The Kaplan-Meier estimator is a nonparametric method for estimating the survival function, which represents the probability of surviving beyond a certain time point. It considers censored observations and produces survival estimates at various time periods. Survival Curve: The survival curve is a graphic representation of the estimated survival function over time. It represents a probability of survival or the occurrence of an event as a function of time. To analyse differences in survival outcomes, the survival curves can be compared between different therapy groups or patient subgroups. Censoring: Censoring occurs when the event of interest for some individuals has not occurred by the end of the study or when they are lost to follow-up. Censoring can be either right-censoring (the event did not occur by the end of the study) or interval- censoring (the event occurred between two observation times, but the exact time is unknown). Censored observations provide only partial information and must be handled carefullyin survival analysis. Hazard Function: The hazard function (also known as the hazard rate) is a measure of the instantaneous rate at which events occur at a certain period, assuming the individual has survived up until that time. It shows the probability of experiencing an event in a short time interval, assuming survival up to that time. Hazard ratio (HR): The hazard ratio (HR) is similar to the concept of relative risk. It has been used to characterize the results of therapeutic studies in which the question is whether treatment can reduce the duration of an illness. The hazard ratio is an estimate of the ratio of hazard rate in the treated and control groups [29] [30].
  • 9. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 278 Cox Proportional Hazards Model: A common regression model in survival analysis is the Cox proportional hazards model. It enables evaluation of the relationship between covariates (e.g., treatment, age, and gender) and the hazard function while assuming that the hazard ratio remains constant all over time. The Cox model provides hazard ratios, which quantify the effect of a covariate on the probability of an event occurring. Cox regression is a semi-parametric approach since the baseline hazard function, h0(t), does not need to be given. The Cox proportional hazard model makes two assumptions: The hazard ratios of two people are time-independent and only apply to time-independent covariates. This means that the hazard functions for any two individuals at any point in time are proportional. Log-Rank Test: The log-rank test is a popular statistical test for comparing survival curves between two or more groups. It determines whether there are statistically significant differences in survival times across groups. The log-rank test considers both observed and censored observations [31]. Survival analysis and time-to-event endpoints in clinical research provide useful insights into the probability and timing of events of interest. They enable the assessment of therapy efficacy, the identification of prognostic variables, and an understanding of disease development. These methods assist researchers and clinicians in making informed decisions about patient care and contribute to the advancement of evidence-based medicine. D. Regression analysis and modeling Regression models are increasingly being used in clinical and epidemiologic investigations to assess therapy efficacy, investigate risk variables, investigate prognostic patterns, and make predictions for specific patients, among other things. Two models have emerged: binary responses, such as in-hospital death or the presence/absence of a certain condition, and censored continuous responses, such as time until death or therapeutic response in a sample of patients, not all of whom may have died or responded. There are four kinds of regression model assumptions: 1. The participants of the study are a random sample of the population from whom the inference is to be drawn, using independent observations. 2. The response variable's distribution has certain characteristics. The logistic model makes no such assumption, whereas the Cox model assumes that the hazard functions for two individuals are proportionate over time, or that the log[-log survival] or log hazard curves for two individuals are equidistant over time. 3. The function that connects a prediction to a response has a specific structure. In its most basic form, the logistic model assumes that a continuous predictor is linearly connected to the log odds of the result. 4. Predictors act additively unless "interaction" variables are included in the model [32]. Overview of regression analysis and modeling in clinical trials: Linear Regression: Linear regression can be used in clinical trials to assess the influence of a treatment on a continuous outcome while controlling for other variables. It can, for example, be used to evaluate the effect of a drug dosage on blood pressure levels while controlling for variables such as age, gender, and baseline blood pressure. Logistic Regression: Logistic regression is typically employed when the outcome variable is binary or categorical (e.g., response vs. non- response, presence vs. absence of a disease). In clinical trials, logistic regression can be used to model treatment response or the occurrence of adverse events. Cox proportional hazards regression: Survival analysis uses Cox proportional hazards regression to assess the relationship between predictor variables and time-to-event outcomes. In clinical trials, Cox regression is commonly used to evaluate the effects of treatments or prognostic factors on survival outcomes. Model Selection and Validation: In clinical trials, it is important to carefully select variables and adjust for potential confounders before applying regression analysis and modeling [33]. In clinical trials, regression analysis and modeling provide useful insights into the correlations between variables, treatment effects, and outcome predictions. Researchers can use these strategies to account for confounding effects, adjust for covariates, and make evidence-based decisions. On the other hand, proper model development, validation, and interpretation are critical to ensuring the correctness and trustworthiness of the outcomes. E. Subgroup analysis and stratification In general, a subgroup is a subset or portion of a study population that is distinguished by a particular characteristic or set of characteristics.
  • 10. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 279 Subgroup analysis is defined as any data analysis focused on a selected subgroup. Subgroup analysis aims at characterizing observed differences among multiple subgroups, such as comparing treatment differences in a trial for different patient subgroups defined by gender, age at entry, and other baseline characteristics. It is a type of exploratory data analysis in which the goal is to discover a subgroup of people who can account for an observed difference, such as in a study to evaluate whether or not an observed treatment difference can be explained by some subgroup. Stratification: Stratification is the deliberate and defined allocation of participants to distinct treatment groups based on specific predefined variables or characteristics. A stratification variable is also a subgrouping variable. The goal is to ensure that essential parameters are represented fairly across treatment arms, which can increase internal validity and reduce confounding effects. Stratification in trials is done at "randomization time" to guarantee that the treatment groups are comparable with reference to the stratification variable. Comparability is achieved by ensuring that the treatment groups have the same mix of individuals in terms of the stratification variable. For example, if gender is stratified, each treatment group will have the equal proportion of males to females [34]. F. Sensitivity analysis and missing data handling Sensitivity analysis examines the impact of different assumptions or variations in the analysis to determine the robustness of study findings or conclusions. It helps in determining the stability of data and the impact of important parameters on study outcomes. If the sensitivity analyses produce results that are consistent with the primary results, researchers can be confident that the assumptions used in the primary analysis had little impact on the outcomes, giving support to the trial findings. A particular analysis is considered a sensitivity analysis if it meets the following criteria: 1. The proposed analysis aims to respond to the same question as the primary analysis, 2. There is a possibility that the proposed analysis will yield results that differ from those of the primary analysis, and 3. There would be uncertainty as to which analysis to believe if the proposed analysis produced results that differed from the primary analysis. These criteria can help guide sensitivity analysis and identify what to consider when assessing sensitivity analysis results [35]. Handling of Missing data: Although researchers make every effort to avoid missing data, it is present in almost every study. Ignoring missing data in statistical analysis can result in significantly biased study outcomes. Rubin was the first to create a framework of several sorts of missing data (missing data mechanisms) that are important in determining the next stages in missing data handling. The three missing data mechanisms are missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). MCAR means that missing values are distributed at random across the data sample. The cause of missing data is unrelated to study variables or outcomes. Data on blood pressure, for example, are lacking because some persons were unable to visit the study centre due to, say, a public transport strike; these missing data are MCAR. The term MAR denotes that the probability of missing data is related to other variables. For example, when more data on blood pressure of persons with high BMI are absent, these missing data are referred to as MAR. MNAR occurs when the probability of missing data is determined by the variable's values. This happens when the patients with the highest blood pressure values do not come to the research centre. When missing data are MNAR, there is no straightforward way to obtain meaningful results. One possibilityis to conduct a large number of sensitivity studies to evaluate the influence of missing data on study findings. Multiple imputation (MI) method is used to handle the missing data. MI consists of three phases: imputation, analysis, and pooling. In the imputation phase, each missing value is substituted by several different values, which leads to multiple imputed datasets. In the MI analysis phase, the multiple datasets are analyzed with the appropriate statistical approach, and the results are summarized into one final estimate according to Rubin's guidelines during the pooling phase [36]. Missing data handling and sensitivity analyses help ensure the validity and reliability of study results by accounting for the impact of missing data on the analysis's conclusions. VII. SAFETY AND EFFICACY ANALYSIS A. Analysis of safety endpoints and adverse events An important part of clinical research and drug development is the analysis of safety endpoints and adverse events. It involves examining the occurrence of adverse events and their potential impact on
  • 11. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 280 patients to determine the safety profile of a medical intervention, such as a medicine or treatment. Safety endpoints are specific events or measures used to evaluate the safety of a treatment. These endpoints may include adverse events, which are any untoward medical occurrences that occur during treatment. Adverse reactions might range from moderate to severe or even life-threatening. Analyzing safety endpoints involves collecting and documenting adverse events reported by study participants, categorizing them according to severity and relevance to therapy, and estimating their incidence and overall impact. Analysing safety endpoints and adverse events usually includes calculating the incidence rates of adverse events, determining the severity and duration of each event, investigating potential patterns or trends, and comparing the safety profile of the treatment under study to a control group or reference standard. To analyze the significance of recorded adverse events and their association with the treatment, statistical approaches such as chi-square testing or logistic regression may be used. B. Analysis of efficacy endpoints and treatment outcomes Analyzing efficacy endpoints and treatment outcomes evaluates the effectiveness of medical intervention in attaining its intended purpose or therapeutic goals. Efficacy endpoints are specific measurements or outcomes that represent a treatment's therapeutic effect. Depending on the nature of the study, efficacy endpoint analysis may include examining various parameters such as symptom improvement, disease progression, or overall survival rates. In most cases, these endpoints are set and outlined in the study protocol or statistical analysis plan. To examine efficacy endpoints, researchers collect relevant information from studyparticipants, compare outcomes between treatment groups (e.g., experimental vs. control), and use statistical analysis to establish the significance of observed differences. T-tests, analysis of variance (ANOVA), survival analysis, and regression models are common statistical approaches used in efficacy analysis. Treatment adherence, patient compliance, and potential confounding variables that may influence treatment outcomes are all elements considered in the analysis of efficacy endpoints. Subgroup studies may also be used to see whether treatment efficacy varies by patient population or demographic factor [37]. C. Data monitoring committees and interim analysis Data monitoring committees (DMCs), also known as Data Safety Monitoring Boards (DSMBs) or independent monitoring committees, are independent bodies of experts who oversee clinical trial progress and safety. To assure participant safety and the study's scientific integrity, they provide an additional level of oversight and monitoring. Interim analyses, which are scheduled evaluations of accumulating research data performed at pre- specified time points during a clinical trial, are performed by DMCs. Interim analyses provide for an early assessment of the study's progress, safety, and efficacy. The basic goals of interim analyses are to determine whether the trial should proceed as planned, modify the protocol, or cancel the study prematurely owing to safety or efficacy issues. The DMC reviews unblinded data related to safety endpoints and efficacy outcomes during interim analyses. They examine adverse event rates, treatment effects, and statistical studies to assess the overall risk-benefit balance of the intervention. Based on the interim analysis findings, the DMC's recommendations assist the research sponsor or investigators in making informed decisions about the trial's continuation or modification. D. Independent statistical analysis and review Independent statistical analysis and review play a crucial role in ensuring the validity, reliability, and transparency of clinical trial data. Apart from the study investigators, independent statisticians or statistical review committees conduct a thorough investigation of the study's statistical methodologies, data collection, and analytic processes. Their responsibilities include ensuring the accuracy and consistency of the obtained data, evaluating the appropriateness of the statistical methods used in the analysis, and ensuring the validity of the study conclusions. To validate the presented conclusions, independent statisticians may conduct further analyses or reanalyze the data. Potential biases or errors in the study design, collection of data, or analysis can be detected and addressed by performing an independent statistical analysis and review. This procedure boosts the credibility of the study's findings and ensures that the conclusions generated from the data are accurate and reliable. Independent statistical analysis and review are especially important in regulatory submissions when other experts may be involved in evaluating the methodology and data quality of the study [38].
  • 12. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 281 VIII. DATA MONITORING AND QUALITY ASSURANCE The quality assurance and monitoring program is crucial for ensuring high-quality data acquisition and reporting in study operations. It should be developed and implemented by coordinating center investigators, with the support of studyleadership and field site personnel. Data quality monitoring should cover the entire data collection, transmission, storage, and analysis process, with summary reports prepared and distributed to study leadership. A. Data monitoring committees and their role The Data and Safety Monitoring group (DSMC) is an independent organization that makes recommendations on whether a trial should be continued or terminated based on safety and effectiveness factors. Its major goal is to protect patient safety, and it is normally avoided to blind the DSMC to medical care form. The committee is made up of three to five doctors, statisticians, chemists, and ethicists. The DSMC must create statistical analysis guidelines at the start of the trial, but the DSMC cannot depend primarily on statistical rules. The DSMC must maintain a current Endpoint Committee and assure appropriate data quality. B. Monitoring of data quality and integrity For organisations that rely on precise, dependable data for reporting, compliance, and customer satisfaction, it is essential to uphold high standards of data quality and integrity. Data quality and integrity are assessed by examining factors such as accuracy, completeness, timeliness, consistency, validity, and relevance. To assure data quality, prioritise it and create measurements. Look into failures. Spend money on internal training. Create data governance guidelines. Processes should be audited, and data stewards should be assigned. Security requirements should be updated. Implement a centralised source of truth. Data streams should be integrated and automated. Use the cloud to ensure consistent data management. [39]. C. Data cleaning procedures and validation Data cleaning is a three-stage process that involves screening, diagnosing, and editing suspected data abnormalities. It is more efficient to detect errors by actively searching for them. In small studies, there may be little or no distinction between databases and analysis datasets [40]. 1. Screening phase Screening data involves distinguishing four types of oddities: lack or excess of data, outliers, strange patterns in distributions, and unexpected analysis results. Screening methods need not onlybe statistical but can also be based on investigator experience, pilot studies, literature evidence, or common sense. To make screening objective and systematic, researchers should examine data with simple descriptive tools, predefine expectations, plan applications, and compare data with screening criteria. Erroneous inliers can be identified using scatter plots, regression analysis, or consistency checks. Screening methods include Checking of questionnaires using fixed algorithms. Validated data entry and double data entry. Browsing of data tables after sorting. Printouts of variables not passing range checks and of records not passing consistency checks. Graphical exploration of distributions: box plots, histograms, and scatter plots [41]. 2. Diagnostic phase This phase aims to clarify the nature of data points, patterns, and statistics. Possible diagnoses include erroneous, true extreme, true normal, or idiopathic. Some data points are logically or biologically impossible, requiring both soft and hard cutoffs [42]. In some cases, a combination of diagnostic procedures may be necessary. To ensure consistency in data flow, access well- archived and documented data and justify any changes. In the diagnostic phase, investigators must use subject-matter knowledge to identify acceptable values. Correcting input errors and reviewing quality assurance procedures can help. The diagnostic phase is labor-intensive, with budgetary, logistical, and personnel requirements often underestimated. Treatment phase Researchers must decide how to handle problematic observations, including errors, missing values, and true values. They can correct impossible values, delete them if correctable, or leave them unchanged. For biological continuous variables, accuracy can be enhanced by taking the average of both values. For true extreme values and suspect values, the investigator should examine their influence on analysis results and use statistical methods to evaluate their impact. Some authors recommend keeping true extreme values in the analysis, but many exceptions exist [43]. True extreme values may be excluded due to unexpected extraneous processes or protocol- prescribed exclusion criteria. Data validation The Society for Clinical Data Management's guidelines for good clinical data management
  • 13. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 282 practices lack focus on data quality standards for clinical trial data. The Data Validation stage involves identifying missing data through standard data cleaning reports. It is crucial to understand the difference between handling missing data for data cleansing purposes and for efficacy and safety analysis. Data reconciliation occurs at the end of clinical trials, comparing two sets of records to ensure accuracy and validity. Questions during reconciliation should be handled in the same manner as clinical queries. Standard operating procedures and quality analysis should be part of every study, regardless of the endpoints [44] [45]. D. Handling of protocol deviations and missing data Missing data is identified during data validation using common data cleaning reports. It's crucial to differentiate between treating missing data for data cleansing, efficacy, and safety analysis. Data cleansing should gather missing information on CRFs, and data loading should be properly performed. Major (serious) PD: deviations that may affect the subject’s rights, safety, or well-being; completeness and accuracy of study data. The research subject received the wrong treatment or dose. Subjects did not meet inclusion criteria but were not withdrawn from the study (e.g. age requirements, certain health conditions, test results out of specified range, etc.) The research subject received an excluded, concomitant medication. Breaches of confidentiality Minor (non -serious): Minor Protocol Deviations are not eligible for prospective review, as they do not significantly impact subjects' rights, safety, well- being, or data completeness or reliability. Participants do not show up for scheduled research visits. Study procedures conducted out-of-sequence. Failure to perform required lab tests, measurements or evaluation. Both emergent and non-emergent deviations are possible. According to GCP guidelines and CFR 21 (312.56(d)), the sponsor must notify the reviewer as soon as possible if a deviation occurs in an emergency situation, such as when a participant's life or physical well-being is in danger. FDA will consult with a sponsor about the necessity to end an investigation upon request. The term "change in research" (for medical research) should be used to describe non-emergent deviations that constitute a significant modification in the approved Protocol. From the perspective of data management, there are six main groups of Protocol Deviation: 1. Protocol Inclusion/Exclusion criteria 2. Discontinuation of treatment 3. Compliance 4. Study drug related 5. Medication related 6. Pain related [46]. IX. REGULATORY CONSIDERATIONS AND REPORTING: A. Regulatory requirements for data management and analysis It is necessary to adhere to rules and standards while managing and analyzing data. Effective CDM procedures and electronic data capture standards must be followed since the pharmaceutical sector depends on electronically generated data to evaluate medications. The Society for Clinical Data Management (SCDM) has developed a document called the Good Clinical Data Management Practices (GCDMP) Guidelines that offers advice on the regulatory-compliant approved CDM practices. Study Data Tabulation Model Implementation Guide for Human Clinical Trials (SDTMIG) and the Clinical Data Acquisition Standards Harmonization (CDASH) standards, available free of cost from the CDISC website, are some of the standards developed to support the acquisition, exchange, submission, and archival of clinical research data. Data validation is done to check whether the collected data is in accordance with the protocol modifications. Edit check programs are built to identify the discrepancies in the entered data, which are incorporated in the database, to ensure data validity. The most difficult challenge would be planning and implementing data management systems in a rapidly changing operational environment where technology growth outpaces current infrastructure [47]. Submission of statistical analysis plans (SAP) and datasets Requirements for SAP and dataset submission may differ among regulatory agencies and countries. Thorough consultation with the guidelines and regulations of the respective regulatory authorities, such as FDA in the United States or the EMA in the European union is needed. In the planning of a clinical study, the SAP is crucial and among the most significant confidential regulatory documents. The ICH E9 (International
  • 14. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 283 Conference on Harmonization of Technical Requirements for Pharmaceuticals for Human Use) recommendations were important standards used in the development of SAP. The results and key findings are supported by statistical analysis and datasets. The SAP, which incorporates intended statistical methods and analyses to be utilized in assessing the research objectives, is created during the first stage of a clinical investigation. The study team and statisticians analyze SAP to guarantee accuracy and appropriateness before presenting it to the regulatory organizations. Regulatory authorities are provided a copy of SAP to evaluate the reliability and validity of the planned analysis. Some regulatory submissions, such as the New Drug Application (NDA) and the Investigational New Drug (IND) application, contain an authorized SAP. Datasets, along with raw and derived data, need to be prepared for submission. The preparation of datasets involves organizing the data in a standardized format that needs to be compiled with regulatory guidelines such as Clinical Data Interchange Standards Consortium (CDISC) standards. Before submission, data integrity checks and quality control procedures are done to ensure the completeness and consistency of datasets. Datasets are submitted in a specified format depending on the regulatory agency's requirements. Commonly used formats are the Study Data Tabulation Model (SDTM) for clinical trial data and the Analysis Data Model (ADaM) for datasets. B. Clinical study reports (CSR) and statistical summaries CSRs provide detailed information about the study's design, methodology, findings, and conclusions. A CSR is the primary document for presenting the findings of a clinical study. CSRs should adhere to regulatory guidelines issued by the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use (ICH). The statistical analyses in CSRs includes efficacy and safety analyses, handling of missing data, subgroup analyses etc. Statistical Summaries provide a concise overview of key statistical analyses and findings from the clinical study. Their aim is to present the statistical findings in a clear and understandable manner. Summaries are important documents as they provide information to assess the robustness and significance of the clinical study. Both CSRs and Statistical summaries are crucial for the regulatory submissions as they provide information for the regulatory authorities to evaluate the validity, safety and efficacy of clinical study. C. Adverse event reporting and safety analysis Adverse event reporting and safety analysis are essential in the context of drug development and clinical trials. Researchers, sponsors, and investigators have an obligation to report an adverse event to regulatory bodies. Serious adverse events are reported within a defined period and reporting forms may include standardized forms or electronic reporting systems. Safety analysis involves the evaluation of the safety profile of the clinical study that includes the assessment of adverse events, laboratory data, vital signs collected during the study. These safety analysis findings are documented in the clinical study report (CSR) and other regulatory submissions. Safety analyses is all about the Benefit-Risk Assessment of study intervention. It helps to determine the safety of the intervention and also helps in regulatory decisions regarding the approval of intervention. Compliance with regulatory requirements for data management, analysis, submission, and reporting is important to ensure the reliability of clinical study and to protect the well-being of study participants. X. EMERGING TRENDS AND TECHNOLOGIES Emerging trends and technology are reshaping the clinical trial landscape, transforming how research is conducted, data is collected, and patient outcomes are monitored. Some important clinical trial trends and technology include: 1. Decentralized clinical trials 2. Electronic informed consent 3. Artificial Intelligence (AI) and Machine Learning (ML) 4. Real World Data and Real-World Evidence 5. Blockchain technology A. Use of real-world data and real-world evidence in clinical trial analysis: In the field of healthcare, the use of real-world data (RWD) and real-world evidence (RWE) in clinical trial analysis is a developing trend. Traditional clinical trials are carried out in controlled conditions with stringent inclusion and exclusion criteria, which may not adequately represent the diversity and complexity of real-world patient populations [48]. Data collected from ordinary clinical practice, such as electronic health records, claims databases, patient registries, wearable devices, and mobile health applications, is referred to as real-world data. Real- world evidence, on the other hand, is derived through analyzing this data in order to comprehend the safety, efficacy, and effectiveness of medical therapies in real-world contexts [49].
  • 15. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 284 Some important advantages of incorporating real- world data and proof include: Real-world data provides a more diversified patient population, including a range of demographics, comorbidities, and treatment trends. This provides a more complete picture of how interventions perform in different patient groups, which can be used to inform personalized healthcare methods. Long-term results: Clinical trials frequently have short follow-up periods, whereas real-world data can capture long-term outcomes and assess the durability of therapeutic benefits. This allows for a more precise assessment of safety profiles, treatment adherence, and long-term effectiveness. Safety monitoring: Real-world data will help detect unusual or long-term adverse events that may not be visible in controlled clinical studies. This allows for the early detection of warning signs and the continual monitoring of drug safety profiles. Wearables and Remote Monitoring: Wearable technologies such as fitness trackers, smartwatches, and biosensors allow for continuous monitoring of patient data such as vital signs, physical activity, sleep patterns, and medication adherence. Remote monitoring systems improve data collection accuracy, provide real-time insights, and enable remote patient treatment, decreasing participant burden and enhancing data quality. RWE has been demonstrated to improve healthcare quality and cost-effectiveness: Real-world expenses and resource use can be taken into consideration when using real-world evidence to assess the financial impact of projects. Health policy, value-based healthcare, and reimbursement choices can all be influenced by this data. Real-World Evidence is a powerful tool for supporting clinicians and patients in making informed treatment decisions. It is crucial to note that RWE does not replace clinical trials, but it can frequently add to and complement them in order to advance clinical knowledge [50]. B. Integration of artificial intelligence and machine learning in data analysis Machine learning (ML), an element of AI, dealing with the algorithms and statistical models that enable computers to learn and make assumptions from data. AI is a broad term for robots that imitate human intelligence. In the context of emerging trends, researchers and healthcare professionals can acquire relevant insights from huge and complicated datasets by integrating AI and ML technologies in data analysis. AI and machine learning algorithms are useful in identifying patterns, detecting abnormalities, and making predictions based on vast quantities of data. These technologies can help with disease diagnosis, therapy optimization, medication development, and personalized medicine in healthcare. Healthcare practitioners may improve patient outcomes, increase efficiency, and speed medical research by integrating AI and ML in data analysis. The use of AI and machine learning in data analysis provides a number of benefits, including: Large volumes of data, including electronic health records, genetic data, imaging data, and patient- reported outcomes, may be processed and analyzed by AI and ML algorithms. Researchers may now discover correlations, trends, and insights that might otherwise go unnoticed when utilizing conventional analysis techniques. Predictive models may be developed using AI and machine learning methods to help researchers detect possible risks, forecast patient outcomes, and improve trial design. The effectiveness and success rate of clinical trials can be increased by using these models to aid in patient grouping, treatment response prediction, and endpoint selection. Data Quality Assurance: AI and machine learning algorithms can help ensure data quality by finding and repairing errors, highlighting discrepancies, and validating data against established criteria. AI and ML help healthcare professionals in making evidence-based decisions by providing real-time insights, treatment recommendations, and risk assessments based on a combination of clinical data, trial results, and medical literature. Human skill and interpretation are still important in clinical trial analyses, but AI and ML are effective tools for augmenting decision-making and improving efficiency [51]. C. Cloud-based data management and analytics solutions Cloud-based data management and analytics systems are gaining recognition in the healthcare business due to their scalability, flexibility, and cost-effectiveness. These technologies enable healthcare organizations to securely store, manage, and analyze large quantities of data in the cloud, eliminating the need for on- premises infrastructure [52]. Cloud-based platforms provide increased accessibility, allowing researchers, physicians, and other stakeholders to interact and access data from any location at any time. Furthermore, cloud-based analytics solutions offer extensive data processing
  • 16. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 285 capabilities, enabling for efficient healthcare data analysis and visualization. This trend encourages accessibility and the development of data-driven healthcare apps and services by facilitating seamless data sharing. Data Security and Privacy: Cloud providers often provide advanced security measures to safeguard data from unauthorized access while also adhering to industry norms and standards. Cloud systems provide security features such as encryption, access limits, regular backups, and disaster recovery protocols. Data privacy is also protected through stringent data protection procedures and compliance with privacy regulations. Cloud-based data management and analytics solutions improve partnership and enable organizations to fully utilize their clinical trial data [53]. D. Blockchain technology for data security and integrity Blockchain is the technology powering Bitcoin, serving as an open, distributed public ledger that records all Bitcoin transactions in a safe and verifiable manner, eliminating the need for a third party to process payments. Blockchain technology produces a permanent record of transactions or data entry. Once a data block is added to the blockchain, it becomes almost impossible to change or tamper with. This feature ensures that clinical trial data, such as patient records, informed consent, and study findings, cannot be edited retroactively, enhancing data integrity and auditability [54]. Blockchain uses cryptographic techniques to encrypt data, adding an additional layer of protection. Sensitive patient data can be securely kept on the blockchain, maintaining privacy and limiting access to certain data elements to authorized parties. Smart contracts allow patients to have control over their data and express consent for its use. Blockchain technology enables investigators, auditors, and regulators to verify the origin, authenticity, and integrity of data by enhancing data traceability and transparency. Blockchain technology holds promise for data security and integrity in clinical trials [55]. CONCLUSION: Data management and analysis are essential in clinical trials, and several major findings and insights have emerged in this field. To begin, the use of electronic data capture (EDC) systems has improved data quality and integrity by reducing errors and accelerating data collection processes. Second, the introduction of standardised data formats has improved data exchange and sharing between trial sites and systems. In addition, the utilisation of recent analytics techniques like machine learning and artificial intelligence has made it easier to analyse huge quantities of information more accurately and efficiently, improving decision-making and leading to the identification of patient subgroups. Furthermore, the incorporation of actual data from many sources, including electronic health records, has produced insightful information on patient outcomes and treatment success. The development of robust data governance and compliance structures to safeguard patient confidentiality and adhere to regulatory requirements ensures that maintaining data privacy and security remains a top concern. Effective data management and analysis in clinical trials have significant implications for the overall success and advancement of medical research: 1. Improved decision making. 2. Effective data management accelerates data collection, monitoring, and reporting processes, reducing manual errors and enhancing trial efficiency. 3. Data management systems allow for real-time monitoring of patient safety and adverse events, allowing for early intervention and risk minimization. This assures participant well-being and compliance to ethical standards. 4. Effective data management supports post-trial analysis by helping researchers in identifying long-term safety profiles and assessing outcomes after treatment. Furthermore, well-documented and well-organized data management systems assure regulatory compliance, making regulatory submissions and approvals easier. 5. The accumulation of high-quality data through effective management and analysis aids in the creation of robust proof for ongoing research projects and provides input for evidence-based medical practises, allowing for continuous patient care improvement. Researchers and sponsors might consider the following suggestions to improve data management and analysis practises in clinical trials: 1. Standardize data collection to ensure consistency and accuracy across trial sites. 2. Develop Data Management Plans 3. Training on best practices for data management, such as data input, quality assurance, and data
  • 17. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 286 protection, should be provided for the research staff. Ensure that suitable resources, such as data management software, statistical tools, and data analysis experience, are available. 4. Implement Data Quality Assurance Measures such as data validation checks, during data collection to minimize errors and maintain data quality. 5. Utilize Advanced Analytics Techniques to take advantage of the potential of large and complex datasets. 6. Ensure Data Security and Privacy to protect patient confidentiality and comply with data privacy regulations. 7. Conduct regular assessments and audits of data management and analysis practices to identify areas for improvement. 8. Collaborate with data management and analysis professionals, such as biostatisticians and data scientists, to optimise study design, analysis plans, and results interpretation. In conclusion, effective data management and analysis in clinical trials are essential for informed decision-making, improved trial efficiency, personalized medicine, patient safety, collaboration, regulatory compliance, and advancing evidence-based medicine. Standardizing data collection, investing in training and resources, utilizing advanced analytics, ensuring data security, fostering data sharing, engaging data experts, embracing technology innovations, and regularly evaluating practices are key recommendations to enhance data management and analysis in clinical trials. By implementing these practices, researchers and sponsors can optimize trial outcomes, accelerate medical research, and improve patient care. References: [1] Krishnankutty B, Bellary S, Kumar NB, Moodahadu LS. Data management in clinical research: an overview. Indian journal of pharmacology. 2012 Mar; 44(2):168. [2] Kala N. G.*, O. Shruti, Aishwarya B. M Overview of clinical data management and statistical analysis of bioequivalence study International journal of clinical Trials 2021 Nov; 8(4):323-329 [3] Clinical Data Management: Top 5 Important Aspects, Trial Expert, https://credevo.com/articles/2021/08/15/clinical -data-management-top-5-important-aspects/ [4] What is an Electronic Data Capture System in clinical trials?, ClinVigilant. https://www.clinvigilant.com/blog/what-is-an- electronic-data-capture-edc-in-clinical-trials/ [5] Raviteja MN, Gupta NV. A review on electronic data management in pharmaceutical industry. Asian Journal of Pharmaceutical and Clinical Research. 2013; 6(2):38-42. [6] Welker JA. Implementation of electronic data capture systems: barriers and solutions. Contemporary clinical trials. 2007 May 1; 28(3):329-36. [7] Mudaranthakam DP, Krebill R, Singh RD, Price C, Thompson J, Gajewski B, Koestler D, Mayo MS. Case Study: Electronic Data Capture System Validation at an Academic Institution. Data basics: a publication supported by and for the members of The Society for Clinical Data Management, Inc. 2019;25(2):16. [8] Data Cleaning and Query Management Importance in EDC https://www.biopharmaservices.com/blog/data- cleaning-and-query-management-importance- inedc/#:~:text=Data%20cleaning%20and%20q uery%20management%20ensure%20that%20d ata%20is%20accurate,of%20success%20in%20 clinical%20trials. [9] What Is CDISC and What Are CDISC Data Standards?, Allucent. https://www.allucent.com/resources/blog/what- cdisc-and-what-are-cdisc-data-standards [10] Gurugubelli VS, Fang H, Shikany JM, Balkus SV, Rumbut J, Ngo H, Wang H, Allison JJ, Steffen LM. A review of harmonization methods for studying dietary patterns. Smart Health. 2022 Mar 1; 23:100263. [11] Standards for data and interoperability, gitbook. https://open-data-institute.gitbook.io/data- governance-playbook/play-four-making-data- interoperable/standards-for-data-and- interoperability [12] Clinical Trial Statistical analysis: How to minimize noise, Cognivia, June 21, 2021. https://cognivia.com/clinical-trial-statistical- analysis-how-to-minimize- noise/#:~:text=Statistical%20analyses%20in%2 0clinical%20trials,factor%20impacts%20a%20r esponse%20variable. [13] Ramya Sriram, How-to-develop-a-statistical- analysis-plan-sap-for-clinical-trials?, 08 February, 2021.
  • 18. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 287 https://www.kolabtree.com/blog/how-to- develop-a-statistical-analysis-plan-sap-for- clinical-trials/ [14] Gamble C, Krishan A, Stocken D, et al. Guidelines for the Content of Statistical Analysis Plans in Clinical Trials. JAMA. 2017; 318(23):2337–2343. [15] Nayak, Barun K, and Avijit Hazra. “How to choose the right statistical test?.” Indian journal of ophthalmology vol. 59, 2 (2011): 85-6. [16] Prabhaker Mishra, Sabaretnam Mayilvaganan, Amit Agarwal, Statistical Methods in Endocrine Surgery Journal Club, World j Endoc surg 2015;7(1):2123. [17] Mishra, Prabhaker et al. “Selection of appropriate statistical methods for data analysis.” Annals of cardiac anaesthesia vol. 22, 3 (2019): 297-301. [18] Objectives and Endpoints, Pennstate Eberly College of Science. https://online.stat.psu.edu/stat509/book/export/ html/653 [19] Endpoint, U.S. Food and Drug Administration (FDA) Patient-Focused Drug Development Glossary, NCImetathesaurus https://toolkit.ncats.nih.gov/glossary/endpoint [20] James Yang, clinical-trial-endpoint-selection, memoinoncology https://memoinoncology.com/clinical- trials/clinical-trial-endpoint-selection/ [21] Suresh, Kp, and S Chandrashekara. “Sample size estimation and power analysis for clinical research studies.” Journal of human reproductive sciences vol. 5, 1 (2012): 7-13. [22] Lerman, J. Study design in clinical research: sample size estimation and power analysis. Can J Anaesth 43, 184–191 (1996). [23] Ayush Singh Rawat, An Overview of Descriptive Analysis, 31 March, 2021. https://www.analyticssteps.com/blogs/overview -descriptive-analysis [24] Barkan, Howard. “Statistics in clinical research: Important considerations.” Annals of cardiac anaesthesia vol. 18, 1 (2015): 74-82. [25] Barkan, Howard. “Statistics in clinical research: Important considerations.” Annals of cardiac anaesthesia vol. 18, 1 (2015): 74-82. [26] Hypothesis-testing, Scalestatistics https://www.scalestatistics.com/hypothesis- testing.html [27] Richard Chin, Bruce Y. Lee, Introduction to Clinical Trial Statistics, in Principles and Practice of Clinical Trial Medicine,, Academic Press, 2008, 43-60, [28] Dr. Deng, On Biostatistics and Clinical Trials: Time to Event end points. http://onbiostatistics.blogspot.com/2015/09/tim e-to-event-end-points.html?m=1 [29] Time-To-Event Data Analysis, Columbia University Mailman School of Public Health. https://www.publichealth.columbia.edu/researc h/population-health-methods/time-event-data- analysis [30] Jennifer Le-Rademacher, Xiaofei Wang, Time- To-Event Data: An Overview and Analysis Considerations, Journal of Thoracic Oncology, Volume 16, Issue 7, 2021, Pages 1067-1074, ISSN 1556-0864. [31] Singh, Ritesh, and Keshab Mukhopadhyay. “Survival analysis in clinical trials: Basics and must know areas.” Perspectives in clinical research vol. 2,4 (2011): 145-8. [32] Frank E. Harre, Jr. and others, Regression Models in Clinical Studies: Determining Relationships Between Predictors and Response, JNCI: Journal of the National Cancer Institute, Volume 80, Issue 15, 5 October 1988, Pages 1198–1202 [33] Prashanth Sharma, Different Types of Regression Models, 19 January 2022. https://www.analyticsvidhya.com/blog/2022/01 /different-types-of-regression-models/ [34] Curt Meinert, subgroup analysis in trials, 11 March 2001. [35] Parpia, S., Morris, T.P., Phillips, M.R. et al. Sensitivity analysis in clinical trials: three criteria for a valid sensitivity analysis. Eye 36, 2073–2074 (2022). [36] Martijn W. Heymans, Jos W.R. Twisk, Handling missing data in clinical research, Journal of Clinical Epidemiology, Volume 151, 2022, 185-188, 0895-4356. [37] Fanaroff, Alexander C et al. “Methods for safety and endpoint ascertainment: identification of adverse events through scrutiny of negatively adjudicated events.” Trials vol. 21, 1 323. 9 Apr. 2020. [38] Sartor, Oliver, and Susan Halabi. “Independent data monitoring committees: an update and
  • 19. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD59667 | Volume – 7 | Issue – 4 | Jul-Aug 2023 Page 288 overview.” Urologic oncology vol. 33, 3 (2015): 143-8. [39] Gassman JJ, Owen WW, Kuntz TE, Martin JP, Amoroso WP. Data quality assurance, monitoring, and reporting. Control Clin Trials. 1995 Apr; 16(2 Suppl):104S-136S. [40] Van den Broeck J, Cunningham SA, Eeckels R, Herbst K. Data cleaning: detecting, diagnosing, and editing data abnormalities. PLoS Med. 2005 Oct; 2(10):e267. [41] Winkler WE. Problems with inliers. Washington (DC): Census Bureau. Research Reports Series RR98. 1998. [42] Altman DG. Practical statistics for medical research. CRC press; 1990 Nov 22. [43] Gardner MJ, Altman DG. Statistics with confidence: confidence intervals and statistical guidelines, in Statistics with confidence: confidence intervals and statistical guidelines 1995. Brithis Medical Journal. [44] Andrew J. Fletcher, Lionel D. Edwards, Anthony W. Fox, Peter D. Stonier, "Principles and Practice of Pharmaceutical Medicine" (Wiley and Sons, Ltd., 2nd ed., 2007), pp.88- 91. [45] Society for Clinical Data Management, "Good clinical data management practices", (Society for Clinical Data Management, version 3.0, 2003) [46] U.S. Department of Health and Human Services, FDA, CDER, CBER, "Guideline for industry: E6 Good Clinical Practice: Consolidated Guidance". (Geneva: International Conference on Harmonization, 1996 and), accessed on January, 2009. [47] Krishnankutty, Binny et al. “Data management in clinical research: An overview.” Indian journal of Pharmacology, vol. 44, 2 (2012): 168-72. [48] Chodankar, Deepa. “Introduction to real-world evidence studies.” Perspectives in clinical research vol. 12, 3 (2021): 171-174. [49] Real-World Evidence, U.S. Food & Drug Administration https://www.fda.gov/science- research/science-and-research-special- topics/real-world-evidence [50] What-is-real-world-evidence-in-clinical-trials?, VIAL. https://vial.com/blog/articles/what-is- real-world-evidence-in-clinical- trials/&utm_source=organic [51] Shah, P., Kendall, F., Khozin, S. et al. Artificial intelligence and machine learning in clinical development: a translational perspective. npj Digit. Med. 2, 69 (2019). [52] Julie Clements, How cloud data management solutions benefit clinical research industry?, 02 January, 2018. [53] Luthria, Gaurav, and Qingbo Wang. “Implementing a Cloud Based Method for Protected Clinical Trial Data Sharing.” Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing vol. 25 (2020): 647-658. [54] de-Melo-Diogo, Marta & Tavares, Jorge & Luís, Ângelo. (2021). Data Security in Clinical Trials Using Blockchain Technology. 10.4018/978-1-7998-7363-1.ch010. [55] Benchoufi, M., Ravaud, P. Blockchain technology for improving clinical research quality. Trials 18, 335 (2017).