SlideShare a Scribd company logo
Volume 8 Issue 542 Journal for Clinical Studies
Technology
Introduction
Every clinical trial is a source of multidimensional data,
analysed in order to answer questions presented in
hypotheses on safety, efficacy and other topics. For the
analysis to be reliable and successful, the recorded data
must be of sufficient quality, i.e. complete, correct and
integral. Keeping invalid or incomplete data in a database
may cause incorrect calculation results, leading to invalid
conclusions and wrong decisions. It is not only a matter
of potential consequences for the sponsor but also
ethics. As there are living humans behind the numbers
generated, this issue must not be taken lightly.
Thus, the process of data validation becomes a key
aspect of every trial. Although the process of checking
and cleaning data is usually performed by the data
management team, a close cooperation with biostatistics
may significantly improve the results by introducing both
statistical knowledge and the ability to create specialised,
programmatic tools and advanced queries giving a good
foundation for deeper and faster data investigations.
Reasons and Types of Invalid Data
Invalid data is usually caused by a human mistake.
EDC forms containing fields insufficiently protected by
edit checks increase the chance for errors. Obviously, it
is always better to prevent than being sorry, and EDC
forms should always be made resistant to errors. Reality,
however, often involves compromises. Text fields allowing
free text to be entered are a good example of that.
Sometimes one has to deal with already poorly-designed
forms. Things get even worse if the EDC software does not
prevent the entry of incorrect values, but rather displays
alerts to the user. This is not uncommon.
Invalid Results
Results of laboratory examinations present a good
example for what can go wrong. Typos, invalid decimal
separators, textual results mixed with numerical ones,
results mixed with both manual comments or messages
generated by the system/machine (“sample hemolysis”,
“bellow assay range”), units entered in many forms,
incorrectly assigned units (e.g. “G/L” confused with
“g/L”), missing lower or upper limits of reference ranges,
switched lower and upper limits of reference ranges,
incorrect assignments between reference ranges and
gender or age, incorrectly assigned flags (high, low,
abnormal), dates and times entered in a wrong format,
just to name a few possible issues. Even automatically
transferred data from a laboratory into an EDC software
through transfer files and programmatic API can be
invalid due to technical issues.
Multiple units make results incomparable and without
the process of unification, they cannot simply be included
in the analysis. Simple group-by analysis enumerates all
the entered units and helps to prepare a list of conversion
factors. It is generally a good idea to make all units SI
compliant.
Missing Observations
While the bad impact of invalid data is obvious, probably
not everyone realises that missing data may affect
statistical computations in no less degree. Things get
worse, if the missingness is not at random, but rather
follows a pattern. Lower sample size may increase
dispersion in data affecting values of descriptive statistics
and estimation of errors. Statistical tests lose their power.
Bias in parameter estimation may be introduced as well.
Design of a trial may become unbalanced, which often
leads to confounding data. Missing observations may
distort distribution shapes. Assumptions of statistical
methods may be violated, which makes statistical
inference unreliable. Missing classes of observations may
make the analysis impossible to perform or interpret.
Advanced imputation techniques are commonly in
use, however they are still only an attempt to fight the
fire. One should not forget that they introduce artificial
data, even if a statistical model says they are possible.
Moreover, things may get really bad when it comes to
misguided data imputation, which may completely distort
the picture of a situation.
Suspicious Observations
Suspicious observations make the next category of issues
which significantly lower the quality of data. Observation
can be considered suspicious for many reasons. Its value
may be too high or too low, acting as an outlier and
significantly affecting results of an analysis or causing
the analysis to fail entirely. Such values may be expected
as typical for a specific disease (ESR, AlAT) or indicate a
human mistake, thus it should be investigated carefully.
But also values looking pretty normal, lying inside a
normal range, may reveal worrying patterns, indicating
a potentially artificial nature of the entered data and
probably fraud. Investigations entailed by this class of
problems are particularly challenging and subtle.
It is not easy to cope with these problems in a
transparent and formalised world of clinical trials when
they happen. Suspicious observations, rich in outliers,
can really damage calculations, distort results and lead
to wrong conclusions. Even if a solution in the form of a
robust statistical method exists, it is challenging to apply,
due to the fact that hypotheses are usually stated a priori
along with a corresponding and closed set of statistical
methods that will be used.
Close Cooperation Between Data Management and
Biostatistics Benefits Data Quality
50_JCS_September2016.indd 42 29/09/2016 16:00:42
Journal for Clinical Studies 43www.jforcs.com
Fraud and Misconduct
Fraud and misconduct, caused intentionally or by
insufficient training, can result in damages which are
often impossible to fix and are very expensive in the end.
One would say that it is far better to have missing rather
than incorrect data. Inappropriate IMP management,
handling or administration procedures, including
accidental switching of drug, placebo or comparator as
well as incorrect examination techniques applied can
damage the data in an unrecoverable manner. This is
because what is done cannot be undone. The sooner it
is detected and eliminated, the better, all the more for
the fact that it often requires long-lasting and difficult
investigation in order to collect all the evidence.
Solutions
After a statistical analysis plan and protocol is prepared
and signed, one does not simply alter things, especially the
set of statistical methods and proceedings, without being
charged with being manipulative. This clearly shows how
extremely important it is to ensure data completeness
and correctness long before the database is finally locked
and the analysis starts. As the process of data validation
and correction is not completed immediately, it involves
a lot of additional communication, consumes time and
resources, and postponing it to a moment shortly before
the lock is very risky.
At KCR we maximise efforts to minimise the risk of
further dealing with invalid and incomplete data, as well as
allowing poorly-trained staff to perform. For this purpose
we have introduced a close cooperation between data
management and biostatistics. While data management
personnel are typically responsible for preparing well-
designed, CDISC-compliant EDC forms and performing
periodic data reviews, the biostatistics department
provides both statistical support and programmatic tools
for advanced data checking and transformation.
The following kinds of support are currently applied
at KCR: preparation-stage analysis; assisted data
validation; creating tools for unassisted data validation;
writing screening programs for unsolicited, ad-hoc data
review; providing solutions for automated scour analysis;
programming solutions for data exchange between
information systems, and last but not least – training and
mentoring.
Preparation-stage Analysis
Every trial starts with a set of common preliminary steps
that have a critical impact on the data quality. One of
the most important prerequisites is to properly design the
EDC forms. The key thing is to ensure its compliance with
CDISC CDASH specification. The second step is to secure
input fields with appropriate edit checks to prevent the
user from entering nonsense data. In addition, text inputs
should be encoded with dictionaries whenever possible.
This refers not only to fields intended to be medically
encoded (MedDRA, LOINC, ICD, etc.) but to any field
of which the content can be organised in a dictionary
to avoid multiple names for a single thing. For encoded
fields, the option allowing the user to enter his own text
should be avoided if possible, as it is contrary to the idea.
All these actions are mostly performed by data
management; however, the programming
skills offered by biostatistics make an
excellent opportunity to improve the process
by preparing scripts querying the database
in search of missing rules, checks and
violations of certain naming conventions.
Assisted Data Validation
This kind of support covers analyses done on
request and usually together with personnel
from other departments, like clinicians,
administrators and managers. It is mainly
used for deeper investigations which cover
various aspects of a trial and involve much
more advanced methods than usual.
Various statistical methods are in use, for
example:
•	 an extended set of descriptive
statistics, including robust, both
classic and positional measures
•	 graphical analyses using various
combinations of scatterplots,
boxplots, mosaic plots, histograms
and various types of density plots,
as well as custom plots revealing
specific patterns in data
Technology
Missing Ref. Range end both lower upper
RefRange lower upper
clunit
%
10^3
10^9/L
1000/uL
G/L
x10^3/ul
x10^6/uL
Result index
0 25 50 75 100
LOGResult[x10^9/L]
1X10
+2
1X10
+1
1X10
+0
1X10
-1
1X10
-2
1X10
-3
Chart 1: An exemplary diagram revealing typical issues found in laboratory data:
missing values, incomplete and missing reference ranges, incorrect units assigned
50_JCS_September2016.indd 43 29/09/2016 16:00:42
Volume 8 Issue 544 Journal for Clinical Studies
•	 analysis of possible outliers done both graphically
and mathematically
•	 analysis of suspicious data by looking for patterns
in coexisting values in view of surrounding
circumstances, involving graphical and mathematical
methods, like decision trees
•	 analysis of randomness in data samples
•	 analysis of patterns in missing data by using
specialised graphs
We have found that graphical methods are especially
useful in communication with clinicians and managers.
Well-designed graphics immediately reveal patterns
and make the user able to grasp a lot of information. It
works perfectly while searching for patterns in missing
data, investigating possible frauds and investigating
laboratory data.
A good example of such activity is a process of
reviewing results of laboratory tests expressed in various
units. By applying a set of conversion factors between
units, it is possible to unify all values and show them on
a common chart along with reference ranges and other
information. This shows immediately which units were
chosen and if they are valid, whether observations have
incorrect values or if a corresponding reference range
(or one of its ends) is missing. This message is easy to
understand and reduces the need to get through long
tables of numbers.
Assisted Data Validation – Fraud and Misconduct
The detection of potential fraud and misconduct involves
both graphical and statistical methods. At the first stage,
the biostatistics team tries to picture the situation with
simple plots, which are then discussed in a team of
clinicians, managers and other specialists. All doubtful
patterns are examined by statisticians using various simple
and advanced, multidimensional methods. In the end,
the statisticians present findings and recommendations
for decision-making. Such investigation can reveal
intentional, harmful activity as well as showing certain
weaknesses of procedures and deficits in training.
Abnormally low or high dispersion in data,
relationships between means and dispersions, highly
skewed distributions (when not expected), departures
from shapes of distribution characterised in a protocol,
unexpected patterns in data like “steps” and “clusters”,
strange relationships between variables, unexpected
patterns in missing data, periodicity in occurrences of
specific issues and many other things can be detected
by well-trained biostatisticians and revealed before
clinicians and managers.
Creations of Tools for Unassisted, Repeatable Data
Validation
The key to success is to perform the data checking as
often as possible. Daily checking is not unusual. On
the other hand, it may become a very time-consuming
process and frequently involving the biostatistics team
in running required analyses does not seem to be the
best option. The fact that many valuable analyses do not
require any statistical advisory has helped us to develop a
reporting tool that can be used by the data management
staff alone.
The first step is to create a list of required analyses,
where items are prioritised and grouped by predefined
categories. For each report, a set of parameters and
their default values are determined as well. The next
step refers to technical matters, like the selection of the
technology to be used, choice of a method of accessing
the database, description of a user authorisation process,
shape of a graphical user interface, selection of the
desired output formats, etc. Since long-lasting analyses
slow down the database, its content should be replicated
to another instance or exported to an intermediate
file (XML, CSV, etc.) before the analysis. In order to
save money, the chosen technology should allow the
utilisation of already existing resources, i.e. hardware,
software, statistical programmers and administrators. In
this case, if R programmers are already on board, the R
package should be considered as the default development
platform first rather than other technologies (.NET, Java,
PHP, etc.) which would require the hiring of additional
programmers.
We decided to create the tool as a self-contained,
windows-based application hosted entirely by the R
package. GNU R is a well-known, powerful, acclaimed
and free statistical package, as well as a high-level
programming language. It is a strong SAS competitor,
used worldwide by millions of users , huge corporations
and organisations, including FDA. R is an open-source
project, developed by the R Core Team, and supported
by the R Consortium which consists of companies like
Microsoft, Oracle, IBM and Google.
The contents of the R library address practically every
topic in biostatistics , including clinical research. R is
capable of reading data and producing output in various
formats, including SAS datasets, Microsoft Office and
PDF documents. Extensive support for querying numerous
kinds of data sources (also via SQL), implementation of the
reproducible research paradigm, three advanced charting
systems, the ability to host embedded user interfaces and
web applications, full portability understood as an ability
to run without the installation on almost every operating
system and a huge, dynamic society of users, make R a
good candidate for a reliable programmatic environment.
The created tool is capable of running a wide range of
a laboratory data reconciliation as well as trial-specific
analyses. The implemented set of analyses allows for
detection of: missing visits, empty mandatory fields,
inconsistencies in certain data domains, various kinds
of misconduct, discrepancies between the database and
specification in units, normal ranges and flags, missing
Technology
50_JCS_September2016.indd 44 29/09/2016 16:00:42
Journal for Clinical Studies 45www.jforcs.com
Technology
laboratory examinations, departures from a schedule
described in the protocol and invalid results, to name
only a few. It has proven its usefulness in everyday
practice. Now it takes only a few minutes for the full set of
analyses and just a few seconds for a single report, when
previously it took long hours to create a corresponding
Excel report manually. By using the tool we were able to
detect serious issues and take certain remedies before
the situation got serious.
Screening Ad Hoc Analyses
The process of writing programs for the final statistical
report is a perfect moment for assessing the quality of
collected data long before analysing them. We call them
“screening programs” and use them to check if the data
is clean enough to perform a certain part of the analysis.
Screening analyses are valuable due to the nature
of their creation: while writing the statistical analysis
program, the statistician plays a lot with the data by
writing a number of queries and checking the content
of a database in many ways. This often results in useful
queries, which normally might have never been requested.
By the use of the reproducible research paradigm
implementation available in R, it is possible to embed
these analyses directly into the main statistical analysis
program.
Automated Scour Analysis
This is an automated enhancement of the screening data
validation, working in the background, and has more of an
“alerting” nature. A program scours the database content
periodically in search of specific issues and reports findings
via email or stores them in an HTML log. The fact that the
amount of time required to complete such an analysis is
of low importance, there is no direct, intended interaction
between ordinary users and the system, and that R is not
resource-consuming and can be deployed in a machine
with any architecture, makes it possible to implement
the tool on simplified minicomputers like Raspberry Pi.
This eliminates the need to buy a new machine or install
new software on an existing, stable server. An additional
small (3.7”) breadboard with LCD touchscreen will enable
a limited interaction with the script.
Simple data
inspector
Dictionaries
User interface
TemplatesQueries
Direct access
Access
via export
Scripts
EDC
Software
SQL
</>
CSV
<CSS>
&
<html>
<html>
<XML>
Access via database
interfaces: OBDC/JDBC
PDF
Site ID
1
1
1
1
2
SubjID
3
3
4
5
5
Lab Test
RBC
WBC
ESR
Hb
B-HCG
Screening
OK
OK
OK
OK
N/A
Day 1
OK
MISSING
MISSING
OK
OK
Day 2
OK
N/A
MISSING
N/A
MISSING
Scheme 1: An overall architecture of a typical reporting system
50_JCS_September2016.indd 45 29/09/2016 16:00:42
Journal for Clinical Studies 46www.jforcs.com
Data Converters
A data converter is a kind of program which transforms
data from one form to another. Its sole task is to
eliminate the human factor during the process of data
transformation as much as possible.
Transferring results of clinical examinations from an
external laboratory into an EDC database, followed by
additional data integrity checks, makes a good example
of such a process. At KCR we constitute data converters
every time the adjustment of received data format
is required. As previously, the R statistical package is
used for that purpose, which significantly facilitates
complicated operations on data spread over multiple,
differentiated sources. Advanced querying capabilities
together with the availability of interfaces to numerous
database engines make the process of transferring data
extremely simple in comparison to traditional, high-level
programming languages, and can be done in a very few
lines of code.
Training and Mentoring
Sharing knowledge about possible issues that can happen
to data as well as emphasising their impact on the
analysis results is no less important than the analytical
support itself. If people understand why certain matters
are so important, they are more cooperative and follow
the rules more willingly. In order to raise a better, more
general awareness in these matters, we decided to
organise a series of courses for non-statisticians. The
audience has demonstrated high interest, which confirms
that our efforts and direction were right.
Summary
Data validation is a process of great importance, having
significant implications for the reliability of the final
data analysis. There are many possible sources of issues,
which makes it really difficult to identify them all and
react quickly enough. From the early stages of a trial to
its very end, at every turn, this is where the programmatic
and statistical support provided by the biostatistics team
comes to the rescue. At KCR, both departments closely
cooperate with each other and have been organised in a
common biometrics unit in order to facilitate the flow of
information.
References
1.	 Oracle Corporation, “Scaling R to the Enterprise. Using
R for Enterprise-level Performance, Scalability, Ease
of Production Deployment, and Security”, An Oracle
White Paper, July 2016, http://www.oracle.com/
technetwork/database/options/advanced-analytics/
r-enterprise/bringing-r-to-the-enterprise-1956618.
pdf
2.	 Olszewski Adrian, “Is R suitable enough for
biostatisticians involved in clinical research and
evidence-based medicine?”, June 15th 2015, http://r-
clinical-research.com
3.	 Smith David, Microsoft Corporation (formerly
Revolution Analytics), “FDA: R OK for drug trials”,
June 21st 2012, http://blog.revolutionanalytics.
com/2012/06/fda-r-ok.html
4.	 Smith David, Microsoft Corporation (formerly
Revolution Analytics), “Companies using R in 2014”,
May 23rd 2014, http://blog.revolutionanalytics.
com/2014/05/companies-using-r-in-2014.html
Technology
Adrian Olszewski is Biostatistician in the
Biometrics & Clinical Trial Data Execution
Systems Department at KCR, a contract
research organisation (CRO). Adrian is
involved in delivering informatics and
analytical solutions for medicine, pharmacy
and clinical laboratory diagnostics. He has a
profound knowledge in statistics in the field of evidence-
based medicine, especially in clinical research. Adrian is
responsible for providing comprehensive support for trials
from the early design considerations through the data
analysis – including interim evaluations – to the final
report. Adrian is also involved in various external projects
on widely understood data analysis and applications of
the R statistical package. Mr Olszewski holds a Master of
Science (MSc) degree in Computer Science.
Email: info@kcrcro.com
50_JCS_September2016.indd 46 29/09/2016 16:00:42

More Related Content

What's hot

Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data ManagementShray Jali
 
CDM
CDMCDM
The impact of electronic data capture on clinical data management
The impact of electronic data capture on clinical data managementThe impact of electronic data capture on clinical data management
The impact of electronic data capture on clinical data management
Clin Plus
 
Clinical Data Collection: The Good, the Bad, the Beautiful
Clinical Data Collection: The Good, the Bad, the BeautifulClinical Data Collection: The Good, the Bad, the Beautiful
Clinical Data Collection: The Good, the Bad, the Beautiful
Mike Hogarth, MD, FACMI, FACP
 
Clinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated dataClinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated data
IUPUI
 
Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...
Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...
Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...
Levi Shapiro
 
AGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORING
AGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORINGAGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORING
AGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORING
Cyntegrity | Data Science for Clinical Trials
 
Designing Risk Metrics for Risk-Based Monitoring
Designing Risk Metrics for Risk-Based MonitoringDesigning Risk Metrics for Risk-Based Monitoring
Designing Risk Metrics for Risk-Based Monitoring
TRI, the risk-based monitoring company
 
Clean Clinical Trial
Clean Clinical Trial Clean Clinical Trial
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
Paul Agapow
 
Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical Research
Paul Agapow
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
Paul Agapow
 
Retina Today (Nov-Dec 2014): The Clinical Data Management Process
Retina Today (Nov-Dec 2014): The Clinical Data Management ProcessRetina Today (Nov-Dec 2014): The Clinical Data Management Process
Retina Today (Nov-Dec 2014): The Clinical Data Management Process
Statistics & Data Corporation
 
Lecture 9C
Lecture 9CLecture 9C
Lecture 9C
CMDLMS
 
Medical data diagnosis
Medical data diagnosisMedical data diagnosis
Medical data diagnosis
Bhargav Srinivasan
 
IRJET- Disease Prediction and Doctor Recommendation System
IRJET-  	  Disease Prediction and Doctor Recommendation SystemIRJET-  	  Disease Prediction and Doctor Recommendation System
IRJET- Disease Prediction and Doctor Recommendation System
IRJET Journal
 
Machine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceMachine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilance
Revathi Boyina
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
Josef Scheiber
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Upendra Agarwal
 

What's hot (20)

Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
CDM
CDMCDM
CDM
 
The impact of electronic data capture on clinical data management
The impact of electronic data capture on clinical data managementThe impact of electronic data capture on clinical data management
The impact of electronic data capture on clinical data management
 
Clinical Data Collection: The Good, the Bad, the Beautiful
Clinical Data Collection: The Good, the Bad, the BeautifulClinical Data Collection: The Good, the Bad, the Beautiful
Clinical Data Collection: The Good, the Bad, the Beautiful
 
Clinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated dataClinical Data Management: Strategies for unregulated data
Clinical Data Management: Strategies for unregulated data
 
Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...
Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...
Hirshberg promise of digital technology astra_zenecaThe Promise of Digital Te...
 
Errors in Statistical Survey
Errors in Statistical SurveyErrors in Statistical Survey
Errors in Statistical Survey
 
AGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORING
AGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORINGAGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORING
AGENDA CBI RBM 2019 | RISK-BASED TRIAL MANAGEMENT and MONITORING
 
Designing Risk Metrics for Risk-Based Monitoring
Designing Risk Metrics for Risk-Based MonitoringDesigning Risk Metrics for Risk-Based Monitoring
Designing Risk Metrics for Risk-Based Monitoring
 
Clean Clinical Trial
Clean Clinical Trial Clean Clinical Trial
Clean Clinical Trial
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
 
Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical Research
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
 
Retina Today (Nov-Dec 2014): The Clinical Data Management Process
Retina Today (Nov-Dec 2014): The Clinical Data Management ProcessRetina Today (Nov-Dec 2014): The Clinical Data Management Process
Retina Today (Nov-Dec 2014): The Clinical Data Management Process
 
Lecture 9C
Lecture 9CLecture 9C
Lecture 9C
 
Medical data diagnosis
Medical data diagnosisMedical data diagnosis
Medical data diagnosis
 
IRJET- Disease Prediction and Doctor Recommendation System
IRJET-  	  Disease Prediction and Doctor Recommendation SystemIRJET-  	  Disease Prediction and Doctor Recommendation System
IRJET- Disease Prediction and Doctor Recommendation System
 
Machine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceMachine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilance
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
 

Similar to Journal for Clinical Studies: Close Cooperation Between Data Management and Biostatistics Benefits Data Quality

Central_Analytics_Treating_the_Cause_Not_Just_the_Symptoms
Central_Analytics_Treating_the_Cause_Not_Just_the_SymptomsCentral_Analytics_Treating_the_Cause_Not_Just_the_Symptoms
Central_Analytics_Treating_the_Cause_Not_Just_the_SymptomsMichelle Phan (L.I.O.N.)
 
Using Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to MarketUsing Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to Market
Cognizant
 
Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies
KCR
 
Health Informatics- Module 3-Chapter 3.pptx
Health Informatics- Module 3-Chapter 3.pptxHealth Informatics- Module 3-Chapter 3.pptx
Health Informatics- Module 3-Chapter 3.pptx
Arti Parab Academics
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysis
journalBEEI
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
IJDKP
 
IRJET- Disease Prediction System
IRJET- Disease Prediction SystemIRJET- Disease Prediction System
IRJET- Disease Prediction System
IRJET Journal
 
Automatic missing value imputation for cleaning phase of diabetic’s readmissi...
Automatic missing value imputation for cleaning phase of diabetic’s readmissi...Automatic missing value imputation for cleaning phase of diabetic’s readmissi...
Automatic missing value imputation for cleaning phase of diabetic’s readmissi...
IJECEIAES
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
CSCJournals
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
CSCJournals
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data Mining
IRJET Journal
 
The Growing Importance of Data Cleaning
The Growing Importance of Data CleaningThe Growing Importance of Data Cleaning
The Growing Importance of Data Cleaning
CarolineSmith912130
 
Data quality management model
Data quality management modelData quality management model
Data quality management modelselinasimpson1301
 
Early stage of diabetics prediction using machine learnin
Early stage of diabetics prediction using machine learninEarly stage of diabetics prediction using machine learnin
Early stage of diabetics prediction using machine learnin
VinothVinoth618840
 
5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf
5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf
5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf
The Lifesciences Magazine
 
Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity
Bhaswat Chakraborty
 
Streamlining Data Accuracy for Precision in R&D.pptx
Streamlining Data Accuracy for Precision in R&D.pptxStreamlining Data Accuracy for Precision in R&D.pptx
Streamlining Data Accuracy for Precision in R&D.pptx
MocDoc
 
Data Management and Analysis in Clinical Trials
Data Management and Analysis in Clinical TrialsData Management and Analysis in Clinical Trials
Data Management and Analysis in Clinical Trials
ijtsrd
 
thegrowingimportanceofdatacleaning-211202141902.pptx
thegrowingimportanceofdatacleaning-211202141902.pptxthegrowingimportanceofdatacleaning-211202141902.pptx
thegrowingimportanceofdatacleaning-211202141902.pptx
YashaswiniSrinivasan1
 

Similar to Journal for Clinical Studies: Close Cooperation Between Data Management and Biostatistics Benefits Data Quality (20)

Central_Analytics_Treating_the_Cause_Not_Just_the_Symptoms
Central_Analytics_Treating_the_Cause_Not_Just_the_SymptomsCentral_Analytics_Treating_the_Cause_Not_Just_the_Symptoms
Central_Analytics_Treating_the_Cause_Not_Just_the_Symptoms
 
Using Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to MarketUsing Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to Market
 
Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies
 
Health Informatics- Module 3-Chapter 3.pptx
Health Informatics- Module 3-Chapter 3.pptxHealth Informatics- Module 3-Chapter 3.pptx
Health Informatics- Module 3-Chapter 3.pptx
 
An efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysisAn efficient feature selection algorithm for health care data analysis
An efficient feature selection algorithm for health care data analysis
 
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
 
IRJET- Disease Prediction System
IRJET- Disease Prediction SystemIRJET- Disease Prediction System
IRJET- Disease Prediction System
 
Automatic missing value imputation for cleaning phase of diabetic’s readmissi...
Automatic missing value imputation for cleaning phase of diabetic’s readmissi...Automatic missing value imputation for cleaning phase of diabetic’s readmissi...
Automatic missing value imputation for cleaning phase of diabetic’s readmissi...
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
 
Classification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey DataClassification Scoring for Cleaning Inconsistent Survey Data
Classification Scoring for Cleaning Inconsistent Survey Data
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data Mining
 
The Growing Importance of Data Cleaning
The Growing Importance of Data CleaningThe Growing Importance of Data Cleaning
The Growing Importance of Data Cleaning
 
Data quality management model
Data quality management modelData quality management model
Data quality management model
 
Early stage of diabetics prediction using machine learnin
Early stage of diabetics prediction using machine learninEarly stage of diabetics prediction using machine learnin
Early stage of diabetics prediction using machine learnin
 
Data Extraction
Data ExtractionData Extraction
Data Extraction
 
5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf
5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf
5 Key Pitfalls to Avoid in the MedTech Clinical Data Collection.pdf
 
Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity Developing Protocols & Procedures for CT Data Integrity
Developing Protocols & Procedures for CT Data Integrity
 
Streamlining Data Accuracy for Precision in R&D.pptx
Streamlining Data Accuracy for Precision in R&D.pptxStreamlining Data Accuracy for Precision in R&D.pptx
Streamlining Data Accuracy for Precision in R&D.pptx
 
Data Management and Analysis in Clinical Trials
Data Management and Analysis in Clinical TrialsData Management and Analysis in Clinical Trials
Data Management and Analysis in Clinical Trials
 
thegrowingimportanceofdatacleaning-211202141902.pptx
thegrowingimportanceofdatacleaning-211202141902.pptxthegrowingimportanceofdatacleaning-211202141902.pptx
thegrowingimportanceofdatacleaning-211202141902.pptx
 

More from KCR

Journal for Clinical Studies: The Changing Organisation and Data Management R...
Journal for Clinical Studies: The Changing Organisation and Data Management R...Journal for Clinical Studies: The Changing Organisation and Data Management R...
Journal for Clinical Studies: The Changing Organisation and Data Management R...
KCR
 
International Pharmaceutical Industry: Innovations in the PASS Concept
International Pharmaceutical Industry: Innovations in the PASS ConceptInternational Pharmaceutical Industry: Innovations in the PASS Concept
International Pharmaceutical Industry: Innovations in the PASS Concept
KCR
 
Journal for Clinical Studies: Examination of Roles in Data Management in Clin...
Journal for Clinical Studies: Examination of Roles in Data Management in Clin...Journal for Clinical Studies: Examination of Roles in Data Management in Clin...
Journal for Clinical Studies: Examination of Roles in Data Management in Clin...
KCR
 
European Pharmaceutical Review: Trials and Errors in Neuroscience
European Pharmaceutical Review: Trials and Errors in NeuroscienceEuropean Pharmaceutical Review: Trials and Errors in Neuroscience
European Pharmaceutical Review: Trials and Errors in Neuroscience
KCR
 
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical ResearchEuropean Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
KCR
 
International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...
International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...
International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...
KCR
 
PharmaVoice: Malaria Research Update
PharmaVoice: Malaria Research UpdatePharmaVoice: Malaria Research Update
PharmaVoice: Malaria Research Update
KCR
 
EPC - Placebo Controlled Designs
EPC  - Placebo Controlled DesignsEPC  - Placebo Controlled Designs
EPC - Placebo Controlled Designs
KCR
 
IPI - Developing Global Solutions for Product Safety
IPI - Developing Global Solutions for Product SafetyIPI - Developing Global Solutions for Product Safety
IPI - Developing Global Solutions for Product Safety
KCR
 
KCR Patient Recruitment & Retention case study: Pediatric Pain
KCR Patient Recruitment & Retention case study: Pediatric PainKCR Patient Recruitment & Retention case study: Pediatric Pain
KCR Patient Recruitment & Retention case study: Pediatric Pain
KCR
 
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...
KCR
 
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...
KCR
 
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...
KCR
 
KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...
KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...
KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...
KCR
 
KCR Data Management
KCR Data Management KCR Data Management
KCR Data Management
KCR
 
KCR: Recent Evolution of Regulatory Framework in Europe
KCR: Recent Evolution of Regulatory Framework in EuropeKCR: Recent Evolution of Regulatory Framework in Europe
KCR: Recent Evolution of Regulatory Framework in Europe
KCR
 
Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...
Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...
Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...
KCR
 
Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015
Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015
Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015
KCR
 
Less is the New More
Less is the New MoreLess is the New More
Less is the New More
KCR
 
KCR Excellence in Rescue Studies
KCR Excellence in Rescue StudiesKCR Excellence in Rescue Studies
KCR Excellence in Rescue Studies
KCR
 

More from KCR (20)

Journal for Clinical Studies: The Changing Organisation and Data Management R...
Journal for Clinical Studies: The Changing Organisation and Data Management R...Journal for Clinical Studies: The Changing Organisation and Data Management R...
Journal for Clinical Studies: The Changing Organisation and Data Management R...
 
International Pharmaceutical Industry: Innovations in the PASS Concept
International Pharmaceutical Industry: Innovations in the PASS ConceptInternational Pharmaceutical Industry: Innovations in the PASS Concept
International Pharmaceutical Industry: Innovations in the PASS Concept
 
Journal for Clinical Studies: Examination of Roles in Data Management in Clin...
Journal for Clinical Studies: Examination of Roles in Data Management in Clin...Journal for Clinical Studies: Examination of Roles in Data Management in Clin...
Journal for Clinical Studies: Examination of Roles in Data Management in Clin...
 
European Pharmaceutical Review: Trials and Errors in Neuroscience
European Pharmaceutical Review: Trials and Errors in NeuroscienceEuropean Pharmaceutical Review: Trials and Errors in Neuroscience
European Pharmaceutical Review: Trials and Errors in Neuroscience
 
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical ResearchEuropean Pharmaceutical Contractor: SAS and R Team in Clinical Research
European Pharmaceutical Contractor: SAS and R Team in Clinical Research
 
International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...
International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...
International Pharmaceutical Industry: Feasibility Is Not (Anymore) A Plain S...
 
PharmaVoice: Malaria Research Update
PharmaVoice: Malaria Research UpdatePharmaVoice: Malaria Research Update
PharmaVoice: Malaria Research Update
 
EPC - Placebo Controlled Designs
EPC  - Placebo Controlled DesignsEPC  - Placebo Controlled Designs
EPC - Placebo Controlled Designs
 
IPI - Developing Global Solutions for Product Safety
IPI - Developing Global Solutions for Product SafetyIPI - Developing Global Solutions for Product Safety
IPI - Developing Global Solutions for Product Safety
 
KCR Patient Recruitment & Retention case study: Pediatric Pain
KCR Patient Recruitment & Retention case study: Pediatric PainKCR Patient Recruitment & Retention case study: Pediatric Pain
KCR Patient Recruitment & Retention case study: Pediatric Pain
 
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Close-ou...
 
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Conduct ...
 
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...
KCR BMX (Biometrics and Clinical Data Execution Systems) case study: Start-up...
 
KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...
KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...
KCR: Post-Authorisation Safety Studies (PASS) - Is the Ongoing Surveillance a...
 
KCR Data Management
KCR Data Management KCR Data Management
KCR Data Management
 
KCR: Recent Evolution of Regulatory Framework in Europe
KCR: Recent Evolution of Regulatory Framework in EuropeKCR: Recent Evolution of Regulatory Framework in Europe
KCR: Recent Evolution of Regulatory Framework in Europe
 
Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...
Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...
Prostate Cancer - Current Approach and Future Perspective in Castration-Resis...
 
Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015
Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015
Safety Monitoring and Reporting in Clinical Trials DIA Poster 2015
 
Less is the New More
Less is the New MoreLess is the New More
Less is the New More
 
KCR Excellence in Rescue Studies
KCR Excellence in Rescue StudiesKCR Excellence in Rescue Studies
KCR Excellence in Rescue Studies
 

Recently uploaded

POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
touseefaziz1
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
KafrELShiekh University
 
Surgical Site Infections, pathophysiology, and prevention.pptx
Surgical Site Infections, pathophysiology, and prevention.pptxSurgical Site Infections, pathophysiology, and prevention.pptx
Surgical Site Infections, pathophysiology, and prevention.pptx
jval Landero
 
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
bkling
 
The Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of IIThe Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of II
MedicoseAcademics
 
Charaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyaya
Charaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyayaCharaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyaya
Charaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyaya
Dr KHALID B.M
 
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Oleg Kshivets
 
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in StockFactory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
rebeccabio
 
24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all
DrSathishMS1
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
Are There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdfAre There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdf
Little Cross Family Clinic
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
Dr. Vinay Pareek
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
Catherine Liao
 
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
VarunMahajani
 
Physiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdfPhysiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdf
MedicoseAcademics
 
Prix Galien International 2024 Forum Program
Prix Galien International 2024 Forum ProgramPrix Galien International 2024 Forum Program
Prix Galien International 2024 Forum Program
Levi Shapiro
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
MedicoseAcademics
 
Antiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptxAntiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptx
Rohit chaurpagar
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
Anujkumaranit
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
aljamhori teaching hospital
 

Recently uploaded (20)

POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
 
Surgical Site Infections, pathophysiology, and prevention.pptx
Surgical Site Infections, pathophysiology, and prevention.pptxSurgical Site Infections, pathophysiology, and prevention.pptx
Surgical Site Infections, pathophysiology, and prevention.pptx
 
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
 
The Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of IIThe Normal Electrocardiogram - Part I of II
The Normal Electrocardiogram - Part I of II
 
Charaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyaya
Charaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyayaCharaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyaya
Charaka Samhita Sutra Sthana 9 Chapter khuddakachatuspadadhyaya
 
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
 
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in StockFactory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
Factory Supply Best Quality Pmk Oil CAS 28578–16–7 PMK Powder in Stock
 
24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
 
Are There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdfAre There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdf
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
 
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
 
Physiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdfPhysiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdf
 
Prix Galien International 2024 Forum Program
Prix Galien International 2024 Forum ProgramPrix Galien International 2024 Forum Program
Prix Galien International 2024 Forum Program
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
 
Antiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptxAntiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptx
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
 

Journal for Clinical Studies: Close Cooperation Between Data Management and Biostatistics Benefits Data Quality

  • 1. Volume 8 Issue 542 Journal for Clinical Studies Technology Introduction Every clinical trial is a source of multidimensional data, analysed in order to answer questions presented in hypotheses on safety, efficacy and other topics. For the analysis to be reliable and successful, the recorded data must be of sufficient quality, i.e. complete, correct and integral. Keeping invalid or incomplete data in a database may cause incorrect calculation results, leading to invalid conclusions and wrong decisions. It is not only a matter of potential consequences for the sponsor but also ethics. As there are living humans behind the numbers generated, this issue must not be taken lightly. Thus, the process of data validation becomes a key aspect of every trial. Although the process of checking and cleaning data is usually performed by the data management team, a close cooperation with biostatistics may significantly improve the results by introducing both statistical knowledge and the ability to create specialised, programmatic tools and advanced queries giving a good foundation for deeper and faster data investigations. Reasons and Types of Invalid Data Invalid data is usually caused by a human mistake. EDC forms containing fields insufficiently protected by edit checks increase the chance for errors. Obviously, it is always better to prevent than being sorry, and EDC forms should always be made resistant to errors. Reality, however, often involves compromises. Text fields allowing free text to be entered are a good example of that. Sometimes one has to deal with already poorly-designed forms. Things get even worse if the EDC software does not prevent the entry of incorrect values, but rather displays alerts to the user. This is not uncommon. Invalid Results Results of laboratory examinations present a good example for what can go wrong. Typos, invalid decimal separators, textual results mixed with numerical ones, results mixed with both manual comments or messages generated by the system/machine (“sample hemolysis”, “bellow assay range”), units entered in many forms, incorrectly assigned units (e.g. “G/L” confused with “g/L”), missing lower or upper limits of reference ranges, switched lower and upper limits of reference ranges, incorrect assignments between reference ranges and gender or age, incorrectly assigned flags (high, low, abnormal), dates and times entered in a wrong format, just to name a few possible issues. Even automatically transferred data from a laboratory into an EDC software through transfer files and programmatic API can be invalid due to technical issues. Multiple units make results incomparable and without the process of unification, they cannot simply be included in the analysis. Simple group-by analysis enumerates all the entered units and helps to prepare a list of conversion factors. It is generally a good idea to make all units SI compliant. Missing Observations While the bad impact of invalid data is obvious, probably not everyone realises that missing data may affect statistical computations in no less degree. Things get worse, if the missingness is not at random, but rather follows a pattern. Lower sample size may increase dispersion in data affecting values of descriptive statistics and estimation of errors. Statistical tests lose their power. Bias in parameter estimation may be introduced as well. Design of a trial may become unbalanced, which often leads to confounding data. Missing observations may distort distribution shapes. Assumptions of statistical methods may be violated, which makes statistical inference unreliable. Missing classes of observations may make the analysis impossible to perform or interpret. Advanced imputation techniques are commonly in use, however they are still only an attempt to fight the fire. One should not forget that they introduce artificial data, even if a statistical model says they are possible. Moreover, things may get really bad when it comes to misguided data imputation, which may completely distort the picture of a situation. Suspicious Observations Suspicious observations make the next category of issues which significantly lower the quality of data. Observation can be considered suspicious for many reasons. Its value may be too high or too low, acting as an outlier and significantly affecting results of an analysis or causing the analysis to fail entirely. Such values may be expected as typical for a specific disease (ESR, AlAT) or indicate a human mistake, thus it should be investigated carefully. But also values looking pretty normal, lying inside a normal range, may reveal worrying patterns, indicating a potentially artificial nature of the entered data and probably fraud. Investigations entailed by this class of problems are particularly challenging and subtle. It is not easy to cope with these problems in a transparent and formalised world of clinical trials when they happen. Suspicious observations, rich in outliers, can really damage calculations, distort results and lead to wrong conclusions. Even if a solution in the form of a robust statistical method exists, it is challenging to apply, due to the fact that hypotheses are usually stated a priori along with a corresponding and closed set of statistical methods that will be used. Close Cooperation Between Data Management and Biostatistics Benefits Data Quality 50_JCS_September2016.indd 42 29/09/2016 16:00:42
  • 2. Journal for Clinical Studies 43www.jforcs.com Fraud and Misconduct Fraud and misconduct, caused intentionally or by insufficient training, can result in damages which are often impossible to fix and are very expensive in the end. One would say that it is far better to have missing rather than incorrect data. Inappropriate IMP management, handling or administration procedures, including accidental switching of drug, placebo or comparator as well as incorrect examination techniques applied can damage the data in an unrecoverable manner. This is because what is done cannot be undone. The sooner it is detected and eliminated, the better, all the more for the fact that it often requires long-lasting and difficult investigation in order to collect all the evidence. Solutions After a statistical analysis plan and protocol is prepared and signed, one does not simply alter things, especially the set of statistical methods and proceedings, without being charged with being manipulative. This clearly shows how extremely important it is to ensure data completeness and correctness long before the database is finally locked and the analysis starts. As the process of data validation and correction is not completed immediately, it involves a lot of additional communication, consumes time and resources, and postponing it to a moment shortly before the lock is very risky. At KCR we maximise efforts to minimise the risk of further dealing with invalid and incomplete data, as well as allowing poorly-trained staff to perform. For this purpose we have introduced a close cooperation between data management and biostatistics. While data management personnel are typically responsible for preparing well- designed, CDISC-compliant EDC forms and performing periodic data reviews, the biostatistics department provides both statistical support and programmatic tools for advanced data checking and transformation. The following kinds of support are currently applied at KCR: preparation-stage analysis; assisted data validation; creating tools for unassisted data validation; writing screening programs for unsolicited, ad-hoc data review; providing solutions for automated scour analysis; programming solutions for data exchange between information systems, and last but not least – training and mentoring. Preparation-stage Analysis Every trial starts with a set of common preliminary steps that have a critical impact on the data quality. One of the most important prerequisites is to properly design the EDC forms. The key thing is to ensure its compliance with CDISC CDASH specification. The second step is to secure input fields with appropriate edit checks to prevent the user from entering nonsense data. In addition, text inputs should be encoded with dictionaries whenever possible. This refers not only to fields intended to be medically encoded (MedDRA, LOINC, ICD, etc.) but to any field of which the content can be organised in a dictionary to avoid multiple names for a single thing. For encoded fields, the option allowing the user to enter his own text should be avoided if possible, as it is contrary to the idea. All these actions are mostly performed by data management; however, the programming skills offered by biostatistics make an excellent opportunity to improve the process by preparing scripts querying the database in search of missing rules, checks and violations of certain naming conventions. Assisted Data Validation This kind of support covers analyses done on request and usually together with personnel from other departments, like clinicians, administrators and managers. It is mainly used for deeper investigations which cover various aspects of a trial and involve much more advanced methods than usual. Various statistical methods are in use, for example: • an extended set of descriptive statistics, including robust, both classic and positional measures • graphical analyses using various combinations of scatterplots, boxplots, mosaic plots, histograms and various types of density plots, as well as custom plots revealing specific patterns in data Technology Missing Ref. Range end both lower upper RefRange lower upper clunit % 10^3 10^9/L 1000/uL G/L x10^3/ul x10^6/uL Result index 0 25 50 75 100 LOGResult[x10^9/L] 1X10 +2 1X10 +1 1X10 +0 1X10 -1 1X10 -2 1X10 -3 Chart 1: An exemplary diagram revealing typical issues found in laboratory data: missing values, incomplete and missing reference ranges, incorrect units assigned 50_JCS_September2016.indd 43 29/09/2016 16:00:42
  • 3. Volume 8 Issue 544 Journal for Clinical Studies • analysis of possible outliers done both graphically and mathematically • analysis of suspicious data by looking for patterns in coexisting values in view of surrounding circumstances, involving graphical and mathematical methods, like decision trees • analysis of randomness in data samples • analysis of patterns in missing data by using specialised graphs We have found that graphical methods are especially useful in communication with clinicians and managers. Well-designed graphics immediately reveal patterns and make the user able to grasp a lot of information. It works perfectly while searching for patterns in missing data, investigating possible frauds and investigating laboratory data. A good example of such activity is a process of reviewing results of laboratory tests expressed in various units. By applying a set of conversion factors between units, it is possible to unify all values and show them on a common chart along with reference ranges and other information. This shows immediately which units were chosen and if they are valid, whether observations have incorrect values or if a corresponding reference range (or one of its ends) is missing. This message is easy to understand and reduces the need to get through long tables of numbers. Assisted Data Validation – Fraud and Misconduct The detection of potential fraud and misconduct involves both graphical and statistical methods. At the first stage, the biostatistics team tries to picture the situation with simple plots, which are then discussed in a team of clinicians, managers and other specialists. All doubtful patterns are examined by statisticians using various simple and advanced, multidimensional methods. In the end, the statisticians present findings and recommendations for decision-making. Such investigation can reveal intentional, harmful activity as well as showing certain weaknesses of procedures and deficits in training. Abnormally low or high dispersion in data, relationships between means and dispersions, highly skewed distributions (when not expected), departures from shapes of distribution characterised in a protocol, unexpected patterns in data like “steps” and “clusters”, strange relationships between variables, unexpected patterns in missing data, periodicity in occurrences of specific issues and many other things can be detected by well-trained biostatisticians and revealed before clinicians and managers. Creations of Tools for Unassisted, Repeatable Data Validation The key to success is to perform the data checking as often as possible. Daily checking is not unusual. On the other hand, it may become a very time-consuming process and frequently involving the biostatistics team in running required analyses does not seem to be the best option. The fact that many valuable analyses do not require any statistical advisory has helped us to develop a reporting tool that can be used by the data management staff alone. The first step is to create a list of required analyses, where items are prioritised and grouped by predefined categories. For each report, a set of parameters and their default values are determined as well. The next step refers to technical matters, like the selection of the technology to be used, choice of a method of accessing the database, description of a user authorisation process, shape of a graphical user interface, selection of the desired output formats, etc. Since long-lasting analyses slow down the database, its content should be replicated to another instance or exported to an intermediate file (XML, CSV, etc.) before the analysis. In order to save money, the chosen technology should allow the utilisation of already existing resources, i.e. hardware, software, statistical programmers and administrators. In this case, if R programmers are already on board, the R package should be considered as the default development platform first rather than other technologies (.NET, Java, PHP, etc.) which would require the hiring of additional programmers. We decided to create the tool as a self-contained, windows-based application hosted entirely by the R package. GNU R is a well-known, powerful, acclaimed and free statistical package, as well as a high-level programming language. It is a strong SAS competitor, used worldwide by millions of users , huge corporations and organisations, including FDA. R is an open-source project, developed by the R Core Team, and supported by the R Consortium which consists of companies like Microsoft, Oracle, IBM and Google. The contents of the R library address practically every topic in biostatistics , including clinical research. R is capable of reading data and producing output in various formats, including SAS datasets, Microsoft Office and PDF documents. Extensive support for querying numerous kinds of data sources (also via SQL), implementation of the reproducible research paradigm, three advanced charting systems, the ability to host embedded user interfaces and web applications, full portability understood as an ability to run without the installation on almost every operating system and a huge, dynamic society of users, make R a good candidate for a reliable programmatic environment. The created tool is capable of running a wide range of a laboratory data reconciliation as well as trial-specific analyses. The implemented set of analyses allows for detection of: missing visits, empty mandatory fields, inconsistencies in certain data domains, various kinds of misconduct, discrepancies between the database and specification in units, normal ranges and flags, missing Technology 50_JCS_September2016.indd 44 29/09/2016 16:00:42
  • 4. Journal for Clinical Studies 45www.jforcs.com Technology laboratory examinations, departures from a schedule described in the protocol and invalid results, to name only a few. It has proven its usefulness in everyday practice. Now it takes only a few minutes for the full set of analyses and just a few seconds for a single report, when previously it took long hours to create a corresponding Excel report manually. By using the tool we were able to detect serious issues and take certain remedies before the situation got serious. Screening Ad Hoc Analyses The process of writing programs for the final statistical report is a perfect moment for assessing the quality of collected data long before analysing them. We call them “screening programs” and use them to check if the data is clean enough to perform a certain part of the analysis. Screening analyses are valuable due to the nature of their creation: while writing the statistical analysis program, the statistician plays a lot with the data by writing a number of queries and checking the content of a database in many ways. This often results in useful queries, which normally might have never been requested. By the use of the reproducible research paradigm implementation available in R, it is possible to embed these analyses directly into the main statistical analysis program. Automated Scour Analysis This is an automated enhancement of the screening data validation, working in the background, and has more of an “alerting” nature. A program scours the database content periodically in search of specific issues and reports findings via email or stores them in an HTML log. The fact that the amount of time required to complete such an analysis is of low importance, there is no direct, intended interaction between ordinary users and the system, and that R is not resource-consuming and can be deployed in a machine with any architecture, makes it possible to implement the tool on simplified minicomputers like Raspberry Pi. This eliminates the need to buy a new machine or install new software on an existing, stable server. An additional small (3.7”) breadboard with LCD touchscreen will enable a limited interaction with the script. Simple data inspector Dictionaries User interface TemplatesQueries Direct access Access via export Scripts EDC Software SQL </> CSV <CSS> & <html> <html> <XML> Access via database interfaces: OBDC/JDBC PDF Site ID 1 1 1 1 2 SubjID 3 3 4 5 5 Lab Test RBC WBC ESR Hb B-HCG Screening OK OK OK OK N/A Day 1 OK MISSING MISSING OK OK Day 2 OK N/A MISSING N/A MISSING Scheme 1: An overall architecture of a typical reporting system 50_JCS_September2016.indd 45 29/09/2016 16:00:42
  • 5. Journal for Clinical Studies 46www.jforcs.com Data Converters A data converter is a kind of program which transforms data from one form to another. Its sole task is to eliminate the human factor during the process of data transformation as much as possible. Transferring results of clinical examinations from an external laboratory into an EDC database, followed by additional data integrity checks, makes a good example of such a process. At KCR we constitute data converters every time the adjustment of received data format is required. As previously, the R statistical package is used for that purpose, which significantly facilitates complicated operations on data spread over multiple, differentiated sources. Advanced querying capabilities together with the availability of interfaces to numerous database engines make the process of transferring data extremely simple in comparison to traditional, high-level programming languages, and can be done in a very few lines of code. Training and Mentoring Sharing knowledge about possible issues that can happen to data as well as emphasising their impact on the analysis results is no less important than the analytical support itself. If people understand why certain matters are so important, they are more cooperative and follow the rules more willingly. In order to raise a better, more general awareness in these matters, we decided to organise a series of courses for non-statisticians. The audience has demonstrated high interest, which confirms that our efforts and direction were right. Summary Data validation is a process of great importance, having significant implications for the reliability of the final data analysis. There are many possible sources of issues, which makes it really difficult to identify them all and react quickly enough. From the early stages of a trial to its very end, at every turn, this is where the programmatic and statistical support provided by the biostatistics team comes to the rescue. At KCR, both departments closely cooperate with each other and have been organised in a common biometrics unit in order to facilitate the flow of information. References 1. Oracle Corporation, “Scaling R to the Enterprise. Using R for Enterprise-level Performance, Scalability, Ease of Production Deployment, and Security”, An Oracle White Paper, July 2016, http://www.oracle.com/ technetwork/database/options/advanced-analytics/ r-enterprise/bringing-r-to-the-enterprise-1956618. pdf 2. Olszewski Adrian, “Is R suitable enough for biostatisticians involved in clinical research and evidence-based medicine?”, June 15th 2015, http://r- clinical-research.com 3. Smith David, Microsoft Corporation (formerly Revolution Analytics), “FDA: R OK for drug trials”, June 21st 2012, http://blog.revolutionanalytics. com/2012/06/fda-r-ok.html 4. Smith David, Microsoft Corporation (formerly Revolution Analytics), “Companies using R in 2014”, May 23rd 2014, http://blog.revolutionanalytics. com/2014/05/companies-using-r-in-2014.html Technology Adrian Olszewski is Biostatistician in the Biometrics & Clinical Trial Data Execution Systems Department at KCR, a contract research organisation (CRO). Adrian is involved in delivering informatics and analytical solutions for medicine, pharmacy and clinical laboratory diagnostics. He has a profound knowledge in statistics in the field of evidence- based medicine, especially in clinical research. Adrian is responsible for providing comprehensive support for trials from the early design considerations through the data analysis – including interim evaluations – to the final report. Adrian is also involved in various external projects on widely understood data analysis and applications of the R statistical package. Mr Olszewski holds a Master of Science (MSc) degree in Computer Science. Email: info@kcrcro.com 50_JCS_September2016.indd 46 29/09/2016 16:00:42