SlideShare a Scribd company logo
1 of 129
How I do it: A Practical Database Management System to Assist
Clinical Research Teams with Data Collecting, Organization,
and
Reporting
Howard Lee, B.S.1, Julius Chapiro, M.D.1, Rüdiger
Schernthaner, M.D.1, Rafael Duran, M.D.
1, Zhijun Wang, M.D., Ph.D1, Boris Gorodetski, B.S.1, Jean-
François Geschwind, M.D.1, and
MingDe Lin, Ph.D2
Howard Lee: [email protected]; Julius Chapiro:
[email protected]; Rüdiger Schernthaner:
[email protected]; Rafael Duran: [email protected]; Zhijun
Wang: [email protected]; Boris Gorodetski:
[email protected]; Jean-François Geschwind: [email protected];
MingDe Lin: [email protected]
1Russell H. Morgan Department of Radiology and Radiological
Science, Division of Vascular and
Interventional Radiology, The Johns Hopkins Hospital, Sheikh
Zayed Tower, Ste 7203, 1800
Orleans St, Baltimore, MD, USA 21287
2U/S Imaging and Interventions (UII), Philips Research North
America, 345 Scarborough Road,
Briarcliff Manor, New York 10510
Introduction
With the growing amount of clinical research studies in the
field of interventional oncology,
selective patient data is becoming more difficult to store and
organize effectively. Existing
hospital EMR (electronic medical record) systems store patient
data in the form of reports
and data tables. Our institution’s EMR system placed our
researchers in a position where
time consuming methods are needed to search for suitable
patients for clinical studies.
Researchers had to manually read through the reports and data
tables to filter patients and
gather data. For most studies, spreadsheet programs such as
Microsoft Excel® (Microsoft,
Washington, USA) are often used as a data repository similar to
a database to record and
organize patient data for research. Once the spreadsheet is
populated, it is manually filtered
by set study parameters and then pushed to statistical analysis
software for further analysis.
For statistical analysis, columns containing text are translated
into binary values (1 or 0) to
be in a format acceptable by statistical analysis software. For
example, each tumor entity is
assigned a new column. Patient histological reports are read
manually to assign a 1 or 0 to
each tumor entity column, 1 for positive, 0 for negative. Under
a tumor entity column,
researchers would write a 1 for all patients with the tumor and a
0 for all patients without the
tumor.
© 2014 AUR. All rights reserved.
Correspondence to: Jean-François Geschwind, [email protected]
Publisher's Disclaimer: This is a PDF file of an unedited
manuscript that has been accepted for publication. As a service
to our
customers we are providing this early version of the manuscript.
The manuscript will undergo copyediting, typesetting, and
review of
the resulting proof before it is published in its final citable
form. Please note that during the production process errors may
be
discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.
HHS Public Access
Author manuscript
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
Published in final edited form as:
Acad Radiol. 2015 April ; 22(4): 527–533.
doi:10.1016/j.acra.2014.12.002.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
This method of data storage has limitations in the organization
and the quality of the data.
Data input and analysis without a database run a higher risk of
incorrect data entry, patient
exclusion, and a higher risk of introducing duplicates.
Furthermore, data selection and
calculation is time consuming. An alternative could be the
clinical research database that
Meineke et. al. proposed (1). However, it is too unspecific for
interventional oncology
research and would need additional optimization, for example,
the capability to
automatically calculate various variables such as tumor staging
systems and to record
information about multiple treatment sessions.
The purpose of this study was to provide an improved workflow
efficient tool through the
use of a clinical research database management system (DBMS)
optimized for interventional
oncology clinical research.
Materials and Methods
This was a single-institution prospective study. The study was
compliant with the Health
Insurance Portability and Accountability Act (HIPAA) and was
waived by the Institutional
Review Board.
Database and Query Interface Design
The presented database management system has two distinct
parts, the database server and
client interface, illustrated in Figure 1. The database is run by
software (MySQL, Oracle
Corporation, California, USA and phpMyAdmin, The
phpMyAdmin Project, California,
USA) on a central computer server within the department (2, 3).
Authorized users were
granted access to this password protected and encrypted secured
server (HIPAA compliant).
Multiple users concurrently add, edit, and query data remotely
through a customized
graphical user interface (GUI) utilizing Microsoft Access®
(Microsoft, Washington, USA).
Any data changes are immediately logged for others to see. The
database performed
automatic calculations using queries, user-defined search
criteria. Queries were saved, rerun,
and exported to spreadsheets. Queries aid in data analysis and
increase study productivity
(4). They are powerful tools for filtering and sorting datasets.
Figure 2 illustrates the query
interface and an example of request from the database.
Graphical User Interface Design and Utility
In our research environment, the database GUI was created to
facilitate patient data input.
This was done by using custom user-friendly interface forms
that contain textboxes and
labels including demographic data, treatment information (e.g.
conventional transarterial
chemoembolization (TACE)), tumors types, dates and types of
radiological exams, etc. The
GUI is used to view patient data and allows users to add/edit
data (Figure 3). The database
interface is not limited to one form. It can have multiple forms,
shown as tabs, to assist
grouping various medical data. Figure 4 shows an example of
multiple tabs for groups of
related data.
Lee et al. Page 2
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Automatic Calculations
Automatic calculations may be run between values, such as
dates. For example, the database
may calculate the time between baseline imaging, follow -up
imaging, treatment dates, pre-
and post-treatment dates, date of diagnosis, and patient’s date of
death in relation to a
particular treatment or event (e.g. randomization), essential for
survival studies. Using these
queries, the database can also calculate the median overall
survival automatically. The
database does also automatically calculate clinical scores such
as Child-Pugh score and
Barcelona Clinic Liver Cancer (BCLC) stage as shown in Figure
5 (5). For our purposes, the
Child Pugh score and BCLC were calculated using baseline data
before a patient’s first
embolization as is typically done for staging. The illustrated
calculators can be revised as
needed. Once patient blood data is available, queries are run to
produce a list of all patients
with Child-Pugh scores. Researchers can then quickly retrieve
them.
Statistical Output
Another powerful feature of the database is its ability to provide
a first tier of statistical
information. Using this GUI, the user defines the search criteria
and runs queries to obtain
immediate statistical information about a particular set of
parameters. With this feature, the
database can quickly output an accurate summary of patient data
such as, for example, how
many patients have colorectal carcinoma and undergo
conventional TACE.
Questionnaire Assessment
A questionnaire (15 questions) was designed and distributed to
21 board-certified
interventional radiologists who conduct clinical research at our
academic hospital that
include Phase I, II, and III clinical trials, and retrospective
studies. The questionnaire
determined how data is controlled in retrospective studies and
the likelihood to use the
database. The questionnaire is shown in Table 1. The purpose of
the questionnaire was to 1)
illustrate the general scope of where researchers were having
problems within Excel and
data organization, such as wasted effort working with duplicate
patients and unintentional
failure to include available patients, and 2) to gauge how
receptive they would be to a
database system. Using this information, the database system
was constructed. There were
weekly progress updates with the clinical research team to
ensure that the original goals set
out to address the deficiencies of Excel were being resolved.
Results
Questionnaire Results
All 21 interventional radiologists completed the questionnaire.
Self evaluation results are
shown in Figure 6. In data collection and analysis, over 50%
(11/21) spent most of the time
searching, filtering, and/or categorizing data. However, about
50% (10/21) spent little to no
time calculating the data. 67% of respondents (14/21) realized
at some point that there were
erroneously included patients who should have been excluded
and there were patients who
were erroneously not included. Over 85% (18/21) were very
receptive to using software that
produces group summaries such as totals of each tumor type
with minimal effort, calculates
Lee et al. Page 3
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
clinical staging and score systems automatically, and also
allows remote access for multiple
users to add/edit data in a central server with data modification
logs.
Query Interface Output
In Figure 7, the query of male patients, over 40 years old, with
HCC is run. Figure 8 shows a
query result of patients with TACE and Child-Pugh score A
calculated by the database.
Figure 9 illustrates an interval of time between two events as a
query that can be calculated
automatically (e.g. time elapsed between two embolization
procedures). The output of the
queries as described above is shown in a structured and concise
list, which can be exported
for further research study specific analysis.
Discussion
The main finding of this study is that there is a need for a much
more time efficient and
accurate way to store, retrieve, and analyze patient data for
clinical research studies. The
database management system presented here fulfills these needs.
This was achieved through
the use of automatic calculations, interface forms, queries, etc.
With a personalized
interface, data access, entry, organization, queries, calculations,
and export processes are
seamlessly performed to assist clinical research with data and
statistical analysis.
Furthermore, the database is a unified repository of clinical
research information and a
shared resource among the clinical research team. This allows
for a multi-user level
experience where there can be simultaneous access to the data
and where the efforts of each
individual in adding/appending new information can be used by
the entire team.
With the presented database put into use, the effort for clinical
studies can truly focus on
conducting various statistical analysis and data interpretation
rather than preparing data for
analysis (6). All retrospective data can be merged into this
database, enabling a centrally
maintained and shared resource. Our clinical research team now
has access to a customized
database of patients with a large number of clinical parameters,
allowing a vast combination
of queries to form or support study hypotheses. The user
defined GUI-connected interface is
invaluable for anyone collecting data as it facilitates data entry
and minimizes data entry
errors.
In previous data collection and analysis, converting spreadsheet
data to binary/numeric
format was time consuming and impractical. The database
presented in this study relieves
the inconvenience of manually searching, organizing, and
calculating data. Processing
calculations, especially more complex calculations such as
clinical staging scores, can now
be done automatically. Prior to implementing the presented
database system, a typical Excel
spreadsheet for the clinical studies at our institution would have
over 100 columns. These
columns included patient demographics, repeat treatment dates
and types (new columns per
TACE session), and repeated pre-/post-imaging dates and types
(new columns per multi-
modular scan). Tracking medical data is frequently difficult due
to the large amount of
columns in the spreadsheet. Compared to a typical Excel
spreadsheet with many columns,
browsing and adding prospective data through the database
interface presented here is more
organized and practical with ten defined tabs for data groups,
ranging from a patient’s basic
information to treatments to survival status. In addition, the
database interface lists all
Lee et al. Page 4
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
repeating treatments and imaging per patient as rows instead of
columns, facilitating
comparisons between multiple treatments of a patient.
Combining the database’s ability to
calculate statistical analysis with automatic calculation queries,
reports can be generated
with virtually any parameter. This is not only helpful in
radiology, but also beneficial for
other studies and hospital information systems.
The database management system in this study has some
limitations. A database system may
not be suitable for all kinds of research teams. There are several
factors that may illustrate
the need of a database. In a previous report on data collection,
applicable examples and
guidelines were addressed to determine whether or not
implementing a database is feasible
in the current environment (7). Depending on the environment
and context, a database may
not be implemented right away as it needs additional testing.
Furthermore, the database will
need a dedicated server to host the database along w ith the data.
In order to use the database
interface, training is required. Someone who specializes in
databases, such as a database
administrator, needs to teach researchers and other potential
users how to use the database
interface and query interface for filtering patients and obtaining
statistics. This is especially
needed in more advanced queries and in developing additional
GUIs. It should be noted that
Microsoft Access is being used in this work as a “front-end”
interface that communicates
with the SQL database to query (filter) data, and for
input/appending to existing data. Other
software such as FileMaker Pro (FileMaker, Santa Clara, CA)
and REDCap would serve a
similar function (8). The need for the SQL database is so that
multiple users can access the
stored data at the same time, increased level of security,
stability, and performance, and
serving as a unified repository of clinical research information
that can be shared by the
research team (9, 10). Also, the database administrator has to
not only construct a database
on a server with input from clinicians and other end users, but
in addition would need to
maintain the database (11, 12). Typical maintenance includes
routine backups, altering
database structure and interface for new data types, and
updating database and client
software. A server can be hosted on a PC or online, both of
which all parties involved can
access in the same network locally or remotely. Furthermore,
databases can be enabled to
communicate with other databases. While the initial setup and
learning curve is high, the
database allows for fluid data entry in an organized fashion,
querying results including
calculations, and storing data while supporting simultaneous
user access. With the variety of
research teams and departments, ideally each suitable team
should have their own database.
This is not necessarily only for interventional oncology but also
for any specific area of
research, for example, studies with patients undergoing
ablation, percutaneous abscess
drainage (PAD), etc. These databases can be connected for
interdisciplinary research to
provide a broader scope of data and facilitate data search (13).
Conclusion
The current database implementation and interface allow s a
much faster and more detailed
retrospective analysis of patient cohorts. In addition, it
facilitates data management and a
standardized information output for ongoing prospective
clinical trials. The database
management system with an interface is a work efficient and
robust tool that provides a
significant edge over manual retrieval of patient records by
filtering data and assisting
statistical analysis in a study-relevant fashion.
Lee et al. Page 5
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Acknowledgments
Funding and support has been provided by NIH/NCI R01
CA160771, P30 CA006973, and Philips Research North
America, Briarcliff Manor, NY, USA.
References
1. Meineke FA, Staubert S, Lobe M, Winter A. A
comprehensive clinical research database based on
CDISC ODM and i2b2. Studies in health technology and
informatics. 2014; 205:1115–9. [PubMed:
25160362]
2. Stobart, S.; Vassileiou, M. MySQL Database and
PHPMyAdmin Installation PHP and MySQL
Manual. Springer; London: 2004. p. 461-73.
3. Kuenz, D. Book Manage data for free with MySQL. City:
Element K Journals; 2001. Manage data
for free with MySQL; p. 7-10.
4. Coronel, CMS.; Rob, P. Database systems: design,
implementation, and management. 9. Boston,
Massachusetts: Cengage Learning; 2009.
5. Llovet JM, Di Bisceglie AM, Bruix J, et al. Design and
endpoints of clinical trials in hepatocellular
carcinoma. Journal of the National Cancer Institute. 2008;
100(10):698–711. [PubMed: 18477802]
6. Kanas G, Morimoto L, Mowat F, O’Malley C, Fryzek J,
Nordyke R. Use of electronic medical
records in oncology outcomes research. ClinicoEconomics and
outcomes research : CEOR. 2010;
2:1–14. [PubMed: 21935310]
7. Schmier JK, Kane DW, Halpern MT. Practical applications of
usability theory to electronic data
collection for clinical trials. Contemporary clinical trials. 2005;
26(3):376–85. [PubMed: 15911471]
8. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde
JG. Research electronic data capture
(REDCap)—A metadata-driven methodology and workflow
process for providing translational
research informatics support. Journal of Biomedical
Informatics. 42(2):377–81. [PubMed:
18929686]
9. MySQL Database Provides Full Transactional Support.
Worldwide Databases. 2002; 14(11) 0-N/A.
10. Oracle Improves Database Performance with Latest
Development Milestone Release for MySQL
5.7; New Release of the World’s Most Popular Open Source
Database is 2x Faster than MySQL
5.6 and Over 3x Faster than MySQL 5.5 in Benchmark Tests.
Book Oracle Improves Database
Performance with Latest Development Milestone Release for
MySQL 5.7; New Release of the
World’s Most Popular Open Source Database is 2x Faster than
MySQL 5.6 and Over 3x Faster
than MySQL 5.5 in Benchmark Tests. City2014.
11. Xie SX, Baek Y, Grossman M, et al. Building an integrated
neurodegenerative disease database at
an academic health center. Alzheimer’s & dementia : the journal
of the Alzheimer’s Association.
2011; 7(4):e84–93.
12. Parkes, D.; Lowman, M.; Andres, C., et al., editors. Pro
Python System Administration: Apress.
2010. Automatic MySQL Database Performance Tuning; p. 329-
48.
13. Piriyapongsa J, Bootchai C, Ngamphiw C, Tongsima S.
microPIR: an integrated database of
microRNA target sites within human promoter sequences. PloS
one. 2012; 7(3):e33888. [PubMed:
22439011]
Lee et al. Page 6
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 1. The Dataflow Chart
This chart shows a general layout of the database server and its
clients. It illustrates how the
database management system performs queries (orange circle)
such as statistical analysis.
Multiple computers are granted access to the database. The blue
rectangles represent the
database management system software. Researchers can utilize
the database client graphical
user interface (GUI) to import data without needing to format.
Researchers also control data
through the GUI. Queries are usually run through the GUI to
provide wanted results. Once
Lee et al. Page 7
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
the results are obtained, researchers export the query to a
spreadsheet, illustrated by the
green rectangle.
Lee et al. Page 8
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 2.
This figure illustrates the query interface. In this example
query, a list of male patients over
the age of 40 with hepatocellular carcinoma (HCC) is wanted.
The user inputs search criteria
for age, gender, and tumor type, “>40”, “m”, and “HCC”
respectively. MRN: medical record
number.
Lee et al. Page 9
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 3.
This form illustrates how users input data to the database. The
form is divided into three
parts:
(a) Patient Form – Data consists of basic patient information.
Patient Identification (PID) is
a unique number generated by the database to uniquely identify
patients. LAST MODIFIED
is a timestamp of when the data was most recently updated or
added. MODIFIED BY is a
text box that records who updated/added data. (a1) shows the
total amount of patients in the
database.
(b) Tumor – Data consists of a patient’s primary and secondary
tumors in the liver. The
dropdown allows users to select a tumor or add new tumor types
(e.g. metastatic disease).
(b1) shows how many tumors types the patient has in the liver.
(c) Embolization Procedures – Data consists of intra-arterial
therapies (IATs) sessions. (c1)
shows how many IATs sessions a patient has went through.
Lee et al. Page 10
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 4.
This figure illustrates the tabular form where each group of
related data is shown as
individual tabs to assist user navigation. The display of patient
identification information
and comments are maintained while the user navigates to
different tabs to preserve the scope
and field of view for each patient.
Lee et al. Page 11
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 5.
This form shows a patient’s Child-Pugh score and Barcelona
Clinic Liver Cancer (BCLC)
stage. They are automatically calculated when provided with
pertinent patient data. The
“Calculate” buttons are used to refresh the form should any
patient data value change.
PT/INR: Prothrombin Time/International Normalized Ratio; PS:
Performance Status.
Lee et al. Page 12
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 6.
The self evaluation results are from Table 1.
Lee et al. Page 13
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 7.
This figure illustrates the output of a query for male patients
with hepatocellular carcinoma
(HCC). The interface outputs a list of all patients matching the
search criteria.
Lee et al. Page 14
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 8.
This is the output of a query for patients who had undergone
TACE in 2006
(P_PROC_DATE column) with Child Pugh Class A, here
labeled as “Classification”. The
automatically calculated Child-Pugh Class can be used also for
querying.
Lee et al. Page 15
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Figure 9.
The database automatically calculates the days between TACE
sessions for each patient as a
query (red circle). The current treatment “EMBODate,” is
subtracted from the next
treatment, “Next_EMBO.” Empty fields indicate that the patient
has undergone only one
treatment or the session is the latest treatment. Because the
query is saved, double clicking
the query indicated by the red circle refreshes the calculation
for the entire database of
patients.
Lee et al. Page 16
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
A
u
th
o
r M
a
n
u
scrip
t
Lee et al. Page 17
Table 1
Questionnaire Assessment
Response: Yes No
Question: I searched and filtered data manually
Example: Sorting and copying relevant data
Question: I inputted formulas and Excel functions to calculate
scores, response rates, or statistics in my Excel spreadsheet
Question: I summarized my Excel data in a report
Example: Total number of Child Pugh A patients
Question: I converted non-binary data (volume measurements,
numeric values, occurrence rates of symptoms) into binary data
(0/1) by defining
a cut-off point to differentiate
Example: Between responder and non-responder to a given
therapy for statistical analysis
Question: I have done statistical analysis myself
Question: I unknowingly produced duplicate data that I later
found out was already collected by another colleague
Response: 0–20% 21–40% 41–60% 61–80% 81–100%
Question: From the beginning of data collection to finishing
analysis, about what percentage of the total time spent for a
single retrospective
study did you spend on:
Question: Querying/filtering/categorizing data?
Example: Defining subsets of patients with certain criteria such
as patients treated only with cTACE or only with DEB-TACE
Question: Calculating data?
Example: Min, Max, Mean, Sum, Clinical Scores such as Child-
Pugh
Response: Very Unlikely Unlikely Neutral Likely Very
Likely
Question: If given the opportunity, how likely will you use
software that:
Question: Produces group summaries with minimal effort?
Example: Total number of Child Pugh A patients
Question: Calculates clinical staging and score systems
automatically?
Question: Allows multiple users to add and edit data into the
same database so that redundant collection of the same patients
by different
colleagues can be avoided?
Question: Allows users to track data modifications?
Question: Stores data in a centralized location with remote
access?
Acad Radiol. Author manuscript; available in PMC 2016 April
01.
Sponsored by
Center for Cancer Research
National Cancer Institute
Clinical Data Management
Introduction
• Clinical data management (CDM) consists of various
activities involving the handling of data or information
that is outlined in the protocol to be
collected/analyzed. CDM is a multidisciplinary activity.
• This module will provide an overview of clinical data
management and introduce the CCR’s clinical
research database. By the end of this module, the
participant will be able to:
• Discuss what constitutes data management activities in
clinical
research.
• Describe regulations and guidelines related to data
management
practices.
• Describe what a case report form is and how it is developed.
• Discuss the traditional data capture process.
• Describe how protocols are developed in Cancer Central
Clinical
Database (C3D).
Clinical Data Management
• A multi-disciplinary activity that includes:
• Research nurses
• Clinical data managers
• Investigators
• Support personnel
• Biostatisticians
• Database programmers
• Various activities involving the handling of
information outlined in protocol
Clinical Data Management
Activities
• Data acquisition/collection
• Data abstraction/extraction
• Data processing/coding
• Data analysis
• Data transmission
• Data storage
• Data privacy
• Data QA
Guidelines and Regulations…
• Good Clinical Practice (GCP):
• Trial management; data handling, record
keeping (2.10, 5.5.3 a-d)
• Subject and data confidentiality (2.11; 5.5.3
g)
• Safety reporting (4.11)
• Quality control (4.9.1; 4.9.3; 5.1.3)
• Records and reporting (5.21; 5.22)
• Monitoring (5.5.4)
…Guidelines and Regulations
• 21 CFR Part 11
• Applies to all data (residing at the institutional site and
the sponsor’s site) created in an electronic record that
will be submitted to the FDA
• Scope includes:
• validation of databases
• audit trail for corrections in database
• accounting for legacy systems/databas es
• copies of records
• record retention
Case
Report
Forms
What is a Case Report Form
(CRF)?...
• Data-reporting document used in a clinical
study
• Collects study data in a standardized
format:
• According to the protocol
• Complying with regulatory requirements
• Allowing for efficient analysis
…What is a Case Report Form
(CRF)?
• Allows for efficient and complete data
collection, processing, analysis and
reporting
• Facilitates the exchange of data across
projects and organizations especially
through standardization
• Types: Paper, electronic/web interface
• Accompanied by a completion/instruction
manual
CRF Relationship to Protocol
• Protocol determines what data should be
collected on the CRF
• All data must be collected on the CRF if
specified in the protocol
• Data that will not be analyzed should not
appear on the CRF
General Considerations for CRF
Development…
• Collect data with all users in mind
• Collect data required by the regulatory
agencies
• Collect data outlined in the protocol
• Be clear and concise with your data
questions
…General Considerations for
CRF Development
• Avoid duplication
• Request minimal free text responses
• Collect data in a fashion that:
• allows for the most efficient computerization
• similar data to be collected across studies
Elements of a CRF
• The term CRF indicates a single page
• A series of CRF pages makes up a CRF Book
• One CRF book is completed for each subject
enrolled in a study
• Three major parts:
• Header
• Safety related modules
• Efficacy related modules
Header Information
• Key identifying Information
• MUST HAVES
• Study Number
• Site/Center Number
• Subject identification number
Safety Modules
• Keep safety analysis requirements of the protocol
in mind
• Follow the general guidelines for CRF development
• Safety Modules include:
• Demographic information
• Adverse Events
• Medical History/Cancer history (e.g., diagnosis, staging)
• Physical Exam, including Vital Signs
• Concomitant/Concurrent Medications/Measures
• Deaths
• Drop outs/off-study reasons
• Eligibility confirmation
Efficacy Modules
• Considered to be “unique” modules and can be
more difficult to develop
• Protocol dictates the elements required in efficacy
modules
• Define
• Key efficacy endpoints of trial (primary and secondary_
• Additional test to measure efficacy (e.g.: QOL)
• How lesions will be measured (longest diameter, bi-
dimensional, volumetric)
• CR, PR, SD, PD
• Required diagnostics
• Include appropriate baseline measurements
• Repeat same battery of tests
Standard CRFs
• Allows rapid data exchange
• Removes the need for mapping during data
exchange
• Allows for consistent reporting across protocols,
across projects
• Promotes monitoring and investigator staff
efficiency
• Allows merging of data between studies
• Provides increased efficiency in processing and
analysis of clinical data
CRF Development Process…
• Begins as soon in the study development
process as possible
• Responsibility for CRF design can vary
between clinical research organizations (e.g.:
CRA, data manager, Research Nurse,
Database Development, Dictionary Coding,
Standards)
• Include all efficacy and safety parameters
specified in the protocol using standard
libraries
…CRF Development Process
• Collect ONLY data required by the protocol
• Work with protocol visit schedule
• Interdisciplinary review is necessary
• Note:
• each organization has its own process for
review/sign-off
• Should include relevant members of the project
team involved in conduct, analysis and reporting of
the trial
Properly Designed CRF
• Allows components or ALL of the CRF
pages to be reused across studies
• Saves time
• Saves money
Poorly Designed CRF
• Poorly designed CRFs will result in data
deficiencies including:
• Data not collected as per protocol
• Collecting unnecessary data (i.e.: data not
required to be collected per protocol)
• Impeding data entry process
• Database requiring modifications throughout
study
Electronic CRFs
• The use of Remote Data
Capture (RDC) is increasing
• In general, the concepts for the design of
electronic CRFs/RDC screens are the
same as covered for paper
• No need to print and distribute paper
CRF Completion
CRF Completion…
• According to GCP Section 4.9.1, the investigator
should ensure the accuracy, completeness,
legibility, and timeliness of the data reported on
the CRFs and in all required reports. This
includes ensuring:
• all sections have been completed, including the
header with identifying items
• all alterations have been properly made
• all adverse events are fully recorded and that for all
serious adverse events, any specific documentation
has been completed
…CRF Completion
• Data is taken from the source documents
(e.g.: medical record) and entered onto the
CRFs by study personnel. This is referred
to as data abstraction.
• Only designated members of the research
staff should be allowed to record and/or
correct data in the CRFs
• Typically this responsibility resides with the
Data Manager/ Research Nurse
Tips: CRF Completion…
1. CRF completion/instruction manual should be
observed to ensure the accuracy,
completeness, legibility, and timeliness of the
data reported to the sponsor
2. Make sure appropriate protocol, investigator
and subject identifying information is included
in the Header (for RDC, may be pre-populated)
3. Ensure data is entered in the correct location or
data field
…Tips: CRF Completion…
4. Use the appropriate units of measurement
(UOM), and be consistent
5. Check to see that data is consistent across
data fields and across CRFs
• E.g.:
• Make sure visit dates match dates on the
laboratory or other procedure reports;
• Make sure the birth date matches the
subject’s age;
6. Use only the abbreviations authorized per
completion/instruction manual
7. Double check your spelling
…Tips: CRF Completion
8. Watch for transcription errors
• E.g.: sodium level should be “135” and entered as
“153”
9. Do not allow entries to run outside the
indicated data field; this important data might
be missed during data processing
10. Use “comments” section to elaborate on any
information, but keep to a minimum
Timeliness of CRF Completion
• Ideally CRFs should be completed as
soon after the subject’s visit as possible
• Ensures that information can be retrieved
or followed-up on while the visit is still
fresh in the healthcare provider’s mind,
and while the subject and/or the
information is still easily accessible
REMEMBER….
• Data cannot be entered onto a CRF if it is not in
the medical record or for some documents, in
the research record
• If the individual completing the CRF, finds
missing or discrepant source data he/she
should:
• Notify the research nurse or health care provider who
then will provide the data
• If applicable, contact outside source (i.e.: outside lab
or doctor's office)
Common Errors …
• Logical
• date of the second visit is earlier than the first
visit
• Inaccurate information
• source document says one thing, the CRF
says another
• Omissions
• AE is recorded on the CRF but not on the
source document
• Transcription errors
• date errors, 11-2-59 instead of 2-11-59
…Common Errors
• Abbreviations
• unless an approved list of abbreviations is
distributed and utilized, data entry personnel
often misinterpret abbreviations
• Spelling errors
• Illegible entries/”write-overs”
• Writing in margins
Correcting Paper CRF
Entries…
• If corrections are necessary, make the change
as follows:
• Draw one horizontal line through the error;
• Insert the correct data;
• Initial and date the change;
• DO NOT ERASE, SCRIBBLE OUT, OR USE
CORRECTION FLUID OR ANY OTHER MEANS
WHICH COULD OBSCURE THE ORIGINAL
ENTRY
• These procedures ensure a complete “audit
trail” exists for all entries.
01/JAN/2005 05C1234 NIC 12345678
03/JAN/1925 80
x
x
1. Complete each form in black or blue pen to ensure good
photocopies.
2. All dates are to be expressed in day/month/year (dy/mth/yr)
format. To
avoid ambiguity,months are to be recorded using a three letter
abbreviation (i.e., Jan, Feb, Mar., etc.). Years are to be
recorded as four
digits (i.e. 1998).
NCI EN
9/8/05
…Correcting Paper CRF Entries
Electronic Data Collection
Process
• Web-based interface
• Sponsor or site dependent
• Ensures data integrity:
• Controls the ability to delete or alter
previously entered data
• Provides an audit trail for data changes
• Protects the database from being tampered
with
• Ensures data preservation (e.g. automatic
back ups)
Process of Data Transfer to
Sponsor
Traditional (Paper)
Electronic
Traditional Data Transfer…
• CRF Books developed by sponsor and supplied
to the site for completion along with
completion/instruction manual
• Paper CRFs are either 2 or 3 part NCR (No
Carbon Required paper)
• Use a black or blue ballpoint pen for permanency –
and PRESS HARD
• At the time of a monitoring visit, CRFs are
reviewed for adherence to completion
guidelines and verified against source
documents by the Monitor
…Traditional Data Transfer …
• During the monitoring visit, site staff make
required corrections to CRFs
• Verified/corrected CRFs are submitted to
the sponsor, leaving a legible copy of the
CRF at the site
• e.g.: CRA may hand carry completed
CRFs to the sponsor;
• If data is not retrieved at the time of the
monitoring visit, sponsor may want the
CRFs submitted via mail or facsimile
…Traditional Data Transfer
• Sponsor enters the CRF data into a centralized
database (generally done by 2 separate
individuals, called double data entry) and
reviews the data for errors
• If inconsistencies are found, the sponsor
generates data queries (forms may vary slightly
from sponsor to sponsor) and sends to the site
• Site staff investigates these queries and
responds to them either directly on the data
query form or on the CRF. The data correction
is then re-submitted to the sponsor for entry into
their database.
Data Transfer:
Electronic CRF (eCRF)
• Site records data from source documents to the
electronic database or the web interface
• Data periodically electronically transmitted to
Sponsor/CRO or automatically resides in Sponsor
database
• Real-time review of data performed by in house
CRAs
• Less frequent CRA visits
• Electronic queries generated and sent to site
• Database lock
Cancer Central Clinical Database
(C3D)…
• C3D is an integrated clinical trial
information system for the CCR
• System is secure, compliant with
regulatory requirements (21 CRF Part 11 )
• System is friendly and flexible for user
…Cancer Central Clinical
Database (C3D)
• Designed to allow integration with the NCI
extramural divisions and the NIH Clinical Center
CRIS (Clinical Research Information System).
• Currently this is being done with labs drawn at the
Clinical Center.
• Oversight is done by the Control and
Configuration Management Group (CCMG)
whose membership has clinical and IT expertise
C3D Overview…
• Based on commercial software produced by the
Oracle Corporation called Oracle Clinical (OC)
• Allows for Remote Data Capture (RDC) so that
local and remote personnel enter and manage
clinical data over a LAN, intranet, telephone line,
or the Internet
• Data can be electronically transferred to
Sponsors (responsibility of DM IT team)
…C3D Overview
• A template set of master CRFs have been
created to collect the data required by
CCR protocols
• Templates are reused and each study will
only use the eCRFs that are appropriate
and required for that study
• Confidentiality statement signed at time of
training
J-Review
• J-Review is a software product that allows us to
get data out of C3D into a variety of reports
• Numerous template reports have been
developed including:
• Adverse event summary
• Demographics
• Drug administration
• Also allows for customized reports
C3D eCRFs Resources
• C3D Data Entry
• Manual for the Completion of the
NCI/CCR/C3D Case Report Forms
• Access to J-review is granted once
training occurs.
https://ccrod.cancer.gov/confluence/display/CCRClinicalIT3/Lo
gin
https://ccrod.cancer.gov/confluence/display/CCRClinicalIT3/Tr
aining+and+Education
https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm
https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm
https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm
https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm
C3D Protocol Build Process…
• OCD determines if a protocol will be
built in C3D
• Currently the following are built:
• All CTEP-sponsored, non-cooperative
group trials
• All industry-sponsored trials with company
agreement (if not, sponsor will then
provided paper crfs)
• All internal/non-sponsored interventional
trials
… C3D Protocol Build Process…
…
• Clinical Analyst (CA)
receives protocol from
IRB
• CA identifies standard
eCRFs to be used
• CA develops the eCRF
book and identifies if
new eCRFs are needed
• CA meets with research
team to confirm eCRF
book
CR Doc
Forms & Rules
Testing
Protocol
Receiving
Clinical Analyst
Forms & Rules
Building
Initiation
Meeting
Control & Configurations Management Group (CCMG)
$
$ $
Requirement
Specification
Clinical
Programmers
Clinical
Programmers
TeamClinical Analyst
Research Team
(PI, RN, DM)
Clinical Analyst
Activation
Meeting
Team
Protocol Protocol
Reqs
Protocol
Reqs
Sign-off
Protocol
Reqs
Sign-off
Build Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Rep Doc
Signoff
Team
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Rep Doc
Signoff
Change
Request
Report
Building
Signoff
… C3D Protocol Build Process…
…
• Clinical Programmers
(CP) build protocol
(eCRFs) in C3D
• Research team tests
the build/enters data
• Modifications made
as needed
• Protocol activated in
C3D by CA/CP
• eCRFS available for
data entry
CR Doc
Forms & Rules
Testing
Protocol
Receiving
Clinical Analyst
Forms & Rules
Building
Initiation
Meeting
Control & Configurations Management Group (CCMG)
$
$ $
Requirement
Specification
Clinical
Programmers
Clinical
Programmers
TeamClinical AnalystClinical Analyst
Modification
Activation
Clinical Analyst
Clinical Programmers
Team
Protocol Protocol
Reqs
Protocol
Reqs
Sign-off
Protocol
Reqs
Sign-off
Build Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Rep Doc
Signoff
Team
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Rep Doc
Signoff
Change
Request
Report
Building
Signoff
… C3D Protocol Build Process…
…
• If a protocol
amendment requires
changes in C3D (e.g.
eligibility criteria),
CA/CP will develop
new eCRF
• Team will review,
sign-off
• CA/CP will activate
new eCRF Book
CR Doc
Forms & Rules
Testing
Protocol
Amendment
Clinical Analyst
New Forms & Rules
Building
Activation of New Forms
Control & Configurations Management Group (CCMG)
$
$ $
Update
Requirement
Specification
Clinical
Programmers
Clinical
Programmers
TeamClinical Analyst
Activation
Meeting
Team
Protocol Protocol
Reqs
Protocol
Reqs
Sign-off
Protocol
Reqs
Sign-off
Build Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Rep Doc
Signoff
Team
Protocol
Reqs
Sign-off
Build Doc
QC Doc
Rep Doc
Signoff
Change
Request
Report
Building
Signoff
Training
• There is specific training required for use
of C3D and I-review.
• See Training Sessions for date, time and
location.
https://ccrod.cancer.gov/confluence/display/CCRClinical IT3/Tr
aining+and+Education
Industry Sponsored Queries
• Sponsor generates questions/queries:
• During/end of a monitoring visit
• After data sent to sponsor and
reviewed/entered in sponsor’s database
• Site corrects CRF:
• During/between monitoring visit
• May need to also sign-off on query form itself
CTEP Sponsored CTMS
Clarification
• These are paper queries generated for
CTEP-sponsored, CTMS-monitored trials
• Sent every Monday by Theradex (contractor
for CTEP)
CTEP Sponsored CDS
Rejection/Notification
• These are electronic data queries for CTEP-
sponsored, CDS-monitored clinical trials
• CDS submitter receives notice
• For studies in C3D, the notification will be sent to the
CCR IT Programmer who transfers the data to CDU
• CCR staff corrects data in the database and
resubmits
• Process occurs until data is loaded correctly in
CDS
Missing Data at Time of Transfer
• Missing data elements
• Source Document (SD) not supporting CRF
• CRF not supporting SD
• Referred to as:
• Discrepancies
• Queries
• Clarifications
• Identified by:
• Sponsor
• Database
Sponsor Queries
• Sponsor generates:
• During/End of a monitoring visit
• After data sent to sponsor and
reviewed/entered in database
• Site corrects CRF:
• During/between monitoring visit
• May need to sign-off on query
Database Discrepancies
• Failure of entered data to pass a validation
check as applied by a database
• Univariate discrepancy – single data
element errors (e.g., not using provided
pick-list, missing data in a field)
• Multivariate discrepancy – multiple data
element errors (e.g., male patient with +
beta HCG)
Quality Control
According to GCP Section 5.1.3 quality
control should be applied to each stage of
data handling to ensure that all data are
reliable and have been processed correctly.
Assessing the QC/QA Process
• Are staff checking their own work?
• Are staff relying on others to check their work?
• Does the organization have a QA plan for
monitoring protocol adherence and data
collection?
• Are there SOPs related to data management?
• How soon after a visit is a CRF completed?
• Is all data, as defined in the protocol, captured
from the source document to the CRF?
Terminology
• Quality Control
• Quality Assurance
• Quality Improvement
Quality Control (QC)
• Ongoing and concurrent review of subject data
• Typically 100%
• Checking your own work and work of others
• Verify that data collected and abstracted:
• Correctly entered onto CRF
• Able to be found in source document
• Follows regulations and guidelines
• Individual team member level
Quality Assurance (QA)
• Planned, systematic check done at the branch or
organizational level
• Verifies:
• Trial is performed as per the approved plan
• Data generated is accurate
• Identifies problems and trends:
• Retrospective and involves sampling of subjects and
data
• Pulls all the pieces together to gain a picture
(measurement) of compliance
• Ensures staff is compliant with internal and external
regulations/guidelines
QA Activities
• Internal monitoring/audits
• Compile all data components and gain a
measurement of compliance
• Clarification monitoring
• Assess for trends
• Review clarifications responses before they are
submitted to sponsor
• Measure data inconsistencies and trends using
a sampling of the data prior to audits/monitoring
visits
• Summarize QA findings and report to
management
• Identify learning needs
QA Activities for CCR
• The following are examples of QA activities
for the CCR:
• Office of the Clinical Director (OCD)
• Internal monitoring/audits
• Conduct audits per upon request, for PI sponsored
studies
• Clarification monitoring
• Data Management Contractor
• Develop QA tools
• Summarize QA findings and report to management,
education and training
• Identify needs
Quality Improvement (QI)
• Result of QC and QA
• Developing a plan includes:
• Identifying root causes of problems
• Intervening to reduce or eliminate these problems
• Taking steps to correct the process(es)
• Identifying trends and areas for improvement
• Identifying solutions:
• Assess work flow and time management activities
• Develop tools for source documentation
• Assess training needs
• Involve appropriate staff in resolution
• Implementing new/updated solution
QI Activities for CCR
• Team Level:
• Based on QC activities: identifying trends
• Based on audit/monitoring visit results
• OCD CCR Level:
• Based on audit/monitoring visit results
• Guide in implementing processes for making
corrective changes
Responsibilities
• Research Team responsibilities
• Research Nurse responsibilities
• Data Manager responsibilities
Research Team
• Ensure that all source data is documented in the
Medical Record/Research Chart with accuracy,
completeness, and consistency
• Ensure the overall quality of the research data is
verifiable and acceptable for sponsor
submissions, publications, etc.
• Review data discrepancy/clarification resolutions
for accuracy, consistency and timely response
Research Nurse….
• Provide accurate and complete source
documentation
• Develop, implement, and maintain a team QC
plan:
• Establish a schedule of QC activities
• Quality check source documentation, data
abstraction, CRFs completion
• Quality check of database
• Verify function in database
• Develop team quality improvement plan, as
needed
….Research Nurse
• Lead Team QC meeting:
• Provide administrative updates
• Provide patient updates
• Perform QC on data/resolve issues
• Review query/clarification:
• Assign to Data Manager(s), if appropriate, to
investigate and resolve or resolve yourself
• Review and sign off:
• Follow sponsor SOP
Data Manager….
• Abstract data onto CRFs according to what is
found in the source documents (Medical Record
or Research Chart) and CRF Instruction Manual
• Abstract data in a timely fashion, this includes
entry into database
• Code Adverse Events accurately utilizing the
appropriate version of CTCAE, as per protocol
….Data Manager
• Apply quality control checks at each stage
of data handling
• Ensure that data elements abstracted are
complete and accurate
• Contact Research Nurse for missing source data
• Resolve discrepant data – ongoing
• Utilize database report tools to assist with QC
activities
Guiding Principles
• Source documents need to be accurate and
complete
• Data abstraction should occur in real time
• QC/QI is the responsibility of every research
team member
• QC/QI should be completed on all protocol data
for all protocols
• QC/QI should be proactive and ongoing
• Each team member should know and
understand the roles and responsibility of each
team member
Resources
• Guidelines for Good Clinical Practice.
International Conference on Harmonisation
(ICH).
• http://www.ich.org
• FDA, Title 21 CFR Part 11
• http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcf
r/CFRSearch.cfm?CFRPart=11
http://www.ich.org/
http://www.ich.org/
http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSea
rch.cfm?CFRPart=11
http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSea
rch.cfm?CFRPart=11
Evaluation
Please complete the evaluation form and
fax to Elizabeth Ness at 301-496-9020.
For questions, please
contact Elizabeth Ness
301-451-2179
[email protected]v
https://ccrod.cancer.gov/confluence/download/attachments/7104
1052/CDM_Evaluation.pdf
Design Considerations for HIM Related Databases
1. Database design considerations for HIM Professionals are
complex and vary widely when considering such factors as
database purpose, setting, objectives, targeted audience, and
output requirements. Using your past experience in the HIM
industry as a guide, please select two topics (a primary and a
secondary) that are of professional interest to you that will
serve as the basis for developing a database during this
semester. Your selection(s) must contain data elements that
represent the entire continuum of the patient experience:
administrative, clinical, and financial data elements. The
instructor will distribute a Database Design Template/Model
that will assist you with this process. You can modify the
Template/Model to fit your topic.
2. When considering design issues for HIM related databases,
we must remain mindful of the interrelationships between data
elements. In order for an HIM-related database to be useable,
linkages must exist between the tables contained within each
database. Within your database design, which fields do you
plan to use as the key data elements which will enable the
various tables in your database to interact with each other?
3. When considering issues of database design, we are
ultimately concerned with the clinical and regulatory needs of
the intended audience. Please specifically identify the intended
audience for the database that you intend to design in partial
fulfillment of the requirements of this course.
The Sample Diabetes Database (in red) on the subsequent page
is provided as a guide for this exercise, although your own
database design may follow any standard database design
convention.
Sample Database Design: Diabetes
Master Patient Index
Field NameData TypeField Size
PtIdNo AutoNumber Integer
PtLast Text 30
PtFirst Text 30
MRNO Number Double
Gender Text 1
Race Text 10
DOB Date/Time mm/dd/yy;@
Encounter
Field NameData TypeField Size
EncounterId AutoNumber Integer
PtIdNo Number Integer
MRNO Number Double
DOS Date/Time mm/dd/yy;@
Height Number Double
Weight Number Double
Physician Text 35
Date Onset Date/Time mm/dd/yy;@
Insulin Dependent Text 1
A1CScore Number Double
A1CRating Text 10,@
DietCompliance Text 10,@
Neuropathy Text 10,@
Retinopathy Text 10,@
BMI Number Double
BMIRating Text 10,@
Insurance
Field NameData TypeField Size
InsuranceId AutoNumber Integer
PtIdNo Number Integer
InsPlanName Text 30
InsPlanNo Text 30
Sample Database Design 1: _________________________
Encounter
Field NameData TypeField Size
EncounterId AutoNumber Integer
PtIdNo Number Integer
Master Patient Index
Field NameData TypeField Size
PtIdNo AutoNumber Integer
PtLast Text 30
PtFirst Text 30
MRNO Number Double
Gender Text 1
Race Text 10
DOB Date/Time mm/dd/yy;@
Insurance
Field NameData TypeField Size
InsuranceId AutoNumber Integer
PtIdNo Number Integer
InsPlanName Text 30
InsPlanNo Text 30
Sample Database Design 2:
_____________________________________
Encounter
Field NameData TypeField Size
EncounterId AutoNumber Integer
PtIdNo Number Integer
Master Patient Index
Field NameData TypeField Size
PtIdNo AutoNumber Integer
PtLast Text 30
PtFirst Text 30
MRNO Number Double
Gender Text 1
Race Text 10
DOB Date/Time mm/dd/yy;@
Insurance
Field NameData TypeField Size
InsuranceId AutoNumber Integer
PtIdNo Number Integer
InsPlanName Text 30
InsPlanNo Text 30
Department of Health Informatics
Health Information Management Program
BINF 5520 Health Analytics
Creating A Diabetes Tracking Relational Database
Using Microsoft Access
Fundamentals of Creating A
Clinical Tracking Database
Working With Database “Objects”
Tables
Forms
Queries
Reports
Creating a Database to Track Patients With Diabetes
Review of Database Fundamentals
Questions and Answers
How This Presentation Is Organized
Step Number Will Always Be At Top
Command Orientation in Red on Left Side
Screen Shot In Middle
Arrows will focus your attention.
The Four Objects of Microsoft Access
TABLES: The “Containers” That Hold The Data. We must
DESIGN these tables before we can do anything, because they
hold the data !
FORMS: The Forms allow us to display information to users
easily.
QUERIES: The Queries allow us to select data based on specific
criteria.
REPORTS: The Reports allow us to output data, either via
printer or via a file, such as files that are in a PDF or XLS
format.
The Four Objects of Microsoft Access
TABLES
QUERIES
REPORTS
FORMS
DATABASE
The Five Steps of Creating A Relational Database
1. Create the Tables
2. Define The Database Relationship(s)
3. Create The MPI and Encounter Forms
4. Combine the MPI and Encounter Forms Into One Form
5. Start Using The Database !
1. Create the Tables
Master Patient Index (MPI)
Field Name Field Type Field Length
PtId AutoNumber Numeric
PtLast ShortText 30
PtFirst ShortText 30
PtDOB Date MM/DD/YYYY
MRNumber ShortText 12
PtSex ShortText 1
PtRace ShortText 1
And other fields….
Encounters
Field Name Field Type Field Length
EncounterID AutoNumber Numeric
PtId Number Numeric
DateOfService Date MMDDYYYY
Provider ShortText 30
A1C Numeric Decimal,0
BP-Systolic Numeric Decimal,0
BP-Diastolic Numeric Decimal,0
Cholesterol Numeric Decimal,0
Retinopathy Yes/No Yes/No
Neuropathy Yes/No Yes/No
And other fields….
2. Define The Database Relationship(s)
Master Patient Index (MPI)
Field Name Field Type Field Length
PtId AutoNumber Numeric
PtLast ShortText 30
PtFirst ShortText 30
PtDOB Date MM/DD/YYYY
MRNumber ShortText 12
PtSex ShortText 1
PtRace ShortText 1
And other fields….
Encounters
Field Name Field Type Field Length
EncounterID AutoNumber Numeric
PtId Number Numeric
DateOfService Date MMDDYYYY
Provider ShortText 30
A1C Numeric Decimal,0
BP-Systolic Numeric Decimal,0
BP-Diastolic Numeric Decimal,0
Cholesterol Numeric Decimal,0
Retinopathy Yes/No Yes/No
Neuropathy Yes/No Yes/No
And other fields….
3. Create The MPI and Encounter Forms
4. Graft the MPI and Encounter Forms Together
5. Start Using The Database !
Step 1
Step 2
Create / Table Design
Step 3
Create / Table Design
Step 4
Create / Table Design
Step 5
Create / Table Design
Step 6
Create / Table Design
Step 7
Create / Table Design
Step 8
Home
Step 9
Create / Table Design
Step 10
Create / Table Design
Step 11
Create / Table Design
Step 12
Create / Table Design
Step 13
Create / Table Design
Step 14
Create / Table Design
Step 15
Home
Step 16
Database Tools / Relationships
Step 17
Database Tools / Relationships
Step 18
Database Tools / Relationships
Step 19
Database Tools / Relationships
Step 20
Database Tools / Relationships
Step 21
Database Tools / Relationships
Step 22
Database Tools / Relationships
Step 23
Database Tools / Relationships
Step 24
Create / Table Design
Step 25
Create / Table Design
Step 26
Create / Table Design
Step 27
Database Tools / Relationships
Step 28
Home
Step 29
Home
Step 30
Home / Right Click on Encounter / Left Click on Design View
Step 31
Move to Provider Field and go to Tab at Bottom called Lookup
Step 32
In Tab at Bottom called Lookup, Select Combo Box
Step 33
In Row Source option, select lkpProvider Table developed
earlier.
Step 34
We now save the table by selecting Yes.
Step 35
We will now see the two tables and the relationship between the
tables.
Step 36
Design / Relationships / Save / Yes
Step 37
We now see all three tables: MPI, Encounter, and lkpProvider
Step 38
Create / Form Wizard
Step 39
Create / Form Wizard
Step 40
Create / Form Wizard
Step 41
Create / Form Wizard
Step 42
Create / Form Wizard
Step 43
Home / Right Click on Form MPI, Left Click on Design View
Step 44
Highlight the four fields at the bottom left side of the screen
and move to upper right.
Step 45
Highlight the four fields at the bottom left side of the screen
and move to upper right.
Step 46
Highlight the four fields at the bottom left side of the screen
and move to upper right.
Step 47
Close the Form MPI and Left Click Yes to save the changes to
the design of the form.
Step 48
Highlight the four fields at the bottom left side of the screen
and move to upper right.
Step 49
Create, Form Wizard, Left Click on Form Encounter, Right
Click on Design View
Step 50
Click the double right arrows (>>) to move from Available to
Selected and click Next.
Step 51
Click Next to display all fields for this form.
Step 52
Indicate that the form should be organized in a Tabular layout
and click Next.
Step 53
Name the form Encounter and click Finish.
Step 54
The form will organize horizontally. You may need to adjust
the width of fields to enhance the readability of the form.
Step 55
Close the form and click Yes to save the changes to the design
of the form Encounter.
Step 56
On the left side of the screen, left click on the Form MPI and
right click on Design View.
Step 57
You will see the large area under the MPI fields. This is where
we will move the Encounter form so that we can simultaneously
see the Patient and all associated encounters.
Step 58
We then left click on the Form Encounter and we position it
under the PtFirst field in the MPI form.
Step 59
We then close Form MPI and we click Yes to save all changes
to the design of this form.
Step 60
We can now double click on the MPI form and we will see how
the two forms have been joined together.
Step 61
The screen below shows you the results of a database that has
been populated. Note that the PtId in the MPI is the same as the
PtId in the Encounter.
How I do it A Practical Database Management System to Assist

More Related Content

More from PazSilviapm

Case Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docx
Case Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docxCase Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docx
Case Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docx
PazSilviapm
 
Case Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docx
Case Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docxCase Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docx
Case Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docx
PazSilviapm
 
case scenario being used for this discussion postABS 300 Week One.docx
case scenario being used for this discussion postABS 300 Week One.docxcase scenario being used for this discussion postABS 300 Week One.docx
case scenario being used for this discussion postABS 300 Week One.docx
PazSilviapm
 
Case Study #2Alleged improper admission orders resulting in mor.docx
Case Study #2Alleged improper admission orders resulting in mor.docxCase Study #2Alleged improper admission orders resulting in mor.docx
Case Study #2Alleged improper admission orders resulting in mor.docx
PazSilviapm
 
Case Study 1Denise is a sixteen-year old 11th grade student wh.docx
Case Study 1Denise is a sixteen-year old 11th grade student wh.docxCase Study 1Denise is a sixteen-year old 11th grade student wh.docx
Case Study 1Denise is a sixteen-year old 11th grade student wh.docx
PazSilviapm
 
Case AssignmentI. First read the following definitions of biodiver.docx
Case AssignmentI. First read the following definitions of biodiver.docxCase AssignmentI. First read the following definitions of biodiver.docx
Case AssignmentI. First read the following definitions of biodiver.docx
PazSilviapm
 
Case C Hot GiftsRose Stone moved into an urban ghetto in order .docx
Case C Hot GiftsRose Stone moved into an urban ghetto in order .docxCase C Hot GiftsRose Stone moved into an urban ghetto in order .docx
Case C Hot GiftsRose Stone moved into an urban ghetto in order .docx
PazSilviapm
 
Case 7. Handling DisparateInformation for Evaluating TraineesRas.docx
Case 7. Handling DisparateInformation for Evaluating TraineesRas.docxCase 7. Handling DisparateInformation for Evaluating TraineesRas.docx
Case 7. Handling DisparateInformation for Evaluating TraineesRas.docx
PazSilviapm
 
CASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docx
CASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docxCASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docx
CASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docx
PazSilviapm
 
Case 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docx
Case 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docxCase 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docx
Case 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docx
PazSilviapm
 

More from PazSilviapm (20)

Case Study 2 Structure and Function of the Kidney Rivka is an ac.docx
Case Study 2 Structure and Function of the Kidney Rivka is an ac.docxCase Study 2 Structure and Function of the Kidney Rivka is an ac.docx
Case Study 2 Structure and Function of the Kidney Rivka is an ac.docx
 
Case Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docx
Case Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docxCase Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docx
Case Study 2 Plain View, Open Fields, Abandonment, and Border Searc.docx
 
Case Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docx
Case Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docxCase Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docx
Case Study 2 Collaboration Systems at Isuzu Australia LimitedDue .docx
 
Case FormatI. Write the Executive SummaryOne to two para.docx
Case FormatI. Write the Executive SummaryOne to two para.docxCase FormatI. Write the Executive SummaryOne to two para.docx
Case FormatI. Write the Executive SummaryOne to two para.docx
 
Case Study #2 Diabetes Hannah is a 10-year-old girl who has recentl.docx
Case Study #2 Diabetes Hannah is a 10-year-old girl who has recentl.docxCase Study #2 Diabetes Hannah is a 10-year-old girl who has recentl.docx
Case Study #2 Diabetes Hannah is a 10-year-old girl who has recentl.docx
 
case scenario being used for this discussion postABS 300 Week One.docx
case scenario being used for this discussion postABS 300 Week One.docxcase scenario being used for this discussion postABS 300 Week One.docx
case scenario being used for this discussion postABS 300 Week One.docx
 
Case Study #2Alleged improper admission orders resulting in mor.docx
Case Study #2Alleged improper admission orders resulting in mor.docxCase Study #2Alleged improper admission orders resulting in mor.docx
Case Study #2Alleged improper admission orders resulting in mor.docx
 
Case Study 1Denise is a sixteen-year old 11th grade student wh.docx
Case Study 1Denise is a sixteen-year old 11th grade student wh.docxCase Study 1Denise is a sixteen-year old 11th grade student wh.docx
Case Study 1Denise is a sixteen-year old 11th grade student wh.docx
 
Case AssignmentI. First read the following definitions of biodiver.docx
Case AssignmentI. First read the following definitions of biodiver.docxCase AssignmentI. First read the following definitions of biodiver.docx
Case AssignmentI. First read the following definitions of biodiver.docx
 
Case and questions are In the attchmentExtra resources given.H.docx
Case and questions are In the attchmentExtra resources given.H.docxCase and questions are In the attchmentExtra resources given.H.docx
Case and questions are In the attchmentExtra resources given.H.docx
 
Case C Hot GiftsRose Stone moved into an urban ghetto in order .docx
Case C Hot GiftsRose Stone moved into an urban ghetto in order .docxCase C Hot GiftsRose Stone moved into an urban ghetto in order .docx
Case C Hot GiftsRose Stone moved into an urban ghetto in order .docx
 
Case Assignment must be 850 words and use current APA format with a .docx
Case Assignment must be 850 words and use current APA format with a .docxCase Assignment must be 850 words and use current APA format with a .docx
Case Assignment must be 850 words and use current APA format with a .docx
 
Case 7. Handling DisparateInformation for Evaluating TraineesRas.docx
Case 7. Handling DisparateInformation for Evaluating TraineesRas.docxCase 7. Handling DisparateInformation for Evaluating TraineesRas.docx
Case 7. Handling DisparateInformation for Evaluating TraineesRas.docx
 
CASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docx
CASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docxCASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docx
CASE 1Pre-Internet Development and Web 1.0Assignment OverviewA.docx
 
Case 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docx
Case 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docxCase 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docx
Case 1Student JoshuaAge 9Grade 4thScenarioJoshua att.docx
 
Case 2Banyan Tree Discussion Questions1.What are the main fa.docx
Case 2Banyan Tree Discussion Questions1.What are the main fa.docxCase 2Banyan Tree Discussion Questions1.What are the main fa.docx
Case 2Banyan Tree Discussion Questions1.What are the main fa.docx
 
Career Development, Technology and Management Development Please r.docx
Career Development, Technology and Management Development Please r.docxCareer Development, Technology and Management Development Please r.docx
Career Development, Technology and Management Development Please r.docx
 
Case 2 Presented tire impression as evidence of special inter.docx
Case 2 Presented tire impression as evidence of special inter.docxCase 2 Presented tire impression as evidence of special inter.docx
Case 2 Presented tire impression as evidence of special inter.docx
 
Carolina is on vacation with her friends. Fill in the blanks with th.docx
Carolina is on vacation with her friends. Fill in the blanks with th.docxCarolina is on vacation with her friends. Fill in the blanks with th.docx
Carolina is on vacation with her friends. Fill in the blanks with th.docx
 
Carla wants to know how many batches of birdseed she can make with 1.docx
Carla wants to know how many batches of birdseed she can make with 1.docxCarla wants to know how many batches of birdseed she can make with 1.docx
Carla wants to know how many batches of birdseed she can make with 1.docx
 

How I do it A Practical Database Management System to Assist

  • 1. How I do it: A Practical Database Management System to Assist Clinical Research Teams with Data Collecting, Organization, and Reporting Howard Lee, B.S.1, Julius Chapiro, M.D.1, Rüdiger Schernthaner, M.D.1, Rafael Duran, M.D. 1, Zhijun Wang, M.D., Ph.D1, Boris Gorodetski, B.S.1, Jean- François Geschwind, M.D.1, and MingDe Lin, Ph.D2 Howard Lee: [email protected]; Julius Chapiro: [email protected]; Rüdiger Schernthaner: [email protected]; Rafael Duran: [email protected]; Zhijun Wang: [email protected]; Boris Gorodetski: [email protected]; Jean-François Geschwind: [email protected]; MingDe Lin: [email protected] 1Russell H. Morgan Department of Radiology and Radiological Science, Division of Vascular and Interventional Radiology, The Johns Hopkins Hospital, Sheikh Zayed Tower, Ste 7203, 1800 Orleans St, Baltimore, MD, USA 21287 2U/S Imaging and Interventions (UII), Philips Research North America, 345 Scarborough Road, Briarcliff Manor, New York 10510 Introduction With the growing amount of clinical research studies in the field of interventional oncology,
  • 2. selective patient data is becoming more difficult to store and organize effectively. Existing hospital EMR (electronic medical record) systems store patient data in the form of reports and data tables. Our institution’s EMR system placed our researchers in a position where time consuming methods are needed to search for suitable patients for clinical studies. Researchers had to manually read through the reports and data tables to filter patients and gather data. For most studies, spreadsheet programs such as Microsoft Excel® (Microsoft, Washington, USA) are often used as a data repository similar to a database to record and organize patient data for research. Once the spreadsheet is populated, it is manually filtered by set study parameters and then pushed to statistical analysis software for further analysis. For statistical analysis, columns containing text are translated into binary values (1 or 0) to be in a format acceptable by statistical analysis software. For example, each tumor entity is assigned a new column. Patient histological reports are read manually to assign a 1 or 0 to
  • 3. each tumor entity column, 1 for positive, 0 for negative. Under a tumor entity column, researchers would write a 1 for all patients with the tumor and a 0 for all patients without the tumor. © 2014 AUR. All rights reserved. Correspondence to: Jean-François Geschwind, [email protected] Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. HHS Public Access Author manuscript Acad Radiol. Author manuscript; available in PMC 2016 April 01. Published in final edited form as: Acad Radiol. 2015 April ; 22(4): 527–533. doi:10.1016/j.acra.2014.12.002. A u th
  • 5. t A u th o r M a n u scrip t This method of data storage has limitations in the organization and the quality of the data. Data input and analysis without a database run a higher risk of incorrect data entry, patient exclusion, and a higher risk of introducing duplicates. Furthermore, data selection and calculation is time consuming. An alternative could be the clinical research database that Meineke et. al. proposed (1). However, it is too unspecific for interventional oncology research and would need additional optimization, for example, the capability to
  • 6. automatically calculate various variables such as tumor staging systems and to record information about multiple treatment sessions. The purpose of this study was to provide an improved workflow efficient tool through the use of a clinical research database management system (DBMS) optimized for interventional oncology clinical research. Materials and Methods This was a single-institution prospective study. The study was compliant with the Health Insurance Portability and Accountability Act (HIPAA) and was waived by the Institutional Review Board. Database and Query Interface Design The presented database management system has two distinct parts, the database server and client interface, illustrated in Figure 1. The database is run by software (MySQL, Oracle Corporation, California, USA and phpMyAdmin, The phpMyAdmin Project, California, USA) on a central computer server within the department (2, 3). Authorized users were
  • 7. granted access to this password protected and encrypted secured server (HIPAA compliant). Multiple users concurrently add, edit, and query data remotely through a customized graphical user interface (GUI) utilizing Microsoft Access® (Microsoft, Washington, USA). Any data changes are immediately logged for others to see. The database performed automatic calculations using queries, user-defined search criteria. Queries were saved, rerun, and exported to spreadsheets. Queries aid in data analysis and increase study productivity (4). They are powerful tools for filtering and sorting datasets. Figure 2 illustrates the query interface and an example of request from the database. Graphical User Interface Design and Utility In our research environment, the database GUI was created to facilitate patient data input. This was done by using custom user-friendly interface forms that contain textboxes and labels including demographic data, treatment information (e.g. conventional transarterial chemoembolization (TACE)), tumors types, dates and types of
  • 8. radiological exams, etc. The GUI is used to view patient data and allows users to add/edit data (Figure 3). The database interface is not limited to one form. It can have multiple forms, shown as tabs, to assist grouping various medical data. Figure 4 shows an example of multiple tabs for groups of related data. Lee et al. Page 2 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A u th
  • 10. t Automatic Calculations Automatic calculations may be run between values, such as dates. For example, the database may calculate the time between baseline imaging, follow -up imaging, treatment dates, pre- and post-treatment dates, date of diagnosis, and patient’s date of death in relation to a particular treatment or event (e.g. randomization), essential for survival studies. Using these queries, the database can also calculate the median overall survival automatically. The database does also automatically calculate clinical scores such as Child-Pugh score and Barcelona Clinic Liver Cancer (BCLC) stage as shown in Figure 5 (5). For our purposes, the Child Pugh score and BCLC were calculated using baseline data before a patient’s first embolization as is typically done for staging. The illustrated calculators can be revised as needed. Once patient blood data is available, queries are run to produce a list of all patients
  • 11. with Child-Pugh scores. Researchers can then quickly retrieve them. Statistical Output Another powerful feature of the database is its ability to provide a first tier of statistical information. Using this GUI, the user defines the search criteria and runs queries to obtain immediate statistical information about a particular set of parameters. With this feature, the database can quickly output an accurate summary of patient data such as, for example, how many patients have colorectal carcinoma and undergo conventional TACE. Questionnaire Assessment A questionnaire (15 questions) was designed and distributed to 21 board-certified interventional radiologists who conduct clinical research at our academic hospital that include Phase I, II, and III clinical trials, and retrospective studies. The questionnaire determined how data is controlled in retrospective studies and the likelihood to use the database. The questionnaire is shown in Table 1. The purpose of the questionnaire was to 1)
  • 12. illustrate the general scope of where researchers were having problems within Excel and data organization, such as wasted effort working with duplicate patients and unintentional failure to include available patients, and 2) to gauge how receptive they would be to a database system. Using this information, the database system was constructed. There were weekly progress updates with the clinical research team to ensure that the original goals set out to address the deficiencies of Excel were being resolved. Results Questionnaire Results All 21 interventional radiologists completed the questionnaire. Self evaluation results are shown in Figure 6. In data collection and analysis, over 50% (11/21) spent most of the time searching, filtering, and/or categorizing data. However, about 50% (10/21) spent little to no time calculating the data. 67% of respondents (14/21) realized at some point that there were erroneously included patients who should have been excluded and there were patients who
  • 13. were erroneously not included. Over 85% (18/21) were very receptive to using software that produces group summaries such as totals of each tumor type with minimal effort, calculates Lee et al. Page 3 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A u th o r M a n u
  • 14. scrip t A u th o r M a n u scrip t A u th o r M a n u scrip t clinical staging and score systems automatically, and also allows remote access for multiple
  • 15. users to add/edit data in a central server with data modification logs. Query Interface Output In Figure 7, the query of male patients, over 40 years old, with HCC is run. Figure 8 shows a query result of patients with TACE and Child-Pugh score A calculated by the database. Figure 9 illustrates an interval of time between two events as a query that can be calculated automatically (e.g. time elapsed between two embolization procedures). The output of the queries as described above is shown in a structured and concise list, which can be exported for further research study specific analysis. Discussion The main finding of this study is that there is a need for a much more time efficient and accurate way to store, retrieve, and analyze patient data for clinical research studies. The database management system presented here fulfills these needs. This was achieved through the use of automatic calculations, interface forms, queries, etc. With a personalized
  • 16. interface, data access, entry, organization, queries, calculations, and export processes are seamlessly performed to assist clinical research with data and statistical analysis. Furthermore, the database is a unified repository of clinical research information and a shared resource among the clinical research team. This allows for a multi-user level experience where there can be simultaneous access to the data and where the efforts of each individual in adding/appending new information can be used by the entire team. With the presented database put into use, the effort for clinical studies can truly focus on conducting various statistical analysis and data interpretation rather than preparing data for analysis (6). All retrospective data can be merged into this database, enabling a centrally maintained and shared resource. Our clinical research team now has access to a customized database of patients with a large number of clinical parameters, allowing a vast combination of queries to form or support study hypotheses. The user defined GUI-connected interface is
  • 17. invaluable for anyone collecting data as it facilitates data entry and minimizes data entry errors. In previous data collection and analysis, converting spreadsheet data to binary/numeric format was time consuming and impractical. The database presented in this study relieves the inconvenience of manually searching, organizing, and calculating data. Processing calculations, especially more complex calculations such as clinical staging scores, can now be done automatically. Prior to implementing the presented database system, a typical Excel spreadsheet for the clinical studies at our institution would have over 100 columns. These columns included patient demographics, repeat treatment dates and types (new columns per TACE session), and repeated pre-/post-imaging dates and types (new columns per multi- modular scan). Tracking medical data is frequently difficult due to the large amount of columns in the spreadsheet. Compared to a typical Excel spreadsheet with many columns,
  • 18. browsing and adding prospective data through the database interface presented here is more organized and practical with ten defined tabs for data groups, ranging from a patient’s basic information to treatments to survival status. In addition, the database interface lists all Lee et al. Page 4 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A u th o r M a
  • 20. repeating treatments and imaging per patient as rows instead of columns, facilitating comparisons between multiple treatments of a patient. Combining the database’s ability to calculate statistical analysis with automatic calculation queries, reports can be generated with virtually any parameter. This is not only helpful in radiology, but also beneficial for other studies and hospital information systems. The database management system in this study has some limitations. A database system may not be suitable for all kinds of research teams. There are several factors that may illustrate the need of a database. In a previous report on data collection, applicable examples and guidelines were addressed to determine whether or not implementing a database is feasible in the current environment (7). Depending on the environment and context, a database may not be implemented right away as it needs additional testing. Furthermore, the database will need a dedicated server to host the database along w ith the data. In order to use the database interface, training is required. Someone who specializes in
  • 21. databases, such as a database administrator, needs to teach researchers and other potential users how to use the database interface and query interface for filtering patients and obtaining statistics. This is especially needed in more advanced queries and in developing additional GUIs. It should be noted that Microsoft Access is being used in this work as a “front-end” interface that communicates with the SQL database to query (filter) data, and for input/appending to existing data. Other software such as FileMaker Pro (FileMaker, Santa Clara, CA) and REDCap would serve a similar function (8). The need for the SQL database is so that multiple users can access the stored data at the same time, increased level of security, stability, and performance, and serving as a unified repository of clinical research information that can be shared by the research team (9, 10). Also, the database administrator has to not only construct a database on a server with input from clinicians and other end users, but in addition would need to maintain the database (11, 12). Typical maintenance includes
  • 22. routine backups, altering database structure and interface for new data types, and updating database and client software. A server can be hosted on a PC or online, both of which all parties involved can access in the same network locally or remotely. Furthermore, databases can be enabled to communicate with other databases. While the initial setup and learning curve is high, the database allows for fluid data entry in an organized fashion, querying results including calculations, and storing data while supporting simultaneous user access. With the variety of research teams and departments, ideally each suitable team should have their own database. This is not necessarily only for interventional oncology but also for any specific area of research, for example, studies with patients undergoing ablation, percutaneous abscess drainage (PAD), etc. These databases can be connected for interdisciplinary research to provide a broader scope of data and facilitate data search (13). Conclusion
  • 23. The current database implementation and interface allow s a much faster and more detailed retrospective analysis of patient cohorts. In addition, it facilitates data management and a standardized information output for ongoing prospective clinical trials. The database management system with an interface is a work efficient and robust tool that provides a significant edge over manual retrieval of patient records by filtering data and assisting statistical analysis in a study-relevant fashion. Lee et al. Page 5 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A
  • 25. n u scrip t Acknowledgments Funding and support has been provided by NIH/NCI R01 CA160771, P30 CA006973, and Philips Research North America, Briarcliff Manor, NY, USA. References 1. Meineke FA, Staubert S, Lobe M, Winter A. A comprehensive clinical research database based on CDISC ODM and i2b2. Studies in health technology and informatics. 2014; 205:1115–9. [PubMed: 25160362] 2. Stobart, S.; Vassileiou, M. MySQL Database and PHPMyAdmin Installation PHP and MySQL Manual. Springer; London: 2004. p. 461-73. 3. Kuenz, D. Book Manage data for free with MySQL. City: Element K Journals; 2001. Manage data for free with MySQL; p. 7-10. 4. Coronel, CMS.; Rob, P. Database systems: design, implementation, and management. 9. Boston, Massachusetts: Cengage Learning; 2009. 5. Llovet JM, Di Bisceglie AM, Bruix J, et al. Design and endpoints of clinical trials in hepatocellular
  • 26. carcinoma. Journal of the National Cancer Institute. 2008; 100(10):698–711. [PubMed: 18477802] 6. Kanas G, Morimoto L, Mowat F, O’Malley C, Fryzek J, Nordyke R. Use of electronic medical records in oncology outcomes research. ClinicoEconomics and outcomes research : CEOR. 2010; 2:1–14. [PubMed: 21935310] 7. Schmier JK, Kane DW, Halpern MT. Practical applications of usability theory to electronic data collection for clinical trials. Contemporary clinical trials. 2005; 26(3):376–85. [PubMed: 15911471] 8. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics. 42(2):377–81. [PubMed: 18929686] 9. MySQL Database Provides Full Transactional Support. Worldwide Databases. 2002; 14(11) 0-N/A. 10. Oracle Improves Database Performance with Latest Development Milestone Release for MySQL 5.7; New Release of the World’s Most Popular Open Source Database is 2x Faster than MySQL 5.6 and Over 3x Faster than MySQL 5.5 in Benchmark Tests. Book Oracle Improves Database Performance with Latest Development Milestone Release for MySQL 5.7; New Release of the World’s Most Popular Open Source Database is 2x Faster than MySQL 5.6 and Over 3x Faster than MySQL 5.5 in Benchmark Tests. City2014.
  • 27. 11. Xie SX, Baek Y, Grossman M, et al. Building an integrated neurodegenerative disease database at an academic health center. Alzheimer’s & dementia : the journal of the Alzheimer’s Association. 2011; 7(4):e84–93. 12. Parkes, D.; Lowman, M.; Andres, C., et al., editors. Pro Python System Administration: Apress. 2010. Automatic MySQL Database Performance Tuning; p. 329- 48. 13. Piriyapongsa J, Bootchai C, Ngamphiw C, Tongsima S. microPIR: an integrated database of microRNA target sites within human promoter sequences. PloS one. 2012; 7(3):e33888. [PubMed: 22439011] Lee et al. Page 6 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t
  • 29. a n u scrip t Figure 1. The Dataflow Chart This chart shows a general layout of the database server and its clients. It illustrates how the database management system performs queries (orange circle) such as statistical analysis. Multiple computers are granted access to the database. The blue rectangles represent the database management system software. Researchers can utilize the database client graphical user interface (GUI) to import data without needing to format. Researchers also control data through the GUI. Queries are usually run through the GUI to provide wanted results. Once Lee et al. Page 7 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th
  • 31. t A u th o r M a n u scrip t the results are obtained, researchers export the query to a spreadsheet, illustrated by the green rectangle. Lee et al. Page 8 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M
  • 33. u th o r M a n u scrip t Figure 2. This figure illustrates the query interface. In this example query, a list of male patients over the age of 40 with hepatocellular carcinoma (HCC) is wanted. The user inputs search criteria for age, gender, and tumor type, “>40”, “m”, and “HCC” respectively. MRN: medical record number. Lee et al. Page 9 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th
  • 35. t A u th o r M a n u scrip t Figure 3. This form illustrates how users input data to the database. The form is divided into three parts: (a) Patient Form – Data consists of basic patient information. Patient Identification (PID) is a unique number generated by the database to uniquely identify patients. LAST MODIFIED is a timestamp of when the data was most recently updated or added. MODIFIED BY is a text box that records who updated/added data. (a1) shows the total amount of patients in the
  • 36. database. (b) Tumor – Data consists of a patient’s primary and secondary tumors in the liver. The dropdown allows users to select a tumor or add new tumor types (e.g. metastatic disease). (b1) shows how many tumors types the patient has in the liver. (c) Embolization Procedures – Data consists of intra-arterial therapies (IATs) sessions. (c1) shows how many IATs sessions a patient has went through. Lee et al. Page 10 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A u
  • 38. scrip t Figure 4. This figure illustrates the tabular form where each group of related data is shown as individual tabs to assist user navigation. The display of patient identification information and comments are maintained while the user navigates to different tabs to preserve the scope and field of view for each patient. Lee et al. Page 11 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t
  • 40. a n u scrip t Figure 5. This form shows a patient’s Child-Pugh score and Barcelona Clinic Liver Cancer (BCLC) stage. They are automatically calculated when provided with pertinent patient data. The “Calculate” buttons are used to refresh the form should any patient data value change. PT/INR: Prothrombin Time/International Normalized Ratio; PS: Performance Status. Lee et al. Page 12 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n
  • 42. o r M a n u scrip t Figure 6. The self evaluation results are from Table 1. Lee et al. Page 13 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A
  • 44. u scrip t Figure 7. This figure illustrates the output of a query for male patients with hepatocellular carcinoma (HCC). The interface outputs a list of all patients matching the search criteria. Lee et al. Page 14 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A u th
  • 46. t Figure 8. This is the output of a query for patients who had undergone TACE in 2006 (P_PROC_DATE column) with Child Pugh Class A, here labeled as “Classification”. The automatically calculated Child-Pugh Class can be used also for querying. Lee et al. Page 15 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M a n u scrip t A u
  • 48. scrip t Figure 9. The database automatically calculates the days between TACE sessions for each patient as a query (red circle). The current treatment “EMBODate,” is subtracted from the next treatment, “Next_EMBO.” Empty fields indicate that the patient has undergone only one treatment or the session is the latest treatment. Because the query is saved, double clicking the query indicated by the red circle refreshes the calculation for the entire database of patients. Lee et al. Page 16 Acad Radiol. Author manuscript; available in PMC 2016 April 01. A u th o r M
  • 52. Table 1 Questionnaire Assessment Response: Yes No Question: I searched and filtered data manually Example: Sorting and copying relevant data Question: I inputted formulas and Excel functions to calculate scores, response rates, or statistics in my Excel spreadsheet Question: I summarized my Excel data in a report Example: Total number of Child Pugh A patients Question: I converted non-binary data (volume measurements, numeric values, occurrence rates of symptoms) into binary data (0/1) by defining a cut-off point to differentiate Example: Between responder and non-responder to a given therapy for statistical analysis Question: I have done statistical analysis myself Question: I unknowingly produced duplicate data that I later found out was already collected by another colleague Response: 0–20% 21–40% 41–60% 61–80% 81–100% Question: From the beginning of data collection to finishing analysis, about what percentage of the total time spent for a single retrospective study did you spend on: Question: Querying/filtering/categorizing data?
  • 53. Example: Defining subsets of patients with certain criteria such as patients treated only with cTACE or only with DEB-TACE Question: Calculating data? Example: Min, Max, Mean, Sum, Clinical Scores such as Child- Pugh Response: Very Unlikely Unlikely Neutral Likely Very Likely Question: If given the opportunity, how likely will you use software that: Question: Produces group summaries with minimal effort? Example: Total number of Child Pugh A patients Question: Calculates clinical staging and score systems automatically? Question: Allows multiple users to add and edit data into the same database so that redundant collection of the same patients by different colleagues can be avoided? Question: Allows users to track data modifications? Question: Stores data in a centralized location with remote access? Acad Radiol. Author manuscript; available in PMC 2016 April 01. Sponsored by
  • 54. Center for Cancer Research National Cancer Institute Clinical Data Management Introduction • Clinical data management (CDM) consists of various activities involving the handling of data or information that is outlined in the protocol to be collected/analyzed. CDM is a multidisciplinary activity. • This module will provide an overview of clinical data management and introduce the CCR’s clinical research database. By the end of this module, the participant will be able to: • Discuss what constitutes data management activities in clinical research. • Describe regulations and guidelines related to data management practices. • Describe what a case report form is and how it is developed. • Discuss the traditional data capture process. • Describe how protocols are developed in Cancer Central Clinical
  • 55. Database (C3D). Clinical Data Management • A multi-disciplinary activity that includes: • Research nurses • Clinical data managers • Investigators • Support personnel • Biostatisticians • Database programmers • Various activities involving the handling of information outlined in protocol Clinical Data Management Activities • Data acquisition/collection • Data abstraction/extraction • Data processing/coding
  • 56. • Data analysis • Data transmission • Data storage • Data privacy • Data QA Guidelines and Regulations… • Good Clinical Practice (GCP): • Trial management; data handling, record keeping (2.10, 5.5.3 a-d) • Subject and data confidentiality (2.11; 5.5.3 g) • Safety reporting (4.11) • Quality control (4.9.1; 4.9.3; 5.1.3) • Records and reporting (5.21; 5.22) • Monitoring (5.5.4) …Guidelines and Regulations
  • 57. • 21 CFR Part 11 • Applies to all data (residing at the institutional site and the sponsor’s site) created in an electronic record that will be submitted to the FDA • Scope includes: • validation of databases • audit trail for corrections in database • accounting for legacy systems/databas es • copies of records • record retention Case Report Forms What is a Case Report Form (CRF)?... • Data-reporting document used in a clinical study
  • 58. • Collects study data in a standardized format: • According to the protocol • Complying with regulatory requirements • Allowing for efficient analysis …What is a Case Report Form (CRF)? • Allows for efficient and complete data collection, processing, analysis and reporting • Facilitates the exchange of data across projects and organizations especially through standardization • Types: Paper, electronic/web interface • Accompanied by a completion/instruction manual
  • 59. CRF Relationship to Protocol • Protocol determines what data should be collected on the CRF • All data must be collected on the CRF if specified in the protocol • Data that will not be analyzed should not appear on the CRF General Considerations for CRF Development… • Collect data with all users in mind • Collect data required by the regulatory
  • 60. agencies • Collect data outlined in the protocol • Be clear and concise with your data questions …General Considerations for CRF Development • Avoid duplication • Request minimal free text responses • Collect data in a fashion that: • allows for the most efficient computerization • similar data to be collected across studies Elements of a CRF • The term CRF indicates a single page
  • 61. • A series of CRF pages makes up a CRF Book • One CRF book is completed for each subject enrolled in a study • Three major parts: • Header • Safety related modules • Efficacy related modules Header Information • Key identifying Information • MUST HAVES • Study Number • Site/Center Number • Subject identification number Safety Modules • Keep safety analysis requirements of the protocol
  • 62. in mind • Follow the general guidelines for CRF development • Safety Modules include: • Demographic information • Adverse Events • Medical History/Cancer history (e.g., diagnosis, staging) • Physical Exam, including Vital Signs • Concomitant/Concurrent Medications/Measures • Deaths • Drop outs/off-study reasons • Eligibility confirmation Efficacy Modules • Considered to be “unique” modules and can be more difficult to develop • Protocol dictates the elements required in efficacy modules • Define • Key efficacy endpoints of trial (primary and secondary_ • Additional test to measure efficacy (e.g.: QOL) • How lesions will be measured (longest diameter, bi- dimensional, volumetric) • CR, PR, SD, PD • Required diagnostics
  • 63. • Include appropriate baseline measurements • Repeat same battery of tests Standard CRFs • Allows rapid data exchange • Removes the need for mapping during data exchange • Allows for consistent reporting across protocols, across projects • Promotes monitoring and investigator staff efficiency • Allows merging of data between studies • Provides increased efficiency in processing and analysis of clinical data CRF Development Process… • Begins as soon in the study development process as possible
  • 64. • Responsibility for CRF design can vary between clinical research organizations (e.g.: CRA, data manager, Research Nurse, Database Development, Dictionary Coding, Standards) • Include all efficacy and safety parameters specified in the protocol using standard libraries …CRF Development Process • Collect ONLY data required by the protocol • Work with protocol visit schedule • Interdisciplinary review is necessary • Note: • each organization has its own process for review/sign-off • Should include relevant members of the project team involved in conduct, analysis and reporting of the trial
  • 65. Properly Designed CRF • Allows components or ALL of the CRF pages to be reused across studies • Saves time • Saves money Poorly Designed CRF • Poorly designed CRFs will result in data deficiencies including: • Data not collected as per protocol • Collecting unnecessary data (i.e.: data not required to be collected per protocol) • Impeding data entry process • Database requiring modifications throughout study
  • 66. Electronic CRFs • The use of Remote Data Capture (RDC) is increasing • In general, the concepts for the design of electronic CRFs/RDC screens are the same as covered for paper • No need to print and distribute paper CRF Completion CRF Completion… • According to GCP Section 4.9.1, the investigator should ensure the accuracy, completeness, legibility, and timeliness of the data reported on
  • 67. the CRFs and in all required reports. This includes ensuring: • all sections have been completed, including the header with identifying items • all alterations have been properly made • all adverse events are fully recorded and that for all serious adverse events, any specific documentation has been completed …CRF Completion • Data is taken from the source documents (e.g.: medical record) and entered onto the CRFs by study personnel. This is referred to as data abstraction. • Only designated members of the research staff should be allowed to record and/or correct data in the CRFs
  • 68. • Typically this responsibility resides with the Data Manager/ Research Nurse Tips: CRF Completion… 1. CRF completion/instruction manual should be observed to ensure the accuracy, completeness, legibility, and timeliness of the data reported to the sponsor 2. Make sure appropriate protocol, investigator and subject identifying information is included in the Header (for RDC, may be pre-populated) 3. Ensure data is entered in the correct location or data field
  • 69. …Tips: CRF Completion… 4. Use the appropriate units of measurement (UOM), and be consistent 5. Check to see that data is consistent across data fields and across CRFs • E.g.: • Make sure visit dates match dates on the laboratory or other procedure reports; • Make sure the birth date matches the subject’s age; 6. Use only the abbreviations authorized per completion/instruction manual 7. Double check your spelling …Tips: CRF Completion 8. Watch for transcription errors • E.g.: sodium level should be “135” and entered as “153”
  • 70. 9. Do not allow entries to run outside the indicated data field; this important data might be missed during data processing 10. Use “comments” section to elaborate on any information, but keep to a minimum Timeliness of CRF Completion • Ideally CRFs should be completed as soon after the subject’s visit as possible • Ensures that information can be retrieved or followed-up on while the visit is still fresh in the healthcare provider’s mind, and while the subject and/or the information is still easily accessible REMEMBER….
  • 71. • Data cannot be entered onto a CRF if it is not in the medical record or for some documents, in the research record • If the individual completing the CRF, finds missing or discrepant source data he/she should: • Notify the research nurse or health care provider who then will provide the data • If applicable, contact outside source (i.e.: outside lab or doctor's office) Common Errors … • Logical • date of the second visit is earlier than the first visit • Inaccurate information • source document says one thing, the CRF says another • Omissions
  • 72. • AE is recorded on the CRF but not on the source document • Transcription errors • date errors, 11-2-59 instead of 2-11-59 …Common Errors • Abbreviations • unless an approved list of abbreviations is distributed and utilized, data entry personnel often misinterpret abbreviations • Spelling errors • Illegible entries/”write-overs” • Writing in margins Correcting Paper CRF Entries… • If corrections are necessary, make the change
  • 73. as follows: • Draw one horizontal line through the error; • Insert the correct data; • Initial and date the change; • DO NOT ERASE, SCRIBBLE OUT, OR USE CORRECTION FLUID OR ANY OTHER MEANS WHICH COULD OBSCURE THE ORIGINAL ENTRY • These procedures ensure a complete “audit trail” exists for all entries. 01/JAN/2005 05C1234 NIC 12345678 03/JAN/1925 80 x x 1. Complete each form in black or blue pen to ensure good photocopies. 2. All dates are to be expressed in day/month/year (dy/mth/yr) format. To avoid ambiguity,months are to be recorded using a three letter abbreviation (i.e., Jan, Feb, Mar., etc.). Years are to be
  • 74. recorded as four digits (i.e. 1998). NCI EN 9/8/05 …Correcting Paper CRF Entries Electronic Data Collection Process • Web-based interface • Sponsor or site dependent • Ensures data integrity: • Controls the ability to delete or alter previously entered data • Provides an audit trail for data changes • Protects the database from being tampered with • Ensures data preservation (e.g. automatic back ups)
  • 75. Process of Data Transfer to Sponsor Traditional (Paper) Electronic Traditional Data Transfer… • CRF Books developed by sponsor and supplied to the site for completion along with completion/instruction manual • Paper CRFs are either 2 or 3 part NCR (No Carbon Required paper) • Use a black or blue ballpoint pen for permanency – and PRESS HARD • At the time of a monitoring visit, CRFs are reviewed for adherence to completion guidelines and verified against source documents by the Monitor
  • 76. …Traditional Data Transfer … • During the monitoring visit, site staff make required corrections to CRFs • Verified/corrected CRFs are submitted to the sponsor, leaving a legible copy of the CRF at the site • e.g.: CRA may hand carry completed CRFs to the sponsor; • If data is not retrieved at the time of the monitoring visit, sponsor may want the CRFs submitted via mail or facsimile …Traditional Data Transfer • Sponsor enters the CRF data into a centralized database (generally done by 2 separate individuals, called double data entry) and reviews the data for errors • If inconsistencies are found, the sponsor generates data queries (forms may vary slightly from sponsor to sponsor) and sends to the site • Site staff investigates these queries and
  • 77. responds to them either directly on the data query form or on the CRF. The data correction is then re-submitted to the sponsor for entry into their database. Data Transfer: Electronic CRF (eCRF) • Site records data from source documents to the electronic database or the web interface • Data periodically electronically transmitted to Sponsor/CRO or automatically resides in Sponsor database • Real-time review of data performed by in house CRAs • Less frequent CRA visits • Electronic queries generated and sent to site • Database lock Cancer Central Clinical Database (C3D)…
  • 78. • C3D is an integrated clinical trial information system for the CCR • System is secure, compliant with regulatory requirements (21 CRF Part 11 ) • System is friendly and flexible for user …Cancer Central Clinical Database (C3D) • Designed to allow integration with the NCI extramural divisions and the NIH Clinical Center CRIS (Clinical Research Information System). • Currently this is being done with labs drawn at the Clinical Center. • Oversight is done by the Control and Configuration Management Group (CCMG) whose membership has clinical and IT expertise
  • 79. C3D Overview… • Based on commercial software produced by the Oracle Corporation called Oracle Clinical (OC) • Allows for Remote Data Capture (RDC) so that local and remote personnel enter and manage clinical data over a LAN, intranet, telephone line, or the Internet • Data can be electronically transferred to Sponsors (responsibility of DM IT team) …C3D Overview • A template set of master CRFs have been created to collect the data required by CCR protocols • Templates are reused and each study will only use the eCRFs that are appropriate and required for that study
  • 80. • Confidentiality statement signed at time of training J-Review • J-Review is a software product that allows us to get data out of C3D into a variety of reports • Numerous template reports have been developed including: • Adverse event summary • Demographics • Drug administration • Also allows for customized reports C3D eCRFs Resources • C3D Data Entry • Manual for the Completion of the NCI/CCR/C3D Case Report Forms
  • 81. • Access to J-review is granted once training occurs. https://ccrod.cancer.gov/confluence/display/CCRClinicalIT3/Lo gin https://ccrod.cancer.gov/confluence/display/CCRClinicalIT3/Tr aining+and+Education https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm https://octrials-rpt.nci.nih.gov/jreviewwww/sample_default.htm C3D Protocol Build Process… • OCD determines if a protocol will be built in C3D • Currently the following are built: • All CTEP-sponsored, non-cooperative group trials • All industry-sponsored trials with company agreement (if not, sponsor will then provided paper crfs) • All internal/non-sponsored interventional trials
  • 82. … C3D Protocol Build Process… … • Clinical Analyst (CA) receives protocol from IRB • CA identifies standard eCRFs to be used • CA develops the eCRF book and identifies if new eCRFs are needed • CA meets with research team to confirm eCRF book CR Doc Forms & Rules
  • 83. Testing Protocol Receiving Clinical Analyst Forms & Rules Building Initiation Meeting Control & Configurations Management Group (CCMG) $ $ $ Requirement Specification Clinical Programmers Clinical Programmers TeamClinical Analyst
  • 84. Research Team (PI, RN, DM) Clinical Analyst Activation Meeting Team Protocol Protocol Reqs Protocol Reqs Sign-off Protocol Reqs Sign-off Build Doc Protocol Reqs Sign-off
  • 85. Build Doc QC Doc Protocol Reqs Sign-off Build Doc QC Doc Protocol Reqs Sign-off Build Doc QC Doc Rep Doc Signoff Team Protocol Reqs Sign-off
  • 86. Build Doc QC Doc Rep Doc Signoff Change Request Report Building Signoff … C3D Protocol Build Process… … • Clinical Programmers (CP) build protocol (eCRFs) in C3D • Research team tests the build/enters data • Modifications made as needed
  • 87. • Protocol activated in C3D by CA/CP • eCRFS available for data entry CR Doc Forms & Rules Testing Protocol Receiving Clinical Analyst Forms & Rules Building Initiation Meeting Control & Configurations Management Group (CCMG) $
  • 88. $ $ Requirement Specification Clinical Programmers Clinical Programmers TeamClinical AnalystClinical Analyst Modification Activation Clinical Analyst Clinical Programmers Team Protocol Protocol Reqs Protocol Reqs Sign-off Protocol
  • 89. Reqs Sign-off Build Doc Protocol Reqs Sign-off Build Doc QC Doc Protocol Reqs Sign-off Build Doc QC Doc Protocol Reqs Sign-off Build Doc QC Doc
  • 90. Rep Doc Signoff Team Protocol Reqs Sign-off Build Doc QC Doc Rep Doc Signoff Change Request Report Building Signoff … C3D Protocol Build Process… …
  • 91. • If a protocol amendment requires changes in C3D (e.g. eligibility criteria), CA/CP will develop new eCRF • Team will review, sign-off • CA/CP will activate new eCRF Book CR Doc Forms & Rules Testing Protocol Amendment Clinical Analyst New Forms & Rules Building Activation of New Forms
  • 92. Control & Configurations Management Group (CCMG) $ $ $ Update Requirement Specification Clinical Programmers Clinical Programmers TeamClinical Analyst Activation Meeting Team Protocol Protocol Reqs Protocol
  • 93. Reqs Sign-off Protocol Reqs Sign-off Build Doc Protocol Reqs Sign-off Build Doc QC Doc Protocol Reqs Sign-off Build Doc QC Doc Protocol Reqs
  • 94. Sign-off Build Doc QC Doc Rep Doc Signoff Team Protocol Reqs Sign-off Build Doc QC Doc Rep Doc Signoff Change Request Report Building Signoff
  • 95. Training • There is specific training required for use of C3D and I-review. • See Training Sessions for date, time and location. https://ccrod.cancer.gov/confluence/display/CCRClinical IT3/Tr aining+and+Education Industry Sponsored Queries • Sponsor generates questions/queries: • During/end of a monitoring visit • After data sent to sponsor and reviewed/entered in sponsor’s database • Site corrects CRF: • During/between monitoring visit • May need to also sign-off on query form itself
  • 96. CTEP Sponsored CTMS Clarification • These are paper queries generated for CTEP-sponsored, CTMS-monitored trials • Sent every Monday by Theradex (contractor for CTEP) CTEP Sponsored CDS Rejection/Notification • These are electronic data queries for CTEP- sponsored, CDS-monitored clinical trials • CDS submitter receives notice • For studies in C3D, the notification will be sent to the CCR IT Programmer who transfers the data to CDU • CCR staff corrects data in the database and resubmits • Process occurs until data is loaded correctly in CDS
  • 97. Missing Data at Time of Transfer • Missing data elements • Source Document (SD) not supporting CRF • CRF not supporting SD • Referred to as: • Discrepancies • Queries • Clarifications • Identified by: • Sponsor • Database Sponsor Queries • Sponsor generates: • During/End of a monitoring visit • After data sent to sponsor and reviewed/entered in database • Site corrects CRF:
  • 98. • During/between monitoring visit • May need to sign-off on query Database Discrepancies • Failure of entered data to pass a validation check as applied by a database • Univariate discrepancy – single data element errors (e.g., not using provided pick-list, missing data in a field) • Multivariate discrepancy – multiple data element errors (e.g., male patient with + beta HCG) Quality Control According to GCP Section 5.1.3 quality control should be applied to each stage of
  • 99. data handling to ensure that all data are reliable and have been processed correctly. Assessing the QC/QA Process • Are staff checking their own work? • Are staff relying on others to check their work? • Does the organization have a QA plan for monitoring protocol adherence and data collection? • Are there SOPs related to data management? • How soon after a visit is a CRF completed? • Is all data, as defined in the protocol, captured from the source document to the CRF? Terminology • Quality Control
  • 100. • Quality Assurance • Quality Improvement Quality Control (QC) • Ongoing and concurrent review of subject data • Typically 100% • Checking your own work and work of others • Verify that data collected and abstracted: • Correctly entered onto CRF • Able to be found in source document • Follows regulations and guidelines • Individual team member level Quality Assurance (QA) • Planned, systematic check done at the branch or organizational level • Verifies: • Trial is performed as per the approved plan
  • 101. • Data generated is accurate • Identifies problems and trends: • Retrospective and involves sampling of subjects and data • Pulls all the pieces together to gain a picture (measurement) of compliance • Ensures staff is compliant with internal and external regulations/guidelines QA Activities • Internal monitoring/audits • Compile all data components and gain a measurement of compliance • Clarification monitoring • Assess for trends • Review clarifications responses before they are submitted to sponsor • Measure data inconsistencies and trends using a sampling of the data prior to audits/monitoring visits • Summarize QA findings and report to management
  • 102. • Identify learning needs QA Activities for CCR • The following are examples of QA activities for the CCR: • Office of the Clinical Director (OCD) • Internal monitoring/audits • Conduct audits per upon request, for PI sponsored studies • Clarification monitoring • Data Management Contractor • Develop QA tools • Summarize QA findings and report to management, education and training • Identify needs Quality Improvement (QI) • Result of QC and QA • Developing a plan includes:
  • 103. • Identifying root causes of problems • Intervening to reduce or eliminate these problems • Taking steps to correct the process(es) • Identifying trends and areas for improvement • Identifying solutions: • Assess work flow and time management activities • Develop tools for source documentation • Assess training needs • Involve appropriate staff in resolution • Implementing new/updated solution QI Activities for CCR • Team Level: • Based on QC activities: identifying trends • Based on audit/monitoring visit results • OCD CCR Level: • Based on audit/monitoring visit results
  • 104. • Guide in implementing processes for making corrective changes Responsibilities • Research Team responsibilities • Research Nurse responsibilities • Data Manager responsibilities Research Team • Ensure that all source data is documented in the Medical Record/Research Chart with accuracy, completeness, and consistency • Ensure the overall quality of the research data is verifiable and acceptable for sponsor submissions, publications, etc. • Review data discrepancy/clarification resolutions for accuracy, consistency and timely response Research Nurse….
  • 105. • Provide accurate and complete source documentation • Develop, implement, and maintain a team QC plan: • Establish a schedule of QC activities • Quality check source documentation, data abstraction, CRFs completion • Quality check of database • Verify function in database • Develop team quality improvement plan, as needed ….Research Nurse • Lead Team QC meeting: • Provide administrative updates • Provide patient updates • Perform QC on data/resolve issues • Review query/clarification:
  • 106. • Assign to Data Manager(s), if appropriate, to investigate and resolve or resolve yourself • Review and sign off: • Follow sponsor SOP Data Manager…. • Abstract data onto CRFs according to what is found in the source documents (Medical Record or Research Chart) and CRF Instruction Manual • Abstract data in a timely fashion, this includes entry into database • Code Adverse Events accurately utilizing the appropriate version of CTCAE, as per protocol ….Data Manager
  • 107. • Apply quality control checks at each stage of data handling • Ensure that data elements abstracted are complete and accurate • Contact Research Nurse for missing source data • Resolve discrepant data – ongoing • Utilize database report tools to assist with QC activities Guiding Principles • Source documents need to be accurate and complete • Data abstraction should occur in real time • QC/QI is the responsibility of every research team member • QC/QI should be completed on all protocol data for all protocols • QC/QI should be proactive and ongoing • Each team member should know and understand the roles and responsibility of each
  • 108. team member Resources • Guidelines for Good Clinical Practice. International Conference on Harmonisation (ICH). • http://www.ich.org • FDA, Title 21 CFR Part 11 • http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcf r/CFRSearch.cfm?CFRPart=11 http://www.ich.org/ http://www.ich.org/ http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSea rch.cfm?CFRPart=11 http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSea rch.cfm?CFRPart=11 Evaluation Please complete the evaluation form and fax to Elizabeth Ness at 301-496-9020.
  • 109. For questions, please contact Elizabeth Ness 301-451-2179 [email protected]v https://ccrod.cancer.gov/confluence/download/attachments/7104 1052/CDM_Evaluation.pdf Design Considerations for HIM Related Databases 1. Database design considerations for HIM Professionals are complex and vary widely when considering such factors as database purpose, setting, objectives, targeted audience, and output requirements. Using your past experience in the HIM industry as a guide, please select two topics (a primary and a secondary) that are of professional interest to you that will serve as the basis for developing a database during this semester. Your selection(s) must contain data elements that represent the entire continuum of the patient experience: administrative, clinical, and financial data elements. The instructor will distribute a Database Design Template/Model that will assist you with this process. You can modify the Template/Model to fit your topic. 2. When considering design issues for HIM related databases, we must remain mindful of the interrelationships between data elements. In order for an HIM-related database to be useable, linkages must exist between the tables contained within each database. Within your database design, which fields do you plan to use as the key data elements which will enable the various tables in your database to interact with each other?
  • 110. 3. When considering issues of database design, we are ultimately concerned with the clinical and regulatory needs of the intended audience. Please specifically identify the intended audience for the database that you intend to design in partial fulfillment of the requirements of this course. The Sample Diabetes Database (in red) on the subsequent page is provided as a guide for this exercise, although your own database design may follow any standard database design convention. Sample Database Design: Diabetes Master Patient Index Field NameData TypeField Size PtIdNo AutoNumber Integer PtLast Text 30 PtFirst Text 30 MRNO Number Double Gender Text 1 Race Text 10 DOB Date/Time mm/dd/yy;@ Encounter Field NameData TypeField Size EncounterId AutoNumber Integer PtIdNo Number Integer MRNO Number Double DOS Date/Time mm/dd/yy;@ Height Number Double Weight Number Double Physician Text 35
  • 111. Date Onset Date/Time mm/dd/yy;@ Insulin Dependent Text 1 A1CScore Number Double A1CRating Text 10,@ DietCompliance Text 10,@ Neuropathy Text 10,@ Retinopathy Text 10,@ BMI Number Double BMIRating Text 10,@ Insurance Field NameData TypeField Size InsuranceId AutoNumber Integer PtIdNo Number Integer InsPlanName Text 30 InsPlanNo Text 30 Sample Database Design 1: _________________________ Encounter Field NameData TypeField Size
  • 112. EncounterId AutoNumber Integer PtIdNo Number Integer Master Patient Index Field NameData TypeField Size PtIdNo AutoNumber Integer PtLast Text 30 PtFirst Text 30 MRNO Number Double Gender Text 1 Race Text 10 DOB Date/Time mm/dd/yy;@ Insurance Field NameData TypeField Size InsuranceId AutoNumber Integer PtIdNo Number Integer InsPlanName Text 30 InsPlanNo Text 30
  • 113. Sample Database Design 2: _____________________________________ Encounter Field NameData TypeField Size EncounterId AutoNumber Integer PtIdNo Number Integer Master Patient Index Field NameData TypeField Size PtIdNo AutoNumber Integer PtLast Text 30 PtFirst Text 30 MRNO Number Double Gender Text 1 Race Text 10 DOB Date/Time mm/dd/yy;@ Insurance Field NameData TypeField Size InsuranceId AutoNumber Integer PtIdNo Number Integer
  • 114. InsPlanName Text 30 InsPlanNo Text 30 Department of Health Informatics Health Information Management Program BINF 5520 Health Analytics Creating A Diabetes Tracking Relational Database Using Microsoft Access Fundamentals of Creating A Clinical Tracking Database Working With Database “Objects” Tables Forms Queries Reports Creating a Database to Track Patients With Diabetes Review of Database Fundamentals Questions and Answers
  • 115. How This Presentation Is Organized Step Number Will Always Be At Top Command Orientation in Red on Left Side Screen Shot In Middle Arrows will focus your attention. The Four Objects of Microsoft Access TABLES: The “Containers” That Hold The Data. We must DESIGN these tables before we can do anything, because they hold the data ! FORMS: The Forms allow us to display information to users easily. QUERIES: The Queries allow us to select data based on specific criteria. REPORTS: The Reports allow us to output data, either via printer or via a file, such as files that are in a PDF or XLS format. The Four Objects of Microsoft Access TABLES QUERIES REPORTS FORMS DATABASE The Five Steps of Creating A Relational Database 1. Create the Tables 2. Define The Database Relationship(s) 3. Create The MPI and Encounter Forms
  • 116. 4. Combine the MPI and Encounter Forms Into One Form 5. Start Using The Database ! 1. Create the Tables Master Patient Index (MPI) Field Name Field Type Field Length PtId AutoNumber Numeric PtLast ShortText 30 PtFirst ShortText 30 PtDOB Date MM/DD/YYYY MRNumber ShortText 12 PtSex ShortText 1 PtRace ShortText 1 And other fields…. Encounters Field Name Field Type Field Length EncounterID AutoNumber Numeric PtId Number Numeric DateOfService Date MMDDYYYY Provider ShortText 30 A1C Numeric Decimal,0 BP-Systolic Numeric Decimal,0 BP-Diastolic Numeric Decimal,0 Cholesterol Numeric Decimal,0 Retinopathy Yes/No Yes/No Neuropathy Yes/No Yes/No And other fields…. 2. Define The Database Relationship(s) Master Patient Index (MPI)
  • 117. Field Name Field Type Field Length PtId AutoNumber Numeric PtLast ShortText 30 PtFirst ShortText 30 PtDOB Date MM/DD/YYYY MRNumber ShortText 12 PtSex ShortText 1 PtRace ShortText 1 And other fields…. Encounters Field Name Field Type Field Length EncounterID AutoNumber Numeric PtId Number Numeric DateOfService Date MMDDYYYY Provider ShortText 30 A1C Numeric Decimal,0 BP-Systolic Numeric Decimal,0 BP-Diastolic Numeric Decimal,0 Cholesterol Numeric Decimal,0 Retinopathy Yes/No Yes/No Neuropathy Yes/No Yes/No And other fields…. 3. Create The MPI and Encounter Forms 4. Graft the MPI and Encounter Forms Together
  • 118. 5. Start Using The Database ! Step 1 Step 2 Create / Table Design Step 3 Create / Table Design Step 4 Create / Table Design Step 5 Create / Table Design
  • 119. Step 6 Create / Table Design Step 7 Create / Table Design Step 8 Home Step 9 Create / Table Design Step 10 Create / Table Design Step 11 Create / Table Design
  • 120. Step 12 Create / Table Design Step 13 Create / Table Design Step 14 Create / Table Design Step 15 Home Step 16 Database Tools / Relationships Step 17 Database Tools / Relationships
  • 121. Step 18 Database Tools / Relationships Step 19 Database Tools / Relationships Step 20 Database Tools / Relationships Step 21 Database Tools / Relationships Step 22 Database Tools / Relationships Step 23 Database Tools / Relationships
  • 122. Step 24 Create / Table Design Step 25 Create / Table Design Step 26 Create / Table Design Step 27 Database Tools / Relationships Step 28 Home Step 29 Home
  • 123. Step 30 Home / Right Click on Encounter / Left Click on Design View Step 31 Move to Provider Field and go to Tab at Bottom called Lookup Step 32 In Tab at Bottom called Lookup, Select Combo Box Step 33 In Row Source option, select lkpProvider Table developed earlier. Step 34 We now save the table by selecting Yes. Step 35
  • 124. We will now see the two tables and the relationship between the tables. Step 36 Design / Relationships / Save / Yes Step 37 We now see all three tables: MPI, Encounter, and lkpProvider Step 38 Create / Form Wizard Step 39 Create / Form Wizard Step 40 Create / Form Wizard Step 41
  • 125. Create / Form Wizard Step 42 Create / Form Wizard Step 43 Home / Right Click on Form MPI, Left Click on Design View Step 44 Highlight the four fields at the bottom left side of the screen and move to upper right. Step 45 Highlight the four fields at the bottom left side of the screen and move to upper right. Step 46 Highlight the four fields at the bottom left side of the screen
  • 126. and move to upper right. Step 47 Close the Form MPI and Left Click Yes to save the changes to the design of the form. Step 48 Highlight the four fields at the bottom left side of the screen and move to upper right. Step 49 Create, Form Wizard, Left Click on Form Encounter, Right Click on Design View Step 50 Click the double right arrows (>>) to move from Available to Selected and click Next. Step 51 Click Next to display all fields for this form.
  • 127. Step 52 Indicate that the form should be organized in a Tabular layout and click Next. Step 53 Name the form Encounter and click Finish. Step 54 The form will organize horizontally. You may need to adjust the width of fields to enhance the readability of the form. Step 55 Close the form and click Yes to save the changes to the design of the form Encounter. Step 56 On the left side of the screen, left click on the Form MPI and right click on Design View. Step 57
  • 128. You will see the large area under the MPI fields. This is where we will move the Encounter form so that we can simultaneously see the Patient and all associated encounters. Step 58 We then left click on the Form Encounter and we position it under the PtFirst field in the MPI form. Step 59 We then close Form MPI and we click Yes to save all changes to the design of this form. Step 60 We can now double click on the MPI form and we will see how the two forms have been joined together. Step 61 The screen below shows you the results of a database that has been populated. Note that the PtId in the MPI is the same as the PtId in the Encounter.