Lecture series on Clinical Research # Title Date 1 What research? And what CRC does Done 2 Developing research capacity Done 3
Governance & Ethics
Methodology & protocol development
IT in Clinical research
Medical writing & publication
8 Develop data resources 9
11 Investigator initiated research 12 Contract Research industry
Dr. Lim Teck Onn FRCP, M.Stat Information Technology in Clinical Research
Research capacity & competencies in CRC # Research competencies Sources 1 Governance, Ethics and Compliance CRC’s core 2 Subject matter expertise MOH & CRC 3 Methodology & protocol development CRC’s core 4 Project mgt & Research QA CRC’s core 5 Information Technology Vendors 6 Data management Partner 7 Biostatistics Partner 8 Medical writing, Publication & publishing Partner 9 Specialized functions: Clinical Lab, Safety, Logistics, etc Vendors
IT and data are for clinical research like what labs, animal house & tissue repository are for basic research; they are mission-critical resources
The data collected by a clinical research is its justification and purpose . No matter how perfect the protocol or experienced its investigators, if data cannot be reliably and accurately collected and stored for analysis, the research might as well not have been performed in the first place
Uses of IT in research
Research data collection: “eClinical”
Uses of IT in clinical research
Common IT tools supporting various research processes
# Research process Enabling IT tools 1 Conception & lit review Internet, Medline 2 Research design Sampling, Randomization, Sample size etc 3 Research conduct Research system; “eClinical”, Data warehousing 4 Statistics SAS, STATA 5 Medical writing Endnote 6 Publication Desktop publishing, Manuscript Central 7 Dissemination PLoS medicine (open access journal) CRC website , Public use dataset (Data download), Data charting tool 8 Research governance & compliance Research registration, NMRR (electronic research submission, processing & performance tracking)
Research data collection
Collecting and managing data is a fundamental activity in research, there are basically 2 ways to go about it
1. Primary data collection for research: eClinical
Collect data per protocol designed to meet research objective
Traditional paper CRF
IVRS/ Call centre
Electronic data capture (EDC): web-based application
2. Secondary uses of existing health databases: Data warehousing
Issues with data: Unstructured or non-standardized; Duplicates; Often require data in multiple databases; Data distortions (missing value, error)
Difference bet. Research and Routine health data Research data eg Trial Routine data eg Hospital, Lab etc 1 Data per protocol Multi-purpose 2 Dedicated system; usually off the shelf Huge complex system, often bespoke 3 Use one-off per project Ongoing concern 4 System validation required Costly luxury 5 Data QA built-in Costly luxury 6 Data professionally managed (MCSE, CCDM, Stats etc) Costly luxury 7 Anonymized data Security concern ++ 8 Cost++ Cost++ Must deliver high quality data, else you are fired Data for patient care; must deal with inherent limitation for research use
Uses of IT in research
Managing research data collection: “eClinical”
Accessing & processing routine health data for research: Data warehousing
The data collected by a clinical research is its justification and purpose . Not surprisingly, IT plays a key role in enhancing the research conduct
# Research mgt Research systems 1 Project mgt Clinical Trial Management Systems (CTMS) eInformed Consent Patient recruitment 2 Data collection & mgt eCRF or Electronic Data Capture (EDC) Electronic diaries, Lab & Imaging data Clinical Data Mgt system (CDMS) 3 Logistics mgt Randomisation and Trial Supply Management (RTMS)
Clinical Data Management System (CDMS) is a comprehensive research system to collect, manage, and review trial data
Capture, process and track the CRF and Query form
CRF and Query scanned as images for simultaneous processing and access
Reduce paper work
Reduce the time and cost
Reduce risk of document loss
CDM How many CRF have Logged-in? Has the site return the Query form? What is the turn around time for query? How many CRF have not yet scanned?
ClinDataTrack Generate and Track Data Clarification Form (DCF)
Track and Scan CRF
Central Randomization System
A system that generates the random allocation of treatment to successive patients within a clinical trial.
Often done centrally to prevent randomization being subverted
IVR and web-based technology provides real-time 24 x 7 randomisation
Linked to Clinical Data Management System (CDMS) and Clinical Supply system
IVR Central Randomisation IVRS Study Site Sponsor Monitor Clinical Supply CDMS
Clinical Supply System
A system to plan, monitor and control the entire supply chain process in clinical trial
Ensure all investigational product can be accounted for at the end of the study
IVR and web-based technology optimizes supply chain process and provide robust inventory control
IVR Clinical Supply System IVR Inventory Database Drug Distribution Depot Study Site 1. Dispense Call 2. Pack Number 3. Stock levels fall to trigger level 4. Consignment Request 5. Consignment Details 6. Shipment 7. Notification call: Arrival/ damaged packs 8. Update list of available packs Telephone Communication via IVR Automated electronic communication Physical Shipment
Safety Reporting System
A system to capture, store, process, maintain, classify and report adverse event data.
To enable source data providers (SDP) access & manage their listed patients
To enable other authorized non-SDP users to gain access to data for other purposes (transplant waiting list, research purpose, on-demand data charting etc)
Web applications for CRC projects # Patient registries Health databases 1 National Renal Registry National Suicide Registry M’sian Cardio-Thoracic Registry Healthcare Establishment & Workforce Survey 2 National Transplant Registry Nat. CVD ( ACS/ PCI ) Database National Neurology Registry Health professional registers 3 National Eye Database Nat. Dermatology Registry National Chest Registry National Medicines Use Survey 4 Malaysian National Neonata l Registry National Cancer Patient Registry National Urology Registry National Medical Device Survey 5 Malaysian Liver Registry Hematological Malignancy Reg. National ORL Registry National Medical care Survey 6 Malaysian GI Registry National OT Register National Nuclear Medicine Database JPN National Death Register 7 National Trauma Database Malaysian Registry of Intensive Care National Radiology Registry Post-Operative Mortality Review 8 Diabetes Registry of Malaysia Nat. Inflammatory Arthritis Registry National O&G Patient Registry Maternal Mortality Register 9 National Mental Health Registry Nat. Orthopedic Reg Malaysia National Paediatric Mortality Register
IT infrastructure supporting CRC projects Coordinating centre NTT Data Centre, Cyberjaya Connectivity between coordinating center and data centre conducted viaA Virtual Private Network (VPN) over broadband internet connection.
eClinical: Towards convergence
As more research activities are conducted and data collected using e-methods; the piecemeal approach in the past is converging to provide a seamless flow in order to save time, streamline workflow and improve accuracy
Combining solutions and access data & all functionalities through a single interface
Business-process driven solutions ; focus on workflow and how to simplify that for users using multiple technology solutions to do their jobs in a single project
Integration rather than homegenization: Better to combine the best solutions rather than attempt to rebuild a single new monolithic system that does everything but will likely have limitations in key areas.
Computer System Validation (CSV)
“… an ongoing process of establishing documented evidence which provides a high degree of assurance that a computerized system will consistently perform according to its predetermined specifications and quality attributes ”
Lifetime system validation goals Management control Controlled GCP work processes using computerized systems System reliability Consistent, intended performance of computerised systems Data integrity Secure, accurate, and attributable GCP e-data Auditable quality Documented evidence for control and quality of e-data and e-system e
Typical CSV package items Validation plan Application administrative SOPs & application configuration management logs Change control log, QA audit log, supplier reports & BDG minutes Needs analysis, RFP, contract, URS, SLAs User manuals, CVs & training records, dept. SOPs, problem/help logs Test summary report & updates Test cases, scripts, data & results logs Test plan(s) start-up & ongoing Users’ CSV package summary report
Uses of IT in research
Managing research data collection: “eClinical”
Accessing & processing routine health data for research: Data warehousing
Information needs for our healthcare Population Health Outcomes
Medical technology/ Devices
Quality of care
Clinical measures : BP, Lipid etc
Population illness burden
Disease incidence & prevalence
Healthcare System Resource Inputs Care Process Service Outputs
Where are the data? # Data domain Data sources 1 Illness burden Population health survey, Routine health service, Epidemiology research, Disease surveillance, Patient registries 2 Financing National Health Account; Health Econ research 3 Facilities & Services NHSI 4 Human resource Professional register; NHSI 5 Medicines NMUS, EMR (THIS, TPC) 6 Medical Technology NMDS 7 Patient Rx Patient registries, EMR (THIS, TPC) 8 Healthcare quality Patient registries, EMR (THIS, TPC), Quality of Care Indicators, Incident reporting 9 Healthcare activities ( in-patient, ambulatory care, surgery ) NHSI, NOTRE, Routine health service, EMR (THIS, TPC) 10 Health outcomes Patient registries , Vital registration ; Health Outcome research
A data warehouse is a repository of an organization's electronically stored data; it is designed to facilitate analysis and reporting
Health Data sources
Procurement (eg medicine, devices)
Population health surveys
Data warehousing & processing
Data cleaning (missing data & rectifying other data distortion)
Public use data
On demand data charting
Monitoring & Evaluation (M&E)
Routine statistical report
What information value? The problem is how best to learn from the data that is captured in our health databases? How to extract any potentially useful information from the available data to inform decision-making, solve problem or simply to discover new knowledge (research).
Terminology Data Raw facts generally stored as characters, words, symbols or measurements Information Processed data. By processing is meant anything done to the raw data from formal analysis to explanations supplied by the user. Knowledge Information applied to rules, experiences and relationships, with results that it can be used for decision making or problem solving. Data mining The science of searching large body of data seeking interesting and unsuspected patterns and structures Research Systematic investigation (usually based on data) to obtain generalizable new knowledge
From data to information to knowledge Data Information Knowledge Data processing Is our massive investment in healthcare IT producing the desired information value?
How is it done?
How to extract useful information from available health databases?
Who: Who should be doing what?
How: Getting from data to information?
Who does what? Elliot. “ Where is the knowledge in the information? And w here is the wisdom in the knowledge?” Message: So, don’t leave it to the IT professional alone Michael Healy, medical statistician ; on being ask about poor quality medical research. “ The difference between medical and agricultural research is that medical research is done by doctors but agricultural research is not done by farmers .” Message: Neither can you leave it to doctors Stalin, Russian dictator. “ There are lies, great lies and statistics ” Message: So, you cannot trust the statisticians either
It takes many different skills 1 ICT pro Data How best to employ available technology to manage & secure the data captured in the healthcare system? 2 Data mgt Data How best to extract the required data, and then to “clean” and process the data to enable the generation of information from the data ? 3 Stat Stats How best to apply available statistical methods to estimate parameters of interest ( information expressed as statistics)? 4 Doctor Text + # How best to interpret and act on the available data and information to manage my patients? 5 Manager Report + stat How best to interpret and act on the available information to manage (financial, HR, operational etc) the healthcare organization? 6 Researcher Stats How best to interpret the statistical information, and obtain generalizable knowledge?
Getting from data to information
1. Data processing
Data cleaning & editing
2. Data analysis
Data Information Knowledge Data processing
Process of finding records that refer to the same entity in one table
Phonemic name comparison
Non phonetic fuzzy matching
Linguistic name analysis
Specialized numeric comparisons such as distance comparison, date/time comparison
User defined rules
“ In order to make coded data available in a setting where a large subset of the information will reside in natural language documents, a technology called natural language understanding is required.
This technology allows a computer system to “read” free-text documents, to convert the language in these documents to predefined concepts and to capture these concepts in a coded form in a medical database ”
By Peter J. Haug
Multiple Data Dictionary
ICD-10, ICD-O (Oncology)
Anatomic Therapeutic Classification
Probabilistic matching of text against coded data (ICD-O)
Coding Adverse Event Text description of reported AE Acute myocardial infarction Increase in blood pressure Blood pressure increased Fatigue Blood pressure increase (hypertension) MedDRA Preferred Term (PT) code + label 10000891 Acute myocardial infarction 10005750 Blood pressure increased 10005750 Blood pressure increased 10016256 Fatigue 10020782 Hypertension NOS
Task of linking together information from one or more data sources that represents the same entity . This technique is used to determine whether two records represent the same real-world entity. (Peter Christen, Tim Churches, 2000)
Uses similarity-search technique in order to search for similar records (e.g misspelt character in a name)where it is able to determine only those that are actual duplicates.
Record Matching with JPN data Record Matching with NVRS data
Process of detecting, diagnosing, and editing faulty data ( missing, disallowed, inconsistent, out of the range, etc.)
Data editing: Correction of the data shown to be incorrect
Data analysis & reporting
1. Statistical methods
Refer presentation on Biostatistics
2. Data mining
The science of searching large body of data seeking interesting and unsuspected patterns and structures
Any computer method of automatic and continuous analysis of data, which turns it into useful information ( Edwards, Data mining 2002 )
The Collaborative Research experience Information Technology professionals Users (managers & doctors) Research Organization eg CRC DIY by ICT or manager/ doctor should be history Collaborative Research Experience
Uses of IT in research
Privacy & Confidentiality considerations Definitions 1. Privacy : An individual’s right to control identifiable health information (in healthcare or research context) 2. Confidentiality : Confidentiality is the corresponding duty to protect privacy right. It comprises those legal or ethical duties that varies in specific relationship, such as between doctor & patient; investigator and subject; custodian & donor 3. Security : This refers to the technological and administrative safeguards or tools to protect identifiable health information from unwarranted access or disclosure
What research guideline says…
Declaration of Helsinki 2008 Paragraph 11 & 23
“ It is the duty of physicians who participate in medical research to protect the life, health, dignity, integrity, right to self-determination, privacy and confidentiality of personal information of research subjects ”
“ Every precaution must be taken to protect the privacy of research subjects and the confidentiality of their personal information and to minimize the impact of the study on their physical, mental and social integrity”
And what the law says..
“ The law in all countries assumes that whenever people give personal information to health professionals caring for them, it is confidential as long as it remains personally identifiable”
Medical Act 1971.
Data Protection Act 2009
US has the most comprehensive set of regulations, and even then they are incomplete and inconsistent
Common Rule 45 CFR 46 Protection of human subjects
Health Insurance Portability and Accountability Act ( HIPAA ) 1996. 45 CFR 160/164 Standards for Privacy of Individually Identifiable Health Information
Friday, January 24 2003 BY LEE KAR YEAN in Kajang THE government is prepared to enact a privacy and data protection law to address the growing concern among Internet users about invasion of their privacy via the Web, Second Finance Minister Datuk Jamaludin Jarjis said. He said the government was drafting a bill for deliberation by parliament in the wake of concerns raised by consumers and in the interest of building confidence in the electronic market place. “ Any privacy and data protection law enacted cannot be perfect but we have several models that we can look at and improve upon. We need to provide some flexibility for adjustments within online services in line with changes in technology,” Jamaludin said. His remarks were contained in a speech read out on his behalf by Deputy Finance Minister Datuk Chan Kong Choy at the conclusion of a seminar on E-Commerce & the Law on Privacy and the launch of a book titled Privacy & Data Protection at Universiti Tenaga Nasional in Kajang yesterday. Jamaluddin said the government, in consultation with the public and private sectors, had come up with a comprehensive personal data protection law. The legislation would provide the mechanism for collecting, processing and using data held by the public and private sectors. “ It provides the individual a remedy in case of misuse of data. It seeks to protect the individual from unwanted or harmful use of their data. As such, the data privacy regime in Malaysia does not seek to cut off the flow of data but merely to see that they are collected and used in a responsible and accountable manner,” he added. Jamaluddin said that although the Internet provided conveniences from e-mail to online shopping and access to share market quotes, the potential abuses and invasion of privacy via the Internet were innumerable. “ What is less understood is that the Internet also collects a great deal of information about its users. As to who uses this information and how it is used forms the basis of many Internet privacy concerns,” he said. Jamaluddin also noted that different countries had adopted different legal methods and instruments to combat the invasion of privacy issue, with the European Union using the regulatory approach and the US adopting the self-regulatory one. Govt ready to enact privacy and data protection law
And what the technical guideline says..
The ethical and legal requirements to protect P&C naturally give rise to considerations of security; the technological & admin safeguards in place in healthcare institution or research organization to protect identifiable health information from unwarranted access or disclosure”
Malaysian Public sector management of ICT Security handbook. MAMPU 2001
Information Security in research org.
Are you having sleepless nights over this?
Every research database manager’s (and sponsor’s) nightmare
What it takes?
Security policy & procedures
Staff awareness and training
Security audit & ISO to ensure compliance
Technological Mechanisms to ensure Security
Control of external communication links and access
System backup and disaster recovery
An SMS containing additional password is sent to user’s mobile phone