SlideShare a Scribd company logo
Assisted Chart Abstraction:
a technique to help while
we wait for Nirvana
The Driver
	

Many entities within Northwestern Medicine (NM) want to
capture data about cancer patients treated at NM.	

! 

Research	


! 

Education	


! 

Operational/Outcomes Analysis 	


! 

QA/QC and Process Improvement	


! 

Marketing/Branding/Outreach Assessment
Challenges
	

NM has multiple EHR systems: Epic (NMFF), PowerChart
(NMH) and Mosaiq (Radiation Oncology).	

Not all clinical systems flow into one of the EHRs 	

All relevant data is not discretely captured during the course of
clinical care. For example, pathology diagnosis is recorded
in a textual document.
Northwestern Medicine Enterprise
Data Warehouse (NMEDW)
	

The NMEDW is:	

One stop shop for finding data from 40+ clinical systems, 10
years of data, and 2.2 Million patients (4 billion events!) 	

Optimized cross-system data marts representing major
biomedical entities and events: patients, providers,
encounters, labs, medications...and more.	

Intelligent structures, data representations, and the ability to
identify and correlate data across patients, events and data
types
Who is requesting
change?
	

The Northwestern Brain Tumor Institute*	

SPORE in Prostate Cancer	

Lynn Sage Breast Cancer Center	

Gastrointestinal Oncology Group	

Many others - typically disease-focused	

* We will focus mainly on use cases and workflows from BTI
What data do they need?
	

Demographics	

Diagnosis	

Treatment	

Disease Progression	

Survival
Old Solution
	

Data coordinator opens up EHR(s) and manually copies data
into a clinical database.	

Newer solution: Data coordinator pulls data from reports run
against the NMEDW and copies/extracts/annotates them
into the clinical database
Command + Tab Model
	


A manually curated database disconnected from EHR data.	

Depends on a data coordinator finding and manually copying
data from the EHR to a clinical database.
EHR
command + tab
Clinical Database
Command + Tab Model:
Pros
	

Depends on humans: 	

Humans are great at interpreting narrative documentation -where a significant portion of cancer clinical data
(unfortunately) resides.
Command + Tab Model:
Cons
	

Depends on humans: 	

Difficult for a human to be aware of every relevant
medical event of every patient within a cohort.	

Ignores the flux that occurs within EHRs: patient medical
histories merging and splitting.	

Humans get bored with rote copying discrete data.	

Humans quit and get new jobs.
New Solution: 
NBTI Data CaptureTool
	

The NBTI Data Capture Tool automatically pulls (via the
NMEDW) relevant EHR data for each patient.	

Data points discretely captured in the EHR need no further
review.	

Data points captured non-discreetly in textual documents are
abstracted via natural language process (NLP) and
presented to a data coordinator for review/revision.
Why not use reports?
	

Lots of valuable clinical data still resides in narrative
documents. 	

Not all discrete data contained within the EHR(s) has
been normalized into easily queryable structures in
the NMEDW. 	

Today an investigator cannot ask an NMEDW analyst
the question and get a quick result: 	

How many IDH1 negative glioma patients survived
longer than 5 years?
Waiting for Nirvana
	

NMEDW reports will not obsolete research clinical
databases until:	

! 

! 

Clinical IT optimizes the EHRs to discretely
capture all relevant data points (ain’t happenin’)	

The NMEDW normalizes all EHR data into easily
digestible formats and to reference
terminologies (limited by above step!)
Sources for the first
iteration
	

Epic: support discrete data capture of fundamental treatment/
diagnosis data points.	

Epic/MyChart: embed intake form.	

Cerner: support discrete data capture of pathology data points.
	

Cerner: support explicit association between pathology cases
and surgeries.	

MOSAIQ: support discrete data capture of site, laterality for
radiation therapies:.
Building the Foundation
Analyze the Data
	

Started with a list of data elements and sample data from a
neuro-oncologist and a neurosurgeon	

Determined obtainability of each data element:	

! 

Discrete in the EHR and in the EDW.	


! 

Discrete in the EHR but not in the EDW.	


! 

narrative document in the EHR and in the EDW.	


! 

narrative document in the EHR but not in the EDW.
Build an EDW 
Data Mart
	

Engaged the NMEDW team to build a NBTI-dedicated data mart and
extract transform load (ETL) script:	

patients

encounters	


medications

surgeries	


surgery notes

pathology cases	


gamma knifes

radiation therapies	


imaging exams

progress notes	


labs
Build a Clinical Database
	

Build a clinical database mirroring the structure of the
EDW data mart in a PostgreSQL server 	

Add database structures to allow for the layering of
curated data on top of data imported from the EDW.
Import Data
	

Expose the data in the EDW data mart as
web services via SQL Reporting Service
reports.	

Automate via cron jobs the pull of data
into the clinical database from the EDW
with shared EDW web service adapter
code.
Patients
	

The criteria for inclusion within the NBTI system is
determined by a list of ICD diagnosis codes. Criteria
could alternatively be determined by consent to a
protocol.	

Pull from the NMEDW patient name, birth date, MRN(s),
gender, ethnicity, race, death date and last seen date
(across Northwestern Medicine - NM).
Integrate with Specimen
Inventory Data
	

Prepare data for migration into PathCore's specimen
inventory system BSI2.	

Allow for ad hoc query exploration of specimen
inventory based on clinical data points. 	

Standardizing the structure of clinical data captures
across sites makes this possible.
NLP
	

Build an NLP pipeline to abstract from the flow of
narrative documents and textual fragments discrete
data points.	

Use the Stanford NLP library for chunking and sentence
splitting.	

Use the lingscope library to parse the negation scope of
sentences.	

Use the NCI metathesaurus for synonym lookups and
attaching codes
Electronic Intake Form
	

Deploy an electronic intake that can be filled out by a
patient before or at their first clinic visit.	

email is sent, can be filled out by web browser, tablet or
(painfully) on a smart phone
Navbar
	

Sections
The Results
Biopsy, Surgery and
Pathology Diagnosis
	

Pull from the NMEDW NM biopsies, surgeries, surgical
procedure reports and pathology cases (inside and out). 	

Abstract and allow for the confirmation/revision of surgery
type, site, laterality, pathology diagnosis, grade,
recurrence, anatomical location of primary, cancer staging
and pathology test results.
Present the NLP
Abstractions
	


Present NLP-derived abstractions as queues of work that
the data coordinator needs to confirm or revise.
Surgeries and Pathology
Cases
Surgery Detail
Pathology Case Detail
With Context
Gamma Knife  Radiation
Therapy
	

Pull from the NMEDW NM radiation therapies. 	

Abstract and allow for the confirmation/revision of
site and laterality.
Intravenous
Chemotherapy
	

Pull from the NMEDW NM intravenous
chemotherapy treatments (from Intellidose and
Epic Beacon).
Labs and Other
Medications
	

Labs	


• 

Pull from the NMEDW NM labs.	


Other Medications	


• 

Pull from the NMEDW NM non-intravenous,
prescribed/ordered medications.	


• 

Allow for confirmation/revision of drug, route,
duration, amount, patient parameter and
administered.
Imaging Exams and Clinic
Visits
	

Imaging Exams	


• 
• 

Pull from the NMEDW NM imaging exams. 	

Abstract and allow for confirmation/revision of
response/progression declarations and lesion
measurements.	


Clinic Visits	


• 

Pull from the NMEDW NM clinic visit notes. Abstract
and allow for confirmation/revision of performance
status declarations and tracking of outside treatments.
Reporting
	

Ad-hoc query exploration of data.	

Integrate NMH quality metrics.	

Generate Kaplan Meier survival curves against SEER
data on demand
Reporting
Export
	


Export into Word, Excel, CSV for analysis and
visualization by SAS, SPSS, R, etc

More Related Content

What's hot

Requirements
RequirementsRequirements
Requirementskerms
 
ROLE OF IT IN HOSPITALS
ROLE OF IT IN HOSPITALSROLE OF IT IN HOSPITALS
ROLE OF IT IN HOSPITALS
GAURAV PRAKASH
 
A KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEM
A KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEMA KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEM
A KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEM
ijaia
 
Elmallah june27 11am_room230_a
Elmallah june27 11am_room230_aElmallah june27 11am_room230_a
Elmallah june27 11am_room230_a
DataWorks Summit
 
Mexico Pittsburgh Ece Introduction
Mexico Pittsburgh Ece IntroductionMexico Pittsburgh Ece Introduction
Mexico Pittsburgh Ece Introductionckuyehar
 
Traditional Text-only vs. Multimedia Enhanced Radiology Reporting
Traditional Text-only vs. Multimedia Enhanced Radiology ReportingTraditional Text-only vs. Multimedia Enhanced Radiology Reporting
Traditional Text-only vs. Multimedia Enhanced Radiology Reporting
Carestream
 
Case Study: Advanced analytics in healthcare using unstructured data
Case Study: Advanced analytics in healthcare using unstructured dataCase Study: Advanced analytics in healthcare using unstructured data
Case Study: Advanced analytics in healthcare using unstructured data
Damo Consulting Inc.
 
Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies
KCR
 
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Pistoia Alliance
 
Information Technology in Hospitals
Information Technology in HospitalsInformation Technology in Hospitals
Information Technology in Hospitals
Vijay Raj Yanamala
 
Healthcare information technology
Healthcare information technologyHealthcare information technology
Healthcare information technologyDr.Vijay Talla
 
2016-Oncology-Case-Study
2016-Oncology-Case-Study2016-Oncology-Case-Study
2016-Oncology-Case-StudyDana Alexander
 
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Cirdan
 
Oncology transcription-case-study
Oncology transcription-case-studyOncology transcription-case-study
Oncology transcription-case-studysumes jack
 
IRJET-Cloud Based Patient Referral System
IRJET-Cloud Based Patient Referral SystemIRJET-Cloud Based Patient Referral System
IRJET-Cloud Based Patient Referral System
IRJET Journal
 
Group project health_care_informatics[2
Group project health_care_informatics[2Group project health_care_informatics[2
Group project health_care_informatics[2guest1e610e
 
E health and the future, promise or peril
E health and the future, promise or perilE health and the future, promise or peril
E health and the future, promise or perileduardo guagliardi
 
Clinical information system
Clinical information systemClinical information system
Clinical information system
NUR3563Team1
 
Clinical Data Collaboration Across the Enterprise
Clinical Data Collaboration Across the Enterprise Clinical Data Collaboration Across the Enterprise
Clinical Data Collaboration Across the Enterprise
Carestream
 
Clinical information systems
Clinical information systemsClinical information systems
Clinical information systems
Mark Wardle
 

What's hot (20)

Requirements
RequirementsRequirements
Requirements
 
ROLE OF IT IN HOSPITALS
ROLE OF IT IN HOSPITALSROLE OF IT IN HOSPITALS
ROLE OF IT IN HOSPITALS
 
A KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEM
A KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEMA KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEM
A KNOWLEDGE BASED AUTOMATIC RADIATION TREATMENT PLAN ALERT SYSTEM
 
Elmallah june27 11am_room230_a
Elmallah june27 11am_room230_aElmallah june27 11am_room230_a
Elmallah june27 11am_room230_a
 
Mexico Pittsburgh Ece Introduction
Mexico Pittsburgh Ece IntroductionMexico Pittsburgh Ece Introduction
Mexico Pittsburgh Ece Introduction
 
Traditional Text-only vs. Multimedia Enhanced Radiology Reporting
Traditional Text-only vs. Multimedia Enhanced Radiology ReportingTraditional Text-only vs. Multimedia Enhanced Radiology Reporting
Traditional Text-only vs. Multimedia Enhanced Radiology Reporting
 
Case Study: Advanced analytics in healthcare using unstructured data
Case Study: Advanced analytics in healthcare using unstructured dataCase Study: Advanced analytics in healthcare using unstructured data
Case Study: Advanced analytics in healthcare using unstructured data
 
Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies Who needs fast data? - Journal for Clinical Studies
Who needs fast data? - Journal for Clinical Studies
 
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
 
Information Technology in Hospitals
Information Technology in HospitalsInformation Technology in Hospitals
Information Technology in Hospitals
 
Healthcare information technology
Healthcare information technologyHealthcare information technology
Healthcare information technology
 
2016-Oncology-Case-Study
2016-Oncology-Case-Study2016-Oncology-Case-Study
2016-Oncology-Case-Study
 
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
 
Oncology transcription-case-study
Oncology transcription-case-studyOncology transcription-case-study
Oncology transcription-case-study
 
IRJET-Cloud Based Patient Referral System
IRJET-Cloud Based Patient Referral SystemIRJET-Cloud Based Patient Referral System
IRJET-Cloud Based Patient Referral System
 
Group project health_care_informatics[2
Group project health_care_informatics[2Group project health_care_informatics[2
Group project health_care_informatics[2
 
E health and the future, promise or peril
E health and the future, promise or perilE health and the future, promise or peril
E health and the future, promise or peril
 
Clinical information system
Clinical information systemClinical information system
Clinical information system
 
Clinical Data Collaboration Across the Enterprise
Clinical Data Collaboration Across the Enterprise Clinical Data Collaboration Across the Enterprise
Clinical Data Collaboration Across the Enterprise
 
Clinical information systems
Clinical information systemsClinical information systems
Clinical information systems
 

Similar to Using NLP and curation to make clinical data available for research

Components And Workflow Of A Digital Radiology Department
Components And Workflow Of A Digital Radiology DepartmentComponents And Workflow Of A Digital Radiology Department
Components And Workflow Of A Digital Radiology Department
Monief Eid,Prince2,Prosci, Lean Six Sigma &ITIL
 
Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...
Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...
Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...
IRJET Journal
 
How Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT ChallengesHow Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT Challenges
Objective Medical Systems
 
Uses of computer in medical field.docx
Uses of computer in medical field.docxUses of computer in medical field.docx
Uses of computer in medical field.docx
RABIAJAWAD1
 
Electronic health records and machine learning
Electronic health records and machine learningElectronic health records and machine learning
Electronic health records and machine learning
Eman Abdelrazik
 
Forum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeForum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decade
Joaquin Dopazo
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsChimezie Ogbuji
 
NegBio: a high-performance tool for negation and uncertainty detection in rad...
NegBio: a high-performance tool for negation and uncertainty detection in rad...NegBio: a high-performance tool for negation and uncertainty detection in rad...
NegBio: a high-performance tool for negation and uncertainty detection in rad...
Yifan Peng
 
Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...
Pubrica
 
A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...
IRJET Journal
 
Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...
Pubrica
 
Intelligent data analysis for medicinal diagnosis
Intelligent data analysis for medicinal diagnosisIntelligent data analysis for medicinal diagnosis
Intelligent data analysis for medicinal diagnosis
IRJET Journal
 
Kaiser Permanente HealthConnect - EHR and SNOMED
Kaiser Permanente HealthConnect - EHR and SNOMEDKaiser Permanente HealthConnect - EHR and SNOMED
Kaiser Permanente HealthConnect - EHR and SNOMED
Health Informatics New Zealand
 
Using data from hospital information systems to improve emergency department ...
Using data from hospital information systems to improve emergency department ...Using data from hospital information systems to improve emergency department ...
Using data from hospital information systems to improve emergency department ...Agus Mutamakin
 
AAPM Foster July 2009
AAPM Foster July 2009AAPM Foster July 2009
AAPM Foster July 2009
Ian Foster
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Remedy Informatics
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
Edgewater
 
transformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdftransformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdf
Paris Women in Machine Learning and Data Science
 
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906Mark Gusack
 
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
IAEME Publication
 

Similar to Using NLP and curation to make clinical data available for research (20)

Components And Workflow Of A Digital Radiology Department
Components And Workflow Of A Digital Radiology DepartmentComponents And Workflow Of A Digital Radiology Department
Components And Workflow Of A Digital Radiology Department
 
Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...
Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...
Disease detection for multilabel big dataset using MLAM, Naive Bayes, Adaboos...
 
How Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT ChallengesHow Modern Cardiologists Are Overcoming HIT Challenges
How Modern Cardiologists Are Overcoming HIT Challenges
 
Uses of computer in medical field.docx
Uses of computer in medical field.docxUses of computer in medical field.docx
Uses of computer in medical field.docx
 
Electronic health records and machine learning
Electronic health records and machine learningElectronic health records and machine learning
Electronic health records and machine learning
 
Forum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decadeForum on Personalized Medicine: Challenges for the next decade
Forum on Personalized Medicine: Challenges for the next decade
 
Semantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical InformaticsSemantic Web Technologies as a Framework for Clinical Informatics
Semantic Web Technologies as a Framework for Clinical Informatics
 
NegBio: a high-performance tool for negation and uncertainty detection in rad...
NegBio: a high-performance tool for negation and uncertainty detection in rad...NegBio: a high-performance tool for negation and uncertainty detection in rad...
NegBio: a high-performance tool for negation and uncertainty detection in rad...
 
Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...
 
A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...A New Real Time Clinical Decision Support System Using Machine Learning for C...
A New Real Time Clinical Decision Support System Using Machine Learning for C...
 
Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...Systematic review of quality standards for medical devices and practice measu...
Systematic review of quality standards for medical devices and practice measu...
 
Intelligent data analysis for medicinal diagnosis
Intelligent data analysis for medicinal diagnosisIntelligent data analysis for medicinal diagnosis
Intelligent data analysis for medicinal diagnosis
 
Kaiser Permanente HealthConnect - EHR and SNOMED
Kaiser Permanente HealthConnect - EHR and SNOMEDKaiser Permanente HealthConnect - EHR and SNOMED
Kaiser Permanente HealthConnect - EHR and SNOMED
 
Using data from hospital information systems to improve emergency department ...
Using data from hospital information systems to improve emergency department ...Using data from hospital information systems to improve emergency department ...
Using data from hospital information systems to improve emergency department ...
 
AAPM Foster July 2009
AAPM Foster July 2009AAPM Foster July 2009
AAPM Foster July 2009
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
transformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdftransformers_multimodal_ehr.pdf
transformers_multimodal_ehr.pdf
 
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
 
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
OPTIMIZED PREDICTION IN MEDICAL DIAGNOSIS USING DNA SEQUENCES AND STRUCTURE I...
 

More from Warren Kibbe

CCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptxCCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptx
Warren Kibbe
 
Big Data Training for Cancer Research, Purdue, May 2023
Big Data Training for Cancer Research, Purdue, May 2023Big Data Training for Cancer Research, Purdue, May 2023
Big Data Training for Cancer Research, Purdue, May 2023
Warren Kibbe
 
CCDI Overview November 2022
CCDI Overview November 2022CCDI Overview November 2022
CCDI Overview November 2022
Warren Kibbe
 
RADx-UP CDCC Overview November 2022
RADx-UP CDCC Overview November 2022RADx-UP CDCC Overview November 2022
RADx-UP CDCC Overview November 2022
Warren Kibbe
 
CCDI Kibbe Big Data Training May 2022
CCDI Kibbe Big Data Training May 2022CCDI Kibbe Big Data Training May 2022
CCDI Kibbe Big Data Training May 2022
Warren Kibbe
 
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Warren Kibbe
 
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Warren Kibbe
 
RADx-UP CDCC presentation for the NIH Disaster Interest Group
RADx-UP CDCC presentation for the NIH Disaster Interest GroupRADx-UP CDCC presentation for the NIH Disaster Interest Group
RADx-UP CDCC presentation for the NIH Disaster Interest Group
Warren Kibbe
 
DCHI webinar on N3C January 2021
DCHI webinar on N3C January 2021DCHI webinar on N3C January 2021
DCHI webinar on N3C January 2021
Warren Kibbe
 
NCATS CTSA N3C
NCATS CTSA N3C NCATS CTSA N3C
NCATS CTSA N3C
Warren Kibbe
 
NAACCR June 2020
NAACCR June 2020NAACCR June 2020
NAACCR June 2020
Warren Kibbe
 
NCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncologyNCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncology
Warren Kibbe
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
Warren Kibbe
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
Warren Kibbe
 
Technology and connected health for population science kibbe duke jan 2020
Technology and connected health for population science kibbe duke jan 2020Technology and connected health for population science kibbe duke jan 2020
Technology and connected health for population science kibbe duke jan 2020
Warren Kibbe
 
Super computing 19 Cancer Computing Workshop Keynote
Super computing 19 Cancer Computing Workshop KeynoteSuper computing 19 Cancer Computing Workshop Keynote
Super computing 19 Cancer Computing Workshop Keynote
Warren Kibbe
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
Warren Kibbe
 
Data supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbeData supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbe
Warren Kibbe
 
Role of data in precision oncology
Role of data in precision oncologyRole of data in precision oncology
Role of data in precision oncology
Warren Kibbe
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
Warren Kibbe
 

More from Warren Kibbe (20)

CCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptxCCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptx
 
Big Data Training for Cancer Research, Purdue, May 2023
Big Data Training for Cancer Research, Purdue, May 2023Big Data Training for Cancer Research, Purdue, May 2023
Big Data Training for Cancer Research, Purdue, May 2023
 
CCDI Overview November 2022
CCDI Overview November 2022CCDI Overview November 2022
CCDI Overview November 2022
 
RADx-UP CDCC Overview November 2022
RADx-UP CDCC Overview November 2022RADx-UP CDCC Overview November 2022
RADx-UP CDCC Overview November 2022
 
CCDI Kibbe Big Data Training May 2022
CCDI Kibbe Big Data Training May 2022CCDI Kibbe Big Data Training May 2022
CCDI Kibbe Big Data Training May 2022
 
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
 
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
 
RADx-UP CDCC presentation for the NIH Disaster Interest Group
RADx-UP CDCC presentation for the NIH Disaster Interest GroupRADx-UP CDCC presentation for the NIH Disaster Interest Group
RADx-UP CDCC presentation for the NIH Disaster Interest Group
 
DCHI webinar on N3C January 2021
DCHI webinar on N3C January 2021DCHI webinar on N3C January 2021
DCHI webinar on N3C January 2021
 
NCATS CTSA N3C
NCATS CTSA N3C NCATS CTSA N3C
NCATS CTSA N3C
 
NAACCR June 2020
NAACCR June 2020NAACCR June 2020
NAACCR June 2020
 
NCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncologyNCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncology
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
 
Technology and connected health for population science kibbe duke jan 2020
Technology and connected health for population science kibbe duke jan 2020Technology and connected health for population science kibbe duke jan 2020
Technology and connected health for population science kibbe duke jan 2020
 
Super computing 19 Cancer Computing Workshop Keynote
Super computing 19 Cancer Computing Workshop KeynoteSuper computing 19 Cancer Computing Workshop Keynote
Super computing 19 Cancer Computing Workshop Keynote
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Data supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbeData supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbe
 
Role of data in precision oncology
Role of data in precision oncologyRole of data in precision oncology
Role of data in precision oncology
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 

Recently uploaded

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Using NLP and curation to make clinical data available for research

  • 1. Assisted Chart Abstraction: a technique to help while we wait for Nirvana
  • 2. The Driver Many entities within Northwestern Medicine (NM) want to capture data about cancer patients treated at NM. !  Research !  Education !  Operational/Outcomes Analysis !  QA/QC and Process Improvement !  Marketing/Branding/Outreach Assessment
  • 3. Challenges NM has multiple EHR systems: Epic (NMFF), PowerChart (NMH) and Mosaiq (Radiation Oncology). Not all clinical systems flow into one of the EHRs All relevant data is not discretely captured during the course of clinical care. For example, pathology diagnosis is recorded in a textual document.
  • 4. Northwestern Medicine Enterprise Data Warehouse (NMEDW) The NMEDW is: One stop shop for finding data from 40+ clinical systems, 10 years of data, and 2.2 Million patients (4 billion events!) Optimized cross-system data marts representing major biomedical entities and events: patients, providers, encounters, labs, medications...and more. Intelligent structures, data representations, and the ability to identify and correlate data across patients, events and data types
  • 5. Who is requesting change? The Northwestern Brain Tumor Institute* SPORE in Prostate Cancer Lynn Sage Breast Cancer Center Gastrointestinal Oncology Group Many others - typically disease-focused * We will focus mainly on use cases and workflows from BTI
  • 6. What data do they need? Demographics Diagnosis Treatment Disease Progression Survival
  • 7. Old Solution Data coordinator opens up EHR(s) and manually copies data into a clinical database. Newer solution: Data coordinator pulls data from reports run against the NMEDW and copies/extracts/annotates them into the clinical database
  • 8. Command + Tab Model A manually curated database disconnected from EHR data. Depends on a data coordinator finding and manually copying data from the EHR to a clinical database.
  • 9. EHR
  • 12. Command + Tab Model: Pros Depends on humans: Humans are great at interpreting narrative documentation -where a significant portion of cancer clinical data (unfortunately) resides.
  • 13. Command + Tab Model: Cons Depends on humans: Difficult for a human to be aware of every relevant medical event of every patient within a cohort. Ignores the flux that occurs within EHRs: patient medical histories merging and splitting. Humans get bored with rote copying discrete data. Humans quit and get new jobs.
  • 14. New Solution: NBTI Data CaptureTool The NBTI Data Capture Tool automatically pulls (via the NMEDW) relevant EHR data for each patient. Data points discretely captured in the EHR need no further review. Data points captured non-discreetly in textual documents are abstracted via natural language process (NLP) and presented to a data coordinator for review/revision.
  • 15. Why not use reports? Lots of valuable clinical data still resides in narrative documents. Not all discrete data contained within the EHR(s) has been normalized into easily queryable structures in the NMEDW. Today an investigator cannot ask an NMEDW analyst the question and get a quick result: How many IDH1 negative glioma patients survived longer than 5 years?
  • 16. Waiting for Nirvana NMEDW reports will not obsolete research clinical databases until: !  !  Clinical IT optimizes the EHRs to discretely capture all relevant data points (ain’t happenin’) The NMEDW normalizes all EHR data into easily digestible formats and to reference terminologies (limited by above step!)
  • 17. Sources for the first iteration Epic: support discrete data capture of fundamental treatment/ diagnosis data points. Epic/MyChart: embed intake form. Cerner: support discrete data capture of pathology data points. Cerner: support explicit association between pathology cases and surgeries. MOSAIQ: support discrete data capture of site, laterality for radiation therapies:.
  • 19. Analyze the Data Started with a list of data elements and sample data from a neuro-oncologist and a neurosurgeon Determined obtainability of each data element: !  Discrete in the EHR and in the EDW. !  Discrete in the EHR but not in the EDW. !  narrative document in the EHR and in the EDW. !  narrative document in the EHR but not in the EDW.
  • 20. Build an EDW Data Mart Engaged the NMEDW team to build a NBTI-dedicated data mart and extract transform load (ETL) script: patients encounters medications surgeries surgery notes pathology cases gamma knifes radiation therapies imaging exams progress notes labs
  • 21. Build a Clinical Database Build a clinical database mirroring the structure of the EDW data mart in a PostgreSQL server Add database structures to allow for the layering of curated data on top of data imported from the EDW.
  • 22. Import Data Expose the data in the EDW data mart as web services via SQL Reporting Service reports. Automate via cron jobs the pull of data into the clinical database from the EDW with shared EDW web service adapter code.
  • 23. Patients The criteria for inclusion within the NBTI system is determined by a list of ICD diagnosis codes. Criteria could alternatively be determined by consent to a protocol. Pull from the NMEDW patient name, birth date, MRN(s), gender, ethnicity, race, death date and last seen date (across Northwestern Medicine - NM).
  • 24. Integrate with Specimen Inventory Data Prepare data for migration into PathCore's specimen inventory system BSI2. Allow for ad hoc query exploration of specimen inventory based on clinical data points. Standardizing the structure of clinical data captures across sites makes this possible.
  • 25. NLP Build an NLP pipeline to abstract from the flow of narrative documents and textual fragments discrete data points. Use the Stanford NLP library for chunking and sentence splitting. Use the lingscope library to parse the negation scope of sentences. Use the NCI metathesaurus for synonym lookups and attaching codes
  • 26. Electronic Intake Form Deploy an electronic intake that can be filled out by a patient before or at their first clinic visit. email is sent, can be filled out by web browser, tablet or (painfully) on a smart phone
  • 29. Biopsy, Surgery and Pathology Diagnosis Pull from the NMEDW NM biopsies, surgeries, surgical procedure reports and pathology cases (inside and out). Abstract and allow for the confirmation/revision of surgery type, site, laterality, pathology diagnosis, grade, recurrence, anatomical location of primary, cancer staging and pathology test results.
  • 30. Present the NLP Abstractions Present NLP-derived abstractions as queues of work that the data coordinator needs to confirm or revise.
  • 35. Gamma Knife Radiation Therapy Pull from the NMEDW NM radiation therapies. Abstract and allow for the confirmation/revision of site and laterality.
  • 36. Intravenous Chemotherapy Pull from the NMEDW NM intravenous chemotherapy treatments (from Intellidose and Epic Beacon).
  • 37. Labs and Other Medications Labs •  Pull from the NMEDW NM labs. Other Medications •  Pull from the NMEDW NM non-intravenous, prescribed/ordered medications. •  Allow for confirmation/revision of drug, route, duration, amount, patient parameter and administered.
  • 38. Imaging Exams and Clinic Visits Imaging Exams •  •  Pull from the NMEDW NM imaging exams. Abstract and allow for confirmation/revision of response/progression declarations and lesion measurements. Clinic Visits •  Pull from the NMEDW NM clinic visit notes. Abstract and allow for confirmation/revision of performance status declarations and tracking of outside treatments.
  • 39. Reporting Ad-hoc query exploration of data. Integrate NMH quality metrics. Generate Kaplan Meier survival curves against SEER data on demand
  • 41. Export Export into Word, Excel, CSV for analysis and visualization by SAS, SPSS, R, etc