SlideShare a Scribd company logo
1 of 15
i2b2 Challenge 2009
and
Our Participation


 Irena Spasic
 Farzaneh Sarafraz
 Goran Nenadic
Summer 2009
About i2b2

Informatics for Integrating Biology & the Bedside
Related to NIH
3 Shared tasks so far
The task: Medication Extraction

Given                   Other
   Discharge reports      Event
Wanted                    Temporal
   Medication mention     Certainty
Dose
Mode of application
Frequency
Duration
Reason
   List/narrative
Example
Ciprofloxacin 500 mg q.6h. for remaining four 
  doses baby aspirin 81 mg daily , Lasix 40 mg 
  b.i.d. , for three days along with potassium 
  chloride slow release 20 mEq b.i.d. for three 
  days , Motrin 400 mg q.8h. p.r.n. Pain


The patient had received a total of five units 
  of packed red blood cells due to blood loss
Regulations/requirements

Medical requirements
  Drug taken by patient
  No allergies
  No food, water, diet, tobacco, alcohol, illicit drugs
Linguistic requirements
  the most informative base adjective phrase or the
    longest base noun phrase as reason
Required output

Event-based annotation
Repeat individual mention for each event
  “Aspirin for headache and for leg pain”
     Aspirin … headache
     Aspirin … leg pain
Semantic-level expectations
  NITROGLYCERIN 1/150 ( 0.4 MG ) 1 TAB SL q5min 
    x 3
Training and test data

Ground Truth, 27 records
  Manually annotated by “PG students”
  Scrutinised by the community
  Relative f-score: ~60%
Unannotated training data: 620
Test data: 260
Our system

Linguistic Preprocessing
  Input: plain ASCII
  Output: XML
Rules
  MinorThird
Template Filling
Preprocessing

Split sentences
  A sentence and paragraph breaker
  NaCTeM: sptoolkit.jar
POS tagging
  A part-of-speech tagger for English
  Tsujii: postagger
Parsing (chunking)
  CFG parser
  Tsujii: chunkparser
Rules

Medication Dictionary (> 1000)
Morphological: medication affix (> 100)
  -bicine, -caine, etc.
Precedes a mode
  Inhaler, supplement, etc.
Medication type
  Cardiac, cardiovascular (~100)
Symptoms (~100)
  Chest discomfort, etc.
Word lists and regular expressions

Dosage, mode, frequency
Duration (While, for, etc.)
Reason
  Head
     Diseases
     Symptoms (pain, agitation, etc.) ~20
     Inffixes (hyper-, -emia, etc.)
  Modifier (acute, chronic, etc.) <100
Time phrases, Body parts
Producing output

Remove allergies
Remove laboratory results
Merge labels
  <m>INSULIN</m> <m>GLARGINE</m>
  <f>after dialysis</f> on <f>Monday</f>­
    <f>Wednesday</f>­<f>Friday</f>

Remove negated medications
  “patient instructed not to take Viagra.”

etc.
Evaluation process

Small training data (27)
  Organisers
  Community
Gold standard test data (260)
  Annotated by participants
  Merge and tie-break
  Community
Silver data (620)
  Voting
Evaluation on ground truth
inexact                 horizontal      system­level    X       0.8776

inexact                 horizontal      patient­level   X       0.8928

inexact                 vertical        system­level    do      0.9150

inexact                 vertical        patient­level   do      0.9160

inexact                 vertical        system­level    f       0.9172

inexact                 vertical        patient­level   f       0.9197

inexact                 vertical        system­level    mo      0.9441

inexact                 vertical        patient­level   mo      0.9471

inexact                 vertical        system­level    m       0.9544

inexact                 vertical        patient­level   m       0.9519

inexact                 vertical        system­level    r       0.5260

inexact                 vertical        patient­level   r       0.3876

inexact                 vertical        system­level    du      0.7958

inexact                 vertical        patient­level   du      0.5846
Preliminary evaluation on test data
inexact    horizontal  system­level       X              0.7847

inexact    horizontal  patient­level     X               0.7755

inexact    vertical    system­level       do             0.8267

inexact    vertical    patient­level     do              0.8155

inexact    vertical    system­level       f              0.8349

inexact    vertical    patient­level     f               0.8289

inexact    vertical    system­level       mo             0.8359

inexact    vertical    patient­level     mo              0.8256

inexact    vertical    system­level       m              0.8533

inexact    vertical    patient­level     m               0.8541

inexact    vertical    system­level       r              0.3881

inexact    vertical    patient­level     r               0.3883

inexact    vertical    system­level       du             0.51

inexact    vertical    patient­level     du              0.4969

More Related Content

Viewers also liked

Viewers also liked (11)

Crf
CrfCrf
Crf
 
Bionlp09
Bionlp09Bionlp09
Bionlp09
 
BioNLP09 Winners
BioNLP09 WinnersBioNLP09 Winners
BioNLP09 Winners
 
Eoy
EoyEoy
Eoy
 
Edu
EduEdu
Edu
 
Tinsleys 7 Accomplishments
Tinsleys 7 AccomplishmentsTinsleys 7 Accomplishments
Tinsleys 7 Accomplishments
 
Artspoken.com
Artspoken.comArtspoken.com
Artspoken.com
 
Workshop negations
Workshop negationsWorkshop negations
Workshop negations
 
Defense
DefenseDefense
Defense
 
Olivia Contradictions
Olivia ContradictionsOlivia Contradictions
Olivia Contradictions
 
Ambiguity
AmbiguityAmbiguity
Ambiguity
 

Similar to I2b209

Best-Practices to Achieve Quality PV Loop Data
Best-Practices to Achieve Quality PV Loop DataBest-Practices to Achieve Quality PV Loop Data
Best-Practices to Achieve Quality PV Loop DataInsideScientific
 
sleep apnea in heart failure patients
sleep apnea in heart failure patientssleep apnea in heart failure patients
sleep apnea in heart failure patientsMaha Yousif
 
Knowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information ExtractionKnowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information ExtractionSujan Perera
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...David Peyruc
 
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analysesNuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analysesOARSI
 
REG Databases and Coding Working Group Meeting 26/09/15
REG Databases and Coding Working Group Meeting 26/09/15REG Databases and Coding Working Group Meeting 26/09/15
REG Databases and Coding Working Group Meeting 26/09/15Zoe Mitchell
 
Using text mining to inform genetic variant interpretation
Using text mining to inform genetic variant interpretationUsing text mining to inform genetic variant interpretation
Using text mining to inform genetic variant interpretationKarin Verspoor
 

Similar to I2b209 (11)

Best-Practices to Achieve Quality PV Loop Data
Best-Practices to Achieve Quality PV Loop DataBest-Practices to Achieve Quality PV Loop Data
Best-Practices to Achieve Quality PV Loop Data
 
EPR-delivered CPOE adoption rates predict reduced LOS
EPR-delivered CPOE adoption rates predict reduced LOSEPR-delivered CPOE adoption rates predict reduced LOS
EPR-delivered CPOE adoption rates predict reduced LOS
 
sleep apnea in heart failure patients
sleep apnea in heart failure patientssleep apnea in heart failure patients
sleep apnea in heart failure patients
 
Knowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information ExtractionKnowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information Extraction
 
Knowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information ExtractionKnowledge-driven Implicit Information Extraction
Knowledge-driven Implicit Information Extraction
 
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Proj...
 
Predicting Pharmacology
Predicting PharmacologyPredicting Pharmacology
Predicting Pharmacology
 
LAB MANUAL 2015
LAB MANUAL 2015LAB MANUAL 2015
LAB MANUAL 2015
 
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analysesNuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
Nuts & Bolts of Systematic Reviews, Meta-analyses & Network Meta-analyses
 
REG Databases and Coding Working Group Meeting 26/09/15
REG Databases and Coding Working Group Meeting 26/09/15REG Databases and Coding Working Group Meeting 26/09/15
REG Databases and Coding Working Group Meeting 26/09/15
 
Using text mining to inform genetic variant interpretation
Using text mining to inform genetic variant interpretationUsing text mining to inform genetic variant interpretation
Using text mining to inform genetic variant interpretation
 

Recently uploaded

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

I2b209

  • 1. i2b2 Challenge 2009 and Our Participation Irena Spasic Farzaneh Sarafraz Goran Nenadic Summer 2009
  • 2. About i2b2 Informatics for Integrating Biology & the Bedside Related to NIH 3 Shared tasks so far
  • 3. The task: Medication Extraction Given Other Discharge reports Event Wanted Temporal Medication mention Certainty Dose Mode of application Frequency Duration Reason List/narrative
  • 4. Example Ciprofloxacin 500 mg q.6h. for remaining four  doses baby aspirin 81 mg daily , Lasix 40 mg  b.i.d. , for three days along with potassium  chloride slow release 20 mEq b.i.d. for three  days , Motrin 400 mg q.8h. p.r.n. Pain The patient had received a total of five units  of packed red blood cells due to blood loss
  • 5. Regulations/requirements Medical requirements Drug taken by patient No allergies No food, water, diet, tobacco, alcohol, illicit drugs Linguistic requirements the most informative base adjective phrase or the longest base noun phrase as reason
  • 6. Required output Event-based annotation Repeat individual mention for each event “Aspirin for headache and for leg pain” Aspirin … headache Aspirin … leg pain Semantic-level expectations NITROGLYCERIN 1/150 ( 0.4 MG ) 1 TAB SL q5min  x 3
  • 7. Training and test data Ground Truth, 27 records Manually annotated by “PG students” Scrutinised by the community Relative f-score: ~60% Unannotated training data: 620 Test data: 260
  • 8. Our system Linguistic Preprocessing Input: plain ASCII Output: XML Rules MinorThird Template Filling
  • 9. Preprocessing Split sentences A sentence and paragraph breaker NaCTeM: sptoolkit.jar POS tagging A part-of-speech tagger for English Tsujii: postagger Parsing (chunking) CFG parser Tsujii: chunkparser
  • 10. Rules Medication Dictionary (> 1000) Morphological: medication affix (> 100) -bicine, -caine, etc. Precedes a mode Inhaler, supplement, etc. Medication type Cardiac, cardiovascular (~100) Symptoms (~100) Chest discomfort, etc.
  • 11. Word lists and regular expressions Dosage, mode, frequency Duration (While, for, etc.) Reason Head Diseases Symptoms (pain, agitation, etc.) ~20 Inffixes (hyper-, -emia, etc.) Modifier (acute, chronic, etc.) <100 Time phrases, Body parts
  • 12. Producing output Remove allergies Remove laboratory results Merge labels <m>INSULIN</m> <m>GLARGINE</m> <f>after dialysis</f> on <f>Monday</f>­ <f>Wednesday</f>­<f>Friday</f> Remove negated medications “patient instructed not to take Viagra.” etc.
  • 13. Evaluation process Small training data (27) Organisers Community Gold standard test data (260) Annotated by participants Merge and tie-break Community Silver data (620) Voting
  • 14. Evaluation on ground truth inexact                 horizontal      system­level    X       0.8776 inexact                 horizontal      patient­level   X       0.8928 inexact                 vertical        system­level    do      0.9150 inexact                 vertical        patient­level   do      0.9160 inexact                 vertical        system­level    f       0.9172 inexact                 vertical        patient­level   f       0.9197 inexact                 vertical        system­level    mo      0.9441 inexact                 vertical        patient­level   mo      0.9471 inexact                 vertical        system­level    m       0.9544 inexact                 vertical        patient­level   m       0.9519 inexact                 vertical        system­level    r       0.5260 inexact                 vertical        patient­level   r       0.3876 inexact                 vertical        system­level    du      0.7958 inexact                 vertical        patient­level   du      0.5846
  • 15. Preliminary evaluation on test data inexact    horizontal  system­level     X       0.7847 inexact    horizontal  patient­level     X      0.7755 inexact    vertical    system­level     do     0.8267 inexact    vertical    patient­level     do   0.8155 inexact    vertical    system­level     f     0.8349 inexact    vertical    patient­level     f     0.8289 inexact    vertical    system­level     mo     0.8359 inexact    vertical    patient­level     mo     0.8256 inexact    vertical    system­level     m     0.8533 inexact    vertical    patient­level     m     0.8541 inexact    vertical    system­level     r     0.3881 inexact    vertical    patient­level     r     0.3883 inexact    vertical    system­level     du     0.51 inexact    vertical    patient­level     du     0.4969