Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Extraction of Drug-Drug Interactions from Biomedical Texts

535 views

Published on

íAuthors: Isabel Segura-Bedmar, Paloma Martínez, María Herrero-Zazo
SemEval-2013 Task 9: Semantic Evaluation Exercices, International Workshop on Semantic Evaluation, Atlanta, Georgia (June 14-15, 2013)
Extraction of Drug-Drug Interactions from Biomedical Texts

Published in: Data & Analytics
  • Be the first to comment

Extraction of Drug-Drug Interactions from Biomedical Texts

  1. 1. Isabel Segura-Bedmar, Paloma Martínez, María Herrero-Zazo Universidad Carlos III de Madrid, SPAIN SemEval-2013 Task 9: Extraction of Drug-Drug Interactions from Biomedical Texts
  2. 2. Outline 2  Motivation  Previous Work: DDIExtraction 2011  New in DDIExtraction 2013  The DDI corpus  Tasks  Task 9.1: Drug Name Recognition and Classification  Taks 9.2: Drug-Drug Interaction Extraction  Conclusions
  3. 3. What is a Drug-Drug Interaction (DDI)? 3 Motivation  A DDI occurs when a drug influences the level or the activity of another drug.  A DDI can be beneficial, but most times DDIs are dangerous for patients and can increase healthcare costs.  Medical literature is the most effective source for the detection of DDIs.
  4. 4. Information Extraction 4 Motivation We thank the team at the Humboldt-Universitaet zu Berlin for making available a visualization of the DDI corpus using Stav: http://http://corpora.informatik.hu-berlin.de/, https://github.com/TsujiiLaboratory/stav
  5. 5. Previous Work: DDIExtraction 2011 5  Automatic extraction of drug-drug interactions from texts.  Dataset: a collection of 579 documents from DrugBank.  DDIs annotated by a pharmacist,  Drugs automatically annotated.  F1 ranged between 0.16 and 0.66. Previous Work
  6. 6. New in SemEval Task 9 6  Task 9.1: Drug Name Recognition and Classification.  Task 9.2: DDI Detection and Classification.  The DDI corpus:  double size: 1,025 annotated documents, 18,502 pharmacological substances and 5,028 DDIs.  Drugs and DDIs were manually annotated by two pharmacists.  Available annotation guidelines and Inter-Annotator agreement.  Two different text sources:  MedLine  DrugBank. Motivation
  7. 7. Tasks 7  Task 9.1: Drug Name Recognition and Classification.  Task 9.2: Drug-Drug Interaction Extraction Tasks
  8. 8. Task 9.1 - Drug Classification 8 Tasks  drug type for generic drugs. (Eg. Heparin, ibuprofen, methotrexate).  brand type for trade drugs. (Eg. Espidifen, aspirin).  group type for groups of drugs. (Eg. Analgesics, anticoagulants).  drug_n type for active substances not approved for human use. (Eg. Picrotoxin, heroin)
  9. 9. Task 9.1 - Teams 9 Team Affiliation Approach LASIGE Lisbon University Conditional Random Fields UEM_UC3M European University, Carlos III University of Madrid Ontology-based approach UMCC_DLSI Matanzas University, Alicant University J48 classifier Uturku Turku University SVM classifier (TEES system) WBI Humboldt University of Berlin Conditional Random Fields Tasks
  10. 10. Task 9.1 Evaluation 10  Recognition (regardless to the type):  Exact-boundary matching (EXACT).  Partial-boundary matching (PARTIAL).  Recognition and classification:  Exact-boundary + type matching (STRICT).  Partial-boundary + type matching (TYPE). Tasks
  11. 11. Task 9.1- Overview of the results 11  Groups and substances not approved are more difficult than drugs and brands:  brand names: short and unique.  generic names: no ambiguity because they are simplified chemical names.  group names can be ambiguous (eg. anticoagulant, anti-retroviral, etc)  group names: many variants and abbreviations. Tasks
  12. 12. Task 9.1- Overview of the results 12  Drug-n type was the most difficult type:  very scarce in DrugBank (less1%).  less clearly defined in guidelines.  Systems are able to identify, but fail to classify them. Tasks
  13. 13. Tasks 13  Task 9.1: Drug Name Recognition and Classification.  Task 9.2: Drug-Drug Interaction Extraction Tasks
  14. 14. 14  Gold annotations for drugs are provided to teams both for training and test datasets. Task 9.2: Drug-Drug Interaction (DDI) Extraction Tasks
  15. 15. 15  Gold annotations for drugs are provided to teams both for training and test datasets.  Detect DDI and classify them Task 9.2: Drug-Drug Interaction (DDI) Extraction Tasks
  16. 16. 16  Gold annotations for drugs are provided to teams both for training and test datasets.  Detect DDI and classify them Task 9.2: Drug-Drug Interaction (DDI) Extraction Tasks EFFECT EFFECT MECHANISM
  17. 17. DDI Classification 17 Tasks  mechanism type for interactions describing the way the interaction occurs. Lansoprazole may decrease the absorption of enoxacin.  effect type for interactions describing the consequence of the interaction. Additive CNS depression may occur when antihistamines are administered with barbiturates.  advice type for interactions describing a recommendation or advice. Patients taking isoniazid and disulfiram concomitantly should closely monitored.  int type for mentions of interactions without any additional information. Clopidogrel interacts with omeprazol.
  18. 18. Task 9.2 Teams 18 Team Affiliation Approach FBK-irst FBK-irst, Italy Hybrid kernel + scope of negations and semantic roles NIL_UCM Complutense University of Madrid, Spain SVM classifier SCAI Fraunhofer SCAI, Germany SVM classifier UC3M Carlos III University of Madrid, Spain Shallow Linguistic Kernel UCOLORADO_SO M University of Colorado, School of Medicine, USA SMV classifier Uturku Turku University, Finland SVM classifier (TEES system) UWM_TRIADS University of Wisconsin, USA Two-stage SVM WBI_DDI Humboldt University of Ensemble of kernels Tasks
  19. 19. Task 9.2- Results 19 0.827 0.676 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 DrugBank Tasks 0.53 0.42 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MedLine
  20. 20. Task 9.2- Overview of the results 20  Detection: significant improvement over 2011: 66% F1 (2011) vs . 82% F1 (2013)  In DrugBank:  Int DDI type is the most difficult (54% F1).  Mechanism, effect and advice types show similar F1 (70%).  In MedLine, results for effect and mechanism types are considerably lower due to the complexity of sentences describing these DDIs.  Non-linear kernel-based methods overcome linear SVMs. Tasks
  21. 21. Conclusion 21  13 teams from 7 different countries.  In both tasks, the results on DrugBank are considerably better than the ones on MedLine.  Best F1: Task 9.1 Drug NERC Task 9.2 Extraction of DDIs Recognitio n Recognition + Classification Detection Detection + Classification DrugBan k 90% 87% 82% 53% MedLine 80% 58% 67% 42%
  22. 22. Conclusion 22  13 teams from 7 different countries.  In both tasks, results on DrugBank considerably better than the ones on MedLine.  Task 9.1:  Best system (WBI): conditional random field + the training dataset extended with the test dataset for task 9.2.  Most difficult: groups and drug-n.  Task 9.2:  There is much room to improve.
  23. 23. Future of the task 23  Include new types of texts:  prescription drug documents,  health records,  texts from social media about DDIs and adverse event drugs.  No plans for annotating new documents.  Goal of the next DDIExtraction:  Create a silver standard DDI corpus.  To annotate effect, mechanism, drug dosages, etc.  Similar to CALBC challenge.
  24. 24. Acknowledgments 24  This work was supported by the Regional Government of Madrid under the Research Network MA2VICMR [S2009/TIC-1542] and by the Spanish Ministry of Education under the project MULTIMEDICA [TIN2010-20644-C03-01].  To all participants for their efforts and to congratulate them to their interesting work.  To the Uturku team who provided TEES analyses for training and test datasets.  To the WBI team who made available a visualization of the DDI corpus using Stav.
  25. 25. Thanks!!! 25

×