Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
School of Computer Science Research Symposium 2014 
Temporal information extraction 
in the general and clinical domain 
f...
presentation: Research Symposium 2014 
2 
Natural Language 
Processing 
Linguistics 
Parallel 
computing 
30/10/2014, Manc...
presentation: Research Symposium 2014 
temporal information extraction 
Temporal aspects of events provide a natural 
mech...
presentation: Research Symposium 2014 
linguistic key concepts 
■ temporal expressions: phrases denoting a temporal 
entit...
presentation: Research Symposium 2014 
■ Yesterday, Deutsche Bank released a note saying 
that China's current economic po...
presentation: Research Symposium 2014 
example: temporal expressions 
■ Yesterday(T), Deutsche Bank released a note saying...
presentation: Research Symposium 2014 
■ Yesterday(T), Deutsche Bank released(E) a note saying(E) 
that China's current ec...
presentation: Research Symposium 2014 
■ Yesterday(T), Deutsche Bank released(E) a note saying(E) 
that China's current ec...
presentation: Research Symposium 2014 
example: ISO-TimeML output 
<TimeML … xsi:noNamespaceSchemaLocation=“http://timeml....
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
visual representation 
released, 
saying 
Utterance ti...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
proposed approach 
■ Data-driven rather than rule-base...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
proposed approach 
■ Data-driven rather than rule-base...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
proposed approach 
■ Data-driven rather than rule-base...
presentation: Research Symposium 2014 
post-processing pipeline 
TbLS fixer BIO 
30/10/2014, Manchester / 29 
5x10-fold cr...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
TempEval-3 results 
Rule-based Machine learning-based ...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
model selection 
16 
93 features, 4 models: 
■ M1: mor...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
i2b2 shared Task ‘12 
SBP ~80 
decreased respiratory r...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
clinical data 
■ disease progression 
modelling 
■ ana...
Better software, better research… 
presentation: Research Symposium 2014
presentation: Research Symposium 2014 
A temporal footprint is a continuous period 
on the time-line that temporally defin...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
evaluation 
■ subjects: people 
■ lived from 1000 AD t...
presentation: Research Symposium 2014 
results 
■ Galileo Galilei (1564-1642), prediction: 1556-1654 
30/10/2014, Manchest...
presentation: Research Symposium 2014 
results 
■ Computer (1940-today), prediction: 1882-1982 
30/10/2014, Manchester / 2...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
application? 
Source: http://start.csail.mit.edu/answe...
presentation: Research Symposium 2014 
temporal intent of queries 
25 
Can we predict the temporal intent of search engine...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
queries’ temporal intent 
Source: https://www.google.c...
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
queries’ temporal intent 
Source: https://www.google.c...
presentation: Research Symposium 2014 
72.33% 
30/10/2014, Manchester / 29 
results 
100 
1st ranked system 
28 
Accuracy ...
presentation: Research Symposium 2014 
just the tip of the iceberg… 
■ “I’ve played Tennis for 10 years” vs. 
“I’ve played...
Thank you.
presentation: Research Symposium 2014 
30/10/2014, Manchester / 29 
publications 
■ Non peer-reviewed: 
● Filannino, M. Te...
? QUESTIONS 
Contact: 
filannim@cs.man.ac.uk
Upcoming SlideShare
Loading in …5
×

Temporal information extraction in the general and clinical domain

671 views

Published on

30th October 2014
School of Computer Science Research Symposium 2014
The University of Manchester

Published in: Science
  • Be the first to comment

  • Be the first to like this

Temporal information extraction in the general and clinical domain

  1. 1. School of Computer Science Research Symposium 2014 Temporal information extraction in the general and clinical domain filannim@cs.man.ac.uk Manchester, 30/10/2014 Michele Filannino Supervisor: Goran Nenadic Co-Supervisor: Gavin Brown
  2. 2. presentation: Research Symposium 2014 2 Natural Language Processing Linguistics Parallel computing 30/10/2014, Manchester / 29 Machine Learning Semi-structured data Statistics Text Mining
  3. 3. presentation: Research Symposium 2014 temporal information extraction Temporal aspects of events provide a natural mechanism for organising information 30/10/2014, Manchester / 29 ■ source: written texts ■ goal: a (machine-understandable) temporal representation of the texts ■ easy for people ■ hard for machines
  4. 4. presentation: Research Symposium 2014 linguistic key concepts ■ temporal expressions: phrases denoting a temporal entity such as an interval or a time point ● 01/05/2014, March 15, the next week, Saturday, at that time, ■ events: phrases denoting eventuality and states ● inflected verbs and nouns: spoken, deliver, will be published ■ links: temporal relation between two phrases ● BEFORE, AFTER, INCLUDES, ENDS, DURING, BEGINS 30/10/2014, Manchester / 29 yesterday, 5 o’clock, 3 days, every 4 hours Source: ISO-TimeML (ISO/TC37/SC 4 N412 ), rev. 12, 2007 4
  5. 5. presentation: Research Symposium 2014 ■ Yesterday, Deutsche Bank released a note saying that China's current economic policies would result in an enormous surge in coal consumption over the next decade. 30/10/2014, Manchester / 29 example Source: CNN news article published on 28th February 2010. 5
  6. 6. presentation: Research Symposium 2014 example: temporal expressions ■ Yesterday(T), Deutsche Bank released a note saying that China's current economic policies would result in an enormous surge in coal consumption over the next decade(T). 30/10/2014, Manchester / 29 Source: CNN news article published on 28th February 2010. 6 value: “2010-02-27” type: DATE value: “P10Y” type: DURATION
  7. 7. presentation: Research Symposium 2014 ■ Yesterday(T), Deutsche Bank released(E) a note saying(E) that China's current economic policies would result(E) in an enormous surge(E) in coal consumption over the next decade(T). 30/10/2014, Manchester / 29 example: events Source: CNN news article published on 28th February 2010. 7 class: OCCURRENCE class: REPORTING class: OCCURRENCE class: OCCURRENCE
  8. 8. presentation: Research Symposium 2014 ■ Yesterday(T), Deutsche Bank released(E) a note saying(E) that China's current economic policies would result(E) in an enormous surge(E) in coal consumption over the next decade(T). 30/10/2014, Manchester / 29 example: links Source: CNN news article published on 28th February 2010. 8 is included is included after is included
  9. 9. presentation: Research Symposium 2014 example: ISO-TimeML output <TimeML … xsi:noNamespaceSchemaLocation=“http://timeml.org/timeMLdocs/TimeML_1.2.1.xsd”> <DOCID>nyt_20100228_china_pollution</DOCID> <DCT><TIMEX3 functionInDocument="CREATION_TIME" tid="t0" type="DATE" value=“2010-02-28">2010-02-28</TIMEX3></DCT> <TITLE>As Pollution Worsens in China, Solutions Succumb to Infighting</TITLE> <TEXT> <TIMEX3 tid="t1" type="DATE" value=“2010-02-27">Yesterday</TIMEX3>, Deutsche Bank <EVENT class="OCCURRENCE" eid="e1">released</EVENT> a note <EVENT class="REPORTING" eid="e2">saying</EVENT> that China's <TIMEX3 tid="t2" type="DATE" value=“PRESENT_REF”>current</TIMEX3> economic policies would <EVENT class="OCCURRENCE" eid="e3">result</EVENT> in an enormous <EVENT class="OCCURRENCE" eid="e4">surge</EVENT> in coal <EVENT class="OCCURRENCE" eid="e5">consumption</EVENT> over <TIMEX3 tid="t3" type="DURATION" value="P10Y">the next decade</TIMEX3>. </TEXT> <TLINK eventInstanceID="ei1" lid="l52" relType="IS_INCLUDED" relatedToTime="t1"/> <TLINK eventInstanceID="ei4" lid="l53" relType="IS_INCLUDED" relatedToTime="t2"/> <TLINK eventInstanceID="ei2" lid="l54" relType=“IS_INCLUDED" relatedToTime="t1"/> <TLINK eventInstanceID="ei4" lid="l59" relType="AFTER" relatedToEventInstance=“ei1"/> 30/10/2014, Manchester / 29 </TimeML> 9
  10. 10. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 visual representation released, saying Utterance time: 28th February 2010. 10 27 Feb. 2010 now 2020 surge
  11. 11. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 proposed approach ■ Data-driven rather than rule-based ■ no pre-existing tools available ■ Conditional Random Fields (Linear chain) ● sequences of labels, for sequences of samples 11
  12. 12. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 proposed approach ■ Data-driven rather than rule-based ■ no pre-existing tools available ■ Conditional Random Fields (Linear chain) ● sequences of labels, for sequences of samples 12 words
  13. 13. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 proposed approach ■ Data-driven rather than rule-based ■ no pre-existing tools available ■ Conditional Random Fields (Linear chain) ● sequences of labels, for sequences of samples 13 sentencewsords
  14. 14. presentation: Research Symposium 2014 post-processing pipeline TbLS fixer BIO 30/10/2014, Manchester / 29 5x10-fold cross validation 14 ■ Probabilistic correction ■ BIO fixer ■ Threshold-based label switcher BIO CRFs PCM fixer
  15. 15. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 TempEval-3 results Rule-based Machine learning-based 15 Research group Identification Normalisation accuracy Overall score Prec. Rec. F1 The University of Heidelberg 0.93 0.88 0.9 0.86 0.776 US Naval Academy 0.89 0.91 0.9 0.79 0.71 The University of Manchester 0.95 0.85 0.9 0.77 0.69 Stanford University 0.89 0.91 0.9 0.75 0.674 AT&T Lab Research 0.98 0.75 0.85 0.77 0.656 University of Colorado Boulder 0.94 0.87 0.9 0.72 0.647 Jadavpur University 0.93 0.8 0.86 0.74 0.638 Katholieke Universiteit Leuven 0.93 0.76 0.84 0.75 0.63 Joint Research Centre European Commission 0.9 0.8 0.85 0.68 0.582
  16. 16. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 model selection 16 93 features, 4 models: ■ M1: morpho-lexical only ■ M2: morpho-lexical + syntactic ■ M3: morpho-lexical + gazeetters ■ M4: morpho-lexical + gazeetters + WordNet Source: Filannino, M., and Nenadic G. ManTIME: Temporal expression extraction with systematic feature type selection and a posteriori label adjustment. Journal of Information processing and Management: Special Issue on Time and Information Retrieval, (2014), Elsevier. (under review) *5x10-fold cross validation
  17. 17. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 i2b2 shared Task ‘12 SBP ~80 decreased respiratory rate Haldol 4mg Ativan 2mg Saline bolus 2l blood pressure medications Source: Kova¢evi¢, A., Dehghan, A., Filannino, M., Keane, J. A., and Nenadic, G. Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. Journal of American Medical Informatics (2013). 17 ADMISSION DATE: 2011-02-06; DISCHARGE DATE: 2011-02-08; HISTORY OF PRESENT ILLNESS: Mr. Pohl is a 53 - year-old male with history of alcohol use and hypertension. Blood alcohol level was 383. Agitated in emergency room requiring 4 leather restraints, received 5 mg of Haldol, 2 mg of Ativan. He became hypotensive in the emergency room with a systolic blood pressure in the 80 's and had decreased respiratory rate. He received a normal saline bolus of 2 litres of good blood pressure response. The patient was then admitted to the medical Intensive Care Unit for observation and then transferred to our service on medicine when the blood pressures remained stable overnight... 06/02/2011 07/02/2011 08/02/2011 General Tests Treatments Problems admission discharge BAL 383 hypotensive transfer stable SBP stable hands tremor improved
  18. 18. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 clinical data ■ disease progression modelling ■ analysis of the effectiveness of treatments ■ extraction of patient’s clinical pathway 18
  19. 19. Better software, better research… presentation: Research Symposium 2014
  20. 20. presentation: Research Symposium 2014 A temporal footprint is a continuous period on the time-line that temporally defines the existence of a particular concept. 30/10/2014, Manchester / 29 temporal footprint Source: Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings of the First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland, August 2014), ACL. 20
  21. 21. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 evaluation ■ subjects: people ■ lived from 1000 AD to 2014 ● text from Wikipedia web pages ● year of birth and death from DBpedia ■ 228,824 people collected ■ simple definition of temporal footprint ● birth and death dates 21
  22. 22. presentation: Research Symposium 2014 results ■ Galileo Galilei (1564-1642), prediction: 1556-1654 30/10/2014, Manchester / 29 Error: 0.204 22
  23. 23. presentation: Research Symposium 2014 results ■ Computer (1940-today), prediction: 1882-1982 30/10/2014, Manchester / 29 Source: http://www.cs.man.ac.uk/~filannim/projects/temporal_footprints/ 23
  24. 24. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 application? Source: http://start.csail.mit.edu/answer.php?query= 24
  25. 25. presentation: Research Symposium 2014 temporal intent of queries 25 Can we predict the temporal intent of search engine’s user queries? 30/10/2014, Manchester / 29 ■ input: queries & submission date ■ output: past, present, future or atemporal Source: Filannino, M., Nenadic G. Using machine learning to predict temporal orientation of search engines’ queries in the Temporalia challenge. In Proceedings of the Sixth International Workshop on Evaluating Information Access (EVIA 2014) (Tokyo, Japan, December 2014).
  26. 26. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 queries’ temporal intent Source: https://www.google.co.uk/search?q=google+stock+price 26
  27. 27. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 queries’ temporal intent Source: https://www.google.co.uk/search?q=weather+forecast+manchester 27
  28. 28. presentation: Research Symposium 2014 72.33% 30/10/2014, Manchester / 29 results 100 1st ranked system 28 Accuracy 75 50 25 0 Full Intermediate Minimal Minimal fixed 66.33% 61.33% 55.00%
  29. 29. presentation: Research Symposium 2014 just the tip of the iceberg… ■ “I’ve played Tennis for 10 years” vs. “I’ve played Tennis for 4 hours.” ■ How can they be ported to different domains? ■ How can they be adapted to different languages? ■ Is ISO-TimeML enough to cover different domains/ languages? 29 RESEARCH 30/10/2014, Manchester / 29
  30. 30. Thank you.
  31. 31. presentation: Research Symposium 2014 30/10/2014, Manchester / 29 publications ■ Non peer-reviewed: ● Filannino, M. Temporal expression normalisation in natural language texts. CoRR abs/1206.2010 (2012). ■ Peer-reviewed: ● Kovačevič, A., Dehghan, A., Filannino, M., Keane, J. A., and Nenadic, G. Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives. Journal of American Medical Informatics (2013). ● Filannino, M., Brown, G., and Nenadic, G. ManTIME: Temporal expression identification and normalization in the TempEval-3 challenge. Proceedings of the Seventh International Work- shop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), ACL. ● Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings of the First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland, August 2014), ACL. ● Filannino, M., Nenadic G. Using machine learning to predict temporal orientation of search engines’ queries in the Temporalia challenge. In Proceedings of the Sixth International Workshop on Evaluating Information Access (EVIA 2014) (Tokyo, Japan, December 2014). ■ Under review: ● Filannino, M., and Nenadic G. ManTIME: Temporal expression extraction with systematic feature type selection and a posteriori label adjustment. Journal of Information processing and Management: Special Issue on Time and Information Retrieval, (2014), Elsevier. 31
  32. 32. ? QUESTIONS Contact: filannim@cs.man.ac.uk

×