Why Life is Difficult, and What We MIght Do About It
Knowledge Discovery And Data Mining Of Free Text Final
1. Patrick Jamieson M.D. Logical Semantics, Inc. Knowledge Discovery and Data Mining of Free Text Radiology Reports
2. Acknowledgements This presentation was made possible by Grant Number 9R44RR024929-02 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the official view of NCRR or NIH.
3.
4. Josette Jones, Ph.D., Assistant Professor of Informatics – ontology development and organization
5.
6. Why do we need a semantic index? What should it look like?
24. Adenocarcinoma of colon , with hepatic metastasisUniqueness of Medical Data Mining, Krzysztof J. Cios and G. William Moore, Artificial Intelligence in Medicine, 2002
25.
26. Two semantically equivalent queries usually will get different results! Top 5 tv moments Top fivetv moments Tomasz Imielinski, AlessioSignorini, Jinyun Yan Rutgers
27.
28. Semantic Queries using MedLEE “Human errors can be avoided if the person who formulates the query can accurately translate the eligibility criteria into corresponding semantic classes and attributes, and thoroughly considers all possible combinations of semantic classes and attributes.” Li L, Chase HS, Patel, CO, Friedman C, Weng C. Comparing ICD9-Encoded Diagnoses and NLP-Processed Discharge Summaries for Clinical Trials Pre-Screening: A Case Study. AMIA AnnuSymp Proc. 2008; 2008: 404–408.
39. The left upper lobe is normal?Most NLP systems simply can not semantically index all the information accurately
40. Knowledge and Language Understanding Example: “Healing both bone fracture of the distal forearm.” Meaning: There is a healing distal radius fracture. There is a healing distal ulnar fracture.
41. Language Understanding Another Example “Global atrophy without acute abnormality” Meaning: There is diffuse cerebral atrophy. There is no acute intracranial abnormality.
42.
43. A widely used method for capturing logical assertions is through first-order predicate logic (FOPC).
44.
45. Natural Language Engineering The last two decades have been marked by a complete paradigm shift in computational linguistics. Frustrated by the inability of applications based on explicit linguistic knowledge to scale up to real-world needs, and, perhaps more deeply, frustrated with the dominating theories in formal linguistics, we looked instead to corpora that reflect language use as our sources of (implicit) knowledge. ShulyWintner University of Haifa
46. Semantic Resources Underdeveloped “Natural language systems generally need a large number of training examples to train, refine, or test the system.” Friedman, Carol. Semantic text parsing for patient records in medical informatics. Knowledge Management and Data Mining In Biomedicine. Editors Chen H, Fuller S, Friedman C, Hersh W. Springer 2005, Pg 431.
47. CLEF aims to develop a high quality, secure and interoperable information repository, derived from operational electronic patient records to enable access to patient information in support of clinical care and biomedical research The CLEF gold standard corpus contains 167 clinical documents, chosen from 565K CLEF corpus.
48. Clinical E-Science (CLEF) Annotation Framework “There is a left lower lobe pulmonary infiltrate” Arg1: (Condition) pulmonary infiltrate Arg2: Locus “left lower lobe”. “A standard anteroposterior radiograph shows a tibial shaft fracture.” Has_finding Arg1: (Investigation) “anteroposterior radiograph” Arg2: (Condition) “tibial shaft fracture”
49.
50. Are the number and type of arguments defined for these predicates?
51. Do the concepts, which fill the argument slots, adequately cover the domain?
52.
53.
54. Each sentence was reviewed by three physicians for correctness in annotation.
61. Phrasal Synonymy Phrases such as “pelvic calcifications consistent with phleboliths” and “several pelvic phleboliths are present” are semantically equivalent, but produce different PASs using MetaMap
62. Concept Representation For example, “gray/white matter differentiation” is defined as the difference in appearance of parts of the brain on a CT scan or MRI (semantic rank 151), but is not in the UMLS or SNOMED CT