SlideShare a Scribd company logo
1 of 1
Download to read offline
METHODS
ABSTRACT
Towards Enhancing Anthrax Vaccine Safety WithNatural Language Processing
Herman Tolentino MD1,2
, Michael Matters PhD MPH2,3
, Wikke Walop PhD4,5
, Barbara Law MD4,5
,
Wesley Tong6
, Deepak Sagaram MBBS7
, Fang Liu MS8
, Paul Fontelo MD MPH8
,
Katrin Kohl MD PhD MPH9,10
, Daniel Payne PhD MSPH1
1 Vaccine Analytic Unit, National Immunization Program, CDC, Atlanta, GA 30333
2 Public Health Informatics Fellowship Program, Office of Workforce and Career Development, CDC, Atlanta, GA 30333
3 Division for Heart Disease and Stroke Prevention, National Center for Chronic Disease Prevention and Health Promotion, CDC, Atlanta GA 30333
4 Immunization & Respiratory Infections Division, Centre for Infectious Disease Prevention & Control
Public Health Agency of Canada, Ottawa, Ontario K1A 0K9
5 Brighton Collaboration
6 Honours Biology and Pharmacology Programme, McMaster University
Hamilton, Ontario L8S 4L8
7 The University of Texas Health Science Center at Houston, TX 77030
8 Office of High Performance Computing and Communications, National Library of Medicine
National Institutes of Health, Bethesda, MD 20894
9 Immunization Safety Office, CDC, Atlanta GA 30333
Detecting   vaccine   adverse   events   is  an  important   public   health  
activity  that  contributes   to  patient  safety.  Large  amounts   of  
clinical   information   are  locked   up  in  unstructured   free  text  
components   of  clinical   reports.  Reports   of  adverse  events  
following   immunization   (AEFI)  from  surveillance   systems  
contain   free  text  that  can  be  analyzed   using   natural   language  
processing   (NLP).  Current  advances   in  computer   and  
information   technology   allow   storage  and   processing   of  large  
amounts   of  information,  including   free  text.  A  collaborative  
workgroup   among   the  Brighton   Collaboration   (BC),  Public  
Health   Agency  of  Canada   (PHAC),  the  National   Library   of  
Medicine   (NLM)  and   the  Vaccine  Analytic   Unit  (VAU)  was  
formed  to  investigate   the  use   of  natural  language   processing  
or  NLP  (1)  to  extract  structured  information   from  free  text  
components   of  AEFI  reports,  and  (2)  automate  information  
retrieval   and   classification   in  surveillance   systems.  The  
outputs  are   applicable   to  processing   of  free  text  for  anthrax  
vaccine   safety.
§ Collaboration   between   Brighton   Collaboration   (BC),  Public  
Health   Agency  of  Canada   (PHAC),  National   Library   of  
Medicine   (NLM)  and   Vaccine  Analytic   Unit  (VAU).  See  
Figure   1.
§ Creation   of  AEFI  free  text  corpus  from  de-­identified  
adverse   event  reports   from  PHAC.
RESULTS
CONCLUSIONS
SELECTED REFERENCES
§ Development   of  natural   language   processing   (NLP)  and  
machine   learning   (ML)  algorithms   to  represent  information   in  
AEFI  reports  using  concepts   from  NLM’s  Unified   Medical  
Language   System  (UMLS).  Two  important  steps  are   needed:  
(1)  spell   checking   to  reduce   “noise”   from  misspelled   words  
and   abbreviations;;   and  (2)  concept   tagging   to  represent  key  
words   in  free  text  with  UMLS  concepts   and   map  them  to  AEFI  
controlled   vocabularies   (MedDRA,   COSTART,  WHO-­ART).
§ Derivation   of  a  semantic  distance   metric  to  measure   similarity  
between   UMLS   concepts   and  enhance   concept   tagging.
§ Application   of  the  algorithms   to  correct  spelling   errors  and  
extract  UMLS  concepts   from  free  text  reports.
§ Validation   of  machine   learning   training   with  test  data
§ Adaptation   of  a  clustering   algorithm   to  classify  documents  
based   on  semantic   distance   concept   groups.
§ NLP  steps  for  spell   checking   is  shown   in  Figure   1  while   that  
for  concept  extraction   is  in  Figure   2.
§ Performance   measures   for  spell   checking   and   concept  
extraction   are  shown   in  Tables   1  and   2.
§ Proportions   of  key  words   mapped   to  adverse   event  controlled  
vocabularies   is  shown   in   Table   3.
§ Screen   shot  of  processed   (concept-­tagged)   free  text  showing  
UMLS  concept   tags  in  Figure   2.
§ The  matching   of  terms  in  free  text  to  concepts  is  an  
essential   component   of  human  and   machine   reasoning.  
When  applied   to  vaccine   safety  using   UMLS  concepts,   it  
enables   computable   and   unencumbered   representation   of  
terms  and  eventual   extraction  of  structured   information  
such  as  vaccine   adverse   events.  
§ Two  important  steps  are   needed   to  carry  this  out:  (1)  
reduction   of  “noise”   from  misspelled   words   and  
abbreviations,   and   (2)  tagging   of  key  terms  from  free  text  
with  UMLS   concepts.  Both  require   the  use  of  natural  
language   processing   and   machine   learning   techniques.
§ The  UMLS  provides   adequate   coverage   for  mapping  
clinical   terms  to  concepts.  The  use  of  a  semantic  distance  
metric  enhances   the  concept  extraction   process.
§ Working   in  the  context  of  a  collaboration   (1)  ensures   that  
contextual   issues   are  addressed   and   appropriate  
knowledge   domain   expertise   is  leveraged   and   efficiently  
utilized,   and   (2)  demonstrates   that  the  value   of  
collaborative   problem-­solving   in   public   health   knows   no  
boundaries.
Figure  1.  Spell  checker  process  flow  showing  different  steps.  
Disambiguation  involves  selection  of  one  correction  term  from  a  list  of  
potential  candidates  obtained  from  lexical  dictionaries.
Figure  2.  Concept  extraction  process  flow  showing  
snapshot  of  a  concept-­tagged  AEFI  report
Table  1.  Performance  measurements  for  spell  checker  during  
training  and  testing
Table  2.  Performance  measurements  for  concept  extraction  
during  training  and  testing
Table  3.  Proportions  of  adverse  event  controlled  vocabulary  
mappings  from  free  text  AEFI   reports  for  training  and  test  data  sets
This research was made possible through a grantby Oak Ridge Institute for Science Education (ORISE) to the Centers for Disease Control
and Prevention Public Health Informatics Fellowship Program (PHIFP).Specialacknowledgments to (1) The Brighton Collaboration for
making globalresearch connections possible;(2) the Public Health Agency of Canada for valuable source data inputs;(3) the National
Library of Medicine for sharing UMLS expertise; and,(4) Herman’s mentors:Dan Payne and Mike McNeil.
1. Hripcsak  G.  Friedman  C,  Alderson  PO,  DuMouchel  W,  Johnson  SB,  
Clayton  P.   Unlocking  clinical  data  from  narrative  reports:  a  study  of  
natural  language  processing.  Annals  of  Internal  Medicine.  May  1995;;  
122(9)-­681-­688.
2. Sittig   DF.  Potential   impact  of  advanced  clinical  information  technology  
on  healthcare  in  2015.  Medinfo  2004:  11(Pt  2):1379-­82.
3. The  Brighton  Collaboration.  URL:  
http://www.brightoncollaboration.org.  Last  accessed:  January  2006.  
4. Unified  Medical  Language  System.  URL:  
http://www.nlm.nih.gov/research/umls/.   Last  accessed:  January  2006.
5. Chapman  WW.  Natural  language  processing  for  outbreak  and  disease  
surveillance.  In  Handbook  of  Biosurveillance,  Elsevier  Inc,  New  York,  
NY  (2005)  (in  press).

More Related Content

Similar to 2006NIC-NLPPoster_V2

E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...journal ijrtem
 
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...IJRTEMJOURNAL
 
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Kaela Johnson
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Publishing House
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
 
BioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdfBioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdfAlHayyan
 
Semantic Similarity Measures between Terms in the Biomedical Domain within f...
 Semantic Similarity Measures between Terms in the Biomedical Domain within f... Semantic Similarity Measures between Terms in the Biomedical Domain within f...
Semantic Similarity Measures between Terms in the Biomedical Domain within f...Editor IJCATR
 
Large Language Models and Applications in Healthcare
Large Language Models and Applications in HealthcareLarge Language Models and Applications in Healthcare
Large Language Models and Applications in HealthcareAsma Ben Abacha
 
Strengths and Weakness of Informatics.docx
Strengths and Weakness of Informatics.docxStrengths and Weakness of Informatics.docx
Strengths and Weakness of Informatics.docxwrite5
 
IRJET - Term based Personalization of Feature Selection of Auto Filling Pa...
IRJET - 	  Term based Personalization of Feature Selection of Auto Filling Pa...IRJET - 	  Term based Personalization of Feature Selection of Auto Filling Pa...
IRJET - Term based Personalization of Feature Selection of Auto Filling Pa...IRJET Journal
 
Clinical Trial Design and Artificial Intelligence | Pepgra.com
Clinical Trial Design and Artificial Intelligence | Pepgra.comClinical Trial Design and Artificial Intelligence | Pepgra.com
Clinical Trial Design and Artificial Intelligence | Pepgra.comPEPGRA Healthcare
 
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSISDENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIScsandit
 
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSISDENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIScscpconf
 
slide share Artificial intelligence .pptx
slide share Artificial intelligence .pptxslide share Artificial intelligence .pptx
slide share Artificial intelligence .pptxPrinci Thapak
 
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...csandit
 
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITION
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITIONPREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITION
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITIONCSEIJJournal
 
Preprocessing Challenges for Real World Affect Recognition
Preprocessing Challenges for Real World Affect Recognition Preprocessing Challenges for Real World Affect Recognition
Preprocessing Challenges for Real World Affect Recognition CSEIJJournal
 
An Overview on the Use of Data Mining and Linguistics Techniques for Building...
An Overview on the Use of Data Mining and Linguistics Techniques for Building...An Overview on the Use of Data Mining and Linguistics Techniques for Building...
An Overview on the Use of Data Mining and Linguistics Techniques for Building...ijcsit
 

Similar to 2006NIC-NLPPoster_V2 (20)

E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
 
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...
 
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
Poster CBIS 2012
Poster CBIS 2012Poster CBIS 2012
Poster CBIS 2012
 
BioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdfBioCreative2023_proceedings_instructions_authors_template.pdf
BioCreative2023_proceedings_instructions_authors_template.pdf
 
Semantic Similarity Measures between Terms in the Biomedical Domain within f...
 Semantic Similarity Measures between Terms in the Biomedical Domain within f... Semantic Similarity Measures between Terms in the Biomedical Domain within f...
Semantic Similarity Measures between Terms in the Biomedical Domain within f...
 
Large Language Models and Applications in Healthcare
Large Language Models and Applications in HealthcareLarge Language Models and Applications in Healthcare
Large Language Models and Applications in Healthcare
 
Strengths and Weakness of Informatics.docx
Strengths and Weakness of Informatics.docxStrengths and Weakness of Informatics.docx
Strengths and Weakness of Informatics.docx
 
IRJET - Term based Personalization of Feature Selection of Auto Filling Pa...
IRJET - 	  Term based Personalization of Feature Selection of Auto Filling Pa...IRJET - 	  Term based Personalization of Feature Selection of Auto Filling Pa...
IRJET - Term based Personalization of Feature Selection of Auto Filling Pa...
 
Poster IHI 2012
Poster IHI 2012Poster IHI 2012
Poster IHI 2012
 
Clinical Trial Design and Artificial Intelligence | Pepgra.com
Clinical Trial Design and Artificial Intelligence | Pepgra.comClinical Trial Design and Artificial Intelligence | Pepgra.com
Clinical Trial Design and Artificial Intelligence | Pepgra.com
 
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSISDENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
 
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSISDENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
DENGUE DETECTION AND PREDICTION SYSTEM USING DATA MINING WITH FREQUENCY ANALYSIS
 
slide share Artificial intelligence .pptx
slide share Artificial intelligence .pptxslide share Artificial intelligence .pptx
slide share Artificial intelligence .pptx
 
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
 
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITION
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITIONPREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITION
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITION
 
Preprocessing Challenges for Real World Affect Recognition
Preprocessing Challenges for Real World Affect Recognition Preprocessing Challenges for Real World Affect Recognition
Preprocessing Challenges for Real World Affect Recognition
 
An Overview on the Use of Data Mining and Linguistics Techniques for Building...
An Overview on the Use of Data Mining and Linguistics Techniques for Building...An Overview on the Use of Data Mining and Linguistics Techniques for Building...
An Overview on the Use of Data Mining and Linguistics Techniques for Building...
 

2006NIC-NLPPoster_V2

  • 1. METHODS ABSTRACT Towards Enhancing Anthrax Vaccine Safety WithNatural Language Processing Herman Tolentino MD1,2 , Michael Matters PhD MPH2,3 , Wikke Walop PhD4,5 , Barbara Law MD4,5 , Wesley Tong6 , Deepak Sagaram MBBS7 , Fang Liu MS8 , Paul Fontelo MD MPH8 , Katrin Kohl MD PhD MPH9,10 , Daniel Payne PhD MSPH1 1 Vaccine Analytic Unit, National Immunization Program, CDC, Atlanta, GA 30333 2 Public Health Informatics Fellowship Program, Office of Workforce and Career Development, CDC, Atlanta, GA 30333 3 Division for Heart Disease and Stroke Prevention, National Center for Chronic Disease Prevention and Health Promotion, CDC, Atlanta GA 30333 4 Immunization & Respiratory Infections Division, Centre for Infectious Disease Prevention & Control Public Health Agency of Canada, Ottawa, Ontario K1A 0K9 5 Brighton Collaboration 6 Honours Biology and Pharmacology Programme, McMaster University Hamilton, Ontario L8S 4L8 7 The University of Texas Health Science Center at Houston, TX 77030 8 Office of High Performance Computing and Communications, National Library of Medicine National Institutes of Health, Bethesda, MD 20894 9 Immunization Safety Office, CDC, Atlanta GA 30333 Detecting   vaccine   adverse   events   is  an  important   public   health   activity  that  contributes   to  patient  safety.  Large  amounts   of   clinical   information   are  locked   up  in  unstructured   free  text   components   of  clinical   reports.  Reports   of  adverse  events   following   immunization   (AEFI)  from  surveillance   systems   contain   free  text  that  can  be  analyzed   using   natural   language   processing   (NLP).  Current  advances   in  computer   and   information   technology   allow   storage  and   processing   of  large   amounts   of  information,  including   free  text.  A  collaborative   workgroup   among   the  Brighton   Collaboration   (BC),  Public   Health   Agency  of  Canada   (PHAC),  the  National   Library   of   Medicine   (NLM)  and   the  Vaccine  Analytic   Unit  (VAU)  was   formed  to  investigate   the  use   of  natural  language   processing   or  NLP  (1)  to  extract  structured  information   from  free  text   components   of  AEFI  reports,  and  (2)  automate  information   retrieval   and   classification   in  surveillance   systems.  The   outputs  are   applicable   to  processing   of  free  text  for  anthrax   vaccine   safety. § Collaboration   between   Brighton   Collaboration   (BC),  Public   Health   Agency  of  Canada   (PHAC),  National   Library   of   Medicine   (NLM)  and   Vaccine  Analytic   Unit  (VAU).  See   Figure   1. § Creation   of  AEFI  free  text  corpus  from  de-­identified   adverse   event  reports   from  PHAC. RESULTS CONCLUSIONS SELECTED REFERENCES § Development   of  natural   language   processing   (NLP)  and   machine   learning   (ML)  algorithms   to  represent  information   in   AEFI  reports  using  concepts   from  NLM’s  Unified   Medical   Language   System  (UMLS).  Two  important  steps  are   needed:   (1)  spell   checking   to  reduce   “noise”   from  misspelled   words   and   abbreviations;;   and  (2)  concept   tagging   to  represent  key   words   in  free  text  with  UMLS  concepts   and   map  them  to  AEFI   controlled   vocabularies   (MedDRA,   COSTART,  WHO-­ART). § Derivation   of  a  semantic  distance   metric  to  measure   similarity   between   UMLS   concepts   and  enhance   concept   tagging. § Application   of  the  algorithms   to  correct  spelling   errors  and   extract  UMLS  concepts   from  free  text  reports. § Validation   of  machine   learning   training   with  test  data § Adaptation   of  a  clustering   algorithm   to  classify  documents   based   on  semantic   distance   concept   groups. § NLP  steps  for  spell   checking   is  shown   in  Figure   1  while   that   for  concept  extraction   is  in  Figure   2. § Performance   measures   for  spell   checking   and   concept   extraction   are  shown   in  Tables   1  and   2. § Proportions   of  key  words   mapped   to  adverse   event  controlled   vocabularies   is  shown   in   Table   3. § Screen   shot  of  processed   (concept-­tagged)   free  text  showing   UMLS  concept   tags  in  Figure   2. § The  matching   of  terms  in  free  text  to  concepts  is  an   essential   component   of  human  and   machine   reasoning.   When  applied   to  vaccine   safety  using   UMLS  concepts,   it   enables   computable   and   unencumbered   representation   of   terms  and  eventual   extraction  of  structured   information   such  as  vaccine   adverse   events.   § Two  important  steps  are   needed   to  carry  this  out:  (1)   reduction   of  “noise”   from  misspelled   words   and   abbreviations,   and   (2)  tagging   of  key  terms  from  free  text   with  UMLS   concepts.  Both  require   the  use  of  natural   language   processing   and   machine   learning   techniques. § The  UMLS  provides   adequate   coverage   for  mapping   clinical   terms  to  concepts.  The  use  of  a  semantic  distance   metric  enhances   the  concept  extraction   process. § Working   in  the  context  of  a  collaboration   (1)  ensures   that   contextual   issues   are  addressed   and   appropriate   knowledge   domain   expertise   is  leveraged   and   efficiently   utilized,   and   (2)  demonstrates   that  the  value   of   collaborative   problem-­solving   in   public   health   knows   no   boundaries. Figure  1.  Spell  checker  process  flow  showing  different  steps.   Disambiguation  involves  selection  of  one  correction  term  from  a  list  of   potential  candidates  obtained  from  lexical  dictionaries. Figure  2.  Concept  extraction  process  flow  showing   snapshot  of  a  concept-­tagged  AEFI  report Table  1.  Performance  measurements  for  spell  checker  during   training  and  testing Table  2.  Performance  measurements  for  concept  extraction   during  training  and  testing Table  3.  Proportions  of  adverse  event  controlled  vocabulary   mappings  from  free  text  AEFI   reports  for  training  and  test  data  sets This research was made possible through a grantby Oak Ridge Institute for Science Education (ORISE) to the Centers for Disease Control and Prevention Public Health Informatics Fellowship Program (PHIFP).Specialacknowledgments to (1) The Brighton Collaboration for making globalresearch connections possible;(2) the Public Health Agency of Canada for valuable source data inputs;(3) the National Library of Medicine for sharing UMLS expertise; and,(4) Herman’s mentors:Dan Payne and Mike McNeil. 1. Hripcsak  G.  Friedman  C,  Alderson  PO,  DuMouchel  W,  Johnson  SB,   Clayton  P.   Unlocking  clinical  data  from  narrative  reports:  a  study  of   natural  language  processing.  Annals  of  Internal  Medicine.  May  1995;;   122(9)-­681-­688. 2. Sittig   DF.  Potential   impact  of  advanced  clinical  information  technology   on  healthcare  in  2015.  Medinfo  2004:  11(Pt  2):1379-­82. 3. The  Brighton  Collaboration.  URL:   http://www.brightoncollaboration.org.  Last  accessed:  January  2006.   4. Unified  Medical  Language  System.  URL:   http://www.nlm.nih.gov/research/umls/.   Last  accessed:  January  2006. 5. Chapman  WW.  Natural  language  processing  for  outbreak  and  disease   surveillance.  In  Handbook  of  Biosurveillance,  Elsevier  Inc,  New  York,   NY  (2005)  (in  press).