Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Text Mining the History of Medicine

1,829 views

Published on

Sophia Ananiadou (Manchester University)
Digital History seminar
Institute of Historical Research
24 March 2015, 5.15pm
John S Cohen Room 203, 2nd floor, IHR, North block, Senate House or live online via the Digital History Seminar blog: http://ihrdighist.blogs.sas.ac.uk/


I will present the results of a collaborative and interdisciplinary project between the National Centre for Text Mining (NaCTeM) and the Centre for the History of Science, Technology and Medicine (CHSTM) at the University of Manchester, demonstrating the capabilities of innovative text mining tools to allow the automatic extraction of information from two historical archives: the British Medical Journal (BMJ) (1840 – present) and the London-area Medical Officer of Health (MOH) reports (1848-1972). NaCTeM’s text mining tools have enriched these historical archives with semantic metadata automatically by extracting terms, named entities and events. The development of a semantic search system focused on the understanding of historical changes in lung diseases since 1840.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Text Mining the History of Medicine

  1. 1. Mining  the  History  of  Medicine   Sophia  Ananiadou   Na-onal  Centre  for  Text  Mining   School  of  Computer  Science   The  University  of  Manchester  
  2. 2. •  Text  Mining  tools  for  enriching  content   •  Mining  the  History  of  Medicine   •  Seman-c  annota-ons:     –  en--es,  events   –  OCR  correc-on   –  Time-­‐sensi-ve  term  inventory  and  search     •  A  prototype  search  system  for  the  history  of  medicine     Outline   2  
  3. 3. raw   (unstructured)   text   part-­‐of-­‐speech   tagging   named  en-ty   recogni-on   event  extrac-on   annotated   (structured)   text   text  processing  pipeline   lexica   ontologies   ….    ……………………I  believe   the  pain  was  seated  in  the   pillars  of  the  diaphragm,  as   it  was  chiefly  occasioned  by   hiccough  ….  ……     Text  Mining  Pipeline     3  
  4. 4. Impact  of  text  mining     •  Discovery  of  concepts  and   named  en--es  (anatomical   en--es,  treatments,   symptoms..)   •  Going  a  step  further:   extrac-ng  rela-onships,   events  from  text  (cause,   affect)   •  Extrac-ng  opinions,  aStudes,   negated  statements,   hypotheses,  contradic-ons….   •  Improves  classifica-on  of   documents   •  Improves  informa-on  access   •  Enables  seman-c  search   •  Enables  ques-on-­‐answering,   sen-ment  analysis,  trend   discovery     Impact  of  text  mining   4  
  5. 5.   •  Na-onal  Centre  for  Text  Mining,  School  of   Computer  Science    www.nactem.ac.uk       •  Centre  for  the  History  of  Science,  Technology  and   Medicine       Mining  the  History  of  Medicine:   an  interdisciplinary  project     AHRC     Digital  Transforma-ons   Big  Data   5  
  6. 6. Mining  the  History  of  Medicine   Aim   Seman7c  search  system   •  Bri-sh  Medical  Journal  (BMJ)    (1840  –  present)  (350K   ar-cles)   •  London  Medical  Officer  of  Health  reports  (MoH)   (1848  –  1972)                 Provide  different  perspec7ves  (e.g.  medical,  public  health)   on  the  treatment  and  preven-on  of  diseases     ü  over  -me   ü  in  different  areas   Text  Mining:  tools,   resources,   infrastructure   6  
  7. 7. Mining  the  History  of  Medicine   Search  historical   archives   OCR   Keyword   search?   BMJ  archive  was  digi-sed  several  years  ago   ü  Word  error  rate  can  be  over  30%   ü  Re-­‐OCRing  -me-­‐consuming  and  costly   MOH  OCRed  more  recently  with  beier  results           Keyword  search  over  huge  textual  archives   inefficient   7  
  8. 8. Mining  the  History  of  Medicine   Solu-ons   ü  Apply  OCR  improvement  techniques     ü  Automa-cally  detect  seman7c  informa-on  in   text   ü  En##es:  condi-ons,  signs  or  symptoms,   therapeu-c  measures   ü  Events:    symptoms  caused  by  lung  disease       ü  Automa-cally  iden-fy  how  the  naming  of   concepts  changes  over  7me   Sophia  Ananiadou   8  
  9. 9. Mining  the  History  of  Medicine   Sema-c   search   system     ü  Automa-c  query  building,  term  sugges-ons     •  historical  variants  unknown  to  the  user   ü  Faceted  search  based  on  ar-cle  metadata   •  Date,  authors,  -tle     ü  Faceted  search  based  on  seman-c  content   •  Combina-ons  of  en--es  and/or  rela-onships     Sophia  Ananiadou   9  
  10. 10. Extrac-ng  seman-c  types,  events   ü  Our  text  mining  tools  learn  from  gold  standard   •  subset  of  documents  from  the  archives  manually  annotated   with  en77es  and  events     ü  Corpus  composi-on   •  25  ar-cles  from  BMJ   •  4  extracts  from  MOH   •  70,000  words   ü  Ar-cles  from  4  key  decades  in  medical  history   •  1850s,  1890s,  1920  and  1960s   •  Capture  language/terminology  changes  over  -me   Sophia  Ananiadou   10  
  11. 11. OCR  Post  Correc-on   •  Re-­‐OCR    BMJ  too  -me  consuming  for  current  project   •  Our  solu-on:    post-­‐correct  original  OCR  output  based  on   spell-­‐checker  output   SpellChecker   %  errors  resolved   ASpell   45%   Hunspell   60%   Word   63%   MAC  OS   58%   Best  open  source   spell  checker   11  
  12. 12. OCR  Post  Correc-on   ü Hunspell  produces  ranked  list  of  spelling  correc-ons   ü However,  tailored  to:     •  Human  errors   •  General  language   Our  approach   Augment  Hunspell  dic-onary  with  OpenMedSpel  dic-onary   Override  default  Hunspell  correc-on  ranking   ü  Rank  sugges-ons  according  to  corpus  frequency   ü  Use  decade-­‐specific  word  frequency  lists     12  
  13. 13. JOHN MILLER, aged 26, glass-blower of Gateshead, was adnmitted August 27th, 1840, unider the care of Sir John Fife, with the usual syptnpoms of stone in the bladder, wlich he lhas laboured under from childhood. At the age of 12 years he was admitted into thtis infirmaryI, buit his frienids wotld niot allow the operation to be performed. Ile took a considerable (luantity of medicine, which greatly relieved him. Since then the symptomt of stone have been constantly presenit, but have not been very urgent. A.- About a month ago, after batlhing, the symptoms became much arggravated-so much so, as to comlpel him to apply again to the hospital for relief. On admission he was sallow and emaciated; laboured iunder symptoms of stone of a high degree, with great irritability of bladder, giviing rise to a frequient desire to inicturate, and occasionally to the in-voluntary discharge of faces from straining. Ile passed a larg,e quantity of mucus with his urine;-pulse small, tongue white, sliglht tlhirst, and appetite good.   BMJ  OCR  Errors  –  Example  from  1840     13  
  14. 14. Basic  Hunspell  Output     JOHN MILLER, aged 26, glass-blower of Gateshead, was admitted August 27th, 1840, under the care of Sir John Fife, with the usual symptoms of stone in the bladder, which he has laboured under from childhood. At the age of 12 years he was admitted into this infirmary, but his friends world into allow the operation to be performed. Ike took a considerable (quantity of medicine, which greatly relieved him. Since then the symptom of stone have been constantly present, but have not been very urgent. A.- About a month ago, after bathing, the symptoms became much aggravated-so much so, as to compel him to apply again to the hospital for relief. On admission he was sallow and emaciated; laboured under symptoms of stone of a high degree, with great irritability of bladder, giving rise to a frequent desire to inaccurate, and occasionally to the involuntary discharge of faces from straining. Ike passed a lag,e quantity of mucus with his urine;-pulse small, tongue white, slight thirst, and appetite good. Correct  changes   made  by  Hunspell     Incorrect  changes   made  by  Hunspell     OCR  Errors  not   detected  by   Hunspell     14  
  15. 15. New  Frequency-­‐driven  Hunspell  Output   JOHN MILLER, aged 26, glass-blower of Gateshead, was admitted August 27th, 1840, under the care of Sir John Fife, with the usual symptoms of stone in the bladder, which he has laboured under from childhood. At the age of 12 years he was admitted into this infirmary, but his friends would not allow the operation to be performed. Lie took a considerable (quantity of medicine, which greatly relieved him. Since then the symptoms of stone have been constantly present, but have not been very urgent. A.- About a month ago, after bathing, the symptoms became much aggravated-so much so, as to compel him to apply again to the hospital for relief. On admission he was sallow and emaciated; laboured under symptoms of stone of a high degree, with great irritability of bladder, giving rise to a frequent desire to inaccurate, and occasionally to the involuntary discharge of faces from straining. Lie passed a large,e quantity of mucus with his urine;-pulse small, tongue white, slight thirst, and appetite good. Correc-ons  only   obtained  using   frequency-­‐driven   method   Correc-ons   obtained  using   both  original  and   frequency-­‐driven   methods   Remaining   errors  with   frequency-­‐driven   methods   15  
  16. 16. Avoiding  Over-­‐Correc-on   Original  Text   Basic  Hunspell   Frequency-­‐ driven  Hunspell     Frequency-­‐driven   Hunspell   augmented  with   OpenMedSpel   an-thrombo-c   anthropometric   anthropometric   an7thrombo7c   pathophysiological   physiological   physiological   pathophysiological     thromboembelism   thrombosis   thrombosis   thromboembolism   warfarin   wayfaring   warfarin   warfarin   transoesophageal   oesophageal   oesophageal   transoesphageal   echocardiography   electroocardiography   oesophageal   echocardiography   intracardiac   intra  cardiac   intracellular   intracardiac   tomogram   mammogram   tomogram   tomogram   •  We  want  to  avoid  erroneous  correc-on  of  words  that  are  already  correct   16  
  17. 17. Extrac-ng  seman-c  types,  events   ü  Our  text  mining  tools  learn  from  gold  standard   •  subset  of  documents  from  the  archives  manually  annotated   with  en77es  and  events     ü  Corpus  composi-on   •  25  ar-cles  from  BMJ   •  4  extracts  from  MOH   •  70,000  words   ü  Ar-cles  from  4  key  decades  in  medical  history   •  1850s,  1890s,  1920  and  1960s   •  Capture  language/terminology  changes  over  -me   17  
  18. 18.  Hymera  corpus:  En-ty  Types   Type   Descrip7on   Examples   Anatomical   An  en-ty  forming  part  of  the   human  body   lung,  lobe,  sputum,  fibroid   Biological_   En7ty     A  living  en-ty  not  part  of  the   human  body     tubercle  bacilli,  mould,  guinea-­‐pig,   flea   Condi7on   Medical  condi-on  or  ailment   phthisis,  bronchi:s,  typhoid   Environmental   Environmental  factor  relevant   to  incidence/preven-on/   control/treatment  of  condi-on   humidity,  high  mountain  climates,   milk,  linen,  drains     Sign_or_   Symptom   Altered  physical  appearance  or   behaviour  as  a  probable  result   of  injury  or  condi-on     cough,  pain,  rise  in  temperature,   swollen   Subject   individual  or  group  of  cases   under  discussion   asthma  pa:ents,  those  with   nega:ve  reac:ons  to  tuberculin   Therapeu7c_   or_   Inves7ga7onal     treatment,  substance,  medium   or  procedure,  prescribed  or   used  in  inves-ga-on   atrophine  sulphate,  generous  diet,   change  of  air,  lobectomy   18  
  19. 19. Hymera  Corpus:  Events   Type   Descrip7on   Par7cipants   Affect   A  en-ty  or  event  is  affected,   infected,  changed  or   transformed,  possibly  by   another  en-ty  or  event     Cause:  The  cause  of  the  affec-on   Target:  The  en-ty  or  event   affected   Subject:    The  medical  subject   affected     19  
  20. 20. Hymera  Corpus:  Events   Type   Descrip7on   Par7cipants   Cause   An  en-ty  or  event  results  in  the   manifesta-on  of  another  en-ty   or  event   Cause:  The  cause  of  the  event   Result:  The  resul-ng  en-ty  or   event   Subject:    The  medical  subject   affected     20  
  21. 21. •  Trained  classifiers  on  Hymera  corpus     •  Used  dic-onaries  to  obtain  domain  specific   informa-on   ü   UMLS  Metathesaurus   ü   MetaMap  (NE  system  for  UMLS  concepts  in  text)     •   Used  seman-c  categories  could  be  mapped  to   Hymera   Extrac-ng  en--es  automa-cally   21  
  22. 22. UMLS  è Corpus  Mappings   MetaMap  Categories   Hymera  category   Anatomical  Abnormality,  Body  Substance   Body  Part,  Organ,  or  Organ  Component   Body  Loca-on  or  Region,  Body  Space  or  Junc-on,  Tissue   Anatomical   Animal,  Mammal,  Cell,  Bacterium,  Organism   Biological_   En-ty   Food,  Chemical  Viewed  Structurally   Element,  Ion,  or  Isotope,  Hazardous  or  Poisonous  Substance   Substance,  Natural  Phenomenon  or  Process   Environmental   Disease  or  Syndrome,  Pathologic  Func-on   Condi-on   Clinical  Drug,  Amino  Acid,  Pep-de,  or  Protein,  Immunologic   Factor,  Organic  Chemical,  Pharmacologic  Substance,   Biologically  Ac-ve  Substance,  Lipid     Therapeu-c_or_   Inves-ga-onal   Sign  or  Symptom,  Finding   Sign_or_   Symptom   Group,  Pa-ent  or  Disabled  Group   Subject   22  
  23. 23. Seman-c  En--es:  Summary     ü  Even  baseline  system  produces  good  performance  on   most  categories  0.8%   ü  Environmental  and  Therapeu-c_or_Inves-ga-onal     ü  Recall  generally  rather  lower  than  precision   ü  Recall  can  be  improved  using  seman-c  informa-on   ü  UMLS  look-­‐up  preferable     ü  MetaMap  slow  and  unreliable   ü  UMLS  lookup  takes  advantage  of  our  NER  built-­‐in   dic-onary  lookup  to  boost  performance     23  
  24. 24. Event  Recogni-on   •  Detects  “trigger”  phrases  around  which   events  are  arranged:  origin     •  Finds  individual  event  par:cipants:  Subj,   Result  and  Cause     •  Combines  trigger-­‐argument  pairs  into   seman-c  structures     hip://www.nactem.ac.uk/EventMine/   24  
  25. 25. Event  Recogni-on  Summary     ü  Complex  task   •  600  Affect  and  200  Causality  events   ü  Diverse  triggers  can  make  recogni-on  a  challenge     ü  Ongoing  work  to  triple  the  size  of  manually   annotated  corpus     •  Different  decades  to  provide  greater  evidence  of   language  usage     ü Extended  corpus  boost  further  results,  wider   coverage  of  seman-cs     25  
  26. 26. Interpre-ng  events     ü Discourse  context  (meta-­‐knowledge)  affects   interpreta-on     •  p21ras  proteins  are  not  ac-vated  by  IL-­‐2     •  Our  results  suggest  that  p21ras  proteins  are  strongly   ac-vated  by  IL-­‐2  in  normal  human  T  lymphocytes     •  Our  results  show  that  p21ras  proteins  are  weakly   ac-vated  by  IL-­‐2  in  normal  human  T  lymphocytes       ü Contradic-ons,  nega-ons,  specula-ons     26  
  27. 27. 27  
  28. 28. Terminological  Issues     ü  Concepts  can  be  expressed  in  many  ways     ü  We  take  into  account    such  variants  to  improve  search   Lexical  Varia7ons   oedema   edema   whooping  cough     whooping-­‐cough   pulmonary  tuberculosis     tuberculosis   respiratory  diseases     diseases  of  the  respiratory  system   Seman7c  Varia7ons   tuberculosis   phthisis   smallpox   variola   Sophia  Ananiadou   28  
  29. 29. Building  a  Time  Sensi-ve   Terminological  Resource     ü  Why  not  using  exis-ng  resources?   •  MeSH,  Disease  Ontology,  UMLS,    Metathesaurus   ü  Used  in  search  systems  to  expand  query  terms     ü  Textual  variants  of  concepts  change  over  7me   •  Exis-ng  resources  tend  to  focus  on   contemporary  terminology   •  Historically-­‐relevant  variants  not  covered  in  a   comprehensive  and  consistent  manner     29  
  30. 30. 0   10   20   30   40   50   60   1840   1844   1848   1852   1856   1860   1864   1868   1872   1876   1880   1884   1888   1892   1896   1900   1904   1908   1912   1916   1920   1924   1928   1932   1936   1940   1944   1948   1952   1956   1960   1964   1968   Phthisis   Tuberculosis   Term  Varia-on  in  BMJ    Phthisis  vs.  Tubercuosis           30  
  31. 31. Term  Varia-on  in  BMJ   Scarlet  Fever  vs.  Scarla:na         0   10   20   30   40   50   60   1840   1844   1848   1852   1856   1860   1864   1868   1872   1876   1880   1884   1888   1892   1896   1900   1904   1908   1912   1916   1920   1924   1928   1932   1936   1940   1944   1948   1952   1956   1960   1964   1968   Scarla-na   Scarlet  Fever   31  
  32. 32. Time  Sensi-ve  Medical  Terminological   Inventory   ü  We  have  created  a  #me  sensi#ve  terminological   inventory  from  BMJ  for  medical  condi#ons   •  Complements  exis-ng  terminological  resources   by  accoun-ng  for  term  variant  usage  over  long   periods  of  7me   •  Extracted  term  variants  from  19th  century  medical   lexica     •  Iden-fied  more  variants  using  distribu:onal   seman:cs  techniques  to  BMJ  archives     32  
  33. 33. Processing  Historical  Lexica   19th  century  medical  lexica  NLM  Archive     Two  lists  of  diseases  considered       Name   Year   #  top-­‐level  entries   Nomenclature  of  diseases   1872   1347   Index  of  diseases,  their   symptoms,  and  treatment     1882   522   33  
  34. 34. •  English  main   term   •  La-n  main  term   •  Synonym  list   •  Sub-­‐type   English  term   •  Sub-­‐type  La-n   term   Synonym  set   Processing  Historical  Lexica  -­‐  example   34  
  35. 35. Processing  Historical  Lexica   ü Link  synonym  set  to  UMLS  using  approximate  string   matching   ü 1,588  synonym  sets  matched     ü 2,422  variants  not  contained  in  UMLS   ü Confirms  lack  of  comprehensive  historical  coverage  in   UMLS   However:   ü Manually-­‐curated  lexica  unlikely  to  account  for  all   “real  usage”  term  varia-on  in  medical  texts   ü Automa-c  extrac-on  of  terms  variants  complements   manually  curated  variants     ü Accounts  for  real  usage     35  
  36. 36. TM  Methods  for  Historical  Variant   Iden-fica-on   ü  Pre-­‐processing     ü  Distribu-onal  similarity  measures     ü  Detec-on  of  historically  relevant  term   variants       36  
  37. 37. Dic-onary   matching   Machine   learning   Pre-­‐Processing  BMJ   We  combined  two   recognisers     ü  Dic-onary  matching  from   19th  century  disease   lexica     ü  Index  of  Diseases   ü  Nomenclature  of  Diseases   ü  Machine  learning  trained   on  the  NCBI  Disease   Corpus     37  
  38. 38.     ü cosine  similarity     scarlet  fever   47366.48   46344.15   30555.26   31896.95     15241.49     11759.56     7407.69     7388.65   …   4608.19   measles   diptheria   fever   cough   death   pox   fatal   small   …   typhoid   scarla#na   11474.87   5436.11   5434.60   2892.64   2413.52   4326.59     875.16   2340.05   …   2071.56   measles   diptheria   fever   cough   death   pox   fatal   small   …   typhoid   Distribu-onal  Similarity  Measures   38  
  39. 39. Distribu-onal  Similarity  Measures   scarlet  fever   scarla-na            0.8434   whooping  cough    0.7415   measles  0.7094   diphtheria            0.6689   acute  pneumonia  0.6501   chicken  pox          0.6407   german  measles    0.6336   diarrhoeal  disease            0.6301   infan-le  diarrhoea          0.6234   diarrhea                0.6208   croup      0.6093   mump        0.6042   typhus  fever        0.5535   typhoid  fever      0.5508   …   39  
  40. 40. Access  “query  builder”  to   start  search  refinement     40  
  41. 41. User  can  refine  search  using   mul-ple  criteria.  Here  we   start  by  searching    for  a   specific  term   41  
  42. 42. User  enters  term  of  interest   and  then  clicks  on  “Refine   Search”       42  
  43. 43. Retrieved  documents   related  terms   User  selects  “pulmonary   tuberculosis”   Current  search  criteria   Frequency  of  “pulmonary   consump-on”  over  -me  
  44. 44. “pulmonary   tuberculosis”  is  added  to   query  builder     composite  -me  graph   (automa-cally  updated)   Terms  related  to  “pulmonary   consump-on”  and  “pulmonary   tuberculosis”   44  
  45. 45. Further  refine  retrieved   documents  by  date   published   45  
  46. 46. Only  retain  documents   published  up  un-l  1950   46  
  47. 47. Further  refine  retrieved   documents  by  date   published   47  
  48. 48. Refine  by  automa-cally   recognised  seman-c   en--es   48  
  49. 49. Sema-c  refinement  can   be  made  according  to   different  en-ty   categories   49  
  50. 50. Here  “Environmental”   en--es  present  in  the   current  document  set,   with  their  counts.  The   user  may  type  into  the   box  to  narrow  the  list  of   en--es  displayed.           50  
  51. 51. New  sema-c  restric-on   added  to  query  builder   display   Number  of  documents   further  reduced     51  
  52. 52. Results  can  also  be   refined  by  searching  for   documents  containing   involving  en--es   52  
  53. 53. User  can  choose   between  the  2   events   recognised  by   our  system   53  
  54. 54. User  (par-ally)   completes  template   according  to  the  types   of  events  they  are   searching  for.  Here,  the   search  is  for  documents   men-oning    causes  of   tuberculosis       54  
  55. 55. Document  display   Bibliographic  informa-on   Document  content   Seman-c  informa-on   55  
  56. 56. Pie  chart  shows  the  distribu-on  of   en-ty  types  in  a  document.   Hovering  the  mouse  over  a  sec-on   reveals  the  en-ty  type  in  ques-on     56  
  57. 57. Number  of   “Environmental”  en--es   in  the  document   “Environmental”  en-ty   instances  in  the  document   “Environmental”  en--es   are  highlighted   57  
  58. 58. Cases  where  specific  causes   of  tuberculosis  are   men-oned  in  the  document   Phrases  iden-fying   Causality  events  are   highlighted   58  
  59. 59. Affect  events  allow  us  to   explore,  e.g.,  whether   therapeu-c  measures   have  a  posi-ve  or   nega-ve  effect   59  
  60. 60. Future  developments       ü  More  en7ty  types:     •  Tools  and  materials  used  in  a  medical  or  laboratory  context   •  Geographical  Loca-ons   •  Organisa-ons  (e.g.  schools,  workhouses,  hospitals,  etc)     •  Temporal  expressions     ü  More  event  types:     •  Associa:ons  between  en--es     •  Ac:ons  carried  out  as  part  of  a  medical  examina-on,   treatment  or  to  resolve  an  issue     •  Rates  -­‐  Descrip-ons  of  a  reported  rate  relevant  to  public   health  (birth,  marriage,  death,    disease,  etc.)  and   comparisons  between  these  rates     60  
  61. 61. Future  Developments  -­‐  Aiributes         Recogni-on  of  aiributes  at  both  the  en-ty  and  event  levels:       ü  En7ty-­‐based  aZributes       •  Age,  sex,  ethnicity,  occupa-on  or  general  health  status  of  medical   subject     •  Frequency  of  treatments   •  Quan--es  or  measures  rela-ng  to  en--es   •  Dosages  of  treatments   •  Intensity  of  a  symptom   ü  Event-­‐based  aZributes  with  discourse  interpreta7on   •  Is  an  event  specified  as  being  definite  or  speculated?     •  If  speculated,  which  level  of  certainty  is  ascribed  to  the  event?     •  What  is  the  source  of  the  informa-on  specified  in  the  event,  e.g.,   does  it  come  from  the  author  or  another  source?           61  
  62. 62. Conclusions   ü Explore  further  archives  using  topic  analysis     ü Automa-c  clustering  of  search  results  according  to   similari-es  of  seman-c  content   ü Expand  types  of  en--es  and  events  and  enrich   further  BMJ   ü Link  with  other  archives     ü Create  a  rich  annota-on  database     62  
  63. 63. •  BMJ  and  Wellcome  Library  for  making  the  archives   available  to  text  mine   •  AHRC  for  funding   •  Team  members  at  NaCTeM  and  CHSTM   •  Paul  Thompson,  Elizabeth  Toon,  Nick  Duvall   •  Jacob  Carter,  George  Kontonatsios,  Riza  Ba-sta-­‐ Navarro     Inves-gators:  S.  Ananiadou,  J.  McNaught  (NaCTeM)                                                    C.  Timmermann,  M.  Worboys  (CHTM)   Acknowledgments   63  

×