LA Semantic Web meetup nov5th 2012
Upcoming SlideShare
Loading in...5
×
 

LA Semantic Web meetup nov5th 2012

on

  • 325 views

 

Statistics

Views

Total Views
325
Views on SlideShare
316
Embed Views
9

Actions

Likes
0
Downloads
1
Comments
0

3 Embeds 9

http://www.linkedin.com 6
https://twitter.com 2
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

LA Semantic Web meetup nov5th 2012 LA Semantic Web meetup nov5th 2012 Presentation Transcript

  • The  Seman)c  Web     (There  and  Back  Again)   Car)c  Ramakrishnan   Pablo  N.  Mendes   Research  Scien)st   Research  Associate     Datapop   Open  Knowledge   Founda)on    11/5/12     1  
  • Evolu)on  of  the  Seman)c  Web  1945   1991   +  Internet      2001   “I  have  a  dream  for  the  Web  [in  which   computers]  become  capable  of  analyzing   all  the  data  on  the  Web  –  the  content,   links,  and  transac)ons  between  people   and  computers.”  –  Tim  Berners  Lee  11/2/12   2  
  • Emergent  Knowledge  in  Public  Text   Nicolas  Poussin   painted_by   Nicolas  Flammel   men-oned_in   member_of   cryp-c_mo1o_of   Victor  Hugo   member_of   Priory  of  Sion   displayed_at   wri1en_by   The  Hunchback     Louvre   displayed_at   of  Notre  Dame   painted_by   Leonardo  Da  Vinci   men-oned_in   painted_by  11/2/12   3  
  • Emergent  Knowledge  in  Biomedical   Research  Papers   contain  Dietary  fish  oils   Eicosapentaenoic  acid   Confirmed  by   reduces  Eicosapentaenoic  acid   Blood  viscosity   clinical  trials   have  Raynaud’s  disease  pa)ents                                      elevated  blood  viscosity.    Swanson,  D.  R.  (1986).  "Fish  Oil,  Raynauds  Syndrome,  and  Undiscovered  Public  Knowledge."  Perspec)ves  in  Biology  and  Medicine  30(1):  7-­‐18.   can  inhibit   12  subsequent  Magnesium   Spreading  cor)cal  depression   studies  support   hypothesis   May  be  implicated  in  Spreading  cor)cal  depression   Migraine  Agacks  Swanson,  D.  R.  (1988).  "Migraine  and  Magnesium:  Eleven  Neglected  Connec)ons."  Perspec)ves  in  Biology  and  Medicine  31(4):  526-­‐557.   11/2/12   4  
  • Applica)on  of  Emergent  Knowledge  in   Biology  –  Drug  Repurposing   Rosiglitazone   Carbopla)n   induces   ac)vates   DNA  fragmenta)on   PPARγ   Peroxisome  prolifertator-­‐ac)viated  receptor  gamma   induces   downregulates   downregulates   Cancer  cell  death   Metallothianine  Girnun,  G.  D.,  E.  Naseri,  et  al.  (2007).  Cancer  Cell  11(5):  395-­‐406   11/2/12   5  
  • Research  Areas  •  Extrac)ng  Factual  Knowledge  from  Biomedical   Research  Ar)cles   –  En))es  –  “Carbopla)n  induces  Cell  Death”   –  Rela)ons  –  induces(Carbopla)n,  Cell  Death)   –  Supervised  Machine  Learning   •  Expensive  Training  data  •  Discovering  Pagerns  in  Factual  Knowledge   –  Paths  –  Carbopla)n        ???              Rosiglitazone   –  Subgraphs    11/5/12   6  
  • LA-­‐PDFText  –  Extrac)ng  Text  From   Research  Papers  Ramakrishnan,  C.,  A.  Patnia,  E.  Hovy  and  G.  Burns  (2012).  "Layout-­‐Aware  Text  Extrac)on  from  Full-­‐text  PDF  of  Scien)fic  Ar)cles."  Source  Code  for  Biology  and  Medicine  7(1):  7.  hgp://code.google.com/p/lapdoext/  11/6/12   7  
  • LA-­‐PDFText  –  Extrac)ng  Text  From   Research  Papers  Ramakrishnan,  C.,  A.  Patnia,  E.  Hovy  and  G.  Burns  (2012).  "Layout-­‐Aware  Text  Extrac)on  from  Full-­‐text  PDF  of  Scien)fic  Ar)cles."  Source  Code  for  Biology  and  Medicine  7(1):  7.  hgp://code.google.com/p/lapdoext/  11/6/12   8  
  • Unsupervised  Fact  Extrac)on   Dallenbach-­‐Hellweg,  G.  (1976)  Fortschr  Med  94(5):  256-­‐263.   Abstract:   An  excessive  endogenous  or  exogenous  s)mula)on  by  estrogen  induces  adenomatous   hyperplasia  of  the  endometrium.   Relationship induces     nsubj   dobj   Subject head Object head s)mula)on     hyperplasia     det  An     amod   prep_of   amod   prep_by   amod   adenomatous     endometrium   endogenous     excessive     det   estrogen     conj_or   the     exogenous     11/2/12   9  
  • Resul)ng  Structure  (RDF)  Dallenbach-­‐Hellweg,  G.  (1976)  Fortschr  Med  94(5):  256-­‐263.  Abstract:  An  excessive  endogenous  or  exogenous  s)mula)on  by  estrogen  induces  adenomatous  hyperplasia  of  the  endometrium.   adenomatous hyperplasia hasModifier hasPart An excessive endogenous or exogenous stimulation modified_entity_2 hasModifier hasPart modified_entity_1 induces composite_entity_1 hasPart hasPart estrogen endometriumCar)c  Ramakrishnan,  Pablo  N.  Mendes,  Shaojun  Wang,  Amit  P.  Sheth:  Unsupervised  Discovery  of  Compound  En))es  for  Rela)onship  Extrac)on.  EKAW  2008:  146-­‐155  11/6/12   10  
  • Detec)ng  Nested  En))es        Chevy  Chase  Bank  on  5th  and  3rd          Chevy  Chase  Bank  on  5th  and  3rd     Syntac)c  Dependencies   nn   prep_on   nn   prep_on   [[[Chevy  Chase]  Bank] Person Org  on  5th  and  3rd ] Loca)on    11/5/12   11  
  • Result  of  Unsupervised  Extrac)on  Abstracts  of    ~18  million  research   ~200  million  parse  trees   En)ty  Rela)onship  network  ar)cles   adenomatous hyperplasia hasModifier hasPart An excessive endogenous or exogenous stimulation modified_entity_2 hasModifier hasPart modified_entity_1 induces composite_entity_1 hasPart hasPart estrogen endometrium •  137,414,820  triples  with  named  rela)ons   –  Triple  “hair-­‐ball”   11/5/12   12  
  • Discovering  Pagerns  in  Factual  Knowledge  11/6/12   13  
  • Discovering  Pagerns  in  Factual  Knowledge  •  Finding  Paths   –  Exponen)al  no.  of  paths                  Informa)on  overload   –  Relevance                  not  all  paths  are  equally  relevant  •  Our  solu)on   –  Subgraph  detec)on  with  fixed  node  budget   –  Heuris)c  edge  weigh)ng  to  control  relevance  Car)c  Ramakrishnan,  William  H.  Milnor,  Maghew  Perry,  Amit  P.  Sheth:  Discovering  informa)ve  connec)on  subgraphs  in  mul)-­‐rela)onal  graphs.  SIGKDD  Explora)ons  7(2):  56-­‐63  (2005)  11/6/12   14  
  • Candidate  Subgraph  Iden)fica)on  •  Bidirec)onal  lock-­‐step  growth  from  S  and  T   –  Next  hop  based  on  edge  weights   –  Terminate  when  cut  edge  limit  reached   –  Results  in  candidate  graph  11/6/12   15  
  • Finding  Best  Subgraphs  •  Candidate  Graph   –  Too  large  to  be  useful   –  Lis)ng  paths  =  informa)on  overload  •  Electrical  Circuit   –  Edge  weights  =  resistance     –  +1  volt  at  source  node  &  ground  at  target  •  Using  Ohm’s  and  Kirchoff’s  laws     –  find  maximum  current  flow  paths  through  the   candidate  graph  from  S  to  T  Car)c  Ramakrishnan,  William  H.  Milnor,  Maghew  Perry,  Amit  P.  Sheth:  Discovering  informa)ve  connec)on  subgraphs  in  mul)-­‐rela)onal  graphs.  SIGKDD  Explora)ons  7(2):  56-­‐63  (2005)  11/6/12   16  
  • Semi-­‐automated  Knowledge  Discovery  in   Biomedicine  –  How  far  are  we?  •  Trust  in  extracted  facts   –  Extrac)on  errors     –  Poor  quality  sources   –  No  provenance     –  Misleading  cita)ons   –  Inten)onally  misleading  research  reports   –  Uninten)onal  mistakes  in  research  reports  •  Informa)on  overload    11/5/12   17  
  • Building  A  Web  of  Linked  En))es   with  DBpedia  Spotlight   Pablo  N.  Mendes   Research  Associate     Open  Knowledge   Founda)on    11/5/12     18