Your SlideShare is downloading. ×
0
The	  Seman)c	  Web	  	                (There	  and	  Back	  Again)	    Car)c	  Ramakrishnan	           Pablo	  N.	  Mende...
Evolu)on	  of	  the	  Seman)c	  Web	  1945	                                                                               ...
Emergent	  Knowledge	  in	  Public	  Text	                                                                             Nic...
Emergent	  Knowledge	  in	  Biomedical	                                Research	  Papers	                                 ...
Applica)on	  of	  Emergent	  Knowledge	  in	                                  Biology	  –	  Drug	  Repurposing	           ...
Research	  Areas	  •  Extrac)ng	  Factual	  Knowledge	  from	  Biomedical	     Research	  Ar)cles	           –  En))es	  –...
LA-­‐PDFText	  –	  Extrac)ng	  Text	  From	                            Research	  Papers	  Ramakrishnan,	  C.,	  A.	  Patn...
LA-­‐PDFText	  –	  Extrac)ng	  Text	  From	                            Research	  Papers	  Ramakrishnan,	  C.,	  A.	  Patn...
Unsupervised	  Fact	  Extrac)on	    Dallenbach-­‐Hellweg,	  G.	  (1976)	  Fortschr	  Med	  94(5):	  256-­‐263.	    Abstrac...
Resul)ng	  Structure	  (RDF)	  Dallenbach-­‐Hellweg,	  G.	  (1976)	  Fortschr	  Med	  94(5):	  256-­‐263.	  Abstract:	  An...
Detec)ng	  Nested	  En))es	                     	  	  	  Chevy	  Chase	  Bank	  on	  5th	  and	  3rd   	  	               ...
Result	  of	  Unsupervised	  Extrac)on	  Abstracts	  of	  	  ~18	  million	  research	             ~200	  million	  parse	...
Discovering	  Pagerns	  in	  Factual	  Knowledge	  11/6/12	                                              13	  
Discovering	  Pagerns	  in	  Factual	  Knowledge	  •  Finding	  Paths	           –  Exponen)al	  no.	  of	  paths	  	  	  ...
Candidate	  Subgraph	  Iden)fica)on	  •  Bidirec)onal	  lock-­‐step	  growth	  from	  S	  and	  T	           –  Next	  hop	...
Finding	  Best	  Subgraphs	  •  Candidate	  Graph	           –  Too	  large	  to	  be	  useful	           –  Lis)ng	  path...
Semi-­‐automated	  Knowledge	  Discovery	  in	          Biomedicine	  –	  How	  far	  are	  we?	  •  Trust	  in	  extracte...
Building	  A	  Web	  of	  Linked	  En))es	                     with	  DBpedia	  Spotlight	                              Pa...
Upcoming SlideShare
Loading in...5
×

LA Semantic Web meetup nov5th 2012

284

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
284
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "LA Semantic Web meetup nov5th 2012"

  1. 1. The  Seman)c  Web     (There  and  Back  Again)   Car)c  Ramakrishnan   Pablo  N.  Mendes   Research  Scien)st   Research  Associate     Datapop   Open  Knowledge   Founda)on    11/5/12     1  
  2. 2. Evolu)on  of  the  Seman)c  Web  1945   1991   +  Internet      2001   “I  have  a  dream  for  the  Web  [in  which   computers]  become  capable  of  analyzing   all  the  data  on  the  Web  –  the  content,   links,  and  transac)ons  between  people   and  computers.”  –  Tim  Berners  Lee  11/2/12   2  
  3. 3. Emergent  Knowledge  in  Public  Text   Nicolas  Poussin   painted_by   Nicolas  Flammel   men-oned_in   member_of   cryp-c_mo1o_of   Victor  Hugo   member_of   Priory  of  Sion   displayed_at   wri1en_by   The  Hunchback     Louvre   displayed_at   of  Notre  Dame   painted_by   Leonardo  Da  Vinci   men-oned_in   painted_by  11/2/12   3  
  4. 4. Emergent  Knowledge  in  Biomedical   Research  Papers   contain  Dietary  fish  oils   Eicosapentaenoic  acid   Confirmed  by   reduces  Eicosapentaenoic  acid   Blood  viscosity   clinical  trials   have  Raynaud’s  disease  pa)ents                                      elevated  blood  viscosity.    Swanson,  D.  R.  (1986).  "Fish  Oil,  Raynauds  Syndrome,  and  Undiscovered  Public  Knowledge."  Perspec)ves  in  Biology  and  Medicine  30(1):  7-­‐18.   can  inhibit   12  subsequent  Magnesium   Spreading  cor)cal  depression   studies  support   hypothesis   May  be  implicated  in  Spreading  cor)cal  depression   Migraine  Agacks  Swanson,  D.  R.  (1988).  "Migraine  and  Magnesium:  Eleven  Neglected  Connec)ons."  Perspec)ves  in  Biology  and  Medicine  31(4):  526-­‐557.   11/2/12   4  
  5. 5. Applica)on  of  Emergent  Knowledge  in   Biology  –  Drug  Repurposing   Rosiglitazone   Carbopla)n   induces   ac)vates   DNA  fragmenta)on   PPARγ   Peroxisome  prolifertator-­‐ac)viated  receptor  gamma   induces   downregulates   downregulates   Cancer  cell  death   Metallothianine  Girnun,  G.  D.,  E.  Naseri,  et  al.  (2007).  Cancer  Cell  11(5):  395-­‐406   11/2/12   5  
  6. 6. Research  Areas  •  Extrac)ng  Factual  Knowledge  from  Biomedical   Research  Ar)cles   –  En))es  –  “Carbopla)n  induces  Cell  Death”   –  Rela)ons  –  induces(Carbopla)n,  Cell  Death)   –  Supervised  Machine  Learning   •  Expensive  Training  data  •  Discovering  Pagerns  in  Factual  Knowledge   –  Paths  –  Carbopla)n        ???              Rosiglitazone   –  Subgraphs    11/5/12   6  
  7. 7. LA-­‐PDFText  –  Extrac)ng  Text  From   Research  Papers  Ramakrishnan,  C.,  A.  Patnia,  E.  Hovy  and  G.  Burns  (2012).  "Layout-­‐Aware  Text  Extrac)on  from  Full-­‐text  PDF  of  Scien)fic  Ar)cles."  Source  Code  for  Biology  and  Medicine  7(1):  7.  hgp://code.google.com/p/lapdoext/  11/6/12   7  
  8. 8. LA-­‐PDFText  –  Extrac)ng  Text  From   Research  Papers  Ramakrishnan,  C.,  A.  Patnia,  E.  Hovy  and  G.  Burns  (2012).  "Layout-­‐Aware  Text  Extrac)on  from  Full-­‐text  PDF  of  Scien)fic  Ar)cles."  Source  Code  for  Biology  and  Medicine  7(1):  7.  hgp://code.google.com/p/lapdoext/  11/6/12   8  
  9. 9. Unsupervised  Fact  Extrac)on   Dallenbach-­‐Hellweg,  G.  (1976)  Fortschr  Med  94(5):  256-­‐263.   Abstract:   An  excessive  endogenous  or  exogenous  s)mula)on  by  estrogen  induces  adenomatous   hyperplasia  of  the  endometrium.   Relationship induces     nsubj   dobj   Subject head Object head s)mula)on     hyperplasia     det  An     amod   prep_of   amod   prep_by   amod   adenomatous     endometrium   endogenous     excessive     det   estrogen     conj_or   the     exogenous     11/2/12   9  
  10. 10. Resul)ng  Structure  (RDF)  Dallenbach-­‐Hellweg,  G.  (1976)  Fortschr  Med  94(5):  256-­‐263.  Abstract:  An  excessive  endogenous  or  exogenous  s)mula)on  by  estrogen  induces  adenomatous  hyperplasia  of  the  endometrium.   adenomatous hyperplasia hasModifier hasPart An excessive endogenous or exogenous stimulation modified_entity_2 hasModifier hasPart modified_entity_1 induces composite_entity_1 hasPart hasPart estrogen endometriumCar)c  Ramakrishnan,  Pablo  N.  Mendes,  Shaojun  Wang,  Amit  P.  Sheth:  Unsupervised  Discovery  of  Compound  En))es  for  Rela)onship  Extrac)on.  EKAW  2008:  146-­‐155  11/6/12   10  
  11. 11. Detec)ng  Nested  En))es        Chevy  Chase  Bank  on  5th  and  3rd          Chevy  Chase  Bank  on  5th  and  3rd     Syntac)c  Dependencies   nn   prep_on   nn   prep_on   [[[Chevy  Chase]  Bank] Person Org  on  5th  and  3rd ] Loca)on    11/5/12   11  
  12. 12. Result  of  Unsupervised  Extrac)on  Abstracts  of    ~18  million  research   ~200  million  parse  trees   En)ty  Rela)onship  network  ar)cles   adenomatous hyperplasia hasModifier hasPart An excessive endogenous or exogenous stimulation modified_entity_2 hasModifier hasPart modified_entity_1 induces composite_entity_1 hasPart hasPart estrogen endometrium •  137,414,820  triples  with  named  rela)ons   –  Triple  “hair-­‐ball”   11/5/12   12  
  13. 13. Discovering  Pagerns  in  Factual  Knowledge  11/6/12   13  
  14. 14. Discovering  Pagerns  in  Factual  Knowledge  •  Finding  Paths   –  Exponen)al  no.  of  paths                  Informa)on  overload   –  Relevance                  not  all  paths  are  equally  relevant  •  Our  solu)on   –  Subgraph  detec)on  with  fixed  node  budget   –  Heuris)c  edge  weigh)ng  to  control  relevance  Car)c  Ramakrishnan,  William  H.  Milnor,  Maghew  Perry,  Amit  P.  Sheth:  Discovering  informa)ve  connec)on  subgraphs  in  mul)-­‐rela)onal  graphs.  SIGKDD  Explora)ons  7(2):  56-­‐63  (2005)  11/6/12   14  
  15. 15. Candidate  Subgraph  Iden)fica)on  •  Bidirec)onal  lock-­‐step  growth  from  S  and  T   –  Next  hop  based  on  edge  weights   –  Terminate  when  cut  edge  limit  reached   –  Results  in  candidate  graph  11/6/12   15  
  16. 16. Finding  Best  Subgraphs  •  Candidate  Graph   –  Too  large  to  be  useful   –  Lis)ng  paths  =  informa)on  overload  •  Electrical  Circuit   –  Edge  weights  =  resistance     –  +1  volt  at  source  node  &  ground  at  target  •  Using  Ohm’s  and  Kirchoff’s  laws     –  find  maximum  current  flow  paths  through  the   candidate  graph  from  S  to  T  Car)c  Ramakrishnan,  William  H.  Milnor,  Maghew  Perry,  Amit  P.  Sheth:  Discovering  informa)ve  connec)on  subgraphs  in  mul)-­‐rela)onal  graphs.  SIGKDD  Explora)ons  7(2):  56-­‐63  (2005)  11/6/12   16  
  17. 17. Semi-­‐automated  Knowledge  Discovery  in   Biomedicine  –  How  far  are  we?  •  Trust  in  extracted  facts   –  Extrac)on  errors     –  Poor  quality  sources   –  No  provenance     –  Misleading  cita)ons   –  Inten)onally  misleading  research  reports   –  Uninten)onal  mistakes  in  research  reports  •  Informa)on  overload    11/5/12   17  
  18. 18. Building  A  Web  of  Linked  En))es   with  DBpedia  Spotlight   Pablo  N.  Mendes   Research  Associate     Open  Knowledge   Founda)on    11/5/12     18  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×