Semantic Textmining

 Goals and achievements
   BioHackathon 2010
Team members
•   Hammad
•   Matthias
•   Venkata
•   Heiko
•   YAMAMOTO-san
•   Alberto
Original Proposal
• Integration of text mining results
  – Reflect / Whatizit / Medie
  – Results as triplets
     • URI and predicates
  – Implementation with SADI
  – Result presentation using aTag
• Explore relations
• Interfaces
The work done
• Integration of text mining results
   – Reflect / Whatizit / Medie
   – Results as triplets
      • URI and predicates
   – Future BioPython module and REST service
• Explore relations
   – Sesame endpoint
   – Biogateway
   – ARQ for federated queries
• Interfaces
   – Result presentation using aTag
   – Exhibit faceted interface
http://whatizit.neurocommons.org/
http://whatizit.neurocommons.org/
RDF schema for TM

<rdf:Description>
 <rdf:type rdf:resource="http://rdfs.org/sioc/ns#Item"/>
 <sioc:about rdf:resource="http://www.ncbi.nlm.nih.gov/pubmed/9002550"/>
 <sioc:content>SBMA</sioc:content>
 <sioc:topic rdf:resource="http://purl.uniprot.org/uniprot/P10275"/>
 <rdfs:seeAlso rdf:resource="http://www.ncbi.nlm.nih.gov/pubmed/9002550"/>
</rdf:Description>
http://reflect.ws
MEDIE and Enju APIs
• MEDIE is an intelligent search
  engine to retrieve biomedical
  correlations from MEDLINE, based
  on indexing by Natural Language
  Processing and Text Mining
  techniques.

• Enju is a syntactic parser for
  English.
Medie XML output
http://www-tsujii.is.s.u-tokyo.ac.jp/medie/dbcls.cgi?pmid=19116711
Enju XML output
http://docman.dbcls.jp/medieconv?pmid=17551671
http://togows.dbcls.jp/entry/pubmed/pmid.ttl

http://www.uniprot.org/uniprot/P12345.rdf
Workflow
Whatizit   Reflect   Medie         Pubmed



XML         XML       XML




            RDF

                             RDF   TogoWS



                             RDF   Uniprot
Substance A
    Interacts with
    Receptor B



    Region C
    axonal projections
    brain region D




     Region D
     aversive stimuli




Interlink these entities with
taxonomies & ontologies
TMOntology
http://hackathon3.dbcls.jp/wiki/TextMining

Textmining activities at BioHackathon 2010