BioNLPSADI

Google Project Page: https://code.google.com/p/bionlp-sadi/

Project Demo Page: https://cbakerlab:8080/p/bionlp-sadi/

Presenter: Ahmad C. Bukhari

1

 Motivation and Introduction
 Past Research Work
 Proposed Methodology
 System architecture
 System design
 Ontology Development
 SADI Service development
 Demo and code view
 Experiments and Results
 Conclusion and Future work
 References

2

 Scientific literature, the most updated source of information

 Explosive growth observed in scientific literature
production

 Internet is full of Bio related databases and search
engines

 Text formats are provided by PubMed and OMIM.
 Sequence data is provided by GenBank, in terms of DNA, and UniProt,
in terms of protein.
 Protein structures are provided by PDB, SCOP, and CATH.

3

 Thousands of documents produced weekly : Impossible to read all
the published documents

 Several solution developed based on AI techniques

 Lost significant due to new terms developed and static mechanism

 NLP emerged as possible solution in past decade

 NLP was widely adopted by scientists

 Several applications are available on internet based on NLP
techniques

4

 We Introduced semantically rich interoperable suite of BioNLP
services based on SADI framework.

 Exploits the NLP technologies in order to extract the
biological useful information from scientific documents.

 Can present the extracted information in such fashion that it
would be reusable, searchable and interoperable.

 Can display the output in integrated format which further can
lead for better bio system analysis

5

Existing text mining services

Existing text mining services with web services

•U-Compare
•Whatizit
•EBIMED

6

 Scientific community looking for sophisticated solution which can
handle Biological data interoperability, usability and integration
challenges.

 We coupled the useful biological NLP techniques with SADI
framework to cope the biological information logistics
issues.
 Proposed solution exploit the NLP technologies to extract
bio worthy info. With semantic support

 Proposed solution provides output in reusable; searchable
and interoperable format

7

User Interaction Layer

SADI services suite

8

REST, XML, SOAP, or WSDL

XML, RDF, OWL, RDFS

SWS+BNLP
KLEIO
U-Compare NLP +WS = XML output
GENIA
FACTA+
etc 9

Deal with Annotation

All document related concepts

Feature Modeling

12

 mutationFinder
 DrugExtractor (enhanced)
 DrugDrug Interaction (80% complte)
 Drug2Food Interaction (Business logic
complte)
 Pmid2pdf (enhanced)
 Pdf2ascii (upgraded overall) // A lot bug in
existing
 SADI client level integration service
14

Tools and technologies used
•Java
•Servlet
•RDF
•SPARQL
•JSP
•JSF
•Javascript
•XHTML
•And several
third party
libraries

15

Demo and Code View

16

 Show where the drug Amoxicillin (DB01060 )
positive effect against higher serum levels
 Give me the sentence where mutation and
drug name occur in the same sentence.
 Extract all the drug names from text and
show me the interaction (if exist) among all
the drugs
 Tell me the food which have bad interaction
with drug Cytarabine
19

Consolidated Output Generated By system
20

 Proposed a generalized architecture : semantic interoperability
and integration among BNLP tools

 Performed several experiments by designing different corpora’s
and by choosing different combination of services

 In most of the cases: system generated the results according to
our requirements
 . AS a future work, we will try to enhance the performance of the
system by refining the algorithms
 A registry feature will be added to give user more freedom to work.

21

 Topic Finding

 Limited availability of tools

 Development challenges (countless)

 Integration with web

 Finding case study (still have)
22

 E. Gatial, Z. Balogh, M. Ciglan, L. Hluchy, Focused web crawling mechanism based on page relevance, In: Proc
eedings of (ITAT 2005) information technologies applications and theory, 2005, pp. 41–45
 F.N Natalya, LM Deborah, Ontology development 101: a guide to creating your ﬁrst ontology. http://protege.s
tanford.edu/publications/ontology_development/ontology101-noy-mcguinness.htm
 H. Cunningham, Y. Wilks, R. J. Gaizauskas, GATE, a General Architecture for Text Engineering. Computers and
humanities (2002), 1057-1060.
 R. Subhashini, V.J.S Kumar, Shallow NLP techniques for noun phrase extraction, In: Proceeding of Trendz in Inf
ormation Sciences & Computing (TISC), 2010 , pp.73-77.
 S. Nasrolahi, M. Nikdast, M. Boroujerdi, The semantic web: a new approach for future world wide web, In: Pro
ceedings of World Academy of Science, Engineering and Technology, 2009, pp. 1149-1154
 A.C. Bukhari, Y.G Kim, Exploiting the Heavyweight Ontology with Multi-Agent System Using Vocal Command
System: A Case Study on E-Mall, International Journal of Advancements in Computing Technology 3(2011) 233
-241.
 A.C. Bukhari, Y.G Kim, Ontology-assisted automatic precise information extractor for visually impaired inhabit
ants, Artificial Intelligence Review (2005) Issn: 0269-2821.
 D.H. Fudholi, N. Maneerat, R. Varakulsiripunth, Y. Kato, Application of Protégé, SWRL and SQWRL in fuzzy on
tology-based menu recommendation, International Symposium on Intelligent Signal Processing and Commu
nication Systems, 2009, pp. 631-634.
 Baumgartner WA, Cohen KB, Fox L, Acquaah-Mensah G, Hunter L: Manual annotation is not sufficient for cura
ting genomic databases.
 Bioinformatics 2007, 23:i41-i48. PubMed Abstract | Publisher Full Text | PubMed Central Full Text
 Laurilla J, Naderi N, Witte R, Riazanov A, Kouznetsov A, Baker CJO: Algorithms and semantic infrastructure for
mutation impact extraction and grounding.
 BMC Genomics 2010, 11(Suppl 4):S24. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text

23

BioNLPSADI

More Related Content

What's hot

Similar to BioNLPSADI

More from Syed Ahmad Chan Bukhari, PhD

Recently uploaded

BioNLPSADI