Next Generation Search and  Discovery Tools for the Web Semantics for Universal Search and Discovery Endre Jofoldi and  dr...
What is semantic search? Keywords vs. Concepts “ semantic search  is  a   search   or  a question   or   an action   that ...
What is NLMplus? NLMplus  is   a   semantic search   and   knowledge discovery  application   designed to tap into the   r...
What is NLMplus?
Semantic Annotation of   PubMed Reviews via the WebLib Part-of-Speech Tagger and Noun Phrase Chunker
phrases >  key phrases >  key concepts  >  synonyms   > normalized  semantic indexes <ul><ul><li>19418823;TI: anthracyclin...
Semantic Searching of PubMed Reviews The WebLib Semantic Search Engine Apache Solr enterprise search engine with WebLib’s ...
Scaling the Semantic Searching of   PubMed Reviews More than 1.6 million PubMed Reviews are semantically searchable on  ou...
 
Results from NLM’s PubMed http://www.ncbi.nlm.nih.gov/pubmed?term=prostate%20cancer%20antigen%203
 
PubMed Query:  transient cortical blindness following coronary angiography
More about the NLMplus  <ul><li>NLMplus Innovations </li></ul><ul><li>Universal Search </li></ul><ul><li>Flexible  access ...
Potential WebLib Semantic Search and  Knowledge Base Applications Common Problem:  accessing heterogeneous databases   Mea...
Endre Jofoldi  endre.jofoldi@weblib.hu  twitter: EndreJofoldi Dr. Tamas Doszkocs [email_address] Contact
Upcoming SlideShare
Loading in …5
×

Semantics for universal search and discovery

1,583 views
1,543 views

Published on

WebLib has recently won a prestigious U.S. challenge.gov award with its NLMplus entry, a large scale semantic search and knowledge discovery application that makes innovative use of the National Library of Medicine’s vast collection of biomedical data and services.

The NLMplus app (http://nlmplus.com) combines a number of leading-edge semantic knowledge resources and technologies, such as a biomedical knowledge base, a semantic search engine, a distributed search engine, and a variety of smart content analysis and discovery services. Users can concurrently access 60 NLM databases to find trusted information ranging from consumer health topics to drugs and from news to clinical trials and translational medicine. One of the important innovations of NLMplus is WebLib’s Semantic Search Engine, which typically produces relevant search results with improved precision and recall from 1.6 million PubMed Review articles, which are semantically indexed and searched on a WebLib server. The NLMplus application also sends conceptually enhanced user queries to NLM’s PubMed system of more than 21 million citations from the biomedical literature, life science journals, and online books.

Providing flexible access to heterogeneous databases is a common challenge in medical libraries, biomedical research institutions and the health care industry. The same is true for non-biomedical content and applications. WebLib’s domain independent semantic indexing and searching of local databases, in combination with universal search and discovery solutions for free and fee-based content allows all types of organizations to better serve their diverse user communities, including the public, researchers, professionals, and policy and decision makers.

Published in: Health & Medicine, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,583
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • A Meaningful Good Morning to You ALL !  I am the last presenter before lunch, so an old hungarian saying comes to my mind: Do not stand between a man and his lunch  . So I promise to be fast, especially because I decided to cancel my online demo what I was planning to do, as I have seen other presenters struggling with this… I dont want to cause myself too much trouble… This internet is faster, so maybe at the end I will show you NLMPlus in practice… But I have to tell, that I have a dream… I have a dream about a conference… where there is no problem with the internet connection… My presentation ( and online demo ) will focus on applying semantic knowledge solutions to the searching of scientific and technical databases
  • Read the Definition in the slide : It is our absolutely non technical definition: Tamas Doszkocs. In the context of textual databases like PubMed, semantic searching primarily involves identifying and utilizing concepts implicit in the user’s query and in the scientific literature .
  • READ the NLMplus description on the slide CLICK on “ What is NLMplus ” link As you can see, NLMplus provides a simple and familiar user interface , much like Google and Bing TYPE gout in the query box and HIT RETURN : NLMplus performs a universal search in a multiple NLM databases in order to retrieve and organize a wide variety of trusted NLM content for the user, e.g. POINT to and READ the search tabs Health Topics , Recent PubMed Reviews etc. The search results are shown in the left column ( CLICK on the first result link, then PAUSE and then CLOSE the browser tab ) The search result page illustrates 3 important NLMplus INNOVATIONS: POINT to 1. The All NLM+ Databases search shows the Hit counts from 59 NLM databases ( CLICK and PAUSE and then CLOSE the All NLM+ Databases browser tab )
  • The Explore and Discover Table displays semantic query associations from WebLib’s Biomedical Knowledge Base (BKB), which contains over 4 million concepts MOVE the CURSOR to different concepts in the Explore and Discover Table to show Definitions and search and query modification options The Biomedical Knowledge Base is automatically generated from diverse semantic and content sources on the Web, including NLM’s Unified Medical Language System The Biomedical Knowledge Base allows users to discover health concerns , treatments and procedures , drugs and substances , alternative and integrative medicine approaches related to the user’s topic The BKB also facilitates query refinement MOVE the CURSOR to “ colchicine ” in the “ Drugs and Supplements ” column CLICK and PAUSE and then CLOSE the Explore and Discover Table browser tab Most importantly, the Biomedical Knowledge Base also powers the Semantic Indexing and Semantic Searching of the PubMed Reviews CLICK on Recent PubMed Reviews and EXPLAIN : we have downloaded the more than 1.6 million PubMed review articles and we semantically indexed them for the NLMplus app ( ALT-TAB back to the PPT Presentation )
  • Here is a typical PubMed Review article. The next slide shows the semantic annotation of the Title using WebLib’s Part-of-Speech Tagger and Noun Phrase Chunker
  • The Tagger/Chunker identified key phrases (TP: title phrase) , mapped these key phrases to UMLS concepts , and added semantic information such as synonyms from our Biomedical Knowledge Base . The key phrases and concepts were also normalized in order to ha n dle lexical/morphologica l /syntactic variants WebLib’s AZ Dictionary system is used to normalize the key phrases and the concepts for semantic indexing
  • WebLib Semantic Search Engine uses the Apache Solr enterprise search system with WebLib’s Semantic Search Indexing and the Biomedical Knowledge Base
  • We tested the scalability of WebLib’s Semantic Search Solution by implementing the semantic indexing and semantic searching of more than 1.6 million PubMed Reviews on a small NLMplus test server NOTE : if you are asked about the “ Enhanced PubMed Search ” , then explain that NLMplus also sends semantically enhanced Boolean queries to the PubMed e-utilities search server
  • This WebLib PubMed Reviews Semantic Search produces 40 relevant search results While NLM’s PubMed keyword-based search engine retrieves over 11,000 articles and 909 review articles for the same query.
  • Most of these articles, however are not relevant to the query since they primarily deal with prostate cancer , and not “ prostate cancer antigen 3 ” – a biomarker
  • Another example with this simple query : “ transient cortical blindness following coronary angiography ” This retrieves 4 relevant review articles in NLMplus
  • While PubMed only finds 2 review articles for the query : “ transient cortical blindness following coronary angiography ” The generally good precision and recall of NLMplus search results can be attributed to utilizing the NLM UMLS synonyms and other semantic info ramtion in the Biomedical Knowledge Base and Semantic Search Engine
  • Go through the list of Innovations and We used the following NLM APIs, web services and software tools in this application (the color makes it unreadable, however I put links into it, so you will be able to see those sources if you download the presentation)
  • In addition to gradually expanding the Biomedical Knowledge Base , WebLib has also been working on creating a Web Knowledge Base in order to support meaningful information retrieval applications in all areas of science and technology. The integration of information from heterogeneous databases and the need for improved search results is a common problem in large organizations We envision many other useful semantic technology apps for INDUSTRY , R&amp;D and GOVERNMENT
  • please try NLMplus and give us feedback . We would happily collaborate you in semantic projects. THANK YOU and have a good lunch! 
  • Semantics for universal search and discovery

    1. 1. Next Generation Search and Discovery Tools for the Web Semantics for Universal Search and Discovery Endre Jofoldi and dr. Tamas Doszkocs
    2. 2. What is semantic search? Keywords vs. Concepts “ semantic search is a search or a question or an action that produces meaningful results , even when the retrieved items contain none of the query terms , or the search involves no query text at all ”
    3. 3. What is NLMplus? NLMplus is a semantic search and knowledge discovery application designed to tap into the rich content offerings of the National Library of Medicine in all areas of biomedicine and health See NLMPlus
    4. 4. What is NLMplus?
    5. 5. Semantic Annotation of PubMed Reviews via the WebLib Part-of-Speech Tagger and Noun Phrase Chunker
    6. 6. phrases > key phrases > key concepts > synonyms > normalized semantic indexes <ul><ul><li>19418823;TI: anthracycline cardiotoxicity after breast cancer treatment </li></ul></ul><ul><ul><li>TI_ NORMALP: anthracycline breast canc cardiotoxicit treat </li></ul></ul><ul><ul><li>TP : anthracycline ; CUI:C0003234; Anthracycline Antibiotics; </li></ul></ul><ul><ul><ul><li>SY: anthracyclines ; </li></ul></ul></ul><ul><ul><li>TP: cardiotoxicity ; CU I:C0876994 ; cardiac toxicity; </li></ul></ul><ul><ul><li>TP: breast cancer ; CUI:C0678222; Breast Carcinoma; </li></ul></ul><ul><ul><ul><li>SY: breast tumor ; </li></ul></ul></ul><ul><ul><li>TP: breast cancer treatment ; </li></ul></ul><ul><ul><ul><li>CUI:C1511300; Breast Cancer Therapeutic Procedure ; </li></ul></ul></ul>Semantic Annotation of a PubMed Review Title
    7. 7. Semantic Searching of PubMed Reviews The WebLib Semantic Search Engine Apache Solr enterprise search engine with WebLib’s Semantic Layer http://lucene.apache.org/solr / Weblib Biomedical Knowledge Base and Query translator W BQT
    8. 8. Scaling the Semantic Searching of PubMed Reviews More than 1.6 million PubMed Reviews are semantically searchable on our test server : Query: prostate cancer antigen 3 Query: transient cortical blindness after coronary angiography
    9. 10. Results from NLM’s PubMed http://www.ncbi.nlm.nih.gov/pubmed?term=prostate%20cancer%20antigen%203
    10. 12. PubMed Query: transient cortical blindness following coronary angiography
    11. 13. More about the NLMplus <ul><li>NLMplus Innovations </li></ul><ul><li>Universal Search </li></ul><ul><li>Flexible access to 59 NLM databases </li></ul><ul><li>Semantic Search Engine </li></ul><ul><li>Semantic searching of PubMed Reviews </li></ul><ul><li>Biomedical Knowledge Base </li></ul><ul><li>powers “Explore and Discover” </li></ul><ul><li>powers the Semantic Search Engine </li></ul><ul><li>Intuitive User Interface </li></ul><ul><li>Wide Spectrum of NLM users </li></ul><ul><li>Consumers </li></ul><ul><li>Physicians </li></ul><ul><li>Biomedical researchers </li></ul><ul><li>Decision makers </li></ul>NLM APIs, web services and software tools utilized by the NLMplus app Entrez Programming Utilities UMLS Terminology Services Medlineplus Web Service MetaMap API SKR Web API Read more about NLMplus
    12. 14. Potential WebLib Semantic Search and Knowledge Base Applications Common Problem: accessing heterogeneous databases Meaningful Solution: semantic technologies <ul><li>Biomedical Knowledge Base and Semantic Search </li></ul><ul><li>National Library of Medicine </li></ul><ul><li>National Institutes of Health </li></ul><ul><li>Department of Health and Human Services </li></ul><ul><li>U.S. Government Departments and A gencies </li></ul><ul><li>Health Care </li></ul><ul><li>Web Knowledge Base and Semantic Search </li></ul><ul><li>Libraries and Research Organizations </li></ul><ul><li>Media and Content Companies </li></ul><ul><li>Educational Institutions </li></ul><ul><li>Businesses </li></ul><ul><li>The Web </li></ul>
    13. 15. Endre Jofoldi endre.jofoldi@weblib.hu twitter: EndreJofoldi Dr. Tamas Doszkocs [email_address] Contact

    ×