Literature Mining Effectiveness in Today’s Economy


Published on

The first of its kind scientific literature alert service, which provides the latest manually annotated relationships for your favorite Proteins, Diseases, Drugs and Biological Processes EVERY WEEK. XTractor also enables you to Search, Classify, Share your data and is absolutely FREE!!! . And it's a great way to stay current and up to date with the latest findings in your research field.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Literature Mining Effectiveness in Today’s Economy

  1. 1. Literature Mining Effectiveness in Today’s Economy
  2. 2. Challenges in Literature mining <ul><li>Publications in  PubMed have increased exponentially in the last few years, </li></ul><ul><li>Leading to… </li></ul><ul><li>Increase in amount of time required for data extraction </li></ul><ul><li>Low precision based extraction of Relevant Facts </li></ul><ul><li>Difficulty in analysis of the extracted facts </li></ul>PubMed Publication rate 2008
  3. 3. More Time consuming = More Cost   To annotate 30 days of data on breast neoplasm = 1 man day So to annotate 50,000 abstracts = 500,000 mins or 1,041 man-days or ~ 3 man years to annotate 1 month of literature findings Increase in Publications Time Cost Keywords searched across PUBMED Dates of Addition in PUBMED Number of Abstracts Approx time taken for manual annotation & extraction (10 min per abstract) Breast Neoplasm Last 90 days 621 6210 min OR 103 hrs OR 11 working days Last 60 Days 271 2710 min OR 45 hrs OR 5 working days Last 30 Days 56 560 mins OR 9 hrs OR 1 working day It takes at least one working day to extract all the possible relations from 56 abstracts Analysis conducted as on 28 Apr 08 on PubMed
  4. 4. Low Precision of NLP v/s Manual Analysis of the “standard NLP” v/s manual curation efforts revealed that….. 12-35% false positive picks were found with NLP in comparison to our manual approach
  5. 5. Low Precision of NLP v/s Manual ..contd <ul><li>NLP miss-outs included instances such as : </li></ul><ul><li>Common English term mismatches: MICE, PEG, DAMAGE, RAW, which overlap with protein names. </li></ul><ul><li>Common Isoform Mismatches: p16-INK4 to p14ARF and cd11c to cd11d </li></ul><ul><li>Common Protein mismatches: S1P (sphingosine-1-phosphate) matched to sphingosine-1-phosphate receptor and ERK to ephrin type-B receptor 2 </li></ul><ul><li>Protein-disease mismatch: VHL protein mismatched to von-hippel lindau disease and progressive multifocal leukoencephalopathy mismatched to PML protein </li></ul><ul><li>Protein-process Mismatches: cell growth tagged to growth factor </li></ul><ul><li>Protein drug Mismatches: rapamycin tagged to protein Mammalian target of rapamycin </li></ul>An ideal Solution should ensure High Precision on annotations for Proteins, Diseases, Drugs and Biological Processes
  6. 6. Our Approach to Literature Mining Manual Annotation Manual Categorization Multiple Categories- Biomarkers, Clinical Trials, knockout studies, toxicity, disease mechanisms, pathways etc.. The major bottleneck with manual curation as demonstrated involves considerable time and cost. In XTractor, we have reduced the time involved in the manual annotation effort - by significantly cutting down the process steps to boost our internal productivity and turnaround time. Therefore, in almost real-time basis we are able to serve you with the latest manually annotated scientific facts. Swiss Prot for Proteins PubChem for Drugs MeSH for Diseases Gene Ontology (GO) for Biological Process 100% expert annotated
  7. 7. Solution = XTractor Premium <ul><li>A platform for discovery, knowledge sharing, analysis and modeling of published biomedical facts </li></ul><ul><li>The application also comes with -XTractor Knowledgebase – the world’s fastest growing biomedical knowledgebase of &quot;manually&quot; annotated scientific facts. </li></ul>Share your findings Export results Track your research entities Know the competition Hypothesize your findings Ontology based Searching Generate Reports Discover newer relationships XTractor Premium
  8. 8. XTractor Knowledgebase <ul><li>Biological relationships in Knowledgebase enable researchers to gain rapid insight into their experimental data, answer complex biological questions and gain deeper insight to their findings </li></ul><ul><li>Now Contains more than 410,000+ facts </li></ul><ul><li>Covers Unique Drugs, Diseases, Proteins and Biological Processes </li></ul><ul><li>50% of Swiss Prot Proteins, 60 % of MeSH diseases, 25% of Drug bank drugs and 24% of GO biological Processes </li></ul>
  9. 9. XTractor Knowledgebase Key Features Accuracy: Manually annotated content Semantic Consistency: Standard ontologies followed including MeSH, GO, Swiss Prot, PubChem, Protein isoform based mapping Comprehensiveness: Covers a large % of all the major protein, disease and drug databases Up-to-Date & Current: Updated on a weekly basis with the latest information Accuracy Semantic Consistency Up-to-date & Current Comprehensive
  10. 10. Solution for Discovery Target Validation Target Discovery Toxicity Clinical Trials Drug Studies Complete solutions for your Drug Discovery data needs Disease markers Target Discovery Biomarkers Drug effects Clinical trials RNAi studies Knockout studies Mutations Pathways Disease mechanisms Biological Process Target information Prognosis
  11. 11. XTractor Premium: Search & Analytics More insights of Scientific Data with XTractor Search Features Semantic Search Bibliographic Search Summary Search Concept Linking WatchList
  12. 12. What does XTractor answer? Knockout/RNAi studies, pertaining to Rheumatoid arthritis Drug- toxicity studies in Alzheimer’ s patients PK & PD studies of drug tamoxifen Pathways involving apoptosis and breast cancer Disease prognosis and diagnosis for Diabetes type 2 Marker/ Biomarker studies in colon cancer Drugs against colon cancer Route of administration studies For insulin Dose related and clearance Studies for doxorubicin Cisplatin Clinical Trials Major disease classes that are associated with PDGFR All this And Much more..
  13. 13. <ul><li>XTractor Premium… </li></ul><ul><li>The One stop solution for all your discovery needs </li></ul>For a free trial access contact: [email_address] Click Here To Register for a Webinar