SlideShare a Scribd company logo
1 of 2
Download to read offline
1
TERM Identification Tagging and
Extraction
TERMite is a semantic indexing engine that
manages the ambiguity in naming of terms in
scientific text. Analysing raw data at speeds of
up to 1 Million words a second, free-text
documents are converted into structured data
enabling new discovery. With TERMite, your
internal databases, reports and document
management systems become part of a wider
big data ecosystem facilitating business
intelligence, hypothesis generation and
identification of hidden trends and
relationships.
High-Quality Vocabularies
High performance biomedical text analytics
requires extensive ontologies covering all of
TERMite
2
the synonyms and different forms of names for
the same entity. Many existing solutions are
supplied with poor quality ontologies, taken
directly from public resources with minimal
additional development.
SciBite is different; we believe semantic text-
analytics requires an exceptional foundation.
Supporting the TERMite engine is a collection
of more than 80 Vocabularies spanning the Life
Sciences sector. These Vocabularies are
enriched through a unique combination of
automated analysis and expert manual curation
and contain over 20 million synonyms.
Many of our vocabularies are unique to SciBite.
Others originate from public domain sources
but are many times enriched. For example, our
human phenotype vocabulary contains over 1.5
million phenotype terms, compared to about
40,000 available in the public domain.
Enhancing Semantic Search and Discovery
DATASHEET
www.scibite.com @SciBite info@scibite.com
3
Scientifically Aware System
While the speed and coverage of TERMite
bring value to any organisation, it has
additional capabilities to provide a more
scientifically aware entity extraction solution.
Ambiguity Detection; Knowing when
“GSK” means “Glaxosmithkline” and not
“Glycogen Synthase Kinase”, when “Pacific”
means the biotechnology company and not
the ocean and when “hedgehog” means the
protein, not the spiky animal
Relevance Detection; Distinguishing
between terms that are “throwaway
mentions” and those that really matter to the
context of the document
Pattern Detection; Able to identify
patterns such as genes causing disease,
toxicities of drugs, association of phenotypes
with pathways and many more where they
are grouped by type e.g. Protein or Indication.
LIVE and Simple to Deploy
Developed in Java, TERMite is a simple API
which can be run either in the end-user
interface or embedded into other applications
opening up semantic text analytics to a much
4
wider audience. Setup is simple and can take
just a few minutes.
Use-Cases
Existing customers are using TERMite to:
• Datamine the entire Medline database
for gene-phenotype-disease correlations
• Analyse grants to discover new trends
• Scan internal documents to find
hidden target-drug-indication relationships
• Investigate disease genetics, biomarker
discovery, drug repurposing, drug toxicity,
competitor intelligence and much more
About SciBite
SciBite provides a flexible environment for
semantic text analytics and data intelligence
for Biopharma, Biotech & beyond through a
collection of applications, platforms and web
services. Built on an entity identification and
extract engine, SciBite’s capabilities can
unlock the value often missed in raw text.
From instant annotation of simple documents
through to the indexing of enterprise search
systems, contact us now to find out how we
can help you get more from your data.
Enriched Vocabularies Powering TERMite

More Related Content

Viewers also liked

Viewers also liked (18)

Chap 6 Avoiding Ambiguity
Chap 6 Avoiding Ambiguity Chap 6 Avoiding Ambiguity
Chap 6 Avoiding Ambiguity
 
Exploiting rules for resolving ambiguity in marathi language text
Exploiting rules for resolving ambiguity in marathi language textExploiting rules for resolving ambiguity in marathi language text
Exploiting rules for resolving ambiguity in marathi language text
 
pertinent report
pertinent reportpertinent report
pertinent report
 
Navidad 2015
Navidad 2015Navidad 2015
Navidad 2015
 
Personality my presenattion
Personality my presenattionPersonality my presenattion
Personality my presenattion
 
Tugas firah
Tugas firahTugas firah
Tugas firah
 
Endorsement
EndorsementEndorsement
Endorsement
 
Línea del tiempo
Línea del tiempo Línea del tiempo
Línea del tiempo
 
Flash bitcoin 5,27
Flash bitcoin 5,27Flash bitcoin 5,27
Flash bitcoin 5,27
 
Resolución 03-2015-tricels
Resolución 03-2015-tricelsResolución 03-2015-tricels
Resolución 03-2015-tricels
 
A (very) short history of ambiguity
A (very) short history of ambiguityA (very) short history of ambiguity
A (very) short history of ambiguity
 
TEKS PROSEDUR PROTOKOL (lengkap)
TEKS PROSEDUR PROTOKOL (lengkap)TEKS PROSEDUR PROTOKOL (lengkap)
TEKS PROSEDUR PROTOKOL (lengkap)
 
Ambiguity
AmbiguityAmbiguity
Ambiguity
 
Campus Journalism Act of 1991
Campus Journalism Act of 1991Campus Journalism Act of 1991
Campus Journalism Act of 1991
 
Semantics: The Meaning of Language
Semantics: The Meaning of LanguageSemantics: The Meaning of Language
Semantics: The Meaning of Language
 
AMBIGUITY IN A LANGUAGE
AMBIGUITY IN A LANGUAGEAMBIGUITY IN A LANGUAGE
AMBIGUITY IN A LANGUAGE
 
Designing The Future
Designing The FutureDesigning The Future
Designing The Future
 
The Epic (with Indarapatra & Sulayman)
The Epic (with Indarapatra & Sulayman)The Epic (with Indarapatra & Sulayman)
The Epic (with Indarapatra & Sulayman)
 

Similar to TERMite DataSheet 2016

Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG TRG
 
Empowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic EnrichmentEmpowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic EnrichmentThe Digital Group
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Publishing House
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
 
Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...
Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...
Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...Vasile Topac
 
ALA 2010 -- Jabin White
ALA 2010 -- Jabin WhiteALA 2010 -- Jabin White
ALA 2010 -- Jabin Whitebisg
 
Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)Ben Gardner
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reportsSaeed Mehrabi
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesHammad Afzal
 
Hl7 common terminology services
Hl7 common terminology servicesHl7 common terminology services
Hl7 common terminology servicesSyed Ali Raza
 
Healthcare NLP - Four Essentials to Make the Most of Unstructured Data
Healthcare NLP - Four Essentials to Make the Most of Unstructured DataHealthcare NLP - Four Essentials to Make the Most of Unstructured Data
Healthcare NLP - Four Essentials to Make the Most of Unstructured DataHealth Catalyst
 
(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot
(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot
(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline PilotBIOVIA
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 

Similar to TERMite DataSheet 2016 (20)

Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG
 
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
 
Scibite flyer 2013
Scibite flyer 2013Scibite flyer 2013
Scibite flyer 2013
 
Empowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic EnrichmentEmpowering Search Through 3RDi Semantic Enrichment
Empowering Search Through 3RDi Semantic Enrichment
 
Improving online chemistry one structure at a time
Improving online chemistry one structure at a timeImproving online chemistry one structure at a time
Improving online chemistry one structure at a time
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...
 
Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...
Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...
Patient Empowerment by Increasing Information Accessibility In a Telecare Sys...
 
ALA 2010 -- Jabin White
ALA 2010 -- Jabin WhiteALA 2010 -- Jabin White
ALA 2010 -- Jabin White
 
Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reports
 
Chem spider as a chemical term resolver
Chem spider as a chemical term resolverChem spider as a chemical term resolver
Chem spider as a chemical term resolver
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resources
 
ChemSpider as a chemical term resolver
ChemSpider as a chemical term resolverChemSpider as a chemical term resolver
ChemSpider as a chemical term resolver
 
Hl7 common terminology services
Hl7 common terminology servicesHl7 common terminology services
Hl7 common terminology services
 
Healthcare NLP - Four Essentials to Make the Most of Unstructured Data
Healthcare NLP - Four Essentials to Make the Most of Unstructured DataHealthcare NLP - Four Essentials to Make the Most of Unstructured Data
Healthcare NLP - Four Essentials to Make the Most of Unstructured Data
 
(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot
(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot
(ATS6-APP03) Thomson Rueters Content used in Acclrys Pipeline Pilot
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 

TERMite DataSheet 2016

  • 1. 1 TERM Identification Tagging and Extraction TERMite is a semantic indexing engine that manages the ambiguity in naming of terms in scientific text. Analysing raw data at speeds of up to 1 Million words a second, free-text documents are converted into structured data enabling new discovery. With TERMite, your internal databases, reports and document management systems become part of a wider big data ecosystem facilitating business intelligence, hypothesis generation and identification of hidden trends and relationships. High-Quality Vocabularies High performance biomedical text analytics requires extensive ontologies covering all of TERMite 2 the synonyms and different forms of names for the same entity. Many existing solutions are supplied with poor quality ontologies, taken directly from public resources with minimal additional development. SciBite is different; we believe semantic text- analytics requires an exceptional foundation. Supporting the TERMite engine is a collection of more than 80 Vocabularies spanning the Life Sciences sector. These Vocabularies are enriched through a unique combination of automated analysis and expert manual curation and contain over 20 million synonyms. Many of our vocabularies are unique to SciBite. Others originate from public domain sources but are many times enriched. For example, our human phenotype vocabulary contains over 1.5 million phenotype terms, compared to about 40,000 available in the public domain. Enhancing Semantic Search and Discovery DATASHEET
  • 2. www.scibite.com @SciBite info@scibite.com 3 Scientifically Aware System While the speed and coverage of TERMite bring value to any organisation, it has additional capabilities to provide a more scientifically aware entity extraction solution. Ambiguity Detection; Knowing when “GSK” means “Glaxosmithkline” and not “Glycogen Synthase Kinase”, when “Pacific” means the biotechnology company and not the ocean and when “hedgehog” means the protein, not the spiky animal Relevance Detection; Distinguishing between terms that are “throwaway mentions” and those that really matter to the context of the document Pattern Detection; Able to identify patterns such as genes causing disease, toxicities of drugs, association of phenotypes with pathways and many more where they are grouped by type e.g. Protein or Indication. LIVE and Simple to Deploy Developed in Java, TERMite is a simple API which can be run either in the end-user interface or embedded into other applications opening up semantic text analytics to a much 4 wider audience. Setup is simple and can take just a few minutes. Use-Cases Existing customers are using TERMite to: • Datamine the entire Medline database for gene-phenotype-disease correlations • Analyse grants to discover new trends • Scan internal documents to find hidden target-drug-indication relationships • Investigate disease genetics, biomarker discovery, drug repurposing, drug toxicity, competitor intelligence and much more About SciBite SciBite provides a flexible environment for semantic text analytics and data intelligence for Biopharma, Biotech & beyond through a collection of applications, platforms and web services. Built on an entity identification and extract engine, SciBite’s capabilities can unlock the value often missed in raw text. From instant annotation of simple documents through to the indexing of enterprise search systems, contact us now to find out how we can help you get more from your data. Enriched Vocabularies Powering TERMite