17. In order to support
Literature Based
Discovery
Ontologies
Communities
Annotation
Machine-readable
documents
In a nutshell….
…documents as
interfaces to the Web
of Data….
Biotea
• Machine-readable and
procesable documents
• Interactive documents
• Enriched metadata
• Full content management,
document centric
• Social hub
Citagora
-Aggregated search
-Single entry point
-Social hub
-Citation centric
20. RDF4PMC, some results
Makes possible
How similar are two articles?
based on authors, keywords,
abstracts, ontological terms
What articles use this reference in
a section with title “Results”?
Annotations
Metadata +
Content +
References
Annotations
Makes possible
• How similar are two articles?
based on semantic distance
• Which annotation co-occurs
more with this “YYY”
annotation?
• Which articles include “TERM”
but not this other “TERM”?
Some numbers, article PMC126253
“Computational method for reducing
variance with Affymetrix
microarrays”
• NCBO
• Annotations: 407
• Topics: 633
• Whatizit
• Annotations: 14
• Topics: 203
Delivering: the platform that makes possible to build interactive environments for semantic publications
This is an overview of the ontologies, APIs, and web services used in our rdfication process. Initially we take the XML provided by PMC for those open access articles. We use BIBO, Dublin Core, FOAF, and DOCO to model the paper. We used the RDFReactor tool in order to get the JAVA classes corresponding to BIBO and DOCO, it was not necessary for Dublin Core and FOAF as we do not used them as extensevely as BIBO and DOCO. Particularly for the references, when the information provided in the XML is incomplete or difficult to process, it is possible to use services such as those provided by NCBI, Mendeley, crossref.org, the DataCite Metadata Schema Repository, etc., so we can improve the metadata for the references. That is currently part of our future work. We also add links to the pubmed space in Bio2RDF and identifiers.org.
A similar process could be done for other publishers
Now, let’s see an example consuming the RDF we have generate so far. In this prototype we are focusing on human genes, particualrly those cover by GeneWiki. We first retrieve the UniProt accession corresponding to the gene name, then we look for publications annotated with this accession. This is a first prototype so we are not yet covering synonyms or related terms, that is part of our future work.
The publication is displayed in an enriched environment that includes different functionalities depending on the type of the biological entity selected by the user, so far we have included genes, proteins, and chemical entities.