A presentation describing Elsevier's perspective of linked data in STM publishing, presented 2011-12-06 at the W3C Linked Enterprise Data Patterns Workshop in Cambridge, MA.
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Linked Data Standards and Infrastructure for Scientific Publishing (W3C LEDP 2011 Workshop)
1. Linked Data Standards and
Infrastructure for Scientific Publishing
Bradley P. Allen
Elsevier Labs
W3C Workshop on Linked Enterprise Data Patterns
6 December 2011
2. The role of linked data in STM publishing
Smart Content Delivery
Better discovery through
semantic search & navigation
Linked data from
•Faceted search & browse
partners and the Web
•Ontology-driven navigation
•Task-specific results
•Personalized/localized results
•Question answering
Better understanding through
Text
Entities, analysis and visualization
•Tag clouds
Scholarly concepts and
•Heatmaps
content relationships •Streamgraphs
Tables •Scatterplots
•Time series
•Animations
Images
New knowledge through
aggregation and synthesis
•Topic pages
Scholarly
•Social network maps
knowledge •Geolocation maps
organization •Data mashups
systems •Text mining reports
2
3. Scientific publications as linked data
Linked data
Entity record
Document
Acquire Deliver
Media object
Transform,
Enhance, Index, Analyze,
Compose
3
4. Elsevier’s approach
• Embrace linked data principles while leveraging our
existing content production workflow and
infrastructure
– Find the right balance between production/QA and online
delivery
• Leverage partners for content enhancement and
knowledge organization
– Reuse Web-standard vocabularies, taxonomies, ontologies
and entity resources where possible
• Build out linked data design patterns for application
development
• Deliver benefits across the complementary use cases
of researcher and practitioner
4
5. Elsevier work to date
• Standards
– RDF named graphs
conformant with use-specific
XML schemas for
production/QA
– Taxonomies in SKOS
• Infrastructure
– Linked Data Repository with
CRUD API, Atom feeds for
online delivery services
– Virtual Total Warehouse for
content repository federation
• Applications
– Semantic search for medical
researchers and
practitioners
– Lancet, SciVerse app
mashups
5
6. LEDP2011: what we want to discuss
• Easing technology adoption by enterprise
IT staff
• Best practices for knowledge organization
systems management
• Infrastructure for scholarly linked data
publishing
6
7. Easing technology adoption
• Tools and best practices for URL and
namespace management and governance
• Best practices for publishing and consuming
linked data that address IT concerns rather
than legacy RDF issues
– 2006 vs. later versions of “Four Principles”
– Serialization “impedance mismatch”
– RDF APIs vs. SPARQL
– HTTP Range-14
7
8. Best practices for knowledge organization
• Tools and best practices for global/local
knowledge organization systems
management
• Standards for named entities and
registries crucial to accreditation,
provenance and trust
– e.g. author identifiers and profiles in ORCID
8
9. Infrastructure for scholarly linked data publishing
• Validators for linked data
• Standards supporting scholarly publishing
workflows
– Named graphs
– Versioning
– Access & entitlement
• Standards and best practices for annotation
of scholarly content
– e.g. CITO, SWAN, SIOC, AO, OAC
• Support for free text search
9