Linked Data in Scholarly Communication

B
Bernhard HaslhoferResearcher at University of Vienna
Linked Data in Scholarly
Communication
AAHEP5 Information Provider Summit, Sept. 22nd 2011


Bernhard Haslhofer | Cornell University, Information Science
Overview


                  •      Linked Data Vision and Goals
                  •      Enabling Technologies (by example)
                  •      Publishing and Consuming Linked Data
                  •      The SciLink Project



AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data in Scholarly Communication
Linked Data in Scholarly Communication
Web Architecture



                                                   HTML                     HTML                HTML


                                                               hyperlinks          hyperlinks




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Web Architecture



                                       HTML                  HTML    HTML   HTML   HTML




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Web Architecture
              •      A set of simple standards
                        •      Uniform document encoding (HTML)
                        •      Uniform global (!) addressing (URI)
                        •      Uniform transportation (HTTP)
              •      Hyperlinks connecting documents
              •      Works pretty well for accessing and
                     exchanging documents

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
But sometimes we need to access to the
                     underlying (meta)data.




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data in Scholarly Communication
Linked Data in Scholarly Communication
Web Services and Web APIs




                                      API                  API       API   API   API




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Web Services and Web APIs


                  •      Each Web API has a proprietary interface
                  •      Datasources must be known in advance
                  •      Information entities (papers, authors,
                         subjects, etc.) are not linked



AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data Vision

                  •      Create a single global data space based on
                         the Architecture of the World Wide Web




                                                 RDF                   RDF               RDF


                                                           RDF links         RDF links




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data Principles

                  •      Use URIs as names for things
                  •      Use HTTP URIs so that people can look up
                         those names
                  •      When someone looks up a URI, provide useful
                         information, using standards (RDF, SPARQL)
                  •      Include links to other URIs, so that they can
                         discover more things


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Web of Linked Data
              •      A set of simple standards
                        •      Uniform data model (RDF)
                        •      Uniform global (!) addressing (URI)
                        •      Uniform transportation (HTTP)
              •      RDF links connecting entities
              •      Forms a global dataspace and facilitates
                     accessing and exchanging data

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Web of Linked Data

                  •      Data publishers
                        •      assign URIs to their information entities
                        •      publish RDF instance data about these entities
                        •      publish schemas / vocabularies on the Web
                        •      link their entities with other entities on the
                               Data Web
                        •      can provide query access to their data

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linking Open Data Project
                  •      Open Data: a philosophy, practice, or policy that
                         data are freely available to everyone without
                         restrictions from copyright, patents, a.s.o
                  •      Linked Data: method / best practices for exposing,
                         sharing, and connecting data using URI and RDF
                  •      Linking Open Data: a W3C community project
                         with the goal to extend the Web with a data
                         commons by publishing various open data sets as
                         RDF on the Web and by setting links between data
                         items from different sources

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data in Scholarly Communication
Linked Data in Scholarly Communication
Linked Data in the Industry




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data in Libraries

                  •      Strong uptake in recent years
                        •      Library of Congress (vocabularies)
                        •      German / Swedish / Hungarian National
                               Library (authorities and/or catalogue data)
                        •      Europeana (metadata about ~4M items)
                  •      W3C Library Linked Data Incubator Group


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Overview


                  •      Linked Data Vision and Goals
                  •      Enabling Technologies (by example)
                  •      Publishing and Consuming Linked Data
                  •      The SciLink Project



AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Uniform Resource Identifiers (URIs)

                  •      Name and identify things (resources)
                  •      Dereferencable HTTP URIs

                                                                     http://springerlink.com/
                                       http://arxiv.org/authors/         author/xyzafbea
                                             Haslhofer_B




                                                                                         http://eprints.cs.univie.ac.at/
                                                                                                creator/xyzafbea
                                   http://arxiv.org/abs/
                                        1106.5178



                                                                        http://eprints.cs.univie.ac.at/
                                                                                     2871




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Resource Description Framework (RDF)

                  •      A data model for representing data on the Web
                  •      Sets of statements (triples) form a graph
                                       http://xmlns.com/foaf/
                                        spec/#term_Person
                                                                       rdf:type

                                                                                  http://arxiv.org/authors/
                                                                                        Haslhofer_B                                Bernhard Haslhofer

                               http://purl.org/ontology/
                                      bibo/Article
                                                                   dcterms:creator
                                                                                                                      foaf:name


                                                                http://arxiv.org/abs/               dcterms:created
                                     rdf:type                        1106.5178

                                                                                                                          2011-06-25T22:55:17
                                                                              dc:title



                                                                                          The Open Annotation
                                                                                        Collaboration (OAC) Model

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
RDF Serialization (RDF/XML, Turtle, N-
              Triples, RDF/JSON)

                  •      Data formats for serializing RDF graphs
                  •      Used to transfer RDF data between apps


                <http://arxiv.org/abs/1106.5178>
                  dc:title “The Open Annotation Collaboration Model” .

                ...




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
RDF Vocabulary Description Language (RDFS)


                  •      A language for describing the syntax and
                         semantics of vocabularies in a machine
                         understandable way

                                              http://xmlns.com/foaf/   rdf:type          rdfs:Class
                                               spec/#term_Agent



                                                 rdfs:subclassOf       rdf:type           rdf:type



                                              http://xmlns.com/foaf/              http://purl.org/ontology/
                                               spec/#term_Person                         bibo/Article




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
OWL Web Ontology Language
                  •      A more expressive (formal) language for
                         defining the syntax and semantics of
                         vocabularies


                                    owl:DatatypeProperty                owl:ObjectProperty         owl:FunctionalProperty




                                            rdf:type                          rdf:type                     rdf:type



                                   http://xmlns.com/foaf/0.1/        http://xmlns.com/foaf/0.1/   http://xmlns.com/foaf/0.1/
                                              name                           homepage                       gender




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
OWL Web Ontology Language

                  •      Defines properties for linking resources
                                                                            owl:sameAs
                                         http://arxiv.org/authors/                                      http://springerlink.com/
                                               Haslhofer_B                                                  author/xyzafbea
                                                                             owl:sameAs



                                                                           owl:sameAs


                                     http://arxiv.org/abs/
                                          1106.5178
                                                                                                      http://eprints.cs.univie.ac.at/
                                                                                                             creator/xyzafbea

                                                                     owl:sameAs




                                                                                          http://eprints.cs.univie.ac.at/
                                                                                                       2871




NOTE: owl:sameAs is NOT the only linking property.
AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Simple Knowledge Organization System (SKOS)

                  •      A language for describing controlled
                         vocabularies (taxonomies, thesauri, etc)

                                                                                 skos:closeMatch             http://springerlink.com/
                                            http://arxiv.org/subjects/                                      vocab/computer-science
                                                   skos.rdf#cs


                                                                                                                       skos:broadMatch
                                                                                 skos:broadMatch
                                               skos:broader

                                                              http://arxiv.org/subjects/                    http://eprints.cs.univie.ac.at/
                                                                   skos.rdf#cs.DL                               subjects/multimedia




                                          dcterms:subject                                                          dcterms:subject


                                          http://arxiv.org/abs/                                    http://eprints.cs.univie.ac.at/
                                               1106.5178                                                        2871




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
SPARQL
                  •      A query language and protocol for
                         accessing RDF data on the Web

                SELECT DISTINCT ?x

                WHERE { ?x skos:subject <http://arxiv.org/subjects/skos.rdf#cs> }

                LIMIT 10




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Overview


                  •      Linked Data Vision and Goals
                  •      Enabling Technologies (by example)
                  •      Publishing and Consuming Linked Data
                  •      The SciLink Project



AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Information vs. Non-Information Resource

                  •      Information Resource

                        •      web pages, images, pdf documents, etc.

                        •      all their essential characteristics can be conveyed in a message

                        •      e.g., http://arxiv.org/pdf/1106.5178v1

                  •      Non-Information Resource

                        •      other things such as people, this meeting, concepts

                        •      their essence is not information

                        •      e.g., http://arxiv.org/authors/Haslhofer_B



AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Publishing Data
                  •      Distinguish between non-information and
                         information resource
                  •      Sample NIR
                        •      http://arxiv.org/resource/1106.5178
                  •      Sample IRs
                        •      http://arxiv.org/abs/1106.5178 (HTML)
                        •      http://arxiv.org/data/1106.5178 (RDF)

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Publishing Data
                                                 GET http://arxiv.org/resource/1106.5187
                                                 Accept: application/rdf+xml



                                                 303 See Other
                                                 Location: http://arxiv.org/data/1106.5187



                                                 GET http://arxiv.org/data/1106.5187
                                                 Accept: application/rdf+xml



                                                 200 OK
                                                 ...
                                                 <?xml version="1.0" encoding="utf-8"?>
                                                 <rdf:RDF ...




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Publishing Data
                                                 GET http://arxiv.org/resource/1106.5187
                                                 Accept: application/rdf+xml



                                                 303 See Other
                                                 Location: http://arxiv.org/data/1106.5187

                                                          Oh dear!
                                                 GET http://arxiv.org/data/1106.5187
                                                 Accept: application/rdf+xml



                                                 200 OK
                                                 ...
                                                 <?xml version="1.0" encoding="utf-8"?>
                                                 <rdf:RDF ...




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
RDFa / Microformats / Microdata

                  •      Mechanisms for embedding structured data
                         in Web documents
                  •      Define or use a set of attributes to
                         augmented presentation-oriented (HTML)
                         documents with structured data
                  •      User agents can extract data from Web
                         documents


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
RDFa Example

<div typeof=”bibo:Article” xmlns:bibo=”http://purl.org/ontology/bibo”>
  <div xmlns:dc=”http://purl.org/dc/elements/1.1/”>

         <h1 property=”dc:title”>The Open Annotation Collaboration (OAC) Model</h1>

         ....

  </div>
</div>




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Microformats Example

              <head profile=”http://www.w3.org/2006/03/hcard”>
              ...
              </head>

              <div class=”vcard”>
                <div class=”fn”>Bernhard Haslhofer</div>
              </div>




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
RDFa vs. Microformats
                         Microformats                                             RDFa
      flat namespace                                                  XML namespaces

      support HTML4, XHTML 1.1, and
                                                                     support for XHTML 1.1, HTML 5
      HTML 5

      use latent HTML attributes                                     introduces new metadata attributes

      vocabulary defined by one
                                                                     open to any RDF-based vocabulary
      organization/community

Also see: http://evan.prodromou.name/RDFa_vs_microformats
AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Microdata
                  •      A very young HTML proposition that
                         extends Microformats and addresses its
                         shortcomings
      <div itemscope itemtype="http://schema.org/Article">
        <span itemprop="name">The Open Annotation Collaboration (OAC) Model</span>
        by <span itemprop="author">Bernhard Haslhofer</span>,
        <span itemprop=”author”>Rainer Simon</span>,
        <span itemprop=”author”>Robert Sanderson</span> and
        <span itemprop=”author”>Herbert Van De Sompel</span>
        This article has been tweeted 1 times and contains 0 user comments.
        <meta itemprop="interactionCount" content="UserTweets:1"/>
        <meta itemprop="interactionCount" content="UserComments:0"/>
      </div>




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Linked Data in Scholarly Communication
Web Data Publishing Approaches
                                                                     Pro            Con

                                                         Separates data /    Technical complexity
         Linked Data                                   presentation markup
        Conneg IR / NIR                                                        Communication
                                                          Web Architecture       overhead

                                                                             Mixes data /
             RDFa /                                    Easy to implement
                                                                         presentation markup
          Microformats /
            Microdata                                      Search Engines
                                                                               Becomes messy

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Consuming Linked Data

        curl -H “Accept: application/rdf+xml” http://arxiv.org/data/1105.5178



        rapper http://arxiv.org/resource/1105.5178



        ... any RDF library




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Overview


                  •      Linked Data Vision and Goals
                  •      Enabling Technologies (by example)
                  •      Publishing and Consuming Linked Data
                  •      The SciLink Project



AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Motivation
                  •      Current research articles

                        •      are mostly PDFs (a kind of BLOB)

                        •      point to supplemental or related information by
                               textual references, DOIs, or embedded hypertext links




                                                                     HTML   HTML   HTML   HTML   HTML




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Motivation

                  •      But a publication is more:
                        •      data sets, conference presentations
                        •      blog / wiki entries, software libraries, etc
                  •      There is related information on the Web
                         (e.g., Wikipedia)


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Overall Goal
             Publication Document (PDF)                                                    Aggregation for the Document

                                                                                                                                                   foaf:firstname               Jennifer
                                                                     dbpedia:                         dbpedia:
                                                                     Drosphila                       Category:                          dblp:
               Genome-wide analysis of Notch                                                                                          jmummery                                Mummery-
                                                                                                 Signal_transduction
                 signaling in Drosphila by                                                                                                                                     Widmer
                     transgenic RNAi                                              rdfs:seeAlso                                                                foaf:lastname
                                                                                                               skos:subject        dcterms: creator
                      Jennifer Mummery-Widmer, ....
                                                                                                                                                                              Outer context
             Here we report the use of a library of
                                                                                                                                                                               Inner context
             Drosphila strains...we identified 6 new
             genes involved in asymmetric cell                            ore:
             division.                                                 Aggregation                                                                                    Genome-wide
                                                                                                                                              dc:title                 analysis...
                                                                                          rdf:type
             A total of 20,262 transgenic RNAi lines were                                                         ex:agg1
             screened...Ten flies were recorded in an
             online database...                                                          ore: aggregates                                    ore: aggregates


                                                                           ex:                                                                                          ex:
                                                                        figure4.jpg                      ore: aggregates   ore: aggregates                         nature07936.pdf



                                                                                               ex:                                  ex:
                                                                                            movie1.avi                          database.zip
             Figure 4. An interaction network for Notch                                                                                                                 dctype:
             signalling                                                                                                                                                 Dataset
                                                                       Asymmmetric
                                                                                                 dc:title                                          rdf:type
                                                                        cell division
                http://www.nature.com/nature/journal/
                 vaop/ncurrent/pdf/nature07936.pdf




AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Methodology
                  •      Aggregation/Data Extraction: create aggregations from
                         available information (PDFs, metadata)

                  •      Analysis: combine aggregations and learn about
                         established research and publication practices in different
                         areas (focus on linking)

                  •      Tool design and Implementation

                        •      link discovery

                        •      resource synchronization / link maintenance

                        •      demonstrator applications operating on data graph


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Aggregation / Data Extraction

                  •      scilink-extract library
                        •      takes PDFs and metadata as input
                        •      creates ORE Aggregations / CSV files
                        •      currently implemented for arXiv.org
                        •      https://github.com/behas/scilink-extract


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Analysis

                  •      How did HTTP linkage evolve in the past?
                  •      Where in their papers do scholars make
                         use of HTTP links?
                  •      What are the linking to?
                  •      Are the links still operational?
                  •      ....


AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
Status

                  •      Extraction for arxiv.org finished (~ 700K ) PDFs
                  •      We need access to further corpora covering
                         different research areas !!!




                                              Work in Progress

AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
1 of 51

Recommended

The Open Annotation Collaboration (OAC) Model by
The Open Annotation Collaboration (OAC) ModelThe Open Annotation Collaboration (OAC) Model
The Open Annotation Collaboration (OAC) ModelBernhard Haslhofer
1K views28 slides
Ontology Web services for Semantic Applications by
Ontology Web services for Semantic ApplicationsOntology Web services for Semantic Applications
Ontology Web services for Semantic ApplicationsTrish Whetzel
729 views33 slides
Ontology, Semantic Web and DBpedia by
Ontology, Semantic Web and DBpediaOntology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpediaRichard Kuo
850 views23 slides
Sw 08-rif by
Sw 08-rifSw 08-rif
Sw 08-rifSTI Innsbruck
99 views61 slides
Linked Data for Libraries: Experiments between Cornell, Harvard and Stanford by
Linked Data for Libraries: Experiments between Cornell, Harvard and StanfordLinked Data for Libraries: Experiments between Cornell, Harvard and Stanford
Linked Data for Libraries: Experiments between Cornell, Harvard and StanfordSimeon Warner
6.5K views63 slides
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ... by
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...National Information Standards Organization (NISO)
57.9K views48 slides

More Related Content

What's hot

Visual Ontology Modeling for Domain Experts and Business Users with metaphactory by
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryPeter Haase
270 views15 slides
Getting Started with Knowledge Graphs by
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
13.9K views74 slides
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr... by
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...National Information Standards Organization (NISO)
53.4K views88 slides
Towards digitizing scholarly communication by
Towards digitizing scholarly communicationTowards digitizing scholarly communication
Towards digitizing scholarly communicationSören Auer
1.9K views47 slides
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data by
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNational Information Standards Organization (NISO)
62.2K views33 slides

What's hot(20)

Visual Ontology Modeling for Domain Experts and Business Users with metaphactory by Peter Haase
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Peter Haase270 views
Getting Started with Knowledge Graphs by Peter Haase
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
Peter Haase13.9K views
Towards digitizing scholarly communication by Sören Auer
Towards digitizing scholarly communicationTowards digitizing scholarly communication
Towards digitizing scholarly communication
Sören Auer1.9K views
IFLA 2012 - OCLC Linked Data round table by Figoblog
IFLA 2012 - OCLC Linked Data round tableIFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round table
Figoblog3.9K views
Sw 06-repository and-sparql by STI Innsbruck
Sw 06-repository and-sparqlSw 06-repository and-sparql
Sw 06-repository and-sparql
STI Innsbruck65 views
Infrastructure crossroads... and the way we walked them in DKPro by openminted_eu
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKPro
openminted_eu505 views
Ephedra: efficiently combining RDF data and services using SPARQL federation by Peter Haase
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
Peter Haase269 views

Similar to Linked Data in Scholarly Communication

LOD/LAM Presentation by
LOD/LAM PresentationLOD/LAM Presentation
LOD/LAM PresentationHafabe
553 views10 slides
ISWC GoodRelations Tutorial Part 2 by
ISWC GoodRelations Tutorial Part 2ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2Martin Hepp
866 views22 slides
GoodRelations Tutorial Part 2 by
GoodRelations Tutorial Part 2GoodRelations Tutorial Part 2
GoodRelations Tutorial Part 2guestecacad2
1K views22 slides
20110728 datalift-rpi-troy by
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troyFrançois Scharffe
857 views55 slides
Linked Data by
Linked DataLinked Data
Linked DataAnja Jentzsch
654 views52 slides
Linked data demystified:Practical efforts to transform CONTENTDM metadata int... by
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Cory Lampert
1.2K views34 slides

Similar to Linked Data in Scholarly Communication(20)

LOD/LAM Presentation by Hafabe
LOD/LAM PresentationLOD/LAM Presentation
LOD/LAM Presentation
Hafabe553 views
ISWC GoodRelations Tutorial Part 2 by Martin Hepp
ISWC GoodRelations Tutorial Part 2ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2
Martin Hepp866 views
GoodRelations Tutorial Part 2 by guestecacad2
GoodRelations Tutorial Part 2GoodRelations Tutorial Part 2
GoodRelations Tutorial Part 2
guestecacad21K views
Linked data demystified:Practical efforts to transform CONTENTDM metadata int... by Cory Lampert
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Cory Lampert1.2K views
Open Annotation: Annotating High Energy Physics on the Web by Robert Sanderson
Open Annotation: Annotating High Energy Physics on the WebOpen Annotation: Annotating High Energy Physics on the Web
Open Annotation: Annotating High Energy Physics on the Web
Robert Sanderson751 views
VRA_2015_CatalogingRoundup_Seneff by Heather Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_Seneff
Heather Seneff178 views
[Webinar] Semantic Technologies by Nuxeo
[Webinar] Semantic Technologies[Webinar] Semantic Technologies
[Webinar] Semantic Technologies
Nuxeo1.3K views
Of Cataloging & Context by charper
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Context
charper1.1K views
Open Science Days 2014 - Becker - Repositories and Linked Data by Pascal-Nicolas Becker
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
Building the New Open Linked Library by Joel Richard
Building the New Open Linked LibraryBuilding the New Open Linked Library
Building the New Open Linked Library
Joel Richard2K views
Linked Data as an enabling framework for resource discovery across libraries,... by Andy Powell
Linked Data as an enabling framework for resource discovery across libraries,...Linked Data as an enabling framework for resource discovery across libraries,...
Linked Data as an enabling framework for resource discovery across libraries,...
Andy Powell1.2K views
Publishing and Using Linked Open Data - Day 1 by Richard Urban
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
Richard Urban3.2K views
Marc and beyond: 3 Linked Data Choices by Richard Wallis
 Marc and beyond: 3 Linked Data Choices  Marc and beyond: 3 Linked Data Choices
Marc and beyond: 3 Linked Data Choices
Richard Wallis1.4K views
WorldCat, Works, and Schema.org by Richard Wallis
WorldCat, Works, and Schema.orgWorldCat, Works, and Schema.org
WorldCat, Works, and Schema.org
Richard Wallis2.3K views

More from Bernhard Haslhofer

Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P... by
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...Bernhard Haslhofer
566 views20 slides
Token Systems, Payment Channels, and Corporate Currencies by
Token Systems, Payment Channels, and Corporate CurrenciesToken Systems, Payment Channels, and Corporate Currencies
Token Systems, Payment Channels, and Corporate CurrenciesBernhard Haslhofer
438 views48 slides
Can a blockchain solve the trust problem? by
Can a blockchain solve the trust problem?Can a blockchain solve the trust problem?
Can a blockchain solve the trust problem?Bernhard Haslhofer
1.2K views23 slides
Measurements in Cryptocurrency Networks by
Measurements in Cryptocurrency NetworksMeasurements in Cryptocurrency Networks
Measurements in Cryptocurrency NetworksBernhard Haslhofer
751 views37 slides
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... by
 Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...Bernhard Haslhofer
926 views57 slides
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba... by
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Bernhard Haslhofer
422 views22 slides

More from Bernhard Haslhofer(20)

Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P... by Bernhard Haslhofer
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...
Decentralized Finance (DeFi) - Understanding Risks in an Emerging Financial P...
Bernhard Haslhofer566 views
Token Systems, Payment Channels, and Corporate Currencies by Bernhard Haslhofer
Token Systems, Payment Channels, and Corporate CurrenciesToken Systems, Payment Channels, and Corporate Currencies
Token Systems, Payment Channels, and Corporate Currencies
Bernhard Haslhofer438 views
Can a blockchain solve the trust problem? by Bernhard Haslhofer
Can a blockchain solve the trust problem?Can a blockchain solve the trust problem?
Can a blockchain solve the trust problem?
Bernhard Haslhofer1.2K views
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... by Bernhard Haslhofer
 Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur... Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...
Post-Bitcoin Cryptocurrencies, Off-Chain Transaction Channels, and Cryptocur...
Bernhard Haslhofer926 views
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba... by Bernhard Haslhofer
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...
Bernhard Haslhofer422 views
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics by Bernhard Haslhofer
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency AnalyticsO Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
Bernhard Haslhofer635 views
Mind the Gap - Data Science Meets Software Engineering by Bernhard Haslhofer
Mind the Gap - Data Science Meets Software EngineeringMind the Gap - Data Science Meets Software Engineering
Mind the Gap - Data Science Meets Software Engineering
Bernhard Haslhofer303 views
GraphSense - Real-time Insight into Virtual Currency Ecosystems by Bernhard Haslhofer
GraphSense - Real-time Insight into Virtual Currency EcosystemsGraphSense - Real-time Insight into Virtual Currency Ecosystems
GraphSense - Real-time Insight into Virtual Currency Ecosystems
Bernhard Haslhofer1.1K views
BITCOIN - De-anonymization and Money Laundering Detection Strategies by Bernhard Haslhofer
BITCOIN - De-anonymization and Money Laundering Detection StrategiesBITCOIN - De-anonymization and Money Laundering Detection Strategies
BITCOIN - De-anonymization and Money Laundering Detection Strategies
Bernhard Haslhofer2.7K views
Bitcoin - Introduction, Technical Aspects and Ongoing Developments by Bernhard Haslhofer
Bitcoin - Introduction, Technical Aspects and Ongoing DevelopmentsBitcoin - Introduction, Technical Aspects and Ongoing Developments
Bitcoin - Introduction, Technical Aspects and Ongoing Developments
Bernhard Haslhofer3.9K views
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen... by Bernhard Haslhofer
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...
Bernhard Haslhofer1.6K views
The value of open data and the OpenGLAM network by Bernhard Haslhofer
The value of open data and the OpenGLAM networkThe value of open data and the OpenGLAM network
The value of open data and the OpenGLAM network
Bernhard Haslhofer835 views
Offene Daten im Kulturbereich - Die pragmatische Perspektive by Bernhard Haslhofer
Offene Daten im Kulturbereich - Die pragmatische PerspektiveOffene Daten im Kulturbereich - Die pragmatische Perspektive
Offene Daten im Kulturbereich - Die pragmatische Perspektive
Bernhard Haslhofer1.1K views
Semantic Tagging for old maps...and other things on the Web by Bernhard Haslhofer
Semantic Tagging for old maps...and other things on the WebSemantic Tagging for old maps...and other things on the Web
Semantic Tagging for old maps...and other things on the Web
Bernhard Haslhofer872 views

Recently uploaded

DISTILLATION.pptx by
DISTILLATION.pptxDISTILLATION.pptx
DISTILLATION.pptxAnupkumar Sharma
82 views47 slides
Gross Anatomy of the Liver by
Gross Anatomy of the LiverGross Anatomy of the Liver
Gross Anatomy of the Liverobaje godwin sunday
100 views12 slides
The Picture Of A Photograph by
The Picture Of A PhotographThe Picture Of A Photograph
The Picture Of A PhotographEvelyn Donaldson
38 views81 slides
INT-244 Topic 6b Confucianism by
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b ConfucianismS Meyer
51 views77 slides
JRN 362 - Lecture Twenty-Two by
JRN 362 - Lecture Twenty-TwoJRN 362 - Lecture Twenty-Two
JRN 362 - Lecture Twenty-TwoRich Hanley
39 views157 slides
Berry country.pdf by
Berry country.pdfBerry country.pdf
Berry country.pdfMariaKenney3
82 views12 slides

Recently uploaded(20)

INT-244 Topic 6b Confucianism by S Meyer
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b Confucianism
S Meyer51 views
JRN 362 - Lecture Twenty-Two by Rich Hanley
JRN 362 - Lecture Twenty-TwoJRN 362 - Lecture Twenty-Two
JRN 362 - Lecture Twenty-Two
Rich Hanley39 views
Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf by TechSoup
 Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf
Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf
TechSoup 67 views
Introduction to AERO Supply Chain - #BEAERO Trainning program by Guennoun Wajih
Introduction to AERO Supply Chain  - #BEAERO Trainning programIntroduction to AERO Supply Chain  - #BEAERO Trainning program
Introduction to AERO Supply Chain - #BEAERO Trainning program
Guennoun Wajih135 views
GSoC 2024 .pdf by ShabNaz2
GSoC 2024 .pdfGSoC 2024 .pdf
GSoC 2024 .pdf
ShabNaz245 views
Peripheral artery diseases by Dr. Garvit.pptx by garvitnanecha
Peripheral artery diseases by Dr. Garvit.pptxPeripheral artery diseases by Dr. Garvit.pptx
Peripheral artery diseases by Dr. Garvit.pptx
garvitnanecha135 views
UNIT NO 13 ORGANISMS AND POPULATION.pptx by Madhuri Bhande
UNIT NO 13 ORGANISMS AND POPULATION.pptxUNIT NO 13 ORGANISMS AND POPULATION.pptx
UNIT NO 13 ORGANISMS AND POPULATION.pptx
Madhuri Bhande48 views
ANGULARJS.pdf by ArthyR3
ANGULARJS.pdfANGULARJS.pdf
ANGULARJS.pdf
ArthyR354 views

Linked Data in Scholarly Communication

  • 1. Linked Data in Scholarly Communication AAHEP5 Information Provider Summit, Sept. 22nd 2011 Bernhard Haslhofer | Cornell University, Information Science
  • 2. Overview • Linked Data Vision and Goals • Enabling Technologies (by example) • Publishing and Consuming Linked Data • The SciLink Project AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 5. Web Architecture HTML HTML HTML hyperlinks hyperlinks AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 6. Web Architecture HTML HTML HTML HTML HTML AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 7. Web Architecture • A set of simple standards • Uniform document encoding (HTML) • Uniform global (!) addressing (URI) • Uniform transportation (HTTP) • Hyperlinks connecting documents • Works pretty well for accessing and exchanging documents AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 8. But sometimes we need to access to the underlying (meta)data. AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 11. Web Services and Web APIs API API API API API AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 12. Web Services and Web APIs • Each Web API has a proprietary interface • Datasources must be known in advance • Information entities (papers, authors, subjects, etc.) are not linked AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 13. Linked Data Vision • Create a single global data space based on the Architecture of the World Wide Web RDF RDF RDF RDF links RDF links AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 14. Linked Data Principles • Use URIs as names for things • Use HTTP URIs so that people can look up those names • When someone looks up a URI, provide useful information, using standards (RDF, SPARQL) • Include links to other URIs, so that they can discover more things AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 15. Web of Linked Data • A set of simple standards • Uniform data model (RDF) • Uniform global (!) addressing (URI) • Uniform transportation (HTTP) • RDF links connecting entities • Forms a global dataspace and facilitates accessing and exchanging data AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 16. Web of Linked Data • Data publishers • assign URIs to their information entities • publish RDF instance data about these entities • publish schemas / vocabularies on the Web • link their entities with other entities on the Data Web • can provide query access to their data AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 17. Linking Open Data Project • Open Data: a philosophy, practice, or policy that data are freely available to everyone without restrictions from copyright, patents, a.s.o • Linked Data: method / best practices for exposing, sharing, and connecting data using URI and RDF • Linking Open Data: a W3C community project with the goal to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting links between data items from different sources AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 20. Linked Data in the Industry AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 21. Linked Data in Libraries • Strong uptake in recent years • Library of Congress (vocabularies) • German / Swedish / Hungarian National Library (authorities and/or catalogue data) • Europeana (metadata about ~4M items) • W3C Library Linked Data Incubator Group AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 22. Overview • Linked Data Vision and Goals • Enabling Technologies (by example) • Publishing and Consuming Linked Data • The SciLink Project AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 23. Uniform Resource Identifiers (URIs) • Name and identify things (resources) • Dereferencable HTTP URIs http://springerlink.com/ http://arxiv.org/authors/ author/xyzafbea Haslhofer_B http://eprints.cs.univie.ac.at/ creator/xyzafbea http://arxiv.org/abs/ 1106.5178 http://eprints.cs.univie.ac.at/ 2871 AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 24. Resource Description Framework (RDF) • A data model for representing data on the Web • Sets of statements (triples) form a graph http://xmlns.com/foaf/ spec/#term_Person rdf:type http://arxiv.org/authors/ Haslhofer_B Bernhard Haslhofer http://purl.org/ontology/ bibo/Article dcterms:creator foaf:name http://arxiv.org/abs/ dcterms:created rdf:type 1106.5178 2011-06-25T22:55:17 dc:title The Open Annotation Collaboration (OAC) Model AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 25. RDF Serialization (RDF/XML, Turtle, N- Triples, RDF/JSON) • Data formats for serializing RDF graphs • Used to transfer RDF data between apps <http://arxiv.org/abs/1106.5178> dc:title “The Open Annotation Collaboration Model” . ... AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 26. RDF Vocabulary Description Language (RDFS) • A language for describing the syntax and semantics of vocabularies in a machine understandable way http://xmlns.com/foaf/ rdf:type rdfs:Class spec/#term_Agent rdfs:subclassOf rdf:type rdf:type http://xmlns.com/foaf/ http://purl.org/ontology/ spec/#term_Person bibo/Article AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 27. OWL Web Ontology Language • A more expressive (formal) language for defining the syntax and semantics of vocabularies owl:DatatypeProperty owl:ObjectProperty owl:FunctionalProperty rdf:type rdf:type rdf:type http://xmlns.com/foaf/0.1/ http://xmlns.com/foaf/0.1/ http://xmlns.com/foaf/0.1/ name homepage gender AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 28. OWL Web Ontology Language • Defines properties for linking resources owl:sameAs http://arxiv.org/authors/ http://springerlink.com/ Haslhofer_B author/xyzafbea owl:sameAs owl:sameAs http://arxiv.org/abs/ 1106.5178 http://eprints.cs.univie.ac.at/ creator/xyzafbea owl:sameAs http://eprints.cs.univie.ac.at/ 2871 NOTE: owl:sameAs is NOT the only linking property. AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 29. Simple Knowledge Organization System (SKOS) • A language for describing controlled vocabularies (taxonomies, thesauri, etc) skos:closeMatch http://springerlink.com/ http://arxiv.org/subjects/ vocab/computer-science skos.rdf#cs skos:broadMatch skos:broadMatch skos:broader http://arxiv.org/subjects/ http://eprints.cs.univie.ac.at/ skos.rdf#cs.DL subjects/multimedia dcterms:subject dcterms:subject http://arxiv.org/abs/ http://eprints.cs.univie.ac.at/ 1106.5178 2871 AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 30. SPARQL • A query language and protocol for accessing RDF data on the Web SELECT DISTINCT ?x WHERE { ?x skos:subject <http://arxiv.org/subjects/skos.rdf#cs> } LIMIT 10 AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 31. Overview • Linked Data Vision and Goals • Enabling Technologies (by example) • Publishing and Consuming Linked Data • The SciLink Project AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 32. Information vs. Non-Information Resource • Information Resource • web pages, images, pdf documents, etc. • all their essential characteristics can be conveyed in a message • e.g., http://arxiv.org/pdf/1106.5178v1 • Non-Information Resource • other things such as people, this meeting, concepts • their essence is not information • e.g., http://arxiv.org/authors/Haslhofer_B AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 33. Publishing Data • Distinguish between non-information and information resource • Sample NIR • http://arxiv.org/resource/1106.5178 • Sample IRs • http://arxiv.org/abs/1106.5178 (HTML) • http://arxiv.org/data/1106.5178 (RDF) AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 34. Publishing Data GET http://arxiv.org/resource/1106.5187 Accept: application/rdf+xml 303 See Other Location: http://arxiv.org/data/1106.5187 GET http://arxiv.org/data/1106.5187 Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ... AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 35. Publishing Data GET http://arxiv.org/resource/1106.5187 Accept: application/rdf+xml 303 See Other Location: http://arxiv.org/data/1106.5187 Oh dear! GET http://arxiv.org/data/1106.5187 Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ... AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 36. RDFa / Microformats / Microdata • Mechanisms for embedding structured data in Web documents • Define or use a set of attributes to augmented presentation-oriented (HTML) documents with structured data • User agents can extract data from Web documents AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 37. RDFa Example <div typeof=”bibo:Article” xmlns:bibo=”http://purl.org/ontology/bibo”> <div xmlns:dc=”http://purl.org/dc/elements/1.1/”> <h1 property=”dc:title”>The Open Annotation Collaboration (OAC) Model</h1> .... </div> </div> AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 38. Microformats Example <head profile=”http://www.w3.org/2006/03/hcard”> ... </head> <div class=”vcard”> <div class=”fn”>Bernhard Haslhofer</div> </div> AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 39. RDFa vs. Microformats Microformats RDFa flat namespace XML namespaces support HTML4, XHTML 1.1, and support for XHTML 1.1, HTML 5 HTML 5 use latent HTML attributes introduces new metadata attributes vocabulary defined by one open to any RDF-based vocabulary organization/community Also see: http://evan.prodromou.name/RDFa_vs_microformats AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 40. Microdata • A very young HTML proposition that extends Microformats and addresses its shortcomings <div itemscope itemtype="http://schema.org/Article">   <span itemprop="name">The Open Annotation Collaboration (OAC) Model</span>   by <span itemprop="author">Bernhard Haslhofer</span>, <span itemprop=”author”>Rainer Simon</span>, <span itemprop=”author”>Robert Sanderson</span> and <span itemprop=”author”>Herbert Van De Sompel</span>   This article has been tweeted 1 times and contains 0 user comments.   <meta itemprop="interactionCount" content="UserTweets:1"/>   <meta itemprop="interactionCount" content="UserComments:0"/> </div> AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 42. Web Data Publishing Approaches Pro Con Separates data / Technical complexity Linked Data presentation markup Conneg IR / NIR Communication Web Architecture overhead Mixes data / RDFa / Easy to implement presentation markup Microformats / Microdata Search Engines Becomes messy AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 43. Consuming Linked Data curl -H “Accept: application/rdf+xml” http://arxiv.org/data/1105.5178 rapper http://arxiv.org/resource/1105.5178 ... any RDF library AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 44. Overview • Linked Data Vision and Goals • Enabling Technologies (by example) • Publishing and Consuming Linked Data • The SciLink Project AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 45. Motivation • Current research articles • are mostly PDFs (a kind of BLOB) • point to supplemental or related information by textual references, DOIs, or embedded hypertext links HTML HTML HTML HTML HTML AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 46. Motivation • But a publication is more: • data sets, conference presentations • blog / wiki entries, software libraries, etc • There is related information on the Web (e.g., Wikipedia) AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 47. Overall Goal Publication Document (PDF) Aggregation for the Document foaf:firstname Jennifer dbpedia: dbpedia: Drosphila Category: dblp: Genome-wide analysis of Notch jmummery Mummery- Signal_transduction signaling in Drosphila by Widmer transgenic RNAi rdfs:seeAlso foaf:lastname skos:subject dcterms: creator Jennifer Mummery-Widmer, .... Outer context Here we report the use of a library of Inner context Drosphila strains...we identified 6 new genes involved in asymmetric cell ore: division. Aggregation Genome-wide dc:title analysis... rdf:type A total of 20,262 transgenic RNAi lines were ex:agg1 screened...Ten flies were recorded in an online database... ore: aggregates ore: aggregates ex: ex: figure4.jpg ore: aggregates ore: aggregates nature07936.pdf ex: ex: movie1.avi database.zip Figure 4. An interaction network for Notch dctype: signalling Dataset Asymmmetric dc:title rdf:type cell division http://www.nature.com/nature/journal/ vaop/ncurrent/pdf/nature07936.pdf AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 48. Methodology • Aggregation/Data Extraction: create aggregations from available information (PDFs, metadata) • Analysis: combine aggregations and learn about established research and publication practices in different areas (focus on linking) • Tool design and Implementation • link discovery • resource synchronization / link maintenance • demonstrator applications operating on data graph AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 49. Aggregation / Data Extraction • scilink-extract library • takes PDFs and metadata as input • creates ORE Aggregations / CSV files • currently implemented for arXiv.org • https://github.com/behas/scilink-extract AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 50. Analysis • How did HTTP linkage evolve in the past? • Where in their papers do scholars make use of HTTP links? • What are the linking to? • Are the links still operational? • .... AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011
  • 51. Status • Extraction for arxiv.org finished (~ 700K ) PDFs • We need access to further corpora covering different research areas !!! Work in Progress AAHEP5 Information Provider Summit | Ithaca, NY | Sept. 22nd, 2011