Simple Knowledge Organization System (SKOS) in the Context of Semantic Web Deployment, Library of Congress, May 2008

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    Let me start by saying that I am here to promote SKOS. Having said that, I would like to be as objectives as I can about both the technical and the business arguments for using SKOS, and for buying in to the Semantic Web technology ecosystem. I’d like to do that because I’d like to use this presentation as a kind of sanity check. Libraries such as LOC who have invested over a long period of time in knowledge organisation have always been acknowledged as primary stakeholders in the development of SKOS, and SKOS has always been about providing a means for libraries to extract and to share more value from their knowledge organisation systems. So I’d very much like to know whether what I say today makes sense from your point of view. This is especially relevant as SKOS nears completion. The main technical work on SKOS will effectively be complete by the end of June.

    2 Favorites & 1 Group

    Simple Knowledge Organization System (SKOS) in the Context of Semantic Web Deployment, Library of Congress, May 2008 - Presentation Transcript

    1. The Simple Knowledge Organization System (SKOS) in the context of Semantic Web Deployment Alistair Miles http://purl.org/net/aliman Library of Congress May 2008
    2. THE FUTURE OF THE WEB http://purl.org/net/aliman
      • Testimony of Sir Timothy Berners-Lee CSAIL Decentralized Information Group Massachusetts Institute of Technology
      • Before the United States House of Representatives Committee on Energy and Commerce Subcommittee on Telecommunications and the Internet
      • http://dig.csail.mit.edu/2007/03/01-ushouse-future-of-the-web.html
      http://purl.org/net/aliman
    3. I. Foundations of the Web
      • “ The success of the World Wide Web, itself built on the open Internet, has depended on three critical factors:
      • unlimited links from any part of the Web to any other;
      • open technical standards as the basis for continued growth of innovation applications; and
      • separation of network layers , enabling independent innovation for network transport, routing and information applications.”
      http://purl.org/net/aliman
    4. A. Universal linking: Anyone can connect to anyone...
      • “ In simple terms, the Web has grown because it's easy to write a Web page and easy to link to other pages.”
      • “ What makes it easy to create links ... is that there is no limit to the number of pages or number of links possible on the Web.”
      • “ Adding a Web page requires no coordination with any central authority , and has an extremely low, often zero, additional cost.”
      http://purl.org/net/aliman
      • “ Adding a page provides content, but adding a link provide the organization, structure and endorsement to information on the Web which turn the content as a whole into something of great value.”
      http://purl.org/net/aliman
      • “ The universality and flexibility of the Web's linking architecture has a unique capacity to break down boundaries of distance, language, and domains of knowledge.”
      • “ These traditional barriers fall away because the cost and complexity of a link is unaffected by most boundaries that divide other media.”
      http://purl.org/net/aliman
      • “ The Web's ability to allow people to forge links is why we refer to it as an abstract information space , rather than simply a network.”
      http://purl.org/net/aliman
    5. II. Looking forward
      • “ First, the Web will get better and better at helping us to manage, integrate, and analyze data .”
      • “ Today, the Web is quite effective at helping us to publish and discover documents, but the individual information elements within those documents ... cannot be handled directly as data.”
      http://purl.org/net/aliman
      • “ Today you can see the data with your browser, but can't get other computer programs to manipulate or analyze it without going through a lot of manual effort yourself.”
      • “ As this problem is solved, we can expect that Web as a whole to look more like a large database or spreadsheet , rather than just a set of linked documents.”
      http://purl.org/net/aliman
    6. A. Data Integration
      • “ Locked within all of this data is the key to knowledge about how to cure diseases, create business value, and govern our world more effectively.”
      • “ The good news is that a number of technical innovations...
      • ... (RDF which is to data what HTML is to documents, and the Web Ontology Language (OWL) which allows us to express how data sources connect together) ...
      • ... along with more openness in information sharing practices are moving the World Wide Web toward what we call the Semantic Web .”
      http://purl.org/net/aliman
      • “ Progress toward better data integration will happen through use of the key piece of technology that made the World Wide Web so successful: the link .”
      • “ The power of the Web today, including the ability to find the pages we're looking for , derives from the fact that documents are put on the Web in standard form, and then linked together .”
      http://purl.org/net/aliman
      • “ The Semantic Web will enable better data integration by allowing everyone who puts individual items of data on the Web to link them with other pieces of data using standard formats.”
      http://purl.org/net/aliman
    7. DATA WEBS FOR E-SCIENCE http://purl.org/net/aliman
    8. http://purl.org/net/aliman
    9. FlyWeb Project
      • Fruit flies ( Drosophila ...)
      • Model organism
      • Extensive body of genetic research
      • Much of that knowledge is in journal papers
      • Recognised value of research data
      • Establish public databases
        • E.g. FlyBase
        • Centrally curated
      http://purl.org/net/aliman
    10. http://purl.org/net/aliman
    11. Data Webs
      • Link data resources
      • Ask questions that no single data resource can answer
      • What’s the easiest, cheapest, most scalable way to achieve this?
      • Agile approach, add value incrementally, return value early and often...
      http://purl.org/net/aliman
    12. http://purl.org/net/aliman Vertical Web Apps Level 0 – Any Data Resources in the Web Level 1 – SPARQL End-points Level 2 – SPARQL End-points (Schema Alignment) Level 3 – SPARQL End-points (Integrated Data) Web 2 Mash-ups SPARQL Mash-ups SPARQL Mash-ups ??? Data Web Layer Cake
    13. Example Application
      • [insert screenshot of mashup]
      http://purl.org/net/aliman
    14. Future, self-publishing
      • As publishing data on the Web becomes easier...
      • ...more research groups will publish their own data...
      • ...rich network of data resources...
      • ...challenging traditional view of scholarly life cycle & value chain ... value grid...
      http://purl.org/net/aliman
    15. SKOS http://purl.org/net/aliman
    16. Potted History
      • SKOS 2001 (pre-alpha)
        • Thesaurus Interchange Format (TIF), LIMBER Project
      • SKOS 2003 (alpha)
        • Semantic Web Advanced Development for Europe (SWAD-Europe)
      • SKOS 2005 (beta)
        • W3C Semantic Web Best Practices and Deployment Working Group (SWBPD)
      • SKOS 2008 (W3C Recommendation)
        • W3C Semantic Web Deployment Working Group (SWD)
      http://purl.org/net/aliman
    17. http://purl.org/net/aliman http://www.w3.org/2007/Talks/1211-whit-tbl/
    18. Layers in the Web
      • http://www.w3.org/2007/Talks/1211-whit-tbl/#(23)
      • Third layer is network (graph) of connections beyond documents...
      • ... people, organisations, genes, proteins, concepts ...
      • Represent these connections (data) in the (Semantic) Web
      http://purl.org/net/aliman
    19. KOS e.g. LCSH
      • Can be viewed as a network of interconnected concepts
      • Represent LCSH as data in the Web
        • Make those concepts and their interconnections part of the Web
      http://purl.org/net/aliman
    20. http://purl.org/net/aliman
    21. http://purl.org/net/aliman
    22. http://purl.org/net/aliman
    23. http://purl.org/net/aliman
    24. Publishing KOS in the Web?
      • Use RDF
        • Basic framework for data in the Web – resources, literals, links... (“graphs” of data)
      http://purl.org/net/aliman
    25. Publishing KOS in the Web?
      • Use SKOS
      • Standard set of...
        • Resource types (Classes)
        • Link types (Properties)
      • ... For representing KOS as RDF data
      • (N.B. Because use URIS as names for classes and properties, call this an RDF vocabulary)
      http://purl.org/net/aliman
    26. SKOS Resource Types (Classes)
      • skos:Concept
        • E.g. Baseball in art
      • skos:ConceptScheme
        • E.g. LCSH itself
      http://purl.org/net/aliman
    27. SKOS Link Types (Properties)
      • For labeling concepts
        • skos:prefLabel, skos:altLabel, skos:hiddenLabel
      • For documenting concepts
        • skos:note, skos:scopeNote, skos:definition, skos:editorialNote...
      • For linking concepts
        • skos:broader, skos:narrower, skos:related
      http://purl.org/net/aliman
      • http://inkdroid.org/bzr/lcsh/docs/slides/
      http://purl.org/net/aliman
    28. http://purl.org/net/aliman
    29. http://purl.org/net/aliman
    30. Publishing LCSH in the Web
      • Project LCSH into RDF (i.e. create an RDF representation)
      • Publish it in the Web as linked data
        • http://lcsh.info
      • Ed Summers, Clay Redding, Dan Krech, Antoine Isaac
      http://purl.org/net/aliman
    31. Scope of SKOS
      • SKOS will be an all-encompassing standard for the lossless representation and exchange of all varieties of knowledge organisation system ... ?
        • No
      • http://lists.w3.org/Archives/Public/public-swd-wg/2008Feb/0116.html -- Antoine Isaac
      http://purl.org/net/aliman
      • “ ...the things that we aim at representing are very diverse: some classification schemes use ‘codes’ and refer to ‘classes’, thesauri have ‘terms’ and so on.”
      http://purl.org/net/aliman
      • “ Yet, it happens, looking at the way these things are used now and will be in the near future (with more and more links established between them), that (i) some standardisation has to take place, and that (ii) this standardisation can be actually grounded on some observed practical similarities ( http://www.w3.org/TR/skos-ucr/ )”
      • “ Our aim is not to replace the original objects in their initial context of use, but to allow to port them to a shared space , based on a simplified model , enabling wider re-use and better interoperability.”
      http://purl.org/net/aliman
    32. Lessons from the Web
      • Less is more ...
        • E.g. REST over SOAP
      • SKOS should capture a small amount of common ground ... Just enough to enable KOS’s valuable concepts and connections to be deployed in the Web and be linked to/from
      • N.B. SKOS is infinitely extensible!
        • Easy to mix & match
        • Easy to refine
      http://purl.org/net/aliman
    33. THE VALUE OF LINKS http://purl.org/net/aliman
    34. The value of links
      • The Web showed, links between documents are really useful
      • Google’s pagerank showed, structure of network means something (and is worth something!)
      • Social networking Web sites showed, how much we value other kinds of links
      http://purl.org/net/aliman
    35. Linked Metadata
      • You’ve got LCSH in the Web, what next?
      • ... Linked metadata...?
      http://purl.org/net/aliman
      • http://inkdroid.org/bzr/lcsh/docs/slides/
      http://purl.org/net/aliman
    36. http://purl.org/net/aliman
    37. http://purl.org/net/aliman
      • [insert demo, show how links change topology of information space]
      http://purl.org/net/aliman
    38. http://purl.org/net/aliman
    39. Value Proposition
      • Links are paths to the discovery of information
      • Links can be exploited in useful (and surprising) ways
      • Well-established KOS like LCSH can be hubs in the network of linked metadata, bridging ...
      • (On the Semantic Web, LCSH should get very high semantic pagerank !)
      http://purl.org/net/aliman
    40. USING URIS http://purl.org/net/aliman
    41. Why use URIs?
      • Identifier management
      • Data discovery
      http://purl.org/net/aliman
    42. Identifier management
      • Referring to things
      • In a database, each table has a primary key
      • What happens when you try to combine data from 2 databases?
        • Identifier clashes (ambiguous reference)
        • Identifier aliases (co-reference)
      • Clashes hurt precision, give you nonsense
      • Aliases hurt recall, miss important results/links
      http://purl.org/net/aliman
    43. URIs & identifier management
      • URIs are like a single, global pool of identifiers – one world-wide primary key
      • Can claim ownership of parts of “URI space”
      • Even though we’re all using same primary key, mechanism for avoiding URI clashes
      • Can join data from multiple sources with confidence
      • But ... doesn’t solve the alias problem, still need to find co-references
      http://purl.org/net/aliman
    44. Data discovery
      • Problem with distributed data ... How do you find everything thats “out there”?
      • Two general approaches:
        • Centralised
        • Decentralised
      http://purl.org/net/aliman
    45. Centralised discovery
      • Someone somewhere keeps a “catalogue” of everything
      • Everyone “knows” where that catalogue is
      • New sources “tell” the catalogue about themselves (a.k.a. “register” themselves)
      • E.g. Gas maintenance
      • Works well at small-medium scales
        • E.g. FlyWeb
      • Rely on networks outside the Web (e.g. Knowing the right people)
      http://purl.org/net/aliman
    46. FlyWeb Project
      • [small number of large data resources]
      http://purl.org/net/aliman
    47. Decentralised discovery
      • Data in one source refers data in another (using a URI)
      • Data from the other source can be retrieved directly, by “de-referencing” the URI via the Web
      • So given one data source, you can “follow your nose” and retrieve data from all linked sources ...
      • ... without needing a central catalogue or registry, just the Web
      • Works well up to Web-scale
        • E.g. FOAF
      http://purl.org/net/aliman
    48. Dereferenceable?
      • For some URIs, can retrieve a “representation” of the “resource” via the Web
      • (N.B. “resource” = “thing”)
      http://purl.org/net/aliman
    49. FOAF
      • Very large number of relatively small data resources
      http://purl.org/net/aliman
    50. Why use URIs?
      • Identity management
        • From 2 to 2 billion data sources, always a problem
      • Data discovery
        • Ability to “de-reference” a URI opens possibility for decentralisation
        • Ability to “de-reference” is also useful in centralised models (e.g. Registries can harvest)
      http://purl.org/net/aliman
    51. SEMANTIC WEB DEPLOYMENT http://purl.org/net/aliman
    52. W3C SWD WG
      • SKOS
      • RDFa
      • Recipes for publishing RDF (linked data)
      • Vocabulary management
      http://purl.org/net/aliman
    53. W3C Semantic Web Activity
      • Semantic Web Deployment
      • Data Access (DAWG)
        • SPARQL query language, SPARQL protocol
      • GRDDL
      • OWL 2
      • SWHCLSIG
      • SWEO
      http://purl.org/net/aliman
    54. SUMMARY http://purl.org/net/aliman
    55. Suggestions
      • Linked KOS
        • Project LCSH into RDF (SKOS) – done
        • Publish LCSH as linked data in the Web – done
        • Publish SPARQL endpoint for LCSH
      • Linked metadata
        • Project LOC metadata into RDF
        • Publish LOC metadata as linked data in the Web
          • With links to LCSH & LCC
        • Publish SPARQL endpoint for LOC metadata
      http://purl.org/net/aliman
    56. Bibliographic Information as RDF?
      • Projecting LCSH into RDF ... SKOS is standard vocabulary of resource & link types
      • Projecting LCSH metadata into RDF ... Which vocabulary to use???
      • Challenge – diversity of bibliographic information!
      http://purl.org/net/aliman
    57. RDA -> RDF
      • Joint DCMI/RDA task force
      • Seed funding to develop initial prototype RDF vocabularies for bibliographic information
      • Based on FRBR and data model implicit in RDA
      • Early stages
      • http://dublincore.org/dcmirdataskgroup/
      • Karen Coyle
      http://purl.org/net/aliman
    58. Thanks
      • STFC Rutherford Appleton Lab
        • Brian Matthews, Michael Wilson, Juan Bicarregui
      • Oxford Image Bioinformatics Research Group
        • David Shotton, Graham Klyne, Jun Zhao
      • W3C Semantic Web Deployment WG
      • Members of [email_address]
      • Comments on SKOS -> [email_address]
      http://purl.org/net/aliman

    + gardensofmeaninggardensofmeaning, 2 years ago

    custom

    2214 views, 2 favs, 0 embeds more stats

    Links are valuable. Links between documents, betwee more

    More Info

    © All Rights Reserved

    Go to text version
    • Total Views 2214
      • 2214 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 2
    • Downloads 94
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as innappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel

    Categories

    Groups / Events