• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
The Semantic Web and Libraries in the United States: Experimentation and Achievements

The Semantic Web and Libraries in the United States: Experimentation and Achievements



This presentation reflects the paper titled "The Semantic Web and Libraries in the United States: Experimentation and Achievements," published in the proceedings of 75th IFLA General Conference and ...

This presentation reflects the paper titled "The Semantic Web and Libraries in the United States: Experimentation and Achievements," published in the proceedings of 75th IFLA General Conference and Assembly, Satellite Meeting: Emerging Trends in Technology: Libraries between Web 2.0, Semantic Web and Search Technology 8/19-20/2009, in Florence, Italy, presented by Sharon Yang, Rider University, Yanyi Lee, Wagner College, and Amanda Xu, St. John's University. Here is the URL to the full paper: http://www.ifla2009satelliteflorence.it/meeting3/program/assets/SharonYang.pdf



Total Views
Views on SlideShare
Embed Views



11 Embeds 585

http://librarynext.wordpress.com 320
http://feeds.feedburner.com 115
http://www.scoop.it 82
http://knokestaggs.wordpress.com 29
http://www.slideshare.net 24
http://demo.wordpress-fr.net 7
https://librarynext.wordpress.com 3
http://www.slashdocs.com 2
http://bit-asup91sbx-lnx.pd.local 1
http://etechlib.wordpress.com 1
http://www.linkedin.com 1


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • SIMILE – Semantic Interoperability of Metadata and Information in Unlike Environments. Check : SIMILE Add-ons: http://www.answers.com/topic/simile-2 & http://simile.mit.edu/wiki/Category:Project Semantic Search: http://wiki.dspace.org/index.php/User:Kotsomit & http://repository.upatras.gr/dspace/help/index.html#semantic SKOS - http://wiki.dspace.org/index.php/User:Christophe.Dupriez http://www.windmusic.org/dspace/handle/68502/86407
  • 1. Fedora – Flexible, Extensible and Digital Object Repository Architecture Check - http://www.fedora-commons.org/confluence/dashboard.action & http://wiki.nsdl.org/index.php/Community:NCore/NCore_in_Fedora_3.0#figure_1 2. NSDL’s Ncore Repository Architecture – Implement Fedora 2.2 Instances and middleware Webapp providing Ncore-centric API, and model of interaction, e.g. OAI-PMH access to DC metadata in the repository 3. DuraCloud Project - http://expertvoices.nsdl.org/duraspace/2009/07/15/library-of-congress-and-duracloud-launch-pilot-program-using-cloud-technologies-to-test-perpetual-access-to-digital-content-service-is-part-of-national-digital-information-infrastructure-and-preserva/

The Semantic Web and Libraries in the United States: Experimentation and Achievements The Semantic Web and Libraries in the United States: Experimentation and Achievements Presentation Transcript

  • THE SEMANTIC WEB AND LIBRARIES IN THE UNITED STATES: EXPERIMENTATION AND ACHIEVEMENTS 75 th IFLA General Conference and Assembly Satellite Meeting: Emerging Trends in Technology: Libraries between Web 2.0, Semantic Web and Search Technology 8/19-20/2009, Florence, Italy Sharon Yang, Associate Professor/Librarian, Rider University Yanyi Lee, Systems Librarian, Wagner College Amanda Xu, Assistant Professor/Librarian, St. John’s University
  • Overview
    • Introduction to Semantic Web
    • Library’s Role in Semantic Web
    • Semantic Web Development in the U.S. Libraries
    • Library Semantic Web Development in the United States and Europe-A Comparison
    • Conclusion
    • Questions & Answers
  • Introduction
    • Semantic Web -A vision by Tim Berners-Lee; An extension of the current Web; Web of data for machine-processing; Smart applications making connection of people, communities, things, and knowledge on the Web
    • Semantic Web technologies (loosely a.k.a. Semantic Stack)
      • URIs – naming things and info resources, and providing means to access info about them over the Web
      • Meta-Languages, e.g. RDF, RDFS, OWL, SPARQL, etc. and their serialization in RDF/XML, N-Triples, N3, Turtle, RDF/JASON, etc.
      • Logic for reasoning, e.g. making inferences based on classes, subclasses, properties, sub-properties, domains, ranges, and logical operations (unions, intersection, etc. ) in RDFS; and value restrictions, cardinality, transitivity, equivalence, and logical operations in OWL
      • Proof for validation
      • Trust via digital signatures and other knowledge, e.g. recommendation, rating, and certifications
  • Introduction (Continued) Semantic Web Architecture/Stack 1
  • Introduction (Continued)
    • Extended SW for community-based vocabularies
      • SKOS (Simple Knowledge Organization Systems) – expressing concept organization systems such as thesauri, taxonomies, and controlled vocabularies in RDF/RDFS, OWL
      • FOAF (Friend of a Friend) – describing people, links between them, things they create or do, and their relationships in RDF/RDFS, OWL
      • SIOC (Semantically-Interlinked Online Communities) – expressing information in the discussion forums on the Web, e.g. blogs, forums, and mailing list
      • Dublin Core – a set of meta-data elements for cross-domain info description
      • Semantic Wiki/Semantic MediaWiki
      • Encoding structured information into HTML pages using Microforms, and mapping it to RDF using GRDDL
      • Encoding and retrieving RDF attributes from HTML page using RDFa
  • Introduction (Continued)
    • Semantic Web Aware Tools
      • Browser level
        • Google’s RDFa support 1 , Yahoo Pipes 2 , Firefox extension for Piggy Bank 3 , FOAFfox 4 , Zotero 5 , etc.
      • Desktop level
        • Twinkle in Jena 6 , and ARC in XAMP for SPARQL 7
      • Ontology editors
        • Protégé 8 and TopBraid 9
      • RDF store
        • Oracle 11’s RDF database 10 , Vulcan’s Knoodl 11
      • Semantic searching
        • Hakia 12 , Freebase Parallax 13 , Semantic Engines’ SenseBot 14
      • SW framework enablers
        • Drupal 15 , Semol’s ARC and Trice 16 , Twine 17 , Jena 18 , Sesame 19 , Ontotext’s OWLIM 20
  • Library’s Role in Semantic Web
    • Phase 1 – Semantifying Thesaurus/Mappings/Services
      • Weaving semantic Wiki and semantic media Wiki
      • Translating LC controlled vocabularies and authority control for named entities, thesauri from domain specific societies and institutions into RDF/RDFS, OWL, SKOS with URIs assigned according to ‘Linked Data Design Principles (Berners-Lee, 2007)’
      • Converting semantic-aware data sets in MARCXML, RDFa and DBPedia into RDF triples in conformance with content models defined in FRBR, RDA, Dublin Core, and registering them with global metadata registries
      • Extensive use of URIs for individual data elements in bibs, authority, etc.
      • Explicit correlation and referencing between LCSH terms, and LCC and DDC numbers
      • Creating build-in links to thesauri with MS Office tools
      • Supporting SPARQL endpoint for querying, analyzing, and federation of data sets
      • Developing business rules and knowledgebase for web services for info exaction, identity resolution, filtering, semantic annotation and search
  • Library’s Role in Semantic Web (Continued)
    • Phase 2 – Exposing collections
      • Bibliographic data discovery, sharing, and reuse, e.g. Talis Connected Commons 1
      • Authority data discovery, sharing, and reuse, e.g. LC authorities & Vocabularies 2 , OCLC’s Faceted Application of Subject Terminology (FAST) 3 , Multilingual Access to Subjects (MACS) 4
      • Explanatory search, e.g. OCLC WorldCat Local 5 , Scriblio 6 , Endeca 7 , and Solr-based search tools, e.g. Vufind 8 , Primo 9 , and Blacklight 10
      • Social networking, e.g. FRBR Blog 11 , LibGuides’ Widget 12 , Academia.edu 13 , Scientific Collaboration Framework 14
      • Preservation and archiving, e.g. Dspace 15 , Fedora Commons 16 , Greenstones 17 , arXiv 18 , OAI-PMH 19 , Cheshire III 20 , e-Prints.org 21 and OCLC’s ArchiveGrid 22
      • Distributed content management systems, e.g. LC’s MIC 23 , Drupal Core and Site Vocabulary 24 , YouTube 25 , Flickr 26 , Vimeo 27 , and Blinx 28
      • Info dissemination, e.g. email alerts from LinkedIn 29 & visualization
      • Intelligent info analysis and decision support
  • Semantic Web Development in the U.S. Libraries
    • Most Semantic Web projects in the U.S. Libraries :
      • By national libraries or big organizations such as LC , NLM, NAL, OCLC, and DCMI
      • In the process of creating Semantic Web tools and infrastructures, e.g. exposing collections
      • In the area of converting MARC records and controlled vocabularies/thesauri into URIs and RDF/XML
      • Semantic Web technologies are slowly and steadily incorporated into digital library management systems
  • Library of Congress Semantic Web Projects
    • Participated in the creation of the W3C standard on SKOS and SKOS Primer
    • LCSH/SKOS project (several small projects)
      • LCCN Permalink project (persistent URLs from LC bibs completed, and authorities in planning)
      • LCSH in RDF/XML using SKOS (entire subject authority records downloadable at http://id.loc.gov/authorities)
      • Supporting SKOS in RDA, MARC, PREMIS, and METS (experimental stage)
      • Maintaining registries for SKOS, and related standards & data elements (on-going)
      • Correlating DDC, LCC/LCSH in Classification Web (on-going)
  • Library of Congress Semantic Web Projects (Continued)
    • Other initiatives influenced by Semantic Web
      • Developing content and format guidelines for submission of ONIX data to CIP program, and retooling ECIP program (completed)
      • Enabling OAI-PMH Retrieval from VIAF (Virtual International Authority File) (completed)
      • Making all access points in LC ILS under authority control (on-going)
      • Enabling RDA in MARCXML, MODS, and MADS (testing stage)
      • Linking TOCs, publisher descriptions, contributor-supplied biographies, sample pages, reading guides, reviews and user-added data to Bibs (on-going)
    • Joining JSC, DCMI, CNL, BNL, NLA and others on making RDA records readily adaptable to Semantic Web (planned)
  • Library’s Role in Semantic Web (Continued)
    • Phase 3 – Future Semantic Web
      • Web of Linked Data 1 & 2
        • DBPedia 3
        • GeoNames 4
        • Other public data in Cloud Computing, e.g. Amazon Web Services 5
        • Librarything 6
      • Trust layer
        • Semantifying Web services for reviews and rating of library related application services, e.g. journal ranking and acceptance rates 7 , CiteUlike 8 , ISI Web of Science
  • OCLC Semantic Web Projects
    • FRBR-izing projects
      • Developed FRBR work set algorithms 1 and xISBN Web Services 2
      • FictionFinder (2.8 million records in WorldCat) 3 & 4
      • WorldCat Identities (20 million identities) 5
    • PREMIS Data Dictionary 6 & 7
      • Sponsored PREMIS working group
      • Developed common data model for metadata preservation, and implementation strategies (Release 2.0)
  • OCLC Semantic Projects (Continued)
    • ECHODEP at UIUC partnered with OCLC , etc. & funded under LC’s NDIIP 1
        • “ Phase I (2004-2007) - Web archiving tool development, repository evaluation, interoperability tool development for METS, and long term semantic preservation research”
        • “ Phase II (2008-2009) – expansion of repository architecture; semantic archiving for preservation of meaning and structure; auto metadata extraction and creation, and user evaluation; data format risk assessment based on INFORM methodology”
  • Other U.S. Semantic Web Projects
    • Semantic Web Features in Dspace
      • MIT SIMILE 1 Add-ons to enhance inter-operability among digital assets, schemata/vocabularies/ontologies, metadata and services with RDF-based tools for Open Source
        • Longwell (faceted browser for visualizing RDF data set)
        • Piggy Bank (Firefox Extension for website scrappers & mash-ups)
        • RDFizer (Converting structured data into RDF, e.g. JPEG, MARC, MODS, OAI-PMH, OCW, BibTeX)
        • Welkin (Graph-based RDF visualizer) & Gadget (Data graph viewer for XML)
        • Timeline (Temporal data visualizer)
      • Semantic Search of Dspace 2 contents by HPCLab, Univ. of Patras using OWL 2.0 API to support DL Reasoner & DL-query Tab of Protégé 4.0
      • SKOS controlled vocabulary list in DSPace 3 , e.g. WindMusic.org 4
  • Other U.S. Semantic Web Projects (Continued)
    • Semantic Web Features in Fedora
      • Joint Project of Fedora Commons, Cornell Univ., and Univ. of Virginia
      • Digital Asset Management (DAM) architecture
      • Mulgara 1 – RDF databases for Web front-end of Fedora repository with SPARQL and OTM (Object to Tripple Mapper) endpoints & SWRL support by Revelytix
      • NSDL’s Ncore Repository Architecture – Implement Fedora-based digital repository, Ncore Model, an API, and a set of middleware Webapp for collaborative, and fully functional semantic digital library 2
    • Semantic Web Features in NSDL 3
      • Providing metadata registry for controlled vocabularies deployed in SKOS, and for ARC SPARQL + Endpoint
    • DuraSpace 4
      • Open source initiatives supported by Dspace and Fedora to develop synergistic technologies, services and programs to increase the interoperability of the two platforms, including Mulgara implementation
      • DuraCoud project funded LC NDIPP & participated by NYPL and Biodiversity Library aiming to help organizations to take advantage of cloud technologies in providing access to digital materials 5
  • A Comparison
    • Semantic Projects by European Libraries
      • More aggressive and bold
      • More enthusiasm
      • Aims at delivering digital and semantic library applications
      • More visible to the public
      • Frequent conferences on digital libraries and semantic web
    • Semantic Web Projects by U.S. Libraries
      • Creating semantic tools and infrastructure
      • Converting controlled vocabularies and laying ground work
      • Less visible to the public
      • Domain specific applications
      • Cautious and slow
  • Conclusion
    • Semantic-aware tools have been increasingly embedded in current Web applications with different levels of semantic support from browsers to desktops, from clients to servers, and servers to servers
    • IT vendors started to view the capability of combining the Web of Data as an opportunity to move from current HTML-based Web to the Web of Linked Data, especially big vendors such as Google, Yahoo, Oracle and others
    • The vast amount of library data, e.g. bibliographic data, authority data, controlled vocabularies, non-MARC oriented meta-data and classification schemes have been standardized for representations, controlled for quality and entity resolutions, and cross-mapped for interoperability and reuse
    • The library-related Web of Data is ready to be exposed to Semantic Web application development.
  • References
    • Allemang, D. & Hendler, J. (2008). Semantic Web for the Working Ontologist : Effective Modeling in RDFS and OWL. Boston, MA: Morgan Kaufmann Publisher/Elsevier
    • Berners-Lee, T. (2000). Semantic Web – XML2000. In W3C Website. Retrieved Aug. 24, 2009, from http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
    • Berners-Lee, T. (2007). Linked Data. In W3C Website. Retrieved Aug. 24, 2009, from http://www.w3.org/DesignIssues/LinkedData.html
    • Dupriez, C. (2009). Integrating SKOS Thesauri and Authority Lists in Dspace. In wiki.dspace.org. Retrieved Aug. 24, 2009, from http://wiki.dspace.org/index.php/User:Christophe.Dupriez
    • FedoraCommons Dashboard. Retrieved Aug. 24, 2009, from http://www.fedora-commons.org/confluence/dashboard.action
    • Harper, C.A., & Tillet, B.B. (2007). Library of Congress Controlled Vocabularies and Their Application to the Semantic Web. Cataloging & Classification Quarterly, 43(3/4), 47-68
    • Koutsomitropoulos, D. (2009). Semantic Search Facility for Dspace: Overview. In wiki.dspace.org. Retrieved Aug. 24, 2009, from http://wiki.dspace.org/index.php/User:Kotsomit
    • Malmsten, M. (2008). Making a Library Catalogue Part of the Semantic Web. In Berlin Proc. Int’l Conf. on Dublin Core and Metadata Applications. Retrieved Aug. 24, 2009, from http://dcpapers.dublincore.org/ojs/pubs/article/view/927/923
    • Marcum, D.B. (2008). Response to On the Record: Report of the Library of Congress Working Group on the Future of Bibliographic Control. Retrieved Aug. 24, 2009, from http://www.loc.gov/bibliographic-future/news/LCWGResponse-Marcum-Final-061008.pdf
  • References (Continued)
    • Morris, C.M. (2009). LC and DuroCloud Launch Pilot Program … In DuraSpace Blog. Retrieved Aug. 24, 2009, from
    • http://expertvoices.nsdl.org/duraspace/2009/07/15/library-of-congress-and-duracloud-launch-pilot-program-using-cloud-technologies-to-test-perpetual-access-to-digital-content-service-is-part-of-national-digital-information-infrastructure-and-preserva/
    • SIMILE Overview. Retrieved Aug. 24, 2009, from http://web.mit.edu/dspace-dev/www/simile/resources/overview.html
    • SIMILE Projects. Retrieved Aug. 24, 2009, from http://simile.mit.edu/wiki/Category:Project
    • Summers, E., Isaac, A., Redding, C., & Krech, D. (2008). LCSH, SKOS, and Linked Data. Proceedings of the International Conference on Dublin Core and Meta-data Applications. Retrieved Aug. 24, 2009, from http://dcpapers.dublincore.org/ojs/pubs/article/view/916/912
    • Yang, S., Lee, Y.Y., Xu, A. (2009). Semantic Web and Libraries in the United States: Experimentation and Achievements. Proceedings of Emerging Trends in Technology: Libraries between Web 2.0, Semantic Web and Search Technology: IFLA 2009 Milan –Italy , Satellite Meetings in Florence. [CD-ROM]. [Firenze, Italia]: Ente Cassa di Risparmio di Firenze
  • Questions & Answers