Digital Library Applications Of Social Networking Jeju Intl ConferencePresentation Transcript
Digital Library Applications of Social Networking Dr. Myungdae Cho Library School SungKyunKwan University firstname.lastname@example.org
1. Social Networking & Information Fluency From PIM (Personal Information Management) level - Is Memex incarnated? - more than hyperlink … To “Sociality” by Link and Tags -> Inter-subjectivity - a thought in a user links to many thoughts in internet community ->Principle of Emergence
Socialty (Messaging, Blogging, Streaming media)
Machine, Humanand Socialty in Information Discovery Forms Inter-subjectivity Ontology (gives Subjective Path) In other words: Top down + Bottom up
Another view of Machine, Humanand Socialty in Information Discovery RDF vocabularies or (Ontology) User-Created Metadata Linked Data (semantically organized data) Mapped data from existing DB (such as MARC)
2. Social Networking in Libraries Social networking could enable librarians and patrons not only to interact, but to share and changeresources dynamically in an electronic medium.
Why do libraries care about social networking sites? The next big thing after Google is Social Networking. ( From “As facebook takes off, Myspace strikes back” Kirkpatrick, Davis. Foutune. Sept. 19, 2007)
2.1 Existing Library Application of Social Networking Librarything in libraries Delicious in libraries Mashup….. couldbe as an application of Linked data
2.1.1 Librarything in libraries http://www.librarything.com/ “Personalized desire from individual’s needs” “Cataloguing thru Social Networking” LibraryThing is a prominent social cataloging web application for storing and sharing personal library catalogs and book lists.
Librarything in libraries LibraryThing helps you create a library-quality catalog of your books. LibraryThing connects people based on the books they share.
3. how to lift existing metadata into a semantic level Mapping (Marc21 -> DC, Marc -> FRBR etc) Open Sources (Open Api) Linked Data
3.1 Open Source Open Source Social Platforms: 10 of the Best 10 open source software platforms http://mashable.com/2007/07/25/open-source-social-platforms/ www.programmableweb.com SungKyunKwan University: Use ofOpen API http://lib.skku.edu/index.ax
3.2 Linked Data “Oh my goodness, the original web of documents was just the tip of iceberg.” ( Sir Tim Berners Lee, July 2008)
What is it?
Closed containers of data Information systems, such as library catalogs, have been, and still are, for the greatest part closed containers of data, or “silos” without connections between them, inaccessible to Web architecture (No Url, no links) with a few exception. (Tim Berners Lee) free from the capsules of the catalog
Linked Data Linked Data is a methodology for providing meanings and relationships between things anywhere on the web, using URISfor identifying, RDF for describingand HTTP for publishing
Two valuable notions from library community Collocations
1876 / Charles Cutter
Resources with the same or related content are grouped together.
clarification that follows from the
removal of ambiguity
Collocations through Linked Data Wiki: http://www.wikipedia.org/ vs dbpedia : http://dbpedia.org/About WorldCat: http://www.worldcat.org/ vs Fictionfinder (FRBR model): http://fictionfinder.oclc.org
rdf identifiers as a disambiguation http://rdf.freebase.com/?freebaseid http://rdf.freebase.com/ns/en.blade_runner
rdf identifiers as a disambiguation annotation Disambiguation process
rdf identifiers as a disambiguation
Another disambiguation_dereferenceable URIs
In summary so far:Paradigm Shift in www
4. Library’s role in Semantic Web Phase 1: Semantifying MARC, Thesaurus etc Translating LC controlled vocabularies and authority control for named entities, thesauri from domain specific societies and institutions into RDF/RDFS, OWL, SKOS with URIs assigned according to ‘Linked Data Design Principles (TBL, 2007) Phase 2: Authority data discovery, sharing, and reuse, e.g., LC authorities & Vocabularies, OCLC’s Faceted Application of Subject Terminology (FAST) etc Phase 3: Into the Semantic Web Web of Linked data DBPedia GeoNames Librarything
Case: OCLC Semantic Web Projects
Developed FRBR work set algorithms’ andxISBN Web Services
WorldCat Identifiers (20 million identifiers)
CV: Why establish controlled vocabularies? Control values that occur in metadata Reduce ambiguity Control synonyms Make documentation available for reuse validate terms (by subject heading /LCSH) Establish formal relationships among values where appropriate Controlled vocabularies: ALA program on Linked Data ALA Annual 2009
Types of Controlled Vocabularies used in metadata standards Lists of enumerated values Code lists (e.g. language, country) Taxonomies Formal Thesauri Locally controlled enumerated lists Controlled vocabularies: ALA program on Linked Data ALA Annual 2009
Thesauri A thesaurus is a controlled vocabulary with multiple types of relationships Example: Rice UF Paddy BT Cereals BT Plant products NT Brown rice RT Rice straw
Standards maintained at LC contain controlled vocabularies LCSH/NAF Thesaurus of Graphic Materials ISO 639-2 (language codes) MARC (including code lists) MODS METS PREMIS MIX (XML schema for NISO Z39.87 Technical metadata for digital still images) … and some others
Representing information about controlled vocabulary values Data elements in metadata formats, e.g. MARC Authority format XML schemas (sometimes as enumeration values) RDF/XML and RDFS (Resource Description Framework) SKOS MADS (Metadata Authority Description Schema)
Reasons for developing a web service for vocabularies Facilitate development and maintenance process for vocabularies Make controlled lists “openly” available Provide comprehensive information about controlled values Experiment with semantic web technologies and linked data Expose vocabularies to wider communities
Popular Rdf Vocabularies People + Organisations FOAF, HCard, Relationship, Resume Places Geonames, Geo Events RDFCalendar Social Media SIOC, Review Topics + Tags SKOS, MOAT, HolyGoat eCommerce GoodRelations, CC Licensing More... Scovo, DOAP, Recipes, Measurements, ...
SKOS “Simple Knowledge Organisation System(s)” A Semantic Web standard called Simple Knowledge Organization System (SKOS) defines the organization of terms into thesaurus form, with broader and narrower terms and alternate terms including alternate language entries Simple, extensible, machine-understandable representation for “concept schemes” Thesauri Classification Schemes Taxonomies Subject Headings Other types of ‘controlled vocabulary’… Disadvantage: unusual concept schemes don’t fit into SKOS (original structure too complex)
A Method to Convert Thesauri to SKOS Case1 Original XML data file: http://www.esd.org.uk/standards/ipsv/ipsv.xml Original XML Schema filehttp://www.esd.org.uk/standards/xmlschemas/taxonomy-v3.0.xsd Conversion program: convertipsv.pl (contains instructions for usage) Resulting RDF: ipsv/rdf/ipsv.rdf SKOS Core schema: http://www.w3.org/2004/02/skos/core/history/2005-10-14 (version used for this paper, for latest version seehere) Additional IPSV schema: ipsv/ipsv1-eswc06.rdf Case 2 Partial original data files: gtaa/SampleOfGTAA.zip Conversion program: gtaa/GTAAtoSKOSinstanceRDFSv6.pl Resulting RDF: gtaa/GTAAinstancesSKOSv7.rdf SKOS Core schema: http://www.w3.org/2004/02/skos/core/history/2005-10-14 (version used for this paper, for latest version seehere) Additional GTAA schema: gtaa/GTAAskosModelRDFSv4.rdfs
Converting into SKOS graph Identify Describe Publish
Skosuse cases_4: Agricultural Information Management Standards (AIMS) http://aims.fao.org/en/search/google/cow?query=cow&cx=011162950886884224513:ennli7xeebg&cof=FORID:11&sitesearch=&hl=en&ie=utf-8&oe=utf-8&lr=lang_en
FRBR conceptual model Coyle (2008) advocates FRBR conceptual model as part of a semantic model in saying “Since FRBR is about entities and relationships, it seems to be perfectly positioned as the first step in the transformation of library data to the semantic web.”
FRBR Expression of Core FRBR Concepts in RDF http://vocab.org/frbr/core.html This vocabulary is an expression in RDF of the concepts and relations described in the IFLA report on the Functional Requirements for Bibliographic Records (FRBR).
FRBR as a RDF vocabulary FRBR is a complete data model that is a new way of looking at our data, not just taking existing records and identifying work relationships. FRBR a type of RDF vocabulary entities and the relationships inFRBR is identifiable, linkable, usable, and reusable, and everything can be matched up.
RDA (Resource Description and Access) The new cataloging rules, replacing AACR2 RDA -> RDF Joint DCMI/RDA task force Seed funding to develop initial prototype RDF vocabularies for bibliographic information Based on FRBR and data model implicit in RDA Early stage year http://dublincore.org/dcmirdataskgroup/ Karen Coyle
library related Linked Data projects A brief and incomplete list of some library related Linked Data projects: RDF BookMashup – Integration of Web 2.0 data sources like Amazon, Google or Yahoo into the Semantic Web. Library of Congress Authorities – Exposing LoCAutorities and Vocabularies to the web using URI’s DBPedia – Exposing structured data from WikiPedia to the web LIBRIS – Linked Data interface to Swedish LIBRIS Union catalog Scriblio+Wordpress+Triplify – “A social, semantic OPAC Union Catalogue”
Language of Interoperability Universal identifiers (URIs): like written word – For “connecting the dots” Abstract syntax (RDF triples): sentence grammar – Foundation of syntactic interoperability Vocabularies: words and concepts – Foundation of semantic interoperability Platform for compatible domain models – Application Profiles Human-understandable – machine-processable
5. Proposed Models for Libraries with Linked data A publisher provides basic information about a book (e.g., using onix) The National Library adds bibliographic and authority control A local library adds holding information Some nice guy out there adds links from, say, Wikipedia A library’s IT staff creates a Webpage where I can find all related information regarding this book, links to related books from the same author, on the same subject, the author’s bio from wikipedia, comments from other Portals. =>Since, instead of following links between HTML pages, Linked Data browsers enable users to navigate between different data sources by following RDF links. How about User-created metadata
Advantages over other methods No crosswalk/mapping - Each one uses his own metadata format, all triples can be aggregated No data redundancy - Each one creates only the data he needs, and retrieves already existing information No harvesting - The data is available directly on the Web No branding issue - The URIs allow to track down the original data whatever its origin No software-specific developments - Everything relies on open standards as RDF, SPARQL … no need to learn a new protocol or query language