Introduction to theSemantic WebRobin Fay@georgiawebgurlrobin fay @georgiawebgurl 2013
ObjectivesBy the end of this session, you’ll• Have an understanding of the basic principles andterminology of the– Semanti...
The web as we know it (andthink of it) links togetherdocuments(html, pdf, dynamicdocuments created fromdatabases, etc.)rob...
In brief:Types of metadata:DescriptiveStructuralAdministrative• Many forms of metadata include elements of each of these;h...
•Much of library metadata is highly structured and done by trainedprofessionals. In the library world, MARC has been a lon...
•At its core, the semantic web comprises:oa set of design principles,ocollaborative working groups,oand a variety of enabl...
Robin Fay, robinfay.net 2009/10•Semantic web and web metadata is frequently from outsideof the library community – working...
robin fay @georgiawebgurl 2013
•RDF = Resource DescriptionFramework•RDFS = Resource DescriptionFramework Schema•OWL = Web OntologyLanguage•URI = Uniform ...
RDF = Resource Description Framework• is a general-purpose language for representing information in theWeb (a metadata dat...
So, we have the framework, but how do we apply it?RDFS = Resource Description Framework SchemaoA schema is outline: a sch...
Very SimpleDublin Core inRDFWe can combine schemasrobin fay @georgiawebgurl 2013
OWL = Web OntologyLanguage•invented to link ontologies whichare classification systems•Attempts to define objects and thei...
robin fay @georgiawebgurl 2013
Being that this is datadriven, we canquery, using SPARQL, astandard querylanguage.We’ll talk more about SPARQL later…robin...
•Linked data is: “about using the Web toconnect related data that wasntpreviously linked, or using the Web tolower the bar...
• What is linked data and open datao Linked data is about reusing datao We already do some linked data in our librarycatal...
Basic principles of linked dataIt keeps us from having to re-enter or copy information– Making our data:• reusable• easy t...
• Advantages (reusable data, potential to provideand built relationships, discoverability)• How library data fits into lin...
The RDF Triple:conceptual Examplessame asauthor ofPredicate/verb
Tim Berners-Lee’s Four Rules1. Use URIs as names for things2. Use HTTP URIs so that people can look up thosenames3. When s...
What can linked data do for libraries?• URIs creates methods for classifying that can be used(linked to!) by others• Libra...
Our data in a semantic viewpointSOURCE: Getting triples from records: the role of ISBDhttp://www.slideshare.net/scottishli...
Our data in a semantic viewSOURCE: Getting triples from records: the role of ISBDhttp://www.slideshare.net/scottishlibrari...
How cataloging is changing: A changing library and WEB landscape• Automation and new technologies• The web has changed• La...
• FRBR will give us a way to group things in differentways building relationships between data – by WEMI(Work, Expression,...
Entity-Relationship Model(new way of storing & organizing data)• Database design model• Entity - a thing with an identity–...
FRBR and RDA properties element sets
“User Tasks”How do catalog users• Find• Identify• Select• Obtain… the resources they want?robin fay @georgiawebgurl 2013
WorkA distinct intellectual or artistic creationGroup 1 Entities (WEMI Hierarchy)ExpressionIntellectual or artistic realiz...
The FRBR Entity Relationship Model
RDA Controlled VocabulariesClosedcontent typemedia typecarrier typemode of issuance... and more.Openfrequencytype of recor...
RDA, FRBR, and MARCRDA is our metadata rules to describe our contentFRBR is our semantic web friendly data modelCurrently ...
RDA, FRBR, and MARC• Bibliographic records are structured in MARC (aprogramming language). MARC (MAchine ReadableCode) and...
RDA, FRBR, and MARC• Bibliographic records are structured in MARC (a programminglanguage). MARC (MAchine Readable Code) an...
• MARC existed before AACR2. MARC was developed inthe 1960s before most digital technology existed –the web as we know it,...
• Our future systems will probably not use MARC, but somekind of semantic web friendly schema.• Currently, the Library of ...
• We have some relationships within our library catalog via thebibliographic data – bib-holding-item (a way to keep all of...
ResourcesLODLAM: http://lodlam.net/LODAM CHALLENGE: http://summit2013.lodlam.net/LODLAM Zotero Group (Webliography of good...
Resources• RDA Toolkit RDA Toolkit (online)– http://www.rdatoolkit.org• LC PCC PS (free .pdf downloads)http://www.loc.gov/...
Intro to the semantic web (for libraries)
Intro to the semantic web (for libraries)
Intro to the semantic web (for libraries)
Upcoming SlideShare
Loading in …5
×

Intro to the semantic web (for libraries)

2,302 views

Published on

Includes overview of the semantic web, FRBR, RDA, and BIBFRAME through the lens of cataloging by robin fay @georgiawebgurl

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,302
On SlideShare
0
From Embeds
0
Number of Embeds
502
Actions
Shares
0
Downloads
24
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • For each part a URI.
  • URIs are kind of like a hook – they allow us to connect things together.
  • ElaineSvenonius (?) posited that “navigate” (finding works related to a given work by generalization, assn., & aggregation…) should be added, but is not officialFind: resources corresponding to user’s search criteriaIdentify: confirm resource described corresponds to that sought, or distinguish between more than one resource with similar characteristicsSelect: resource appropriate to user’s needsObtain: to acquire or access resource (RDA chap. 4 (7 pp.) on acquisition and access, includes URL)
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/[Show exercises, explain, solicit questions, etc.]
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/[Show exercises, explain, solicit questions, etc.]
  • What a FRBRized catalog should give us is better searching tools and enable to see editions more easily; see related titles in different media (e.g., easier to find the work “Dracula” regardless of its physical format – its manifestation). Since FRBR is a data model built on a semantic web framework, it will also enable us to have better, more robust, more semantic web like search tools (like our catalogs). ..while FRBR influenced RDA and FRSAD (Functional Requirements for Subject Authority Data)
  • Intro to the semantic web (for libraries)

    1. 1. Introduction to theSemantic WebRobin Fay@georgiawebgurlrobin fay @georgiawebgurl 2013
    2. 2. ObjectivesBy the end of this session, you’ll• Have an understanding of the basic principles andterminology of the– Semantic Web– Linked data– Library data in semantic web space• BIBFRAME• RDA• FRBRrobin fay @georgiawebgurl 2013
    3. 3. The web as we know it (andthink of it) links togetherdocuments(html, pdf, dynamicdocuments created fromdatabases, etc.)robin fay @georgiawebgurl 2013
    4. 4. In brief:Types of metadata:DescriptiveStructuralAdministrative• Many forms of metadata include elements of each of these;however it is dependent upon the schema.• A schema is a set of rules covering the elements andrequirements for coding. Examples of common schemas in thelibrary world include Dublin Core, TEI, EAD, and others. Examplesof schemas in the semantic web include Dublin Core, FOAF(Friend of a Friend), and many others.robin fay @georgiawebgurl 2013
    5. 5. •Much of library metadata is highly structured and done by trainedprofessionals. In the library world, MARC has been a long termstandard. While it can be rigid, its structural nature can makes iteasier to crosswalk and harvest into other databases.•SEO (Search Engine Optimization) is a common term in the webworld; these experts assign descriptive, administrative (usuallycopyright) to websites; their goal is generally higher searchresults. Given that search engine algorithms change regularly, SEOis a highly dynamic field, which can lead to inconsistencies inmetadata application, making it harder for databases and searchengines to harvest.•In a nutshell, most library metadata has rules and standards;metadata in the web world is often (but not always) more flexible.The Semantic Web will need to manage (and make sense!) of all ofthese types of metadata.robin fay @georgiawebgurl 2013
    6. 6. •At its core, the semantic web comprises:oa set of design principles,ocollaborative working groups,oand a variety of enabling technologies.•Some elements of the semantic web are expressed asprospective future possibilities that are yet to beimplemented or realizedAND•Other elements of the semantic web are expressed informal specifications -- (wikipedia, 2009)Robin Fay, robinfay.net 2009/10robin fay @georgiawebgurl 2013
    7. 7. Robin Fay, robinfay.net 2009/10•Semantic web and web metadata is frequently from outsideof the library community – working in parallel or sometimes,at odds.•Metadata in libraries encompasses a wide variety; one of themost common metadata schemas is MARC.•MARC is formatted using ISBD punctuation; the content ofwhat goes into a record is controlled by our cataloging rules(such as RDA). RDA can be applied using different metadataschemas – although for now, many libraries are still in aMARC based world.robin fay @georgiawebgurl 2013
    8. 8. robin fay @georgiawebgurl 2013
    9. 9. •RDF = Resource DescriptionFramework•RDFS = Resource DescriptionFramework Schema•OWL = Web OntologyLanguage•URI = Uniform ResourceIdentifier - think uniquenumber , URLsMany terms associated with the Semantic Web are usedor based upon information architecture, database,information science, and library science fields –controlled vocabularies, structural elements, etc.robin fay @georgiawebgurl 2013
    10. 10. RDF = Resource Description Framework• is a general-purpose language for representing information in theWeb (a metadata data model)• is a W3C specification• is a conceptual description• is based upon making statements about web resources (triplets)• More or less : XML• We can express RDA in RDF• Think sentence structure :• subject – predicate(verb)-object• My dog eats dogfood.robin fay @georgiawebgurl 2013
    11. 11. So, we have the framework, but how do we apply it?RDFS = Resource Description Framework SchemaoA schema is outline: a schematic or preliminary plan A structure described in a formal language supported bythe database management system ; in a relational database[such as MySQL), the schema defines the tables, the fields ineach table, and the relationships between fields and tables. a description of the structure and rules a document mustsatisfy for an XML document type http://tinyurl.com/yj442vr (define: schema -- google) Dublin Core is a schemarobin fay @georgiawebgurl 2013
    12. 12. Very SimpleDublin Core inRDFWe can combine schemasrobin fay @georgiawebgurl 2013
    13. 13. OWL = Web OntologyLanguage•invented to link ontologies whichare classification systems•Attempts to define objects and theirrelationships•Different “flavors”•“interpreted as a set of "individuals"and a set of "property assertions"which relate these individuals toeach other” (wikipedia 2009)•Not a requirement•Sounds familiar to catalogers, right?robin fay @georgiawebgurl 2013
    14. 14. robin fay @georgiawebgurl 2013
    15. 15. Being that this is datadriven, we canquery, using SPARQL, astandard querylanguage.We’ll talk more about SPARQL later…robin fay @georgiawebgurl 2013
    16. 16. •Linked data is: “about using the Web toconnect related data that wasntpreviously linked, or using the Web tolower the barriers to linking data.”•Think> related, series records, authorityfiles•Libraries already link data.•Projects such as the NYT Linked OpenData project and the Virtual AuthorityFile project are resources of controlledvocabularies.•Verified and digital identity accountssuch as openID and claimID todifferentiate namesrobin fay @georgiawebgurl 2013
    17. 17. • What is linked data and open datao Linked data is about reusing datao We already do some linked data in our librarycatalogs and even in our daily liveso The link in a bibliographic record (like an authorityrecord link) is linking data behavioro A link that we share to our friends on facebook islinked data (of sorts)• Linked data is a link to a record/data/contentthat can then be utilized in some way• Open data is data that available to be used insome way with no barriers to access (licensing,etc.) robin fay @georgiawebgurl 2013
    18. 18. Basic principles of linked dataIt keeps us from having to re-enter or copy information– Making our data:• reusable• easy to correct (correct one record instead of multiples)• efficient• and potentially useful to othersIt can build relationships in different ways - allowing us to createtemporary collections (a user could organize their search results in away that makes sense to them) or more permanent (collocating ALLworks by a particular author more easily; pulling together photographsmore easily)robin fay @georgiawebgurl 2013
    19. 19. • Advantages (reusable data, potential to provideand built relationships, discoverability)• How library data fits into linked datao FRBR ( a bibliographic FRAMEWORK which is moresemantic by nature) RDA ( metadata rules whichare not tied to a programming language such asMARC but can work with semantic web standardslike XML); IRs, and CMS like Drupal which havesemantic web capabilities• RDA expressed as RDFarobin fay @georgiawebgurl 2013
    20. 20. The RDF Triple:conceptual Examplessame asauthor ofPredicate/verb
    21. 21. Tim Berners-Lee’s Four Rules1. Use URIs as names for things2. Use HTTP URIs so that people can look up thosenames3. When someone looks up a URI, provide usefulinformation, using the standards4. Include links to other URIs, so they can discovermore thingsURIs = Uniform Resource Identifierrobin fay @georgiawebgurl 2013
    22. 22. What can linked data do for libraries?• URIs creates methods for classifying that can be used(linked to!) by others• Library of Congress has released LCSH as linked data, andOCLC has a modified version of LCSH called FAST aslinked data• Linked Data is flexible enough to express entity-relationship relationships such as FRBR/FRAD• Different databases (ILS, ERMS, IRs, local databases, etc.)allowing sharing of data – potentially more consistentdata – allowing for collocation across resources andallowing users to easily find resources regardless ofsourcerobin fay @georgiawebgurl 2013
    23. 23. Our data in a semantic viewpointSOURCE: Getting triples from records: the role of ISBDhttp://www.slideshare.net/scottishlibraries/isbd-record2triples
    24. 24. Our data in a semantic viewSOURCE: Getting triples from records: the role of ISBDhttp://www.slideshare.net/scottishlibraries/isbd-record2triples“Bib”:Recordid assubjectField role andrelationshipCan map to recordsuch as viaf
    25. 25. How cataloging is changing: A changing library and WEB landscape• Automation and new technologies• The web has changed• Large scale bibliographic databases• Cooperative cataloging• Administrative desire to decrease costs• Greater variety of media in library collections(electronic!)• User expectations and needs• FRBR is our data model – semantic web friendly!robin fay @georgiawebgurl 2013
    26. 26. • FRBR will give us a way to group things in differentways building relationships between data – by WEMI(Work, Expression, Manifestation, Item)• WEMI is a hierarchy from abstract to the actual thingowned by a library (the well… item!)• Work and Expression can be somewhat conceptualwith lots of discussion going on; however, you canloosely think of Work as a concept or idea which isExpressed (think the act of creation; performance)onto/into a physical format (can be digital) aka aManifestation, of which the library has a copy (Item).robin fay @georgiawebgurl 2013
    27. 27. Entity-Relationship Model(new way of storing & organizing data)• Database design model• Entity - a thing with an identity– Entities have attributes (characteristics)• Relationships– Between different entities at different levels• Provides for organization of records in database– “clustering”• Conceptual model of abstract conceptsrobin fay @georgiawebgurl 2013
    28. 28. FRBR and RDA properties element sets
    29. 29. “User Tasks”How do catalog users• Find• Identify• Select• Obtain… the resources they want?robin fay @georgiawebgurl 2013
    30. 30. WorkA distinct intellectual or artistic creationGroup 1 Entities (WEMI Hierarchy)ExpressionIntellectual or artistic realization of a workManifestationPhysical embodiment of an expression of a workItemSingle exemplar of a manifestation
    31. 31. The FRBR Entity Relationship Model
    32. 32. RDA Controlled VocabulariesClosedcontent typemedia typecarrier typemode of issuance... and more.Openfrequencytype of recordinglanguage of expressionform of musical notationrelationship designators(app. I-K)... and more.robin fay @georgiawebgurl 2013
    33. 33. RDA, FRBR, and MARCRDA is our metadata rules to describe our contentFRBR is our semantic web friendly data modelCurrently we use MARC to format our data but weneed something betterLinked data can be the mechanism – but whatabout the actual records?robin fay @georgiawebgurl 2013
    34. 34. RDA, FRBR, and MARC• Bibliographic records are structured in MARC (aprogramming language). MARC (MAchine ReadableCode) and AACR2 have been working together a longtime which means that compromises and workaroundshave sometimes be made. This will be true for RDA,too.• MARC is a mixture of controlled access points (series,name authority and subject headings + free text (e.g.,contents notes). This provides flexibility and structurebut> More free text = less precision in searching =more work for systems to return relevant resultsrobin fay @georgiawebgurl 2013
    35. 35. RDA, FRBR, and MARC• Bibliographic records are structured in MARC (a programminglanguage). MARC (MAchine Readable Code) and AACR2 havebeen working together a long time which means thatcompromises and workarounds have sometimes be made. Thiswill be true for RDA, too.• MARC is a mixture of controlled access points (series, nameauthority and subject headings + free text (e.g., contents notes).This provides flexibility and structure but> More free text = lessprecision in searching = more work for systems to returnrelevant resultsrobin fay @georgiawebgurl 2013
    36. 36. • MARC existed before AACR2. MARC was developed inthe 1960s before most digital technology existed –the web as we know it, ebooks, and Google, did notexist.• Most current catalog systems use MARC, but thereare other metadata schemas and programminglanguages.• Although many systems have not fully utilized all ofthe fields and functionalities of MARC, it is reachingthe end of its lifespan.• The next generation (nexgen) systems can notdevelop as only MARC based; we need more.RDA, FRBR, and MARCrobin fay @georgiawebgurl 2013
    37. 37. • Our future systems will probably not use MARC, but somekind of semantic web friendly schema.• Currently, the Library of Congress has started a projectcalled the Bibliographic Framework Transition Initiative• Why?• We need something that is more flexible, not flat in filestructure, yet works with a semantic framework.• We need something that works better with differentmetadata schemas.• This new framework will provide us with enormousfunctionality in our catalogs and allow us to fully use RDA.It will allow us to move forward into the semantic webworld.RDA, FRBR, and MARCrobin fay @georgiawebgurl 2013
    38. 38. • We have some relationships within our library catalog via thebibliographic data – bib-holding-item (a way to keep all of theparts of a particular thing together)• Bib to authority –series-subject headings (a bib record havinglinking field(s) to another record(s))• Authority records – records not visible to the public, butprovide the linking points to our bib records and guide the userthrough variations of the name or title, etc.Linking data in catalogsrobin fay @georgiawebgurl 2013
    39. 39. ResourcesLODLAM: http://lodlam.net/LODAM CHALLENGE: http://summit2013.lodlam.net/LODLAM Zotero Group (Webliography of good stuff): https://www.zotero.org/groups/lod-lamGLAMLOD: https://groups.google.com/group/glamlodLC Bibliographic Framework Transition Initiative: http://www.loc.gov/marc/transition/LITA - library linked data interest group: http://connect.ala.org/node/142470Use Case Tool: http://obd.jisc.ac.uk/navigateGetting triples from records: the role of ISBD http://www.slideshare.net/scottishlibraries/isbd-record2triplesFRBR Display Tool: http://www.loc.gov/marc/marc-functional-analysis/tool.htmlUnderstanding FRBR: http://www.loc.gov/cds/downloads/FRBR.PDFMore materials at http://www.delicious.com/georgiawebgurl/metadata_presentation_comoMaking the Digital Connection: Linked Data and Librariesrobin fay @georgiawebgurl 2013
    40. 40. Resources• RDA Toolkit RDA Toolkit (online)– http://www.rdatoolkit.org• LC PCC PS (free .pdf downloads)http://www.loc.gov/catdir/cpso/RDAtest/rda_LC-PCC PS.htmlrobin fay @georgiawebgurl 2013

    ×