Applications of XML in Libraries for Electronic
Resources
Karen A. Coombs
University of Houston
librarywebchic@gmail.com
XML formats you might see or use in libraries

• MARCXML


• MARCXML holdings


• ISO/FDIS 20775 - Holdings schema


• Ope...
MARCXML

• XML version of a MARC record


   • Uses fields, subfields and indicators


• Very complex and often difficult to ...
<?xml version=quot;1.0quot;?>
<marc:collection xmlns:marc=quot;http://www.loc.gov/MARC21/slimquot;>
 <totalResults xmlns=q...
MARCXML Holdings

• MARC format for holdings


• Most relevant for serials/journals


• Limited number of important fields
...
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot; ?>
<marc:collection xmlns:marc=quot;http://www.loc.gov/MARC21/slimquo...
ISO/FDIS 20775

• Standard for transmitting holdings information


• Also contains information about the library with the ...
<holding>
	 <institutionIdentifier>
		      <value>CZP</value>
		      <typeOrSource>
		      	     <pointer>http://worldca...
OpenURL XML formats

• Normally we think of OpenURL as a set of key/value pairs
  http://www.crossref.org/openurl?
  url_v...
Digital Library Standards for Metadata

• There are lots of different types of metadata for digital objects


   •   Descr...
Dublin Core

• Two different elements sets: Simple and Qualified


  • Simple
     • 15 elements
     • Extremely simplisti...
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot; standalone=quot;noquot;?>
<records xmlns:dc=quot;http://purl.org/dc/e...
METS

• Metadata Encoding Transmission Standard


• Used for digital objects to “wrap-up” all metadata elements


   • Can...
MODS

• Metadata Object Description Schema


• Advantages


  • Richer description than Dublin Core


  • Element names mo...
XML from the Internet also useful to Libraries

• Feeds


  • Standard formats for syndicating content


  • RSS


     • ...
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?><rss version=quot;2.0quot;>
<channel>
<title>Library Hi Tech </title...
<?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?>
<feed xmlns=quot;http://purl.org/atom/ns#quot; xmlns:taxo=quot;http...
Sources for data in XML format

• Syndicated Table of Content feeds
   • From Publisher websites - Emerald
   • From ticTO...
WorldCat API

• Service Levels
   • Default - limited set of indexes and limits; limited bibliographic data
     returned
...
SRU Query to WorldCat Search API

• Can search by ISSN or other fields, full MARC records can be returned

 http://worldcat...
An Open Search Query to WorldCat Search API

• Can only search by keywords and the data returned isn’t particularly useful...
xISSN Service

• Several types of Requests
   • getForms - returns a list of ISSNs and its production form information in
...
<rsp stat=quot;okquot;>
   <group rel=quot;thisquot;>
     <issn form=quot;JDquot; oclcnum=quot;57136697 222024701 3429853...
Serial Solutions API

• Proprietary APIs


• Available for customers only


• API for 360 Link (OpenURL)


   • Serial Sol...
Query to Serial Solutions 360 Link XML API

  http://<client identifier>.openurl.xml.serialssolutions.com/openurlxml?
 vers...
Other XML standard of interest

• COUNTER and SUSHI - http://www.niso.org/schemas/sushi/
  Data can be transmitted in XML ...
Possible Applications

• Integrate journal table of contents into web pages


• Provide users with latest articles in thei...
Possible Applications

• Provide links to journal table of contents


   • Use WorldCat API to search ISSN and retrieve 85...
Further Resources

• Auto-Populating an ILL form with the Serial Solutions Link Resolver API -
  http://journal.code4lib.o...
Xml Applications Libraries
Xml Applications Libraries
Xml Applications Libraries
Xml Applications Libraries
Xml Applications Libraries
Upcoming SlideShare
Loading in …5
×

Xml Applications Libraries

1,583 views
1,530 views

Published on

Preconference presentation for ER&L 2009

Published in: Education, Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,583
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
60
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Xml Applications Libraries

  1. 1. Applications of XML in Libraries for Electronic Resources Karen A. Coombs University of Houston librarywebchic@gmail.com
  2. 2. XML formats you might see or use in libraries • MARCXML • MARCXML holdings • ISO/FDIS 20775 - Holdings schema • OpenURL XML formats • XML Metadata Format for Books (info:ofi/fmt:xml:xsd:book) • XML Metadata Format for Journals (info:ofi/fmt:xml:xsd:journal) • Digital Library standards • Dublin Core • MODS • METS
  3. 3. MARCXML • XML version of a MARC record • Uses fields, subfields and indicators • Very complex and often difficult to work with • Typical output of most API for library catalogs • Difficult to interpret if don’t know MARC • OCLC Bibliographic Standards and Formats - http://www.oclc.org/ bibformats/default.htm
  4. 4. <?xml version=quot;1.0quot;?> <marc:collection xmlns:marc=quot;http://www.loc.gov/MARC21/slimquot;> <totalResults xmlns=quot;http://a9.com/-/spec/opensearch/1.1/quot;>1</totalResults> <startIndex xmlns=quot;http://a9.com/-/spec/opensearch/1.1/quot;>1</startIndex> <itemsPerPage xmlns=quot;http://a9.com/-/spec/opensearch/1.1/quot;>10</itemsPerPage> <record xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot; xmlns=quot;http://www.loc.gov/MARC21/slimquot; xmlns:marc=quot;http://www.loc.gov/MARC21/slimquot; xsi:schemaLocation=quot;http://www.loc.gov/MARC21/slim http:// www.loc.gov/ standards/marcxml/schema/MARC21slim.xsdquot;> <leader>04957cam a22004698a 4500</leader> <controlfield tag=quot;001quot;>ocm61260129</controlfield> <controlfield tag=quot;003quot;>OCoLC</controlfield> <controlfield tag=quot;005quot;>20080604173055.0</controlfield> <controlfield tag=quot;008quot;>050712s2005 caua b 000 0 eng</controlfield> <datafield tag=quot;020quot; ind1=quot; quot; ind2=quot; quot;> <subfield code=quot;aquot;>0596007655 (pbk.)</subfield> </datafield> <datafield tag=quot;035quot; ind1=quot; quot; ind2=quot; quot;> <subfield code=quot;aquot;>(OCoLC)61260129</subfield> </datafield> <datafield tag=quot;050quot; ind1=quot; quot; ind2=quot;4quot;> <subfield code=quot;aquot;>QA76.9.D26</subfield> <subfield code=quot;bquot;>M67 2005</subfield> </datafield> <datafield tag=quot;100quot; ind1=quot;1quot; ind2=quot; quot;> <subfield code=quot;aquot;>Morville, Peter.</subfield> </datafield> <datafield tag=quot;245quot; ind1=quot;1quot; ind2=quot;0quot;> <subfield code=quot;aquot;>Ambient findability /</subfield> <subfield code=quot;cquot;>Peter Morville.</subfield> </datafield> </record> </marc:collection>
  5. 5. MARCXML Holdings • MARC format for holdings • Most relevant for serials/journals • Limited number of important fields • 856 - Electronic Location and Access • 853 - Captions and Pattern information • 863 - Enumeration and Chronology • 866 - Textual Statement of Holdings
  6. 6. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot; ?> <marc:collection xmlns:marc=quot;http://www.loc.gov/MARC21/slimquot; xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema- instancequot; xsi:schemaLocation=quot;http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/ MARC21slim.xsdquot;> <marc:record> <marc:leader>00381nam a2200133 45e0</marc:leader> <marc:controlfield tag=quot;001quot;>mfhd1</marc:controlfield> <marc:controlfield tag=quot;008quot;>8312304g####8###2001aaeng0831017</marc:controlfield> <marc:datafield tag=quot;852quot; ind1=quot; quot; ind2=quot; quot;> <marc:subfield code=quot;aquot;>CSf</marc:subfield> <marc:subfield code=quot;bquot;>Sci</marc:subfield> <marc:subfield code=quot;tquot;>2</marc:subfield> </marc:datafield> <marc:datafield tag=quot;853quot; ind1=quot;1quot; ind2=quot;0quot;> <marc:subfield code=quot;8quot;>1</marc:subfield> <marc:subfield code=quot;aquot;>v.</marc:subfield> <marc:subfield code=quot;bquot;>no.</marc:subfield> <marc:subfield code=quot;uquot;>12</marc:subfield> <marc:subfield code=quot;vquot;>r</marc:subfield> <marc:subfield code=quot;iquot;>(year)</marc:subfield> <marc:subfield code=quot;jquot;>(month)</marc:subfield> <marc:subfield code=quot;wquot;>m</marc:subfield> <marc:subfield code=quot;xquot;>01</marc:subfield> </marc:datafield> <marc:datafield tag=quot;863quot; ind1=quot;4quot; ind2=quot;0quot;> <marc:subfield code=quot;8quot;>1.2</marc:subfield> <marc:subfield code=quot;aquot;>22</marc:subfield> <marc:subfield code=quot;bquot;>1-6</marc:subfield> <marc:subfield code=quot;iquot;>1982</marc:subfield> <marc:subfield code=quot;jquot;>01-06</marc:subfield> </marc:datafield> </marc:record> </marc:collection>
  7. 7. ISO/FDIS 20775 • Standard for transmitting holdings information • Also contains information about the library with the holdings • Being used by OCLC in WorldCat API • Can contain information about complex serial holdings • Can contain information about availability, availability policy, conditions and charges
  8. 8. <holding> <institutionIdentifier> <value>CZP</value> <typeOrSource> <pointer>http://worldcat.org/registry/institutions/</pointer> </typeOrSource> </institutionIdentifier> <physicalLocation>Peninsula Library System</physicalLocation> <physicalAddress> <text>San Mateo, CA 94403 United States</text> </physicalAddress> <electronicAddress> <text>http://www.worldcat.org/wcpa/oclc/8114241? page=frame&url=http%3A%2F%2Fcatalog.plsinfo.org%2Fsearch%2Fi0380641135 &title=Peninsula+Library+System&linktype=opac &detail=CZP%3APeninsula+Library+System%3APublic &qt=affiliate&ai=wcapi</text> </electronicAddress> <holdingSimple> <copiesSummary> <copiesCount>1</copiesCount> </copiesSummary> </holdingSimple> </holding>
  9. 9. OpenURL XML formats • Normally we think of OpenURL as a set of key/value pairs http://www.crossref.org/openurl? url_ver=Z39.882004&req_dat=username:password&rft_val_fmt=info:ofi/ fmt:kev:mtx:journal&rft.atitle=Isolation of a common receptor for coxsackie B&rft.jtitle=Science&rft.aulast=Bergelson&rft.auinit=J&rft.date=1997&rft.volum e=275&rft.spage=1320&rft.epage=1323 • Doesn’t have to be. Newer versions allow you to send the metadata as XML rather than a set of key/value pairs
  10. 10. Digital Library Standards for Metadata • There are lots of different types of metadata for digital objects • Descriptive • Structural • Administrative • Technical • Different types of metadata = different standards • Dublin Core, MODS - Descriptive • METS - Structural, Administrative • PREMIS - Administrative • MIX - Technical
  11. 11. Dublin Core • Two different elements sets: Simple and Qualified • Simple • 15 elements • Extremely simplistic • dc namespace • Qualified • Includes all the elements in Simple Dublin Core plus additional elements that refinements • description -> abstract • Still fairly simple but better granularity • dcterms namespace
  12. 12. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot; standalone=quot;noquot;?> <records xmlns:dc=quot;http://purl.org/dc/elements/1.1/quot; > <record> <dc:creator>Morville, Peter.</dc:creator> <dc:date>2005</dc:date> <dc:description>Includes bibliographical references and index.</dc:description> <dc:description>How do you find your way in an age of information overload? How can you filter streams of complex information to pull out only what you want? Why does it matter how information is structured when Google seems to magically bring up the right answer to your questions? What does it mean to be quot;findablequot; in this day and age? This eye-opening new book examines the convergence of information and connectivity. Written by Peter Morville, author of the groundbreaking Information Architecture for the World Wide Web, the book defines our current age as a state of unlimited findability. In other words, anyone can find anything at any time. </ dc:description> <dc:format>xiv, 188 : ill. (some col.) ; 23 cm.</dc:format> <dc:identifier>0596007655 (pbk.)</dc:identifier> <dc:identifier>9780596007652 (pbk.)</dc:identifier> <dc:language xsi:type=quot;http://purl.org/dc/terms/ISO639-2quot;>eng</dc:language> <dc:publisher>O'Reilly</dc:publisher> <dc:subject xsi:type=quot;http://purl.org/dc/terms/DDCquot;>005.72</dc:subject> <dc:subject xsi:type=quot;http://purl.org/dc/terms/LCCquot;>QA76.9.D26 M67 2005</dc:subject> <dc:subject xsi:type=quot;http://purl.org/dc/terms/LCSHquot;>Database searching.</dc:subject> <dc:subject xsi:type=quot;http://purl.org/dc/terms/NLMquot;>TK 5105.888 M892a 2005</dc:subject> <dc:title>Ambient findability </dc:title> <dc:type>Text</dc:type> </record> </records>
  13. 13. METS • Metadata Encoding Transmission Standard • Used for digital objects to “wrap-up” all metadata elements • Can include other metadata schemes • Provides structural metadata • what files are part of the objects • what is their purpose
  14. 14. MODS • Metadata Object Description Schema • Advantages • Richer description than Dublin Core • Element names more user-friendly than MARCXML • Better separation of data and presentation than MARC and actual datatyping of elements • Typically used for describing digital library content but MARCXML can be converted to MODS
  15. 15. XML from the Internet also useful to Libraries • Feeds • Standard formats for syndicating content • RSS • title, description, link, author, pubDate • Atom • title, summary, link, modified, dc:date
  16. 16. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?><rss version=quot;2.0quot;> <channel> <title>Library Hi Tech </title> <link>http://www.emeraldinsight.com/0737-8831.htm</link> <description> Table of Contents from the most recently published issues of Library Hi Tech</description> <language>en-us</language> <copyright>2009 Emerald Group Publishing Ltd.</copyright> <image> <title>Library Hi Tech </title> <url>http://www.emeraldinsight.com/info/pics/journals/lht-cover-xix.gif</url> <width>120</width> <height>157</height> </image> <item> <title>Accessing information in a parliamentary environment: is the OPAC dead? : Table of Contents</title> <link/> <description> &lt;B&gt;Abstract:&lt;/B&gt;&lt;BR/&gt; &lt;B&gt;Purpose&lt;/B&gt; - Access to library collections in an era where users want to quot;getquot; rather than quot;findquot; offers particular challenges. This article explores users' needs for bibliographic records in a primarily full text environment.&lt;B&gt;Design/ methodology/approach&lt;/B&gt; - The paper describes access to parliamentary and library information from the Australian Parliament. It then outlines the approach taken to develop and implement a new search system, ParlInfo, which applied a repository and search system that provides integrated access to bibliographic and full text information. The system was launched in September 2008 and offers facets, alerts, RSS feeds and other Web 2.0 functionality to offer both the Australian public and Parliamentary Network users to access to library collections and parliamentary collections. &lt;B&gt;Findings&lt;/B&gt; -.</ description> <author>Ms. Roxanne Missingham, Ms. Rina Brettell, Ms. Shirley White, Dr. Sarah Miskin</author> <pubDate>Sun Jan 18 14:15:05 GMT 2009</pubDate> </item> </channel> </rss>
  17. 17. <?xml version=quot;1.0quot; encoding=quot;UTF-8quot;?> <feed xmlns=quot;http://purl.org/atom/ns#quot; xmlns:taxo=quot;http://purl.org/rss/1.0/modules/taxonomy/quot; xmlns:rdf=quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#quot; xmlns:sy=quot;http://purl.org/rss/1.0/modules/ syndication/quot; xmlns:dc=quot;http://purl.org/dc/elements/1.1/quot; version=quot;0.3quot;> <title>Geological Magazine - Current Issue</title> <link rel=quot;alternatequot; href=quot;http://journals.cambridge.org/action/displayJournal?jid=GEOquot; /> <info>Geological Magazine, Volume 146 Issue 01 Geological Magazine , established in 1864, is one of the oldest and best-known periodicals in earth sciences. It publishes original scientific papers covering the complete spectrum of geological topics, with high quality illustrations. Its worldwide circulation and high production values, combined with Rapid Communications and Book Review sections keep the journal at the forefront of the field. This journal is included in the Cambridge Journals open access initiative, Cambridge Open Option. Offer readers unrestricted online access to your work, click here for more details.</info> <entry> <title>Volume 146 Issue 01</title> <link rel=quot;alternatequot; href=quot;http://journals.cambridge.org/action/displayIssue? jid=GEO&amp;volumeId=146&amp;issueId=01quot; /> <modified>2009-01-01T00:00:00Z</modified> <summary type=quot;text/plainquot; mode=quot;xmlquot;>Geological Magazine, Volume 146 Issue 01 Geological Magazine , established in 1864, is one of the oldest and best-known periodicals in earth sciences. It publishes original scientific papers covering the complete spectrum of geological topics, with high quality illustrations. Its worldwide circulation and high production values, combined with Rapid Communications and Book Review sections keep the journal at the forefront of the field. This journal is included in the Cambridge Journals open access initiative, Cambridge Open Option. Offer readers unrestricted online access to your work, click here for more details.</summary> <dc:date>2009-01-01T00:00:00Z</dc:date> </entry> </feed>
  18. 18. Sources for data in XML format • Syndicated Table of Content feeds • From Publisher websites - Emerald • From ticTOCs project- http://www.tictocs.ac.uk • WorldCat API • Evergreen Catalogs (Georgia Pines, University of Prince Edward Island) • xISSN services • Serial Solutions API
  19. 19. WorldCat API • Service Levels • Default - limited set of indexes and limits; limited bibliographic data returned • Full - all indexes available in WorldCat; full bibliographic data • Search formats • OpenSearch • SRU • Response formats • OpenSearch • RSS • Atom • SRU • MARCXML • Dublin Core
  20. 20. SRU Query to WorldCat Search API • Can search by ISSN or other fields, full MARC records can be returned http://worldcat.org/webservices/catalog/search/sru?query=srw.in+all+% 221041-7915% 22&version=1.1&operation=searchRetrieve&wskey=key&recordSchema=info% 3Asrw%2Fschema%2F1% 2Fmarcxml&maximumRecords=10&startRecord=1&recordPacking=xml&servicelevel =default&sortKeys=relevance&resultSetTTL=300&recordXPath= • query - srw query Use SRU Explain Service (http://worldcat.org/webservices/catalog/) to help construct your query • wskey - API key
  21. 21. An Open Search Query to WorldCat Search API • Can only search by keywords and the data returned isn’t particularly useful when dealing with serials/journals http://worldcat.org/webservices/catalog/search/worldcat/opensearch?q=computers %20in%20libraries&format=atom&wskey=key • q - your query This is very simple really can’t be anything but a keyword search • format - format you want results returned in Atom or RSS • wskey - WorldCat Search API key
  22. 22. xISSN Service • Several types of Requests • getForms - returns a list of ISSNs and its production form information in same group as the requested ISSN. • Form is ONIX production form code • JB ( Printed serial ), JC ( Serial distributed electronically by carrier ) ,JD ( Electronic serial distributed online ), MA ( Microform ) • getEditions - returns a list of ISSNs in same group as the requested ISSN. • form, oclcnum, peerreview, publisher, rawcoverage, title • getHistory - returns a list of ISSNs in same group as the requested ISSN, as well as ISSNs for preceding/succeeding groups • getMetadata - returns metadata about the requested ISSN • xISSN History Visualization Tool - generate a chart showing the history of a journal with a given ISSN
  23. 23. <rsp stat=quot;okquot;> <group rel=quot;thisquot;> <issn form=quot;JDquot; oclcnum=quot;57136697 222024701 34298537quot; rawcoverage=quot;Vol. 1, no. 1 (July 3, 1880)-v. 3, no. 82 (Mar. 4, 1882); [New ser.] Vol. 1, no. 1 (Feb. 9, 1883)-v. 23, no. 581 (Mar. 23, 1894); [2nd ser.] v. 1, no. 1 (Jan. 4, 1895)-quot; title=quot;Sciencequot; publisher=quot;New York, N.Y. : s.nquot; peerreview=quot;Yquot;>1095-9203</ issn> <issn form=quot;JBquot; oclcnum=quot;53849218 237823594 77943117 182894935 1644869 248155486 213776464 225979457 231016675 183350662 70737295 145332150 191712526 222180991 264687537 9292560 5582807 27118932 173731846 241455726 174295239 32917481 181820410 5933538 7648838 19698903quot; rawcoverage=quot;Vol. 1, no. 1 (July 3, 1880)-v. 3, no. 82 (Mar. 4, 1882); [New ser.] Vol. 1, no. 1 (Feb. 9, 1883)-v. 23, no. 581 (Mar. 23, 1894); [2nd ser.] v. 1, no. 1 (Jan. 4, 1895)-quot; title=quot;Sciencequot; publisher=quot;New York, N.Y. : s.nquot; peerreview=quot;Yquot;>0036-8075</issn> </group> </rsp>
  24. 24. Serial Solutions API • Proprietary APIs • Available for customers only • API for 360 Link (OpenURL) • Serial Solutions provides other APIs depending on which of their products you subscribe to • SFX OpenURL resolver also has an API
  25. 25. Query to Serial Solutions 360 Link XML API http://<client identifier>.openurl.xml.serialssolutions.com/openurlxml? version=1.0&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx% 3Ajournal&rfr_id=info%3Asid%2Fsersol% 3ARefinerQuery&url_ver=Z39.88-2004&rft_id=info%3Adoi%2F10.1037% 2F0003-066X.59.1.29 • Standard OpenURL elements are passed • In this case the DOI is providing the majority of the info
  26. 26. Other XML standard of interest • COUNTER and SUSHI - http://www.niso.org/schemas/sushi/ Data can be transmitted in XML format • ONIX • For Books - http://www.editeur.org/onix.html • For Serials - http://www.editeur.org/onixserials.html • Actually a set of formats • Much more complex than books standard
  27. 27. Possible Applications • Integrate journal table of contents into web pages • Provide users with latest articles in their field by creating an aggregated feed of important journal in a given field • Provide better interfaces for resources discovery • Display print journal holdings in-line with e-journal holdings • Check for other versions/iterations of a journal during OpenURL resolution (xISSN) • Show users relationships between journals and title changes over time
  28. 28. Possible Applications • Provide links to journal table of contents • Use WorldCat API to search ISSN and retrieve 856 • Manipulate usage statistics information outside an ERM • Show most popular journals, databases, ebooks to users • Provide better interface for ILL staff to see holdings and loan rule information for e-resources • Better display of cross-references between print and electronic journal holdings for users
  29. 29. Further Resources • Auto-Populating an ILL form with the Serial Solutions Link Resolver API - http://journal.code4lib.org/articles/108 • Dublin Core - http://dublincore.org/ • ISO/FDIS 20775 - Holdings schema - http://www.loc.gov/standards/ iso20775/ • MARC Holdings - http://www.loc.gov/marc/holdings/echdhome.html • MARCXML - http://www.loc.gov/standards/marcxml/ • MODS - http://www.loc.gov/standards/mods/ • METS - http://www.loc.gov/standards/mets/ • OCLC Developer’s Network - http://worldcat.org/devnet/wiki/Main_Page • WorldCat Search API URI Evaluator - http://worldcat.org/webservices/ catalog/evaluator.html • xISSB Web Services Documentation - http://xissn.worldcat.org/xissnadmin/ doc/api.htm

×