Xml Overview

1,853 views

Published on

XML

Published in: Technology, News & Politics
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,853
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
49
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Xml Overview

  1. 1. XML for Catalogers in 2009: Emerging Technologies, Tools, and Trends Kevin Reiss [email_address] Systems Librarian Office of Library Services City University of New York AJL-NYMA's 2009 Cataloging Workshop 4/22/2009
  2. 2. Outline <ul><li>XML Basics
  3. 3. XML and MARC
  4. 4. XML Formats
  5. 5. Usage Scenarios
  6. 6. XML Tools
  7. 7. Experimentation & Questions </li></ul>
  8. 8. Purpose <ul><li>I'm not here to teach you how to catalog in XML
  9. 9. Give a basic understanding of XML syntax
  10. 10. Put in XML in the context of library, specifically cataloging, work
  11. 11. Highlight usage scenarios for XML
  12. 12. Discuss tools for editing XML </li></ul>
  13. 13. XML Basics <ul><li>Extensible Markup Language
  14. 14. World Wide Web Consortium (W3C) Standard </li><ul><li>Officially a Recommendation </li></ul><li>First Published in 1997
  15. 15. SGML for the Web </li><ul><li>Standardized General Markup Language </li></ul><li>Came out of the text-encoding community </li><ul><li>Software Documentation ( Docbook )
  16. 16. Literary Texts ( TEI ) </li></ul></ul>
  17. 17. XML is: <ul>So useful it has outlived it's own hype. It is ubiquitous within most modern applications and on the web. It isn't even cool any longer. </ul>
  18. 18. Future Proof Your Data “Data Outlasts Code” Ian Davis – Code4lib 2009 <ul>How many of you have lived through an ILS migration? </ul>
  19. 19. XML is: The best data format we have to deal with this issue at the moment since MARC, in some respects, is becoming a liability where modern software is concerned.
  20. 20. XML is also: <ul><li>Machine-readable
  21. 21. Human-readable
  22. 22. Platform Independent
  23. 23. Verbose
  24. 24. Unicode-compliant
  25. 25. Used in data-centric applications
  26. 26. Used in document-centric applications
  27. 27. Editable by any editor that can handle plain-text files </li></ul>
  28. 28. XML is a meta-language <ul><li>“Self describing Data”
  29. 29. Machine-readable semantic data
  30. 30. You define your application vocabulary </li><ul><li>XML applications are defined with a schema
  31. 31. Example (X)HTML is an XML application </li></ul><li>Adhere to a few simple rules </li><ul><li>Hierarchy
  32. 32. Nested Tags
  33. 33. Quoted attributes </li></ul></ul>
  34. 34. Two Approaches to Markup <ul><li>Descriptive </li><ul><li><h1>Page Title</h1> <p>Paragraph one.</p> <p>Paragraph two.</p> </li></ul><li>Procedural </li><ul><li><font size=”12”>Page Title</font><br/><br/> <font size=”6”>Paragraph one.</font><br/><br/> <font size=”6”>Paragraph two.</font> </li></ul></ul>
  35. 35. Similar Display/Different Approaches
  36. 36. Descriptive Markup <ul><li>Seeks to separate content from presentation
  37. 37. Which of the previous code snippets succeeds?
  38. 38. Descriptive markup makes data </li><ul><li>More portable
  39. 39. Easier to repurpose and share </li></ul><li>In many ways MARC is a partially descriptive, partially procedural markup language </li><ul><li>Field/subfield definitions and validation rules
  40. 40. ISBD Punctuation </li></ul></ul>
  41. 41. 090 |a ML410 .S18 |b J3 2007 24500 |a J. B. Sancho : |b compositor pioner de Califòrnia = compositor pionero de California : pioneer composer of California / |c William J. Summers ... [et. al.] ; ed. Antoni Pizà. 250 |a 1a ed. 260 |a Palma : |b Universitat de les Illes Balears, |c c2007. 300 |a 366 p. : |b ill., music ; |c 30 cm. + |e 1 CD-ROM. 500 |a Parallel text in Catalan, Spanish, and English. 504 |a Includes bibliographical references and thematic catalogue of the works of J. B. Sancho. 500 |a CD-ROM contains Artaserse facsimiles; transcriptions of Misa de los ángeles, Gloria, and Misa del sol; and audio recordings of Misa de los ángeles and Gloria de la Misa en sol. 590 |a At GC, CD-ROMs shelved at Circulation Desk under call no.: CD-ROM 54 50500 |t Sancho : l'eminent músic de l'Alta Califòrnia / |r William J. Summers -- |t Juan Bautista Sancho : a la recerca dels orígens del primer compositor de Califòrnia i de 'estil musical primitiu de les missions / |r Craig H. Russell -- |t Els Sanzo d'Artà / |r Antoni Gili -- |t Catàleg temàtic / |r William J. Summers. 650 0 |a Composers |z California |x Biography. 60010 |a Sancho, Juan Bautista, |d 1772-1830. 60010 |a Sancho, Juan Bautista, |d 1772-1830 |v Thematic catalogs. 7001 |a Pizà, Antoni. 7001 |a Summers, William John. 7001 |a Russell, Craig H. 7001 |a Gili Ferrer, Antonio. Procedural or Descriptive?
  42. 42. Basic XML Syntax <ul><li>Files end in .xml
  43. 43. Individual XML documents are “instances”
  44. 44. Documents must adhere to a nested hierarchy
  45. 45. Start with an option XML declaration
  46. 46. <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> </li><ul><li>Declares XML version used
  47. 47. Declares the character set </li></ul></ul>
  48. 48. The Root Element <ul><li>Every document instance has only one
  49. 49. All other elements nest within this one
  50. 50. For example every XHTML Document has only one “<html>” Tag
  51. 51. Start <tag>
  52. 52. End </tag> </li></ul>
  53. 53. Web Page Source
  54. 54. Elements <ul><li>Sometimes called “tags”
  55. 55. Can contain other elements and text
  56. 56. Must have a <start> and </end> tag
  57. 57. Sometimes elements are “empty”
  58. 58. These must also be “closed” </li><ul><li><empty attribute=”stuff”/>
  59. 59. The image <img src=”mypicture.jpeg”/> element in XHTML is a good example </li></ul></ul>
  60. 60. Elements in MODS <subject xmlns:xlink = &quot;http://www.w3.org/1999/xlink&quot; authority = &quot;lcsh&quot; > <topic> City and town life </topic> <topic> Fiction </topic> </subject>
  61. 61. Attributes <ul><li>Attached to a specific element
  62. 62. Must be quoted ex; myattribute=”my attribute content”
  63. 63. Order is not important when attached to a given element
  64. 64. HTML Example </li><ul><li><a href=” http://www.google.com ” title=”Go to Google”>Visit Google</a> </li></ul><li>MARCXML Example </li></ul><datafield tag = &quot;245&quot; ind1 = &quot;1&quot; ind2 = &quot;0&quot; > <subfield code = &quot;a&quot; > Ulysses </subfield> <subfield code = &quot;c&quot; > [by] James Joyce. </subfield> </datafield>
  65. 65. Entities <ul><li>Five reserved special characters – XML general entities </li><ul><li>& - &amp;
  66. 66. > - &gt;
  67. 67. < - &lt;
  68. 68. ' - &apos;
  69. 69. “ - &quot; </li></ul><li>Example <equation>2 &lt; 5</equation>
  70. 70. Authoring software should escape these for you </li></ul>
  71. 71. Parsing XML <ul><li>Every programming language and operating system supports parsing XML
  72. 72. Most web browsers are XML parsers
  73. 73. Two-levels of XML parsing </li><ul><li>Well-formed - “weak checking”
  74. 74. Validation - “strict checking”
  75. 75. Validation happens when the instance adheres to the rules defined in a specific Schema </li></ul></ul>
  76. 76. Well-Formedness <ul><li>“Weak” check
  77. 77. Checks for adherence to basic XML syntax </li><ul><li>Root element
  78. 78. Nested Data
  79. 79. Attribute syntax
  80. 80. Entity escaping </li></ul><li>Ensures a piece of software can parse the data
  81. 81. Test this with your web browser </li></ul>
  82. 82. Well-formed XML Document
  83. 84. What is a Valid XML Document? <ul><li>Validity makes XML more than than just a structured data format
  84. 85. Validity is enforced by a “schema” that defines a particular XML application
  85. 86. Schemas contain: </li><ul><li>Element/Attribute definitions
  86. 87. Content model definitions </li><ul><li>i.e. element order and number </li></ul><li>Data validation rules </li><ul><li>Enumerated values
  87. 88. Patterns, i.e. dates, MARCXML leader field </li></ul></ul></ul>
  88. 89. XML Schemas <ul><li>Schemas define the semantics/structure of your application
  89. 90. Could be called “strict checking”
  90. 91. Most major XML applications have some sort of schema
  91. 92. Data or document modeling work is done here
  92. 93. A schema supports “guided” editing
  93. 94. In practice schemas are most useful during </li><ul><li>Authoring phase
  94. 95. Data migration </li></ul></ul>
  95. 96. Types of Schemas <ul><li>DTDs – Document Type Definitions </li><ul><li>Older form, derived from SGML
  96. 97. Non XML syntax </li></ul><li>XML Schemas </li><ul><li>W3C Standard
  97. 98. Expressed in XML
  98. 99. Most database like </li></ul><li>Relax NG Schemas </li><ul><li>Most flexible
  99. 100. Expressed in XML or non-XML </li></ul></ul>
  100. 101. Guided Editing using Oxygen
  101. 102. Data Validation Example <ul><li>Consider the MARCXML Schema
  102. 103. MARC Leader Field validation rule </li></ul>[d ]{5}[dA-Za-z ]{1}[dA-Za-z]{1}[dA-Za-z ]{3}(2| )(2| )[d ]{5}[dA-Za-z ]{3}(4500| )
  103. 104. Creating XML <ul><li>Any text editors
  104. 105. Special purpose tools </li><ul><li>General purpose XML editors
  105. 106. Tools for a specific XML application </li></ul><li>As an export format – check your ILS system
  106. 107. Can I catalog in XML? </li><ul><li>Yes
  107. 108. Many of you already do, see OCLC Connexion </li></ul></ul>
  108. 109. Why is XML useful to Software? <ul><li>Well-formed or valid documents make the content predictable and accessible
  109. 110. Parser and schema carry out data-checking for you
  110. 111. Very easy to manipulate for programmers
  111. 112. Multi-language support via Unicode
  112. 113. Parses to “tree” data structure </li><ul><li>Think of an XML document instance as an organization chart
  113. 114. Consists of nodes </li></ul></ul>
  114. 115. Sample Instance
  115. 116. Document Parsed as a Tree
  116. 117. Manipulated via DOM <ul><li>DOM - Document Object Model </li><ul><li>Common XML Processing
  117. 118. supported by most programming languages </li></ul><li>Typical DOM pseudo code:
  118. 119. list = xml->GetAllElements(“genre”)
  119. 120. foreach genre in list: </li><ul><li>if (isTextNode(genre.firstChildNode)) </li><ul><li>print “Genre is ” + genre.firstChildNode </li></ul></ul></ul>
  120. 121. Auxiliary XML Standards <ul><li>Many other W3C Standards </li><ul><li>XML Namespaces
  121. 122. XML Transformations (XSLT)
  122. 123. Xlink (Linking within XML documents)
  123. 124. Xquery, Xpath (Query syntax for XML Documents) </li></ul><li>Transformations are the most useful and relevant for catalogers
  124. 125. XML Namespaces a close second </li><ul><li>Allow you to mix different XML applications within the same document instance </li></ul></ul>
  125. 126. XSLT <ul><li>Extensible Stylesheet Language Transformations
  126. 127. W3C Standard
  127. 128. Written in XML
  128. 129. Convert an XML instance into: </li><ul><li>Another XML instance
  129. 130. Another text format (.csv, .txt)
  130. 131. Most commonly takes XML to XHTML </li></ul></ul>
  131. 132. XSLT is a series of Templates <ul><li>Convert Dublin Core fields to their MARCXML Equivalents
  132. 133. Language => 546
  133. 134. Publisher => 260 </li></ul>
  134. 135. Crosswalks <ul><li>Most common XSLT application in library world
  135. 136. LOC Publishes a series of stylesheets for common cataloging XML formats </li><ul><li>MODS -> MARCXML
  136. 137. MARCXML -> MODS
  137. 138. MARCXML -> Dublin Core
  138. 139. MARCXML -> HTML
  139. 140. Character set conversions (MARC8 -> UTF8) </li></ul><li>Programming languages support </li><ul><li>RAW MARC => MARCXML </li></ul></ul>
  140. 141. Can MARC and XML Co-Exist? I find it telling that the first step to designing any system around data currently in MARC, is that I have to take the data out of MARC, correct it for inconsistencies, massage it to make it more straightforward — just so that the information is useful within non-library systems. Terry Reese - 2007 Creator of MarcEdit
  141. 142. MARC and the future <ul><li>The Current MARC format is: </li><ul><li>The primary store for library metadata
  142. 143. A very large part of the collective intellectual effort of our profession
  143. 144. Most modern software uses design paradigms far different than that of MARC
  144. 145. Not very interoperable outside of OCLC/ILS land
  145. 146. A factor in isolating library data
  146. 147. Supported by only a small number of software tools </li></ul></ul>
  147. 148. 02805cam 2200589 450000100070000000500170000700800410002403500210006590600450008601000170013104000420014804300120019005000230020205101140022505101580033905100850049705101080058205101230069005100830081305101060089608200130100210000300101524500310104526000370107630000500111350000250116350000300118850000340121850000600125265000330131265100310134565000290137665000250140565000220143065500330145265500280148565500280151371000610154195200310160295200630163398500360169698500350173299100680176799100510183599100480188699100500193499100510198499100510203599100500208699100500213609900290218627430020010920145828.0891030s1934 nyuag 000 1 eng 9(DLC) 34002348 a7bcbccorignewd1eocipf19gy-gencatlg a 34002348 aDLCcDLCdDLCdOCoLCdDLCdOCoLCdDLC ae-ie---00aPR6019.O9bU4 1934 aPR6019.O9bU4 1934bcAnother impression. &quot;Second printing, January 1934&quot;--P. [v]. Copyright deposit (#70193). aMicrofilmb75965 PRcMicrofilm of preceding impression. Washington, D.C. : Library of Congress, Photoduplication Service, 1979. 1 microfilm reel ; 35 mm. aPR6019.O9bU4 1934ccAnother impression. &quot;Fourth printing, January 1934&quot;--P. [v] aPR6019.O9bU4 1934c Copy 2cCopy 2 of the preceding impression. Gift of Willard L. Hart, Mar. 17, 1952. aPR6019.O9bU4 1934dcAnother impression. &quot;Fifth printing, February 1934--P. [v] Purchase, Mar. 20, 1934 (DLC #452310). aPR6019.O9bU4 1934ecAnother impression. &quot;Seventh printing, Nov. 1934&quot;--P. [v] aPR6019.ObU4 1934e Copy 2cCopy 2 of the preceding impression. Purchase, Feb. 25, 1936 (DLC #494329).00a823/.9121 aJoyce, James,d1882-1941.10aUlyssesc[by] James Joyce. a[New York,bRandom house,c1934] axvii, 767, [1] p.billus. (music)c21 1/2 cm. aTitle on two leaves. a&quot;First American edition.&quot; aLC copy has dust jacket.5DLC aSource: Gift of Herman Finkelstein, Dec. 30, 1980.5DLC 0aCity and town lifevFiction. 0aDublin (Ireland)vFiction. 0aMarried peoplevFiction. 0aJewish menvFiction. 0aArtistsvFiction. 7aPsychological fiction.2lcsh 7aDomestic fiction.2lcsh 7aEpic literature.2gsafd2 aHerman Finkelstein Collection (Library of Congress)5DLC aForm AACR 2: vj05 04-26-89 aCopy 1 of 4th printing missing in inventory: vj05 04-26-89 ararebk/finkerbcfqr05 02-19-93 ararebk/rbcerbcfqr05 02-19-93 bc-RareBookhPR6019.O9iU4 1934tCopy 1mFinkelstein CollwBOOKS bc-RareBookhPR6019.O9iU4 1934btCopy 1wBOOKS bc-MicRRhMicrofilmi75965 PRtCopy 1wBOOKS bc-GenCollhPR6019.O9iU4 1934ctCopy 1wBOOKS bc-RareBookhPR6019.O9iU4 1934ctCopy 2wBOOKS bc-RareBookhPR6019.O9iU4 1934dtCopy 1wBOOKS bc-GenCollhPR6019.O9iU4 1934etCopy 1wBOOKS bc-RareBookhPR6019.OiU4 1934etCopy 2wBOOKS ajoyce-ulysses-1072275128 From Another Computing Era
  148. 149. MARC | MARCXML
  149. 150. Meaningful Output in Browser
  150. 151. MARC => MARCXML <ul><li>This step requires programming
  151. 152. Utilize Perl Programming to parse MARC to MARCXML
  152. 153. PHP also has a MARC library
  153. 154. These have internal crosswalks that produce a MARCXML representation </li></ul>
  154. 155. MARC => MARCXML <datafield tag = &quot;245&quot; ind1 = &quot;1&quot; ind2 = &quot;0&quot; > <subfield code = &quot;a&quot; > Ulysses </subfield> <subfield code = &quot;c&quot; > [by] James Joyce. </subfield> </datafield> <datafield tag = &quot;260&quot; ind1 = &quot; &quot; ind2 = &quot; &quot; > <subfield code = &quot;a&quot; > [New York, </subfield> <subfield code = &quot;b&quot; > Random house, </subfield> <subfield code = &quot;c&quot; > 1934] </subfield> </datafield>
  155. 156. Tough Example <ul><li>24500 |a J. B. Sancho : |b compositor pioner de Califòrnia = compositor pionero de California : pioneer composer of California / |c William J. Summers ... [et. al.] ; ed. Antoni Pizà.”
  156. 157. MARCXMLifying this isn't necessarily going to help make this more easily digestible to a piece of software
  157. 158. MARCXML essentially maintains MARC as it is and puts it into a parsable XML wrapper </li></ul>
  158. 159. Other XML Formats <ul><li>MARC-Derivatives </li><ul><li>MODS (The Semantic or Readable MARC)
  159. 160. MARCXML </li></ul><li>Dublin Core </li><ul><li>MARCXML's little brother </li></ul><li>EAD
  160. 161. TEI
  161. 162. XHTML
  162. 163. RSS/ Atom
  163. 164. RDF </li></ul>
  164. 165. Data v. Document Centric <ul><li>Data Centric </li><ul><li>Database export formats
  165. 166. Spreadsheet export formats
  166. 167. Metadata
  167. 168. Most cataloging formats fall into this category </li></ul><li>Document Centric </li><ul><li>Encoding full-text resources
  168. 169. Mixed content </li></ul></ul>
  169. 170. MODS <ul><li>Metadata Object and Description Schema </li><ul><li>http://www.loc.gov/standards/mods/ </li></ul><li>The “semantic” or “descriptive” XML MARC Surrogate
  170. 171. Inconsistent support </li><ul><li>ILS Systems
  171. 172. Institutional Repositories </li></ul></ul>
  172. 173. MADS <ul><li>Metadata Authority Description Standard </li><ul><li>http://www.loc.gov/standards/mads/ </li></ul></ul><mads .....><authority> <topic authority = &quot;lcsh&quot; > Computer programming </topic> </authority> <related type = &quot;broader&quot; > <topic> Computers </topic> </related> <related type = &quot;narrower&quot; > <topic> Programming languages </topic> </related> <related type = &quot;other&quot; > <topic> Systems Analysis </topic> </related> </mads>
  173. 174. Dublin Core <ul><li>Popular simple metadata format
  174. 175. 15 basic elements
  175. 176. key=>value pairs </li><ul><li>Title =
  176. 177. Publisher =
  177. 178. DC Element Name = </li></ul><li>Qualified vocabulary available
  178. 179. Default format for the OAI-PMH Protocol for Metadata Harvesting </li></ul>
  179. 180. EAD <ul><li>Encoded Archival Description
  180. 181. Archival Findings Aids
  181. 182. One of the oldest XML formats
  182. 183. Straddles the data and document-centric worlds
  183. 184. Crosswalks available in MarcEdit and other places </li></ul>
  184. 185. TEI <ul><li>Text Encoding Initiative
  185. 186. Designed to encode any kind of text
  186. 187. Humanities Computing Initiative
  187. 188. Support in the special collections community
  188. 189. Intellectually rich XML application
  189. 190. Many dialects ranging from: </li><ul><li>Basic descriptive encoding of a text's structure
  190. 191. Detailed linguistic analysis </li></ul></ul>
  191. 192. XTHML <ul><li>Extensible HTML
  192. 193. HTML that confirms to XML rules
  193. 194. Has become ubiquitous on the web
  194. 195. Used in conjunction with Cascading Style Sheets </li><ul><li>XHTML provides the content
  195. 196. CSS controls how it displays </li></ul><li>If your Content Management System (CMS) doesn't use XHTML you are in trouble </li></ul>
  196. 197. RSS Syndication <ul><li>Really Simple Syndication
  197. 198. An instance of RSS is known as a feed
  198. 199. Users can subscribe to a particular RSS feed
  199. 200. New additions to the feed are pushed out
  200. 201. RSS feeds are easily incorporate into webpages
  201. 202. Most web portals (i.e. your yahoo, or google account are built around RSS feeds)
  202. 203. In a catalog </li></ul>
  203. 204. RSS within a Catalog
  204. 205. RSS and Repositories <ul><li>Emerging area of functionality for RSS
  205. 206. RSS can be used an export protocol to a repository, i.e. turn something into connexion for a institutional repositories
  206. 207. Any content creation tool could send items to a repository
  207. 208. SWORD (Simple Web-service Offering Repository Deposit)
  208. 209. Uses Atom, an RSS dialect to accomplish this
  209. 210. http://www.swordapp.org/ </li></ul>
  210. 211. RDF <ul><li>Resource Description Framework
  211. 212. Semantic Web Technology
  212. 213. Linked Data using URI(L)s
  213. 214. Machine Readable semantics a level above what XML provides
  214. 215. RDF fragment of Project Gutenberg data </li></ul>
  215. 216. Sample RDF Assertion describing a Person taken from RDF Primer
  216. 217. RDA and XML <ul><li>Some crosswalks in the works
  217. 218. XML versions of RDA will likely be produced in RDF
  218. 219. Early Example - Using Library of Congress MARC data </li><ul><li>http://code.google.com/p/code4rda/wiki/MilestoneOne </li></ul></ul>
  219. 220. RDA in RDF/XML
  220. 221. XML Usage Scenarios <ul><li>Web Interfaces (AJAX)
  221. 222. Data processing (ILS go-between)
  222. 223. Crosswalks (MARCXML=>All of the Above)
  223. 224. Metadata Harvesting (OAI-PMH)
  224. 225. Full-text Indexing </li></ul>
  225. 226. AJAX – XML Behind the Scenes
  226. 227. ILS Go-between Format <ul><li>OCLC Connexion </li><ul><li>Connexion records are actually created in MARCXML
  227. 228. Get converted to MARC for export </li></ul><li>ILS Example - Aleph </li><ul><li>Notices
  228. 229. Reports
  229. 230. Customizable XSL stylesheets to format the XML produced by these transactions </li></ul></ul>
  230. 231. Crosswalks <ul><li>Library of Congress </li><ul><li>Various MARCXML crosswalks </li></ul><li>Other formats </li><ul><li>EAD => MARCXML
  231. 232. Anything to Dublin Core </li></ul></ul>
  232. 233. OAI - PMH <ul><li>Open Archives Initiative – Protocol for Metadata Harvesting
  233. 234. Dublin Core is the default format here
  234. 235. Expose information about digital collections/repository content to the wider world
  235. 236. Participants in METRO grants have data available via OAI in XML </li><ul><li>Collection List </li></ul></ul>
  236. 237. OAI Metadata Example with Dublin Core
  237. 238. Indexing XML <ul><li>There are numerous full-text indexing tools for XML, some utilized by ILS systems
  238. 239. Parse XML into their own indexing format </li><ul><li>Solr (actually uses it's own XML format)
  239. 240. Lucene </li></ul><li>Native XML Indexers </li><ul><li>eXist
  240. 241. Ex Libris' Primo </li><ul><li>Catalog Records are converted to OAI-PMH Dublin Core and then indexed </li></ul></ul></ul>
  241. 242. MarcEdit <ul><li>Simplest tool to integrate into existing library workflows; open-source, freely downloadable
  242. 243. Direct MARC Support
  243. 244. Global Editing of MARC Data
  244. 245. Crosswalk utilities
  245. 246. Most useful for: </li><ul><li>Special Collections Work
  246. 247. Electronic MARC Record Processing </li></ul></ul>
  247. 248. MarcEdit Crosswalk Options
  248. 249. Harvest OAI Data
  249. 250. End of OAI Harvest in MarcEdit
  250. 251. Specialty Editors <ul><li>Archivist's Toolkit </li><ul><li>Useful for EAD
  251. 252. Also has MARC support </li></ul><li>Oxygen </li><ul><li>Most useful low-cost option for: </li><ul><li>Special Collections work
  252. 253. Document-centric work
  253. 254. General authoring XML </li></ul></ul></ul>
  254. 255. Oxygen <ul><li>Low-cost
  255. 256. Complete XML Management Solution
  256. 257. Supports all types of XML Schema
  257. 258. XSLT Support w/debugger
  258. 259. Many academic users </li></ul>
  259. 260. XML Aware Editing in Oxygen
  260. 262. XML and Programming Languages <ul><li>Strong native XML support in all programming languages
  261. 263. Familiar data structure to programmers </li><ul><li>Remember the tree structure? </li></ul><li>Internationalization support via Unicode
  262. 264. Library data has a better chance of strong support in XML than not in XML </li></ul>
  263. 265. MARC and Programming Languages <ul><li>Full Support by a small number of software vendors
  264. 266. Perl/PHP/Python/Ruby all have support with varying levels of MARC support
  265. 267. Marc tools in these languages are typically: </li><ul><li>Specialty modules
  266. 268. maintained by a small, but dedicated group of programmers
  267. 269. Not part of most languages' “standard” distribution </li></ul></ul>
  268. 270. For Future Reference <ul><li>A Classic introduction to basic XML concepts from the TEI “ A Gentle Introduction to XML ”
  269. 271. Terry Reese's Weblog
  270. 272. Watch for how RDA interacts with XML
  271. 273. Eric Lease Morgan's Workshop for those with a more technical bent - “ XML in Libraries ” </li></ul>
  272. 274. Conclusion <ul><li>XML is just a tool
  273. 275. It is a useful one
  274. 276. The intellectual work of cataloging will still be the same
  275. 277. Relying on the MARC format as our primary data store is becoming problematic </li></ul>

×