• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Pen to Pixel: Bringing Appropriate Technologies to Digital Manuscript Philology
 

Pen to Pixel: Bringing Appropriate Technologies to Digital Manuscript Philology

on

  • 228 views

Digital representation of medieval manuscripts and their key elements – ranging from beautiful illuminations to ancient hidden diagrams and texts – pose significant challenges for the application ...

Digital representation of medieval manuscripts and their key elements – ranging from beautiful illuminations to ancient hidden diagrams and texts – pose significant challenges for the application of appropriate technologies that are efficient and useful to scholars. While users and institutions tend to focus on the technologies and their technical capabilities, one of the most significant elements in development of digital representations of manuscripts is the ability to share and archive digital data for philology, scholarship and preservation research and analysis. Large datasets need to be created and archived with clear storage and access procedures to ensure data integrity and full knowledge of the digital content. Only with common standards, work processes and access can advanced digitization technologies be used for the study of medieval manuscripts in libraries. These are being used in institutions ranging from the ancient library of St. Catherine’s Monastery in the Sinai to the Library of Congress, Walters Art Museum and University of Pennsylvania Library in the United States. Wherever they are located, each is grappling with the challenges of collecting and preserving digital information from medieval manuscripts and codices for future generations.

These libraries use advanced camera systems to capture high-resolution images of manuscripts. Some of these institutions are also conducting spectral imaging studies of manuscripts with advanced collection and digital processing to reveal erased information – such as the earliest copies of Archimedes diagrams and treatises – without damaging the upper layer of text and artwork. These technologies yield large collections of quality digital images for access and study, but the data that becomes the digital counterpart must be effectively stored, managed and preserved to be truly useful for study. Integrating complex sets of digital images and hosting them on the Web for global users poses a complex set of challenges.

Statistics

Views

Total Views
228
Views on SlideShare
219
Embed Views
9

Actions

Likes
0
Downloads
5
Comments
0

1 Embed 9

https://twitter.com 9

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • 1. Title Slide: Meeting the Challenge: Digitizing Islamic Manuscripts at the Walters
  • Data - all core data: images, transcriptions, metdaata _ checkusm Documents - internal and external documentation ResearchContirib - importtant data that is not integrate with core data set: conservation information, speical or experimental images Supplemental -- Source files for other core data files: folio-by-folio transcriptions are derived from work length transcriptions: Floating Bodies, Method

Pen to Pixel: Bringing Appropriate Technologies to Digital Manuscript Philology Pen to Pixel: Bringing Appropriate Technologies to Digital Manuscript Philology Presentation Transcript

  • Pen to Pixel:Bringing Appropriate Technologies to Digital Manuscript Philology Michael B. Toth R. B. Toth Associates rbtoth.com http://www.thedigitalwalters.org/ On behalf of the Walters Art Museum Digitization Team, especially: Lynley Herbert, Ariel Tabritha, Diane Bockrath, Kimber Wiegand, Doug Emery Supported by the US National Endowment for the Humanities
  • Walters Art Museum W.562, 2b Koran 9th century AH / 15th CEWalters Art Museum, Baltimore, MarylandDigital Imaging System
  • St. Catherine’s Monastery, SinaiSpectral Imaging System
  • US Library of CongressSpectral Imaging System
  • Advanced Digitization
  • Applied Science & Technology
  • …to Manuscript Studies
  • Manuscript Studies 20th Century and prior
  • Manuscript Studies 21st Century
  • Obscured Information
  • Illuminated Manuscripts
  • Digital Manuscript Challenges“…an ultimate challenge to creators and users of digitaltools wishing to produce useful and reliable digital counter-parts to these medieval sources of knowledge andtestimonies of intellectual creativity.” • Complex, Changing Technical Climate • Range of Digital Products & Formats • Need for Integrity of Entire Data Set • Demand for Continual & Faster Access • User Repurposing of Content • Restrictions on Access and Use
  • Simplicity of Data1. Access to data • By People • By Machines2. Licensing • Global Storage & Access
  • Walters Online Manuscripts
  • The Digital Walters http://www.thedigitalwalters.org/
  • Islamic Manuscripts of the Walters Art Museum: A Digital Resource (2008 to 2011)
  • Parchment to Pixel:Creating a Digital Resource of Medieval Manuscripts (2010 to 2012)
  • The Digital Walters Over 10 Terabytes of Data ng! wiing! d grro w nd g o .. .. .. a n a Islamic Parchment Total to PixelNo. of 172 107 279ManuscriptsNo. of TEI 170 37 207DescriptionsDistinct Images 46,857 34,084 80,941Image Files 187,266 134,698 321,964Data Size 5.99 TB 4.09 TB 10.08 TB
  • Data & Metadata• Long-term data set viability beyond the lifetime of current technologies – Adherence to existing broadly accepted standards – Simple, flat metadata records• Integration of metadata with images, supporting data and scholarly products
  • Cataloging & Metadata• Metadata Integrated with Digital Object – Adherence to broadly accepted standards – Simple, flat metadata records• Persistent Identifiers• Accepted Standards – Standardized Vocabularies – Metadata Schema – xml to support conversion to other formats (e.g. MARC, MODS, EAD)• Documentation & Preserve Standards
  • Data Integrity• Image• XML Metadata• TEI Catalog• License
  • Standardiz e• Cataloging• Metadata• File Format• Imaging and Color• Resolution or Fidelity• Vocabulary and Geographic Names • Foreign Language and English• Intellectual Property• Storage• Quality and Quality Control• Others
  • Preservation & AccessOwner of Archimedes Palimpsest:• Preserve data in “flat files” – Do not tailor data for Web interfaces• Host data on “spinning disks” – Did not want digital product to end up on media that could become obsolete, with limited access• Make broadly available on Internet – Do not place restrictions on use
  • Data Layout Access ReadMe Data WaltersManuscripts Technical ReadMe Supplemental Access Other Books
  • Digital Walters File Structure
  • Cataloging Information• Manuscript level: all information that applies the manuscript as a whole, including an abstract, physical dimensions and features of the manuscript, like size, extent, collation, and binding.• Manuscript item level: all information that applies to the intellectual divisions of the book, including the titles of works, rubrics, incipits, colophons, layout information about the written surface.• Manuscript piece level: all information for the items imaged (i.e., binding pieces, flyleaves, and folios), including item name, folio number, and, for illuminated pieces, detailed descriptions of the art work.
  • Dublin Core Metadata Initiative Element Set
  • Manuscript DCMI Elements• Identifier: the shelf mark for manuscripts (e.g., W.582), and the image serial number for images (e.g., W582_000001)• Creator: always the Walters Art Museum• Contributor: one entry for each project participant responsible for the creation of the manuscript’s data set• Date: the date of web page or image creation• Title: the title of the manuscript (e.g, “Walters Ms. W.579, Prayer”)• Description: a description of the manuscript or image• Source: source of the object used to create the image or image collection• Type: Image for individual images; Collection for all images of a manuscript• Format: image/tiff for images, text/html for a manuscript web page• Subject: keywords describing the manuscript or imaged folio• Rights: license and usage terms
  • License and use: UPDATED! 6 February 2013All License and use:images and descriptions provided here are licensed for use under the Walters manuscript UPDATED! 6 February 2013Creative Commons Attribution-Share Alike 3.0 Unported License are licensed for use under the All Walters manuscript images and descriptions provided here and the Creative Commons Attribution-Share Alike 3.0 Unported License and theGNU Free Documentation License.You are Free to download andLicense. images and descriptions on this website under the licenses GNU free Documentation use thenamed are freeYou do not need to apply to the Walters prior to using the images. We ask only that You above. to download and use the images and descriptions on this website under the licensesyou cite the source of the not needas the Walters Art Museum. to using the images. We ask only tha named above. You do images to apply to the Walters priorAdditionally, we request that images of any work created using these materials be sent to the you cite the source of the a copy as the Walters Art Museum.Curator of Manuscripts andthat a copy ofat the Walters Art Museum, 600 N. Charles Street, the Additionally, we request Rare Books any work created using these materials be sent toBaltimore, of Manuscripts and Rare Books at the Walters Art Museum, 600 N. Charles Street, Curator MD 21201, mss-curator@thewalters.org.Note these terms 21201, mss-curator@thewalters.org. Baltimore, MD mark a change from our previous license, which placed a noncommercialrestriction on the use of these materials. The previous license, which placed a noncommercial this Note these terms mark a change from our noncommercial restriction no longer applies, andlicense supersedes use previously advertised license, and replaces that foundlonger applies, and thi restriction on the the of these materials. The noncommercial restriction no in many of the license supersedes the previously advertised license, and replaces that found in many of thearchival TIFF image headers.This change follows theheaders. Art Museum’s licensing policy. More information on the Walters’ archival TIFF image Walters This change follows the Walters Art Museum’s licensing policy. More information on the Walters’intellectual property policy can be found on the Walters website: http://art.thewalters.org/license/. intellectual property policy can be found on the Walters website: http://art.thewalters.org/license/.
  • Metadata xml Information• /manuscript: top-level container of metadata for a manuscript’s images• /manuscript/image_object: description of the manuscript, primarily Dublin Core metadata, with the number of images captured in the imageCount element• /manuscript/images: container for the manuscript’s image data• /manuscript/images/image: information about a single capture and its derivatives, including: – /manuscript/images/image/index: the order of the image in the set, beginning with 0 – /manuscript/images/image/image_subject: the folio number or name of the piece imaged• /manuscript/images/image/capture: detailed information about the image’s capture extracted from the imaging software database• /manuscript/images/image/masterDerivation: description of how the archival TIFF image was generated from the camera raw file, including cropping and color correction information• /manuscript/images/image/jhoveData: XML output of the JHOVE utility run on the archival TIFF file• /manuscript/images/image/derivative: three elements containing cropping and scaling information needed to generate the 300 PPI, SAP, and thumbnail files from the archival TIFF
  • xml Model/manuscript /manuscript /image_object /image_object // manuscript/i manuscript/i mages mages /image /image /image /image /capture /capture /image /image /capture /capture /image /image /capture /capture /capture /capture
  • Preserve Standards
  • Standard Workflows for Data Management• Transfer & archive digital data for research and analysis by the curatorial, scholarly, preservation and imaging communities• Clear access procedures − Ensuring data integrity for digital storage repositories, − Preventing introduction of mislabeled and incorrect metadata
  • Quality Control• Data Quality – Automate data handling to avoid error – Audit trail for manual data manipulation• Quality Management – Implement processes for quality review – Verification and Validation• Documentation – Define metrics & quality goals
  • Data Management System• Internal Digital Asset Management System – Internal Server • Image Files • Catalog Data• Access Infrastructure• Security• Backup – Internet Systems Consortium
  • IDR Access Model Johns Hopkins Metadata ApplicationMetadata Metadata Agent Agent Metadata(METS) Metadata (METS) (METS) (METS) Preservation Metadata: Event Implementation Request Event Strategies Event (PREMIS) Digital Digital DigitalRepresentation Digital Representation Representation e.g. TIFF Representation e.g. TIFF e.g. TIFF Image e.g. TIFF Image Image Image Dublin Core TEI Dublin Core TEI Metadata Metadata Initiative Initiative (DCMI) (DCMI)
  • Preservation of the DataPreservation Heresy:Preservation Heresy: The Digital information is closer to the original The Digital information is closer to the original than the Artifact itself than the Artifact itself <“I don’t use the parchment. The parchment is gone! As far as the “I don’t use the parchment. The parchment is gone! As far as thescholars are concerned, there is no parchment. You only work from scholars are concerned, there is no parchment. You only work fromdigital images on the laptop – that’s the only thing that matters for the digital images on the laptop – that’s the only thing that matters for thereading.” – Dr. Reviel Netz, 14 Jan WYPR reading.” – Dr. Reviel Netz, 14 Jan WYPR
  • What Will Happen to the Data? “There’s a big technical issue that has me worried. “There’s a big technical issue that has me worried. The information on the Net is not all simple text. It’s The information on the Net is not all simple text. It’s structured, whether it’s Microsoft Word documents or structured, whether it’s Microsoft Word documents or PDFs. That means the information is only really PDFs. That means the information is only really accessible if you understand how to interpret the bits. accessible if you understand how to interpret the bits. What happens when files are there and we don’t What happens when files are there and we don’t know how to interpret them anymore? know how to interpret them anymore? “If you have a CD but the form isn’t known anymore. II “If you have a CD but the form isn’t known anymore. have 5 1/4-in. diskettes, but nothing to read them. have 5 1/4-in. diskettes, but nothing to read them. Even 3 1/2-in. diskette readers are becoming hard to Even 3 1/2-in. diskette readers are becoming hard to come by. The physical source media change. come by. The physical source media change. We may lose the ability to read them.” We may lose the ability to read them.” Vint Cerf, Vint Cerf, Google Internet Evangelist, recipient of US Presidential Medal of Google Internet Evangelist, recipient of US Presidential Medal of Freedom, and basic architecture of the Internet. Freedom, and basic architecture of the Internet. July 30, 2007 (Computerworld) July 30, 2007 (Computerworld)
  • Digital PreservationImpermanence of Digitized Data• Dynamic technology, media and formats • Rapid obsolescence • Regular reformatting required• Ensure utility of data• Broad distribution to service providers• Standardized formats & encoding
  • LicenseAll artworks in the photographs are in public domain due to age. The photographs of two-dimensional objects are also in the public domain. Photographs of three-dimensional objects andall descriptions have been released under the Creative Commons Attribution-Share Alike 3.0Unported License and the GNU Free Documentation License.You are free to download and use the images and descriptions on this website under the licensesnamed above, but if you desire digital images at a higher resolution, for scholarly or commercialpublication, please contact our photo services department.
  • Trusted Digital Repository• Compliance with the Reference Model for an Open Archival Information System (OAIS)• Administrative responsibility• Organizational viability• Financial sustainability• Technological and procedural suitability• System security• Procedural accountability
  • Future Opportunities Michael B. Toth R. B. Toth Associates rbtoth.com