Image Databases in Practice


Published on

  • Be the first to comment

  • Be the first to like this

Image Databases in Practice

  1. 1. Metadata for Asset Management Peter B. Hirtle Co-Director Cornell Institute for Digital Collections
  2. 2. Problem: Imaging projects produce many digital files
  3. 4. Problem redux… <ul><li>How to you locate, manage, and display scanned images? </li></ul>
  4. 5. One possible answer: <ul><li>Put identifying information into the file header </li></ul><ul><li>Problems with this approach </li></ul><ul><ul><li>Hard to search and retrieve </li></ul></ul><ul><ul><li>May change over time </li></ul></ul><ul><ul><li>May not be able to migrate data </li></ul></ul>
  5. 6. Second approach <ul><li>Use an image management system to manage images: </li></ul><ul><li>A software application (often a database) used for organizing, managing, and providing access to digital media </li></ul>
  6. 7. Image management system <ul><li>Provides tools for searching </li></ul><ul><ul><li>(Descriptive metadata) </li></ul></ul><ul><li>Provides public and internal links to the images </li></ul><ul><li>(Structural metadata) </li></ul><ul><li>Provides the control elements needed for short and long-term access </li></ul><ul><li>(administrative metadata) </li></ul>
  7. 8. Metadata for image management <ul><li>No single accepted standards for each type of metadata </li></ul><ul><ul><li>Descriptive metadata </li></ul></ul><ul><ul><ul><li>MARC, DC, MOA2, EAD, VRA, Open Archives Initiative </li></ul></ul></ul><ul><ul><li>Structural metadata </li></ul></ul><ul><ul><ul><li>LC RFP’s, MOA2, DOIs </li></ul></ul></ul><ul><ul><li>Administrative metadata </li></ul></ul><ul><ul><ul><li>DIG 35, NISO draft standard, MOA2, in process preservation standards such as CEDARS </li></ul></ul></ul>
  8. 9. Key concept: metadata is seldom fixed <ul><li>You will be massaging the metadata throughout the life of the project </li></ul><ul><ul><li>To conform to emerging standards </li></ul></ul><ul><ul><li>To adjust to new technical environments </li></ul></ul><ul><ul><li>To add functionality </li></ul></ul>Once you start a digital project, you are committed to it for life
  9. 10. So where do you get an image management solution? <ul><li>No single off the shelf solution </li></ul><ul><li>Solutions vary according to: </li></ul><ul><ul><li>complexity </li></ul></ul><ul><ul><li>performance </li></ul></ul><ul><ul><li>cost </li></ul></ul>
  10. 11. What is the “ideal solution”…? <ul><li>Dependent upon your needs: </li></ul><ul><ul><li>size of database </li></ul></ul><ul><ul><li>expected demand for images </li></ul></ul><ul><ul><li>volatility of the data </li></ul></ul><ul><ul><li>available technical resources </li></ul></ul>
  11. 12. Other elements to consider.... <ul><li>Access to a controlled thesaurus </li></ul><ul><li>Flexibility in database design </li></ul><ul><li>The expected life-span of the data </li></ul><ul><li>If permanent, the potential for migration </li></ul><ul><ul><li>Adherence to database standards </li></ul></ul><ul><ul><li>Adherence to data content standards </li></ul></ul>
  12. 13. Three classes of solutions <ul><li>Generic database applications </li></ul><ul><ul><li>Desktop </li></ul></ul><ul><ul><li>Client/server </li></ul></ul><ul><li>Specialized image management programs </li></ul><ul><li>SGML-based solutions </li></ul>
  13. 14. Generic database applications <ul><li>Most common desktop programs </li></ul><ul><ul><li>MS Access, Filemaker Pro </li></ul></ul><ul><li>Client/server applications </li></ul><ul><ul><li>Oracle, Informix (including Illustra), 4th Dimension, object-oriented applications </li></ul></ul>
  14. 15. Demo Here
  15. 16. Advantages to desktop programs <ul><li>Low initial cost for desktop programs </li></ul><ul><li>Desktop programs are relatively easy to program and use </li></ul><ul><li>Simple data import and export </li></ul><ul><li>Growing 3rd-party market of add-ons (especially web tools) </li></ul>
  16. 17. Disadvantages <ul><li>Desktop solutions limited in size </li></ul><ul><ul><li>(< 10,000?) </li></ul></ul><ul><li>Few standardized data structures </li></ul><ul><li>Web interfaces require customization </li></ul><ul><li>High costs of programming </li></ul><ul><ul><li>explicit with large applications </li></ul></ul><ul><ul><li>hidden but real with desktop </li></ul></ul>
  17. 18. Specialized image management programs <ul><li>“ Desktop” examples: </li></ul><ul><ul><li>Canto’s Cumulus </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>ImageAXS </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>Portfolio (formerly Fetch) </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>Content (shown here) </li></ul></ul>
  18. 19. Advantages <ul><li>Pre-defined data structure </li></ul><ul><li>Built-in links to images </li></ul><ul><li>Some are cross-platform </li></ul><ul><li>Some have built-in links to the web </li></ul><ul><li>Overall, less programming expertise required </li></ul>
  19. 20. Disadvantages <ul><li>Fixed data structure </li></ul><ul><li>Proprietary database structures </li></ul><ul><li>Limited customization possible </li></ul><ul><li>Web access is primarily via scripts </li></ul>
  20. 21. Larger client/server image management programs <ul><li>Library software </li></ul><ul><li>Museum-oriented programs </li></ul><ul><li>Document management programs </li></ul><ul><li>Digital library solutions </li></ul><ul><li>Other programs for newspaper photos, stock photos, multimedia asset management, etc. </li></ul>
  21. 22. Library systems <ul><li>Image-enabled library catalogs include </li></ul><ul><ul><li>VTLS </li></ul></ul><ul><ul><li>CARL </li></ul></ul><ul><ul><li>OCLC Sitesearch </li></ul></ul><ul><ul><li>Endeavor’s Voyager and ENCOMPASS </li></ul></ul><ul><ul><li>RLG has a system in development </li></ul></ul><ul><li>All library systems will head in this direction </li></ul>
  22. 23. Advantages <ul><li>Ready links between catalog and digital images </li></ul><ul><li>Built on common data structures </li></ul><ul><ul><li>MARC or Dublin Core </li></ul></ul><ul><li>Increased likelihood they will exploit library-specific metadata </li></ul><ul><li>Greater possibility for shared resources </li></ul>
  23. 24. Disadvantages <ul><li>Poor integration between images and text </li></ul><ul><li>No common repository standard </li></ul><ul><li>No shared standard for utilizing metadata </li></ul><ul><li>Administrative hurdles </li></ul><ul><ul><li>Do digital imaging and Library Systems talk to each other? </li></ul></ul>
  24. 25. SGML and XML-based systems <ul><li>A new approach: using metadata encoded with SGML or XML </li></ul><ul><li>Based on document type definitions (DTD) </li></ul><ul><li>Examples: </li></ul><ul><ul><li>Photographs using EAD: California Heritage project </li></ul></ul><ul><ul><li>Text using Ebind (electronic binding DTD) </li></ul></ul><ul><ul><li>Agora’s complete management system </li></ul></ul>
  25. 26. Why consider SGML? <ul><li>Based on an international standard </li></ul><ul><li>DTD’s may themselves become standard </li></ul><ul><ul><li>Example: MOA2 </li></ul></ul><ul><li>May be more appropriate for text-oriented description </li></ul><ul><li>Links to other SGML or XML-encoded resources are possible </li></ul>
  26. 27. Disadvantages to SGML <ul><li>Little native client support for SGML </li></ul><ul><li>SGML engines may not be as powerful as relational databases </li></ul><ul><li>XML databases are just being developed </li></ul><ul><li>Native SGML software tends to be expensive </li></ul><ul><li>Often it is easier to store data in a database, and write it out with SGML XML tags for exchange or export </li></ul>
  27. 28. Summary <ul><li>No single imagebase package is likely to meet all your needs </li></ul><ul><li>Plan on continuously modifying databases, interfaces, and metadata </li></ul><ul><li>Monitor closely the work developing image database standards in the area of greatest interest to you </li></ul><ul><li>Avoid if possible the hidden costs of internal development </li></ul>