  1. 1. Archiving and Cataloging Digital Photographs Maurizio Agelli, CRS4 { } September 20th 2012, 5.30pm Aula Magna Facoltà di Architettura - Via Corte dAppello - Cagliari
  2. 2. Point de vue du Gras, Nicéphore Niépce, 1826 (from Wikimedia Commons)
  3. 3. Boulevard du Temple, Louis Daguerre, 1838 (from Wikimedia Commons)
  4. 4. The first photograph was taken less than 200 years ago ... How many photos have ever been taken ?
  5. 5. Number of photos ever shot (up to 2011): ~3.5 x 1012 500 to 800 billion taken in 2011 [source: Observatoire des Professions de lImage ] [ source: Jonathan Good, 2011 - ]
  6. 6. Presentation Outline1) Archiving as part of the photographic workflow2) Describing photographs: metadata3) Organizing images in catalogs4) Ensuring long-term storage: backup and migration5) An overview of image archiving tools6) A Digital Asset Management platform developed at CRS4
  7. 7. -1-Archiving as part of thephotographic workflow
  8. 8. Photo Archive Photo by Seeweb - CC BY-SA 2.0 A collection of images kept in secure, long- term storage. [ ]Photo by M.Agelli - CC BY-SA 2.0
  9. 9. Building a digital photo archiveinvolves many decisions ... What to archive ? File formats Metadata File naming Catalog organization Folder structure Backup policiesArchiving platform Migration policies... which strongly depend on thephotographic workflow
  10. 10. A general workflowNo single workflow suits all photographers and allclients [UPDIG]Workflow decisions are determined by volume production, turnaround, imagequality requirements, regulations, costs, etc.. Capture Ingestion Working Publishing Archive
  11. 11. A general workflow, more in detail - Image transfer - File renaming - Add bulk metadata - Image editing - Batch editing - Metadata editing - Export images - Format conversion - Create derivative work - Print imagesAll camera- - Publish torelated stuff Focus on volume Focus on quality web and speed Capture Ingestion Working Publishing camera computer Store, search, organize, ... Archive Digital Asset Management Platform
  12. 12. File formats / 1 RAW JPEGRAW Camera In-camera (DNG) (TIFF)Many RAW formats (>200). sensor processingProprietary, undocumented. TIFF JPEGEncodes values from camera Film Scanner (DNG)sensor, before demosaicing (12-16 bit/pixel, 1 color/pixel) .Lossless. May be compressed. TIFF Open standard.DNG (DIGITAL NEGATIVE) 8, 16, 32 bit RGBOpen standard, created by Lossless, big file size !Adobe. Possible PSD replacementTargeted to replace RAW, but still (supports layers).limited adoption by the industry.
  13. 13. File formats / 2 JPEG JPEG 2000 Open standard Better compression than Jpeg Compressed, lossy (wavelet transform vs. cosine 8 bit RGB: suitable for displaying, transform) not good for editing 8, 16 bit RGB Lossless / lossy Many extra features: regions of interest, progressive decoding, multi-resolution decoding. ~35 MB ~5.3 MB ~5 MB ~0.6 MBExample: 6Mpixel image(Nikon D40) TIFF NEF DNG JPEG 48 bit / pixel 12 bit / pixel 12 bit / pixel 90% uncompressed compressed compressed quality
  14. 14. File formats and image editing RAW PARAMETRIC RAW or DNG JPGCAMERA EXPORT EDITING TIFF or DNG RASTER TIFF or DNG JPG EXPORT EDITINGParametric Image Editing Raster Image EditingImage data are not modified. Image pixels are modified.Source file is preserved. Editing is A new file containing the edited imagesaved as a list of rules which are shall be saved in order to preserve theapplied at rendering time. original.(e.g. Lightroom, Aperture) (e.g. Photoshop, Picture Window Pro)
  16. 16. Which files to archive? Capture Ingestion Working Publishing ORIGINAL MASTER DERIVATIVE FILES FILES FILES Archive
  17. 17. -2-Metadata
  18. 18. The importance of metadata "An image is worth 1000 words", but ... ... there are questions which only words can answer: When was it shot? ... and where? Who are thosePhoto by Maurizio Agelli - CC BY-SA 2.0 people? Who took this photograph ? Can I use it freely ?
  19. 19. MetadataInformation about content. Photo by M. Agelli - CC BY-SA 2.0
  20. 20. A more precise definition METADATA "Structured encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities" [source American Library Association]
  21. 21. Photo by anyjazz65 [ CC BY-NC 2.0 ] Image metadata is nothing new ...
  22. 22. Where digital image metadatacan be written?○ inside the image file metadata image data○ in a sidecar file image + data metadata○ in a database○ in an online registry○ in the file name d40-20120920-DSC_0153-edited.jpg camera date id derived
  23. 23. Image metadata standards PLUS IPTC XMP DICOM EXIF Dublin Core Mpeg-7 Creative Commons
  24. 24. IPTC IIM EXIFInformation Interchange Model Exchangeable Image File FormatCreated in 1991 by International Created in 1995 by Japan ElectronicPress Communication Council Industries Development AssociationAdobe defined the mechanism for Driven by CAMERA MANUFACTURERSembedding IPTC IIM metadata in Focused on low-level propertiesimage files (1994) (camera settings, geo coordinates,Driven by NEWS INDUSTRY date/time, ...)Focused on high-level properties Cannot be extended(description, geo location, ...)Cannot be extended EXIF IPTC IIM Image Data
  25. 25. XMP EXIF Legacy Metadata IPTC IIMExtensible Metadata PlatformOpen standard, created by Adobe Dublin Core○ defines a data model and a XMP XMP Basic serialization model (RDF/XML) Rights○ also covers video, audio, text○ structured as a set of schemas Media Mng○ can be extended with new Photoshop metadata schemas Camera RAW○ multi-lingual qualifiers EXIF○ can be serialized and stored in IPTC Core most file formats (not in RAW!) Image Data○ it is widely supported by the IPTC Extens. industry ...
  26. 26. A timeline of image standards IPTC IIM IPTC Headers XMP (Adobe) (first release) EXIF (first release) JPEG TIFF (first release)(first release) Kodak Photo CD consumer DSLRs First DSLR (Kodak DCS-100) professional DSLRs 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2001
  27. 27. A quick look inside XMP>200 properties + all EXIF and IPTC propertiesTITLE (dc:title)DESCRIPTION (dc:description)DESCRIPTION WRITER (photoshop:CaptionWriter)RATING (xmp:Rating)KEYWORDS (dc:subject)GEO COORDINATES (exif:GPSLatitude, exif:GPSLongitude)LOCATION (photoshop:Country, photoshop:State, photoshop:City,..)AUTHOR (dc:creator, exif:Artist)RIGHTS (xmp:Rights).....
  28. 28. A quick look inside XMPDate/Time Metadata The original An ancient The digital The archived painting postcard representation image (metadata ( ~1507) (1925) of the postcard last edited in (2008) 2012) Iptc4xmpExt: photoshop: AODateCreated DateCreated xmp:CreateDate xmp:MetadataDate
  29. 29. Photo by Creative Commons CC BY 3.0Extending XMPCreative CommonsCC provides a legal and technicalinfrastructure to help people shareknowledge and creativity.CC defines a set ofproperties that allowauthors to specify underwhich conditions theircontent can bedistributed and used.CC recommends XMP forembedding CC propertiesinside resources.
  30. 30. Extending XMPPLUSPicture Licensing Universal SystemNon-profit organization whose mission is to simplifyand facilitate the communication and management ofimage rights.PLUS Registry○ unique ids for creators, right holders, images, ...○ access to rights information and other metadataPLUS License Data Format (LDF)○ metadata schema for embedding image license○ 88 properties○ dedicated XMP PLUS namespace
  31. 31. Extending XMPPRISMPublishing Requirements for Industry Standard MetadataDefined by IDEAlliance, a global community of contentand media creators.PRISM Metadata for Images provides information about:○ objects pictured (manufacturer, model, description, ...)○ slideshows (sequences of images)○ shooting info (viewpoint, season, visual technique, ...)PRISM Advertising Metadata provides information aboutthe usage of the image in an advertising campaignPRISM defines dedicated XMP namespaces: pmi and pam
  32. 32. Extending XMPArea TaggingMetadata Working Group○ XMP-MP Schema for face tags○ adopted by PicasaMicrosoft has created a new XMP schema for taggingpeople
  33. 33. Handling Social TaggingA research issue140 billion photos inFacebook (up to 2011) [ source: Jonathan Good, 2011 - ]
  34. 34. -3-Organizing images in catalogs
  35. 35. catalog catalognoun list of the contents of a library or a 1. to make an itemized list ofgroup of libraries, arranged according 2. to classify (a book or publication, forto any of various systems example) according to a categorical system[ ] [ ] Picture by Henry Trotter, 2005 - Source: Wikimedia Commons
  36. 36. Photo Cataloging SoftwarePrime goals of Photo Cataloging Software:○ provide a secure, long-term storage○ find the images when you need them○ interoperate with other tools of the same ecosystem (in the present, as well as future) An ecosystem is made up of many parts that must not only coexist but also work with each other to survive. When all the elements work in concert, the system can thrive. (Peter Krogh, The DAM Book)Photo Cataloguing Software falls into the broad domain ofDigital Asset Management.Lets try grabbing some definitions ...
  37. 37. Digital Asset Managementa term open to many definitions ... a way of keeping an overview of your digital files and make sure they dont get lost or altered unintentionally [J.Jacobsen, T.Schlenker, L.Edwards, Implementing a DAM System, Elsevier]the protocol for downloading, renaming, backing up, rating, grouping,archiving, optimizing, maintaining, thinning, and exporting files[P.Krog, The DAM Book, OReilly]a complete toolbox to the author, publisher, and the end users ofthe media to efficiently utilize the assets[D.Austerberry, Digital Asset Management 2nd edition, Focal Press]... and whose scope goes beyond the domain ofphotography Enterprise Creative Digital Content Industries Libraries Publishing Management
  38. 38. Core functionalitiesof a photo catalog / DAM software( will use these two terms interchangeably )○ Import images○ Harvest metadata○ Manage metadata in a database ( + index for search)○ Synchronize metadata○ Export images○ Organize photos with hierarchical keywords○ Manage originals, masters and derivatives files as different renditions of the same itemExtra functionalities such as file rename, raw converter,editor, publishing tools may be provided too.
  39. 39. Harvesting and synchronizing metadata EXIF IPTC IIM User Interface EXIF IPTC IIM ..... XMP Harvest metadata DatabaseImage Data Synchronize metadata import export Image Storage
  40. 40. Hierarchical keywords Photo by Isabelle Palatin CC BY-SA 2.0○ typically mapped to dc:subject○ no semantic rules for describing the hierarchy, special characters are used, e.g.: Organizations|Industry|ACME
  41. 41. Renditions / Version setsDifferent files related tothe same image under import Image Storagecertain circumstances ORIGINALshall be managed as asingle item. MASTER (edited)Covered by XMP-MM(Media Management) DERIVATIVES ... exportCataloging applicationsprovide differentsolutions (e.g. stacking,version sets) 1 item, N renditions
  42. 42. -4-Ensuring long-term storage: backup and migration
  43. 43. There are many causes of data loss lightning transfer errorsdisk / hardware failure theft floods loss Photo by Lucina M - CC BY-NC 2.0 viruses firehuman errors
  44. 44. Which files to backup Catalog (DB) Original Files Working Files Derivative Files Master Files
  45. 45. A possible backup strategy for single userworkflow Copy to optical storage (ORIGINALS, MASTERS, DERIVATEIVES) additional copy on 4 a remote NAS rsync (*) 1 2 3 OFF-LINE OFF-SITE PRIMARY ON-LINE BACKUP BACKUP STORAGE BACKUP (e.g. NAS) storage media are swapped at every backup (*) deleting files on the receiving side shall be disabled for ORIGINALS, additional copy on CLOUD 5 MASTERS and DERIVATIVES Service (Amazon S3, Elephant Drive, Symform. ...) CLOUD BACKUP
  46. 46. Migration Currently there are no permanent solutions for storing digital content. No media lasts forever, and file formats become obsolete. Migration must be considered as a necessary part of every storage strategy. [ ]○ file formats can become obsolete (just think what is happening to Kodak Photo CD ...)○ storage evolves (higher capacity, higher speed, ...)○ solution: ○ monitoring the storage process ○ conversion to newer and safer formats (e.g. DNG) ○ periodical replacement of storage devices
  47. 47. -5- An overview ofimage archiving tools and services
  49. 49. Image management applicationsExamples Photoshop INGESTION Picture Window Pro TOOL RASTER ImageIngester Pro SPECIAL PURPOSE IMAGE EDITOR EDITOR RAW Photomatix PROCESSOR Adobe Camera Raw Lightroom PARAMETRIC Image IMAGE CULLING Browser DAM ApertureEDITOR APPLICATION Bridge (Photo Fast Picture Viewer IDImager Catalog) Bibble Pro PUBLISHING TOOLS DEDICATED PRINTING SOFTWARE SCANNER Qimage SOFTWARE Quad Tone RIP Vuescan Silverfast
  50. 50. A few photo cataloging applications Product Notes Platforms Cost (EUR) include Adobe Camera RAW, Adobe Lightroom 4 many export features WIN / MAC 130 Photo Supreme (formerly very powerful catalog WIN / MAC 80 known as IDIMAGER) explorer, multiuser DB Phase One Media Pro (formerly known as Expression WIN / MAC ~85 Media, formerly as iView) Apple Aperture 3 MAC 63 Corel AfterShot Pro (formerly WIN / MAC ~50 known as Bibble Pro) RAW processing based on Digikam Software dcraw, rendition support from Linux free Collection 3 version 2 Picasa 3.9 WIN / MAC free PicaJet basic editing, multiuser DB WIN ~50 Common features: ○ parametric editor, with possibility to use an external editor ○ XMP support (with some issues when exporting/importing keyword hierarchies) ○ some kind of rendition support ○ trial period (typically 30 days)
  51. 51. Multi-user photo management○ commercial ○ Daminion ○ Canto Cumulus ○ Celum○ open-source ○ ZenPhoto (GPL) ○ Montala Resource Space (BSD) ○ Gallery (GPL) ○ Razuna (AGPL) ○ NotreDAM (GPL3)
  52. 52. -6- NotreDAM: an open-source DAMplatform developed at CRS4
