Managing image data for aquatic sciences - the best practices presentation


Published on

An introduction to some concepts of best practices with digital images to assist aquatic biologist in their analyses. Given at BIO in February 2012 to accompany the DFO Technical Report 2962.

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • compare to octopus and isopod images: so as long you know what and where to look for info, search will succeed
  • Managing image data for aquatic sciences - the best practices presentation

    1. 1. Claude Nozères Science Branch, Québec Region Fisheries and Oceans Canada Maurice Lamontagne Institute
    2. 2. Overview 1. introduction: the guide (Tech. Rep. 2962) 2. image data: what is it about? 3. captures: preparations 4. metadata: why all the bother? 5. workflows: recipes for work 6. exports: archives & publishing 7. trends: comments on new tech. 8. questions: findings on the tour so far afternoon: software demos & discussions
    3. 3. 1. Introduction: background  Personal experiences – taking digital photos of aquatic life since 2001  needed to document prey samples for marine mammals, and film wasn‘t doing a good job  became aware of mixed information among users ○ frustrations were common when using either consumer or industrial tools ○ by sharing experiences, our work may become easier, and better quality image data is produced
    4. 4. Introduction: guide & tour  DFO‘s National Image Data Management (NIDM) Working Group  fall 2010: began a ‗best practices guide‘ to assist employees with their imaging work  mid-Dec. 2011, published the first full version: Nozères. Tech. Rep. 2962, now online (WAVES)  Jan-Feb. 2012: tour of regions to introduce guide ○ the hope is that each site will then do a follow-up, with advanced workshops, for their needs
    5. 5. Objectives: this talk 1. 2. to introduce a sample of common, but perhaps misunderstood concepts in image data to learn about your experiences with gear and software so we can share this with others in DFO note: will also try to include latest information, not in the guide Headline: happy marine biologist Keywords: scene, joke, smurf Location: Belle-Isle Category: personal
    6. 6. 2. Image data – basic types  still image data (photo)  huge availability in consumer devices  well-established for industry & science but finicky?  moving image data (video)*  often consumer-oriented (family videos)  industrial applications: pricey and finicky?  information (metadata) 2008-08-0712:02:09...  data about the image data *note: video is not discussed in this brief introduction – see guide for information
    7. 7. Why talk of files as ‗data‘? just another pretty picture or an aquatic species observation? information clearly visible subjects Keywords: harbour seal, rock Location: Sainte-Luce Date: Sept. 8, 2009
    8. 8. Image data: perceptions  ‗If really so useful, we should all be doing it!‟  may end up generating stacks of fuzzy, dateless, unknown files = frustration  „I have enough science data to deal with‟  images as data may not be taken seriously  „I don‟t have time for more requirements!‟  learning about image data may be viewed as a time-waster instead of a work-saver
    9. 9. 3. Capturing image data  camera settings  format (file type)  quality (lossy compression)  size (dimensions....)  special topic: geotagging (GPS data)
    10. 10. Camera settings: file formats  JPG (8-bit) is default, or only option for many  good, but ‗baked‘ (limited for image editing)  RAW (10, 12, 14-bit) for advanced cameras  require post-processing with RAW software  sometimes capture both ‗RAW+JPG‘ ○ view JPG right away, store RAW for later edits  TIF is occasionally available (8 or 16-bit)  microscope & tethered cameras, scanners  good choice for image analysis (16-bit)  ‗baked‘ like JPG: harder to correct for whiteness
    11. 11. Camera settings: ‗quality’ (for JPG) Lossy compression: how much detail is to be discarded in JPG? Select quality: „basic, good, fine, v.fine‟ = low to high quality Lossless compression: no data loss, no need to set ‗quality‘ (RAW, TIF)
    12. 12. Settings: channels & bits  Channels  Grayscale has 1 channel (black)  RGB (for screen) has 3: Red, Green, Blue  CMYK (for print) has 4: Cyan, Magenta, Yellow, blacK  Bits: levels, or gradation ‗steps‘ (in each channel)  1-bit = 21 = 2 values, on/off, black or white (like a fax)  8-bit = 28 = 256 values for tones (gray or colour images)  10, 12, 14, 16-bit = many thousands of tone levels  note: most monitors only display in 8-bit  even if you can‘t see it, the data is there for analysis
    13. 13. Channels x Levels (tones) 8-bit: 256 steps black = 0 white = 256 X 16-bit: 65,536 steps black = 0 white = 65,536
    14. 14. Why more bits matter  high-bit RAW & TIFF files have more tones  important in image analysis for feature (subject) discrimination, like plankton in a water sample ○ 16-bit grayscale may be preferred over 8-bit colour  the extra information enables powerful software editing (recover detail in light and dark areas) ○ JPG 8-bit can be also edited, but less dramatic ○ TIF at 8-bit has same limits (16-bit allows more) ○ RAW is >8-bit (e.g.,10-14) Note: colour scanners may refer to 24-bit or 48-bit (3x8 or 3x16)
    15. 15. Settings: white balance  auto-white balance may be accurate, but sometimes better when set to conditions:  sunny, cloudy, shade, incandescent, fluorescent  JPG & TIF are ‗processed‘ files with their ‗whiteness‘ (white balance) set at capture  like a ‗Polaroid‘ instant photo: limited edits  RAW has metadata suggesting the setting, but is not fixed: can redo after capture  similar to a film negative: ‗reprocess it‘
    16. 16. Camera settings: white balancedefault capture RAW file: under fluorescent lights corrected file for white Background should be white – clicked on it with a correction tool and white balance was adjusted
    17. 17. Camera settings – size ...file size, image size (resolution), image re-sizing (pixel numbers), pixel density, sensor size, sensor photosites, photostitching...
    18. 18. Camera settings – size  2 MP 1600x1200 Image resolution      web 5 MP 2600x1900 web or small: good for onscreen viewing large (2 to 5 MP): good for regular prints full-size (usually about 8 to 16 MP): archives note: RAW is usually a full-size capture Why choose for less than ‗full-size‘?  digital zoom (like cropping) sometimes handy  situations when a large image is a burden ○ documenting labels, geotagging, emailing ○ caution: set back to full-size afterwards
    19. 19. Size: pixels vs. files  Settings for size (or resolution) are about image dimensions—how many megapixels (MP), not the computer file size in Kilobytes, Megabytes (KB, MB) blank test image 2 MP 1600 (across) x 1200 (high) pixels = 2 MP but file size will vary by format & compression  JPG with high compression = small file (68 KB)  TIFF with no compression = large file (5800 KB) ○ TIF with lossless compression of this image = (70 KB)
    20. 20. Size: dimensions vs. density  on the computer: resizing is increasing or decreasing the number of pixels (dimensions) 1600x1200 3600x2400 800x600 smaller (less pixels)  original) upsized (more pixels) but sometimes we say we ‗resize‘ for print  really just setting pixel density (dots per inch: dpi)  image size (number of pixels) has not changed smaller dots: 300 dpi Print viewing larger dots: 72 dpi Screen viewing
    21. 21. Of sensor sizes & megapixels  Sensor size: physical dimensions (mm)  SLR cameras have large sensors  compact cameras have tiny image sensors  Photosites : density of sites on the sensor  two cameras may have the same resolution, but the 12 megapixels of the SLR are over a much wider area (the larger sensor) than the 12 megapixels on a small-sensor compact
    22. 22. 20-80 MP Sensor sizes  Medium format & full-frame 35mm 12-24 MP  niche markets ($$)  slow development  smaller sensors most are versatile  extremely competitive  intense development 12-24 MP new Canon G1X common new Nikon 1 5-16 MP
    23. 23. Sensor sizes  35 mm & Medium format ( & larger) are useful in aerial surveys (e.g., marine mammals)  extreme level of fine, clean detail & tones  great for distances; macro work is trickier, bulky  most biology work is done with compacts or smaller (APS) SLRs: simpler, easier to use)  compacts for macro work: many can do 0-10 cm  getting ‗pretty good‘ results: use software processing to beat physical limits, reduce noise ○ not ‗fakery‘ but sometimes undesired (see example later)
    24. 24. Capture tips: boost image size ‘photostitching’
    25. 25. Capture tips Consult examples: online image galleries optional: have a colour card (or something white) in view, to correct for white balance contrasting background (white piece of plastic) - gray and black also good ruler or object in view for scale
    26. 26. Capture: Geolocation  some cameras have internal GPS to embed coordinates & correct time zone date  mostly in still cameras, but also some video (rare) ○ note: smartphones geotag both photos & videos  other cameras can have their images tagged with external data using, for example:  1) geotagged image at same location (e.g. smartphone)  2) GPS track and timestamp of image ○ note: image file must have correct clock time ○ tip: take a photo of the time on a GPS screen, then examine that photo‘s capture time info. to determine correction/adjustment for camera clock
    27. 27. Geolocation – image tagging Smartphone map (shows AIS) Smartphone photo (tagged with GPS) Camera with telephoto lens (but no GPS)
    28. 28. Geolocation – image tagging Smartphone photo (tagged with GPS) Keywords: ship, transport Location: Sainte-Flavie Category: personal load into the geotagging software the tagged photo with untagged photos taken from the same location SLR zoom photo (geotagged w/phone image)
    29. 29. Geolocation – GPS track sync record a GPS track log on an external device log while taking camera images  later, download images and the GPS track into geotagging software  the capture time of the photo will be used to determine its position at that time on the GPS track (‗sync‘)  embeds the coordinates into image file  NOTE: this is an example of image data information (metadata), and not about image quality
    30. 30. 4. Image (file) metadata tags  why the fuss over metadata?  we may do ‗tagging‘ in order to be able to locate, use, and credit the image files using the tags  where is the image metadata?  camera files have well-known, standard places to store this special text information  other image data, or non-standard information, may be entered in catalog files in a database system  do I need to do manually add all these tags?  some are automatically included by the camera, such as date, time, camera model (and GPS, if available)
    31. 31. Metadata tags: suggestions Common fields for tagging images: Filename: unique name (e.g, date-####.JPG) Title: name for photo (but often for ID #) Headline: short phrase about content Description: more info. about content Keywords: species name, subject Location: place or station name Creator: photographer‘s name
    32. 32. Tag example Filename: 20111014_IMG_1387.JPG useful, but not often done Title (catalog no.): 9682 Headline (quick describe): Arctic isopods Description/Caption (text on paper label): Hand-collected Mesidotea sabini from Causeway at low tide, held in an aquarium for one day Keywords: Saduria sabini Location: Frobisher Bay site 9 Creator: Claude Nozères
    33. 33. Good: added metadata tags can be as you like Bad: added metadata tags can be as you like Try to follow examples of others, e.g. IPTC, MWG, the DAM book (some rules exist, but most are open-ended) Example: Creator: unknown Posted on blogs since 2010 Was able to find it using the visible text in a Google search How would you tag this image? Title? Caption? Keyword? Make sure your metadata makes sense to users
    34. 34. Title: Rappahannock River,.... Description: (a literary quotation ?) Source: Mike Ashenfelder, 2011
    35. 35. Metadata: retaining & reading  Older or simpler software may be unaware  strip away camera metadata (capture date, etc)  Not all image browsing software play fair  Apple, Microsoft, and Google are all competing to make easy-to-use, popular tools  sometimes do hidden & proprietary processing ‗for your benefit‘ (automatically), which may be to the detriment of ‗industry-standard‘ metadata tags  recent examples: face recognition (all), geotagging (Windows Live), stripping of current tags (IPTC) with retired fields (Apple Aperture, iPhoto)
    36. 36. Metadata: summary for use  basic fields are easily read by most  advanced fields may be handy in projects  custom fields are available, but make sure your users are aware of their existence key lessons: 1) adopt a style and be consistent 2) let your users know what to expect 3) be vigilant for software behavior
    37. 37. 5. Image data workflows  can we do editing and tagging without worrying about how it works?  people want ‗recipes‘, or workflows  see guide no. 2962 for some examples  image data protocol examples ○ case studies for different work scenarios in aquatic sciences  image data software examples ○ practical examples using software tools
    38. 38. Guide workflows: for discussion the guide is not a fixed set of rules  rather it is a list of suggestions from recent work  which ways of working may be easier (& better) than others?  source: XKCD
    39. 39. Image Data work it doesn’t have to hurt when using the right tools (adapted from The Oatmeal)
    40. 40. Claude has Image Data work it doesn’t have to hurt when using the right tools (adapted from The Oatmeal)
    41. 41. Quote overheard yesterday* “How do you love Photoshop? Like someone loves their wife,...or their cousin...or?” “I love Photoshop like people love their kids – no way to get rid of it, so I have to love it” *Macworld Podcast – Less than Perfect: App Design
    42. 42. Image data work: a tale of 2 tools  Adobe Photoshop (PS)....20+ years  classic tool for editing and....everything! ○ most folks only use it for a few tasks  Adobe Lightroom (LR)....5+ years  revolutionary workflow tool, now matured ○ ‗95%‘ of my photo work is now done inside LR ○ extra functions available with shareware plugins Newsflash! Jan. 2012 – LR Public Beta 4: video editing, geotagging, photobooks
    43. 43. Image data work: managing tools  Browsers: ‗find‘ your images on a workstation      Windows Explorer (default – very limited) Google Picasa (easy, basic, free) Photoshop Bridge (full browser & metadata editor) Photoshop Elements Organizer (new: object searching) Cataloger & image editor  Adobe Lightroom (workstation, not network use)  Catalogers  Phase One Media Pro (workstation; free catalog reader)  Damnion, Canto Cumulus (network/server) demonstrations this afternoon (bring your laptop)
    44. 44. 6. Exporting: final work stages After capturing, tagging, editing images, we want to:  store the originals & edits (archiving)  distribute copies (publishing)
    45. 45. Exporting: archives  Ideally, this is about final edits in best quality with metadata tags that are stored securely in multiple locations and media This is an area that NIDM is working on: how to consolidate and preserve. Large projects are likely good, but smaller ones may need advice  3-2-1 approach is recommended (Krogh)  have 3 copies (original & 2 backups in rotation)  store on 2 kinds of media (hard drive, DVD)  keep 1 off-site (not all stored same place)
    46. 46. Exporting: galleries & print  may send re-sized versions:  800 pixel 72 dpi JPG is fine for web galleries, and especially for email  more-pixels, but at 150-300 dpi is for print (the density is important for clear prints)  for public viewing on web, review the file metadata & edit if desired  location, names, comments may be seen  edit in DAM (Bridge, MediaPro, LR)
    47. 47. Publishing – web CaRMS (Canadian Register of Marine Species) - online taxonomic resource with editors - also has a user-added image gallery - see Kennedy et al. Tech. Report - note: camera metadata is visible added on website camera metadata
    48. 48. Publishing – web  DFO has several image gallery projects  Coast Guard, SLGO, CaRMS, CMB, others?  Groups may join a large, existing gallery  Flickr is very popular and does some metadata  used by EOL, BHL, GBIF
    49. 49. 7. Trends: new camera types  before, chose either a digicam or a SLR  small device & average images, or big rig & great  was a demand for quality and compact at same time  ‗mirrorless interchangeable lens‘: MILC  Panasonic, Olympus, Sony, Nikon, Pentax  2011: new disruptive trends in compacts  ‗retro-style‘: Olympus Pen, Fujifilm X100, X10...  ‗ultra-modern‘ camera phones: iPhone 4S  2012: light field (Lytros) – ‗refocus anytime‘
    50. 50. tiny-sensor fixed compact large-sensor fixed compact lightfield (Lytros) large-sensor MILC
    51. 51. New tech: changing the game editing software  new camera types  high-sensivity sensors (lowlight)  solid-state memory (‗flash‘)  cheap hard drives  network storage (‗cloud‘, e.g., Dropbox)  tablets & tactile displays (iPad, Cintiq)  not just fashion: new types may lead to better image data and much improved workflow (easier & faster)
    52. 52. New tech: science benefits  lowlight sensors: reduce need to carry lights  fewer noisy, blurry (slow shutter) shots  compacts: easier to carry & use  capture events more often in the field  SSD: insensitive to ship vibration, magnets  use on underwater towsleds, aerial surveys  large drives: save all, do backups  don‘t bother to delete or waste $$ time reviewing  cloud services: share files with colleagues  don‘t burden email with huge attachments  tablets: field guides, rapid data entry & review
    53. 53. Newer is not always better Late 2011 DPReview test: indoors w/flash photo Pentax had long line of WP cameras, but recent models not good indoors Sony & Panasonic are new entrants, but are giving much better files clean detail mushy when indoors clean detail
    54. 54. Teleost Aug. 2011 new Pentax Optio when used indoors: mushy photo—hard to identify Canon Powershot: clean detail, easier to identify small organisms
    55. 55. Resources – websites The Luminous Landscape – practical opinions  The DAM Book forum – ―real DAM answers‖  – best practices & workflows  JISC Digital Media – advice & examples  Digital Photography Review (  WHOI HabCam – underwater photo  SERPENT projet – underwater video  CARMS Photogallery – species images 
    56. 56. Resources – books The DAM Book, 2nd edition, Krogh  Photoshop CS5 and Lightroom 3: A Photographer‟s Handbook, Laskevitch  Adobe Photoshop Lightroom 3: the missing FAQ, Brampton   Photographic Multishot Techniques, Steinhoff & Steinhoff  The VueScan Bible, Steinhoff  On Digital Photography, Johnson
    57. 57. Resources – documents (PDF)       GBIF Community Site: Best Practices Manuals Federal Agencies Digitization Initiative (FADGI), Still Image Working Group Metadata Working Group (MWG) IPTC Image Metadata Handbook Establishing best practices for marine biological data, Seeley et al. 2008, COWRIE CaRMS photogallery user guide, Kennedy et al. 2011. DFO Tech. Rep. 2933
    58. 58. Resources – software utilities  Ingestamatic, Photographer‘s Toolbox, JFriedl‘s Lightroom Goodies, Photo Mechanic, DVMP, CatDV, RoboGEO, Cineform, Clipwrap, Helicon Focus, CDFinder, CDWinder, NIS Elements  freeware: Picasa, ImageJ, VLC, VARS, ExifTool, IrfanView, Zooscan, Shotwell, Handbrake, Contour Storyteller, MPEG Streamclip
    59. 59. Obj. 2: learning – Sault-SteMarie  Otolith microscopy w/Image Pro (5 MP)  good file naming, 3-2-1 storage; might try tagging  Scanning historical slides of activities (size?..)  all notes are entered in filename – need to rethink this  Underwater video for lamprey control (volume?)  proprietary DVR: take video feed over RCA & capture  Underwater dam inspection using a 2 m pole  want live view & record; suggest using 2 dif. cameras  Photo folder on local server (8 GB)  do temp. catalog to browse, then do perm. catalog
    60. 60. Obj. 2: learning – Nanaimo  Otoliths: want to overlay 2 images, & dots  need 3rd party tools (Photoshop, ImageJ)  Import prior analyses (keywords into LR)  LR plugin (Syncomatic: based on filenames)  Reading catalog without full software  LR: not usually. ExMedia/Media Pro: Yes  Can we use alternative ingestion (import) tools?  Yes, Photo Mechanic, Ingestamatic may be useful for high volume, batch file entry (e.g., marine mammal surveys)  Easy way to get started and using tools like LR?  Various resources – our guide is an example, but we still need a forum or other place to post experiences and tips
    61. 61. Obj. 2: learning – Burlington Q‘s        does DFO have a site licence for this software? how to distribute a catalog on the network? can I use custom annotation fields in a catalog? what kind of scanner to archive histology slides? Flowcam produces a composite of plankton shots in sample: how to manage? Nikon imaging microscope produces custom files – how to manage? How to transfer hierarchical folder names into annotation fields? ...and more!
    62. 62. Obj. 2: learning – St. Andrews  geomatics & video lab: of screens & mice  had a quality monitor, but not great for viewing charts or when using a mouse and keyboard to trace habitats at same time as viewing images  solution: use the right display for different work: 1) HDTV for video (1920x1080 pixels) 2) 27in NEC for photos (2560 x1600 pixels) 3) 24in tactile display (Wacom Cintiq) for tracing habitat classifications—more efficient
    63. 63. Obj. 2: learning – St. Andrews  reusing legacy & custom equipment  big HDV camcorder, with $30K UW housing ○ don‘t want to buy a new camera & $$$ housing  solution: HDMI video out to flash memory cards  result: instant digital video (no tape playback to import), and higher quality (original video capture, not compressed to fit HDV tape)
    64. 64. Obj. 2: learning – St. John‘s  need a place to obtain and learn more  want workshops, website forums....CMB?...  exchange files with remote fisher. observers  receive and send feedback on species ID images  cloud computing seen as a solution (Dropbox)  have to enable software updates  older software versions (>3 yrs.) are not aware of current metadata and image file standards  want access to image files for regional guides  other regions may do ID books, want to do it here