Recent advances in visual information retrieval marques klu june 2010

2,807 views

Published on

Technical colloquium at Klagenfurt University, June 11, 2010.

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,807
On SlideShare
0
From Embeds
0
Number of Embeds
368
Actions
Shares
0
Downloads
45
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Recent advances in visual information retrieval marques klu june 2010

  1. 1. Oge Marques Florida Atlantic University Boca Raton, FL - USA
  2. 2.   VIR is a highly interdisciplinary field, but … Image and (Multimedia) Information Video Database Retrieval Processing Systems Visual Machine Computer Learning Information Vision Retrieval Visual data Human Visual Data Mining modeling and Perception representation Klagenfurt - June 2010
  3. 3.   There are many things that I believe…   … but cannot prove Klagenfurt - June 2010
  4. 4. The “big mismatch” Klagenfurt - June 2010
  5. 5.   Part I ◦  10 years after the “end of the early years”   Where are we now?   Part II ◦  Medical image retrieval   Challenges and opportunities   Part III ◦  Where is VIR headed?   Advice for young researchers Klagenfurt - June 2010
  6. 6.   It’s been 10 years since the “end of the early years” [Smeulders et al., 2000] ◦  Are the challenges from 2000 still relevant? ◦  Are the directions and guidelines from 2000 still appropriate? Klagenfurt - June 2010
  7. 7.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Driving forces   “[…] content-based image retrieval (CBIR) will continue to grow in every direction: new audiences, new purposes, new styles of use, new modes of interaction, larger data sets, and new methods to solve the problems.” Klagenfurt - June 2010
  8. 8.   Yes, we have seen many new audiences, new purposes, new styles of use, and new modes of interaction emerge.   Each of these usually requires new methods to solve the problems that they bring.   However, not too many researchers see them as a driving force (as they should). Klagenfurt - June 2010
  9. 9.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Heritage of computer vision   “An important obstacle to overcome […] is to realize that image retrieval does not entail solving the general image understanding problem.” Klagenfurt - June 2010
  10. 10.   I’m afraid I have bad news… ◦  Computer vision hasn’t made so much progress during the past 10 years. ◦  Some classical problems 
 (including image 
 understanding)
 remain unresolved. ◦  Similarly, CBIR from a 
 pure computer vision
 perspective didn’t work 
 too well either. Klagenfurt - June 2010
  11. 11.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Influence on computer vision   “[…] CBIR offers a different look at traditional computer vision problems: large data sets, no reliance on strong segmentation, and revitalized interest in color image processing and invariance.” Klagenfurt - June 2010
  12. 12.   The adoption of large data sets became standard practice in computer vision (see Torralba’s work).   No reliance on strong segmentation (still unresolved)  new areas of research, e.g., automatic ROI extraction and RBIR.   Color image processing and color descriptors became incredibly popular, useful, and (to some degree) effective.   Invariance still a huge problem ◦  But it’s cheaper than ever to have multiple views. Klagenfurt - June 2010
  13. 13.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Similarity and learning   “We make a pledge for the importance of human- based similarity rather than general similarity. Also, the connection between image semantics, image data, and query context will have to be made clearer in the future.”   “[…] in order to bring semantics to the user, learning is inevitable.” Klagenfurt - June 2010
  14. 14.   The authors were pointing in the right direction (human in the loop, role of context, benefits from learning,…)   However: ◦  Similarity is a tough problem to crack and model.   Even the understanding of how humans judge image similarity is very limited. ◦  Machine learning is almost inevitable…   … but sometimes it can be abused. Klagenfurt - June 2010
  15. 15.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Interaction   Better visualization options, more control to the user, ability to provide feedback […] Klagenfurt - June 2010
  16. 16.   Significant progress on visualization interfaces and devices.   Relevance Feedback: still a very tricky tradeoff (effort vs. perceived benefit), but more popular than ever (rating, thumbs up/ down, etc.) Klagenfurt - June 2010
  17. 17.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Need for databases   “The connection between CBIR and database research is likely to increase in the future. […] problems like the definition of suitable query languages, efficient search in high dimensional feature space, search in the presence of changing similarity measures are largely unsolved […]” Klagenfurt - June 2010
  18. 18.   Very little progress ◦  Image search and retrieval has benefited much more from document information retrieval than from database research. Klagenfurt - June 2010
  19. 19.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  The problem of evaluation   CBIR could use a reference standard against which new algorithms could be evaluated (similar to TREC in the field of text recognition).   “A comprehensive and publicly available collection of images, sorted by class and retrieval purposes, together with a protocol to standardize experimental practices, will be instrumental in the next phase of CBIR.” Klagenfurt - June 2010
  20. 20.   Significant progress on benchmarks, standardized datasets, etc. ◦  ImageCLEF ◦  Pascal VOC Challenge ◦  MSRA dataset ◦  Simplicity dataset ◦  UCID dataset and ground truth (GT) ◦  Accio / SIVAL dataset and GT ◦  Caltech 101, Caltech 256 ◦  LabelMe Klagenfurt - June 2010
  21. 21.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Semantic gap and other sources   “A critical point in the advancement of CBIR is the semantic gap, where the meaning of an image is rarely self-evident. […] One way to resolve the semantic gap comes from sources outside the image by integrating other sources of information about the image in the query.” Klagenfurt - June 2010
  22. 22.   The semantic gap problem has not been solved (and maybe will never be…)   What are the alternatives? 1.  Treat visual similarity and semantic relatedness differently   Examples: Alipr, Google similarity search, etc. 2.  Improve both (text-based and visual) search methods independently 3.  Trust the user   CFIR, collaborative filtering, crowdsourcing, games. Klagenfurt - June 2010
  23. 23.   Challenges ◦  We’re entering a new country…   How much can we bring?   Do we speak the language?   Do we know their culture?   Do they understand us and where we come from?   Opportunities ◦  They use images (extensively) ◦  They have expert knowledge ◦  Domains are narrow (almost by definition) ◦  Fewer clients, but potentially more $$ Klagenfurt - June 2010
  24. 24.   An overview of the challenges ◦  Different terminology ◦  Standards (e.g., DICOM) ◦  Modality dependencies ◦  Equipment dependencies ◦  Privacy issues ◦  Proprietary data ◦  A tough sell? Klagenfurt - June 2010
  25. 25.   Be prepared for: ◦  New acronyms   CBMIR (Content-Based Medical Image Retrieval)   PACS (Picture Archiving and Communication System)   DICOM (Digital Imaging and COmmunication in Medicine)   Hospital Information Systems (HIS)   Radiological Information Systems (RIS) ◦  New phrases   Imaging informatics ◦  Lots of technical medical terms Klagenfurt - June 2010
  26. 26.   DICOM (http://medical.nema.org/) ◦  Global IT standard, created in 1993, used in virtually all hospitals worldwide. ◦  Designed to ensure the interoperability of different systems and manage related workflow. ◦  Will be required by all EHR systems that include imaging information as an integral part of the patient record. ◦  750+ technical and medical experts participate in 20+ active DICOM working groups. ◦  Standard is updated 4-5 times per year. ◦  Many available tools! (see http://www.idoimaging.com/) Klagenfurt - June 2010
  27. 27.   The IRMA code [Lehmann et al., 2003] ◦  4 axes with 3 to 4 positions, each in {0,...9,a,...,z}, where "0" denotes "unspecified" to determine the end of a path along an axis.   Technical code (T) describes the imaging modality   Directional code (D) models body orientations   Anatomical code (A) refers to the body region examined   Biological code (B) describes the biological system examined. Klagenfurt - June 2010
  28. 28.   The IRMA code [Lehmann et al., 2003] ◦  The entire code results in a character string of <14 characters (IRMA: TTTT – DDD – AAA – BBB). Example: “x-ray, projection radiography, analog, high energy – sagittal, left lateral decubitus, inspiration – chest, lung – respiratory system, lung” Source: [Lehmann et al., 2003] Klagenfurt - June 2010
  29. 29.   The IRMA code [Lehmann et al., 2003] ◦  The companion tool… Source: [Lehmann et al., 2004] Klagenfurt - June 2010
  30. 30.   Most current retrieval systems in clinical use rely on text keywords such as DICOM header information to perform retrieval.   CBIR has been widely researched in a variety of domains and provides an intuitive and expressive method for querying visual data using features, e.g. color, shape, and texture.   Current CBIR systems: ◦  are not easily integrated into the healthcare environment; ◦  have not been widely evaluated using a large dataset; and ◦  lack the ability to perform relevance feedback to refine retrieval results. Source: [Hsu et al., 2009] Klagenfurt - June 2010
  31. 31.   CBMIR is still a relatively small dot on the map of the medical imaging community. Source: Program of SPIE Medical Imaging 2010 Multiconference Klagenfurt - June 2010
  32. 32.   New gaps! ◦  Just when you thought the semantic gap was your only problem… Source: [Deserno, Antani, and Long, 2009] Klagenfurt - June 2010
  33. 33.   USA ◦  NIH (National Institutes of Health)   NIBIB - National Institute of Biomedical Imaging and Bioengineering   NCI - National Cancer Institute   NLM – National Libraries of Medicine ◦  Several universities and hospitals   Europe ◦  Aachen University (Germany) ◦  Geneva University (Switzerland)   Big companies (Siemens, GE, etc.) Klagenfurt - June 2010
  34. 34.   IRMA (Image Retrieval in Medical Applications) ◦  Aachen University (Germany)   http://ganymed.imib.rwth-aachen.de/irma/ ◦  3 online demos:   IRMA Query demo: allows the evaluation of CBIR on several databases.   IRMA Extended Query Refinement demo: CBIR from the IRMA database (a subset of 10,000 images).   Spine Pathology and Image Retrieval Systems (SPIRS) designed by the NLM/NIH (USA): holds information of ~17,000 spine x-rays. Klagenfurt - June 2010
  35. 35.   MedGIFT (GNU Image Finding Tool) ◦  Geneva University (Switzerland)   http://www.sim.hcuge.ch/medgift/ ◦  Large effort, including projects such as:   Talisman (lung image retrieval)   Case-based fracture image retrieval system   Onco-Media: medical image retrieval + grid computing   ImageCLEF: evaluation and validation   medSearch Klagenfurt - June 2010
  36. 36.   WebMIRS ◦  NIH / NLM (USA)   http://archive.nlm.nih.gov/proj/webmirs/index.php ◦  Query by text + navigation by categories ◦  Uses datasets and related x-ray images from the National Health and Nutrition Examination Survey (NHANES) Klagenfurt - June 2010
  37. 37.   SPIRS (Spine Pathology & Image Retrieval System): Web-based image retrieval system for large biomedical databases ◦  NIH / UCLA (USA) ◦  Great case study on highly specialized CBMIR Klagenfurt - June 2010 Source: [Hsu et al., 2009]
  38. 38.   National Biomedical Imaging Archive (NBIA) ◦  NCI / NIH (USA)   https://imaging.nci.nih.gov/ ◦  Search based on metadata (DICOM fields) ◦  3 search options:   Simple   Advanced   Dynamic Klagenfurt - June 2010
  39. 39.   ARSS Goldminer ◦  American Roentgen Ray Society (USA)   http://goldminer.arrs.org/ ◦  Query by text ◦  Results can be filtered by:   Modality   Age   Sex Klagenfurt - June 2010
  40. 40.   Yottalook Images ◦  iVirtuoso (USA)   http://www.yottalook.com/ ◦  Developed and maintained by four radiologists ◦  Query by text ◦  Claims to use 4 “core technologies”:   "natural query analysis”   "semantic ontology”   “relevance algorithm”   a specialized content delivery system that provides high yield content based on the search term. Klagenfurt - June 2010
  41. 41.   ImageCLEF Medical Image Retrieval 2010   http://www.imageclef.org/2010/medical ◦  Data set: 77,000 images from articles published in Radiology and Radiographics including text of the captions and link to the html of the full text articles. ◦  3 types of tasks:   Modality Classification: given an image, return its modality (MR, CT, XR, etc.)   Ad-hoc retrieval: classic medical retrieval task, with 3 “flavors”: textual, mixed and semantic queries   Case-based retrieval: retrieve cases including images that might best suit the provided case description. Klagenfurt - June 2010
  42. 42.   Better user interfaces, which are responsive, highly interactive, and capable of supporting relevance feedback. ◦  In other words, address the “Performance Gap Category” and the “Usability Gap Category”. Klagenfurt - June 2010
  43. 43.   New applications of CBMIR, including: ◦  Teaching ◦  Research ◦  Diagnosis ◦  PACS and Electronic Patient Records   CBMIR evaluation using medical experts   Integration of local and global features Klagenfurt - June 2010
  44. 44.   New descriptors ◦  Example: the Fuzzy Rule Based Compact Composite Descriptor (CCD), which includes global image features capturing both brightness and texture characteristics in a 1D Histogram [Chatzichristofis & Boutalis, 2009] Klagenfurt - June 2010
  45. 45.   Partial match schemes (see [Hsu et al., 2009]) Source: [Hsu et al., 2009] Klagenfurt - June 2010
  46. 46.   New devices (e.g., iPad) Klagenfurt - June 2010
  47. 47.   Advice for [young] researchers ◦  In this last part, I’ve compiled pieces and bits of advice that I believe might help researchers who are entering the field. ◦  They focus on research avenues that I personally consider to be the most promising. Klagenfurt - June 2010
  48. 48.   LOOK… ◦  at yourself (how do you search for images and videos?) ◦  around (related areas and how they have grown) ◦  at Google (and other major players) Klagenfurt - June 2010
  49. 49.   Which sites do you use? ◦  Why?   Which search options do you use? ◦  What do you do when the returned results aren’t good?   What is the single most useful feature that you wish those sites had?   What are your intentions and how do you express them? Klagenfurt - June 2010
  50. 50.   Semi-automatic image annotation   Tag recommendation systems   Story annotation engines   Content-based image filtering   Copyright detection   Watermark detection ◦  and many more Klagenfurt - June 2010
  51. 51.   Google Similarity Search (VisualRank) [Jing & Baluja, 2008]   Google Goggles (mobile visual search) Klagenfurt - June 2010
  52. 52.   THINK… ◦  mobile devices ◦  new devices and services ◦  social networks ◦  games Klagenfurt - June 2010
  53. 53.   Google Goggles understands narrow-domain search and retrieval   Several other apps for iPhone, iPad, and Android (e.g., kooaba and Fetch!) Klagenfurt - June 2010
  54. 54.   Flickr (b. 2004)   YouTube (b. 2005)   Flip video cameras (b. 2006)   iPhone (b. 2007)   iPad (b. 2010) Klagenfurt - June 2010
  55. 55.   The Web 2.0 has brought about: ◦  New data sources ◦  New usage patterns ◦  New understanding about the users, their needs, habits, preferences ◦  New opportunities ◦  Lots of metadata! ◦  A chance to experience a true paradigm shift   Before: image annotation is tedious, labor-intensive, expensive   After: image annotation is fun! Klagenfurt - June 2010
  56. 56. ◦  Google Image Labeler ◦  Games with a purpose (GWAP):   The ESP Game   Squigl   Matchin Klagenfurt - June 2010
  57. 57.   UNDERSTAND… ◦  human intentions ◦  human emotions ◦  user’s preferences and needs Klagenfurt - June 2010
  58. 58.   CREATE… ◦  better interfaces ◦  better user experience ◦  new business opportunities (added value) Klagenfurt - June 2010
  59. 59.   Image Genius (sponsored by FAU / will become startup)
  60. 60.   Fully functional online prototype of a medical image retrieval system (MEDIX) with DICOM capabilities
  61. 61.   Unsupervised ROI extraction from an image (by Gustavo B. Borba, UTFPR, Brazil)
  62. 62. –  I believe (but cannot prove…) that successful VIR solutions will: •  combine content-based image retrieval (CBIR) with metadata (high-level semantic-based image retrieval) •  only be truly successful in narrow domains •  include the user in the loop –  Relevance Feedback (RF) –  Collaborative efforts (tagging, rating, annotating) •  provide friendly, intuitive interfaces •  incorporate results and insights from cognitive science, particularly human visual attention, perception, and memory Klagenfurt - June 2010
  63. 63.   “Image search and retrieval” is not a problem, but rather a collection of related problems that look like one.   There is a great need for good solutions to specific problems.   10 years after “the end of the early years”, research in visual information retrieval still has many open problems, challenges, and opportunities. Klagenfurt - June 2010
  64. 64. Questions? omarques@fau.edu Klagenfurt - June 2010

×