A brief introduction to extracting information from imagesJonathon HareUniversity of Southampton
What can images tell us?How are images represented in digital computersHow do we extract information from imagesExamples of some different extraction techniquesAnalogies with textFree software!Contents
Images CAN…the main roles of images in the communications process
Attract attention and make documents more appealing
Convey opinions and emotional messages
Convey information for documenting a claim
Representation and Understandinghow a computer “sees”
digital image Representation 87  91  85 ... 86  86  81 ... 88  85  84 ...... ... ... ...137 145 144 ...153 150 137 ...148 139 123 ...... ... ... ... 89  91  89 ... 84  88  90 ... 88  87  90 ...... ... ... ...
Understanding An Image
Feature Extractionf(x)Feature extraction is the process of extracting “descriptors” from an image. Descriptors describe some aspect of the image content.Typically, a descriptor is a numerical vector called a “feature vector”, however other forms of descriptor are possible.
Higher-level featuresDirectly interpretable by humansi.e. the number of faces in the imageEither hand-crafted or trained with machine learning techniquesLower-level featuresMuch more abstract; convey a notion of the image contenti.e. the colour distribution of the imageIMAGE Feature Morphology
Example High-Level Featuresfaces, composition & photoshop disasters
The detection of faces in an image is a very useful feature for inferring information about an imageFace detection is the first step of face recognitionThe most popular face detection algorithm is the “Viola-Jones” detectorConceptually simpleUses machine learning; Requires training (slow).Very fast detectionHigh-level features: face Detection
Viola-Jones face DetectionBank of filters. Consider all possible position, scale and type parameters(very large numbers of features)For each feature create a simple (weak) binary classifier (a stump)Use ADABOOST to select the informative features P. Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004. (first version appeared at CVPR 2001)
Viola-Jones face DetectionP. Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004. (first version appeared at CVPR 2001)
Photographers use the “rule-of-thirds” to improve the composition of their photos.The basic idea is to place main subjects at roughly one-third of the horizontal or vertical dimension of the photograph. High-level features: Composition
High-level features: CompositionIt is possible to design features that look for the presence of composition using the rule-of thirdsimagesaliency mapsegments + saliency mapdistance to closest power-pointarea of segment * saliency of segmentChe-HuaYeh, Yuan-Chen Ho, Brian A. Barsky, and Ming Ouhyoung. "Personalized Photograph Ranking and Selection System". In ACM Multimedia 2010, pages 211–220, October 2010.
High-level features: Tampering
High-level features: Tampering
High-level features: TamperingA Political Advertisement for George W. BushAutomatic cloning detection (“copy-move” forgery)
Example LOW-Level Featurescolour histograms, segments and sift
Global features describe the content of an entire imageOne of the simplest global features is the “Global RGB Colour Histogram”Quantise each pixel into a discrete number of colours and then build a histogram.Low-Level Features: Global
Global features are useful for some tasks, but in many cases are not powerful enoughLocal features attempt to overcome this by breaking the image into smaller parts from which to extract featuresThree primary techniques for splitting up the imageLow-level features: Localsegmentationsalient regions &interest pointsgrids & blocks
Salient interest regions and their associated features are currently the most popular way of describing an image content.Extracting image features using interest regions is a two-part process:Find regionsExtract feature to describe region propertiesTypically, the resultant image feature will have a variable length, dependent on the number of regionsSalient interest regions
Important regions portray:RepeatabilitySaliency Corners and blobs have these qualitiesDetectable using various techniquesDifference of Gaussian - cornersHarris corner detector - cornersMSER - blobsSalient interest region Locationcornersblobs
Good region descriptors portray:Resilience to image transformsCompactnessEmphasise different image characteristics:Pixel intensities, colour, texture, edges etc.Common descriptors include:SIFT: histogram of edge orientationShape context: histogram of edge locationSalient interest region descriptors
SIFT: Scale Invariant Feature Transform
Analogies with textintroducing the visual bag-of-words
In the computer vision community over recent years it has become popular to model the content of an image in a similar way to a “bag-of-terms” in textual document analysis.Bags of Visual Words
Features localised by a robust region detector and described by a local descriptor such as SIFT.A vocabulary of exemplar feature-vectors is learnt.Traditionally through k-means clustering.Local descriptors can then be quantised to discrete visual terms by finding the closest exemplar in the vocabulary.BoVW using local features
BOVW models have many applicationsAuto-annotation and object recognitionConcept classificationLarge-scale indexingApplications of BOVW
open-source tools for image analysis and indexingintroducing openimaj & imageterrier
http://www.openimaj.orgOpen-source (BSD Licence) libraries and tools for multimedia (image, video, sound) analysis and information extractionImplemented in Java; use with any JVM languageImplementations of all the techniques mentioned in this tutorialScalability of extraction using Hadoop with the included tools
http://www.imageterrier.orgExtension to the Terrier retrieval system to allow indexing of imagesCollections and documents that read data produced from image feature extractors.New indexers and supporting classes to make compressed augmented inverted indices for visual term data.New distance measures implemented as WeightingModels.Geometric re-ranking implemented as DocumentScoreModifiers.Command-line tools for indexing and searching.Freely available under the Mozilla Licence

A brief introduction to extracting information from images

  • 1.
    A brief introductionto extracting information from imagesJonathon HareUniversity of Southampton
  • 2.
    What can imagestell us?How are images represented in digital computersHow do we extract information from imagesExamples of some different extraction techniquesAnalogies with textFree software!Contents
  • 3.
    Images CAN…the mainroles of images in the communications process
  • 4.
    Attract attention andmake documents more appealing
  • 6.
    Convey opinions andemotional messages
  • 8.
    Convey information fordocumenting a claim
  • 10.
  • 11.
    digital image Representation87 91 85 ... 86 86 81 ... 88 85 84 ...... ... ... ...137 145 144 ...153 150 137 ...148 139 123 ...... ... ... ... 89 91 89 ... 84 88 90 ... 88 87 90 ...... ... ... ...
  • 12.
  • 13.
    Feature Extractionf(x)Feature extractionis the process of extracting “descriptors” from an image. Descriptors describe some aspect of the image content.Typically, a descriptor is a numerical vector called a “feature vector”, however other forms of descriptor are possible.
  • 14.
    Higher-level featuresDirectly interpretableby humansi.e. the number of faces in the imageEither hand-crafted or trained with machine learning techniquesLower-level featuresMuch more abstract; convey a notion of the image contenti.e. the colour distribution of the imageIMAGE Feature Morphology
  • 15.
    Example High-Level Featuresfaces,composition & photoshop disasters
  • 16.
    The detection offaces in an image is a very useful feature for inferring information about an imageFace detection is the first step of face recognitionThe most popular face detection algorithm is the “Viola-Jones” detectorConceptually simpleUses machine learning; Requires training (slow).Very fast detectionHigh-level features: face Detection
  • 17.
    Viola-Jones face DetectionBankof filters. Consider all possible position, scale and type parameters(very large numbers of features)For each feature create a simple (weak) binary classifier (a stump)Use ADABOOST to select the informative features P. Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004. (first version appeared at CVPR 2001)
  • 18.
    Viola-Jones face DetectionP.Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004. (first version appeared at CVPR 2001)
  • 19.
    Photographers use the“rule-of-thirds” to improve the composition of their photos.The basic idea is to place main subjects at roughly one-third of the horizontal or vertical dimension of the photograph. High-level features: Composition
  • 20.
    High-level features: CompositionItis possible to design features that look for the presence of composition using the rule-of thirdsimagesaliency mapsegments + saliency mapdistance to closest power-pointarea of segment * saliency of segmentChe-HuaYeh, Yuan-Chen Ho, Brian A. Barsky, and Ming Ouhyoung. "Personalized Photograph Ranking and Selection System". In ACM Multimedia 2010, pages 211–220, October 2010.
  • 21.
  • 22.
  • 23.
    High-level features: TamperingAPolitical Advertisement for George W. BushAutomatic cloning detection (“copy-move” forgery)
  • 24.
    Example LOW-Level Featurescolourhistograms, segments and sift
  • 25.
    Global features describethe content of an entire imageOne of the simplest global features is the “Global RGB Colour Histogram”Quantise each pixel into a discrete number of colours and then build a histogram.Low-Level Features: Global
  • 26.
    Global features areuseful for some tasks, but in many cases are not powerful enoughLocal features attempt to overcome this by breaking the image into smaller parts from which to extract featuresThree primary techniques for splitting up the imageLow-level features: Localsegmentationsalient regions &interest pointsgrids & blocks
  • 27.
    Salient interest regionsand their associated features are currently the most popular way of describing an image content.Extracting image features using interest regions is a two-part process:Find regionsExtract feature to describe region propertiesTypically, the resultant image feature will have a variable length, dependent on the number of regionsSalient interest regions
  • 28.
    Important regions portray:RepeatabilitySaliencyCorners and blobs have these qualitiesDetectable using various techniquesDifference of Gaussian - cornersHarris corner detector - cornersMSER - blobsSalient interest region Locationcornersblobs
  • 29.
    Good region descriptorsportray:Resilience to image transformsCompactnessEmphasise different image characteristics:Pixel intensities, colour, texture, edges etc.Common descriptors include:SIFT: histogram of edge orientationShape context: histogram of edge locationSalient interest region descriptors
  • 30.
    SIFT: Scale InvariantFeature Transform
  • 31.
    Analogies with textintroducingthe visual bag-of-words
  • 32.
    In the computervision community over recent years it has become popular to model the content of an image in a similar way to a “bag-of-terms” in textual document analysis.Bags of Visual Words
  • 33.
    Features localised bya robust region detector and described by a local descriptor such as SIFT.A vocabulary of exemplar feature-vectors is learnt.Traditionally through k-means clustering.Local descriptors can then be quantised to discrete visual terms by finding the closest exemplar in the vocabulary.BoVW using local features
  • 34.
    BOVW models havemany applicationsAuto-annotation and object recognitionConcept classificationLarge-scale indexingApplications of BOVW
  • 35.
    open-source tools forimage analysis and indexingintroducing openimaj & imageterrier
  • 36.
    http://www.openimaj.orgOpen-source (BSD Licence)libraries and tools for multimedia (image, video, sound) analysis and information extractionImplemented in Java; use with any JVM languageImplementations of all the techniques mentioned in this tutorialScalability of extraction using Hadoop with the included tools
  • 37.
    http://www.imageterrier.orgExtension to theTerrier retrieval system to allow indexing of imagesCollections and documents that read data produced from image feature extractors.New indexers and supporting classes to make compressed augmented inverted indices for visual term data.New distance measures implemented as WeightingModels.Geometric re-ranking implemented as DocumentScoreModifiers.Command-line tools for indexing and searching.Freely available under the Mozilla Licence

Editor's Notes

  • #22 Reuters got in some trouble because of image manipulation recently, and this resulted in a backlash in the press. There is a blog “photoshop disasters” with many examples of tampering; here are just a few...
  • #23 This is a case of image tampering in an image published Reuters and later withdrawn by Reuters. The image depicts Beirut after an Israeli air strike. The tampering makes the scene look worse than it perhaps was. The use of the clone tool is quite evident however.August 2006: This photograph by Adnan Hajj, a Lebanese photographer, showed thick black smoke rising above buildings in the Lebanese capital after an Israeli air raid. The Reuters news agency initially published this photograph on their web site and then withdrew it when it became evident that the original image had been manipulated to show more and darker smoke. "Hajj has denied deliberately attempting to manipulate the image, saying that he was trying to remove dust marks and that he made mistakes due to the bad lighting conditions he was working under", said Moira Whittle, the head of public relations for Reuters. "This represents a serious breach of Reuters' standards and we shall not be accepting or using pictures taken by him." A second photograph by Hajj was also determined to have been doctored.** The picture on the left was created around 1864 - it is supposed to depict Ulysses S. Grant in front of his troops not far from here, at City Point Virginia. Unfortunately, this is an example of an early forgery; the rider on the horse, is actually Major General McCook. McCook and his horse have been superimposed on the image an image of confederate prisoners at Fishers Hill, and Grant’s head on top of this!circa 1864: This print purports to be of General Ulysses S. Grant in front of his troops at City Point, Virginia, during the American Civil War. Some very nice detective work by researchers at the Library of Congress revealed that this print is a composite of three separate prints: (1) the head in this photo is taken from a portrait of Grant; (2) the horse and body are those of Major General Alexander M. McCook; and (3) the background is of Confederate prisoners captured at the battle of Fisher's Hill, VA.
  • #24 So, images can be tampered with, but is there any way to detect this automatically? There is a whole research field based around the idea of forensic techniques. Here are two examples of the kind of automatic forensic processing that is possibleCloning parts of images to hide something is common. In this case the original picture showed George bush on a lectern. Automatic analysis is able to detect the manipulations.