Information from
Pixels
Dave Snowdon
@davesnowdon
https://github.com/davesnowdon/ljc-information-from-pixels
http://www.slideshare.net/DaveSnowdon1/information-from-pixels
Summary
• Why? What?
• Range operations and colour spaces
• Kernels & convolution
• Object detection
• Contours
• Conclusion
Why me?
• Social robotics developer
• Social robots need to handle unstructured
environments
• Vision is the most versatile way of sensing the
environment
Most general purpose
sensor
Machine vision
• Tracking movement: Dyson 360, Google Tango
• Recognising people, biometric security
• Recognising medication
• Image search
• …
Why this is hard
https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
Why this is hard
• Colour reproduction, lighting & white balance
• Perspective & rotation effects
• Noise
• Different scales
Rotations & perspective
Open CV
The good news
• Open source
• Tried and tested
• Large collections of algorithms
• Language bindings for C, python & java
• Runs on pretty much anything (Linux, Mac,
Windows, android, iOS, RaspberryPi)
The less good news
• Native code
• Java API is a bit clunky
• Not much structure
• Not the new shiny
Range operations &
colour spaces
RGB
https://en.wikipedia.org/wiki/RGB_color_model#/media/File:RGB_color_solid_cube.png
HSV
https://upload.wikimedia.org/wikipedia/commons/a/a0/Hsl-hsv_models.svg
L*a*b* / CIELAB
https://gurus.pyimagesearch.com/wp-content/uploads/2015/03/color_spaces_lab_axis.jpg
Blob detection
Get an image
• From a Java image
• From video / webcam
org.opencv.videoio.VideoCapture
• From file
import org.opencv.core.Mat;
Mat image = Imgcodecs.imread(filename);
org.opencv.core.Mat
new Mat(numRows, numColumns, CvType.CV_8UC3);
• Dense multi-dimensional matrix
• Variants with int, double, byte values
• Implements basic matrix operations
B G R B G R B G R B G R
B G R B G R B G R B G R
B G R B G R B G R B G R
Blur the image
Imgproc.GaussianBlur(image, result,
new Size(kernelSize, kernelSize),
0.0);
Convert to HSV
Imgproc.cvtColor(input, hsv,
Imgproc.COLOR_BGR2HSV);
Select only pixels in range
Core.inRange(image, low, high, result);
Erode & Dilate
final Mat se = Imgproc.getStructuringElement(Imgproc.MORPH
new Size(kernelSize, kernelSize));
Imgproc.erode(image, result, se, new Point(-1, -1), numIteration
Imgproc.dilate(image, result, se, new Point(-1, -1), numIteration
Find contours
Imgproc.findContours(image, contours, new Mat(),
Imgproc.RETR_EXTERNAL,
Imgproc.CHAIN_APPROX_SIMPLE);
Find largest contour
contours.stream()
.max((c1, c2) ->
(Imgproc.contourArea(c1) > Imgproc.contourArea(c2) ? 1
: -1))
.get();
Draw contour (for demo)
Imgproc.circle(image, centre, 5, CENTRE_COLOUR, 2);
Imgproc.drawContours(image,
Arrays.asList(contour), 0, OUTLINE_COLOUR, 2);
Output image
• Don’t always need to
• Grab region of interest
Mat roi = mat.submat(Rect)
• Convert to java image
BufferedImage javaImage = Util.matrixToImage(mat);
Util.displayImage(command, javaImage);
• Write to file
Imgcodecs.imwrite(filename, mat);
Built-in blob detection
• OpenCV has built-in blob detection:
SimpleBlobDetector
• blob detection by colour may not work
• Blog post: https://www.learnopencv.com/blob-
detection-using-opencv-python-c/
Kernels & Convolution
Convolution
//developer.apple.com/library/mac/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations
Example kernels
Gaussian
Example kernels
Laplacian
Detecting blurred
images
Detecting blurred images
• Want to discard images that are unlikely to be of
use
• The more blurred an image is the fewer sharp
edges will be found
• What happens to the laplacian of an image as it’s
blurred…
Input image
Grayscale + laplacian
3x3 gaussian kernel
5x5 gaussian kernel
7x7 gaussian kernel
13x13 kernel
19x19 kernel
Variance of the laplacian
Code
// apply laplacian to grayscale copy of image
Imgproc.Laplacian(gray, laplacian, CvType.CV_64F);
// determine variance
MatOfDouble mean = new MatOfDouble();
MatOfDouble stddev = new MatOfDouble();
Core.meanStdDev(laplacian, mean, stddev);
double sd = stddev.toList().get(0);
double var = sd * sd;
Line following
Detect the line
What we want to do
Kernel to detect vertical lines
-1 2 -1
Mat kernel = new Mat(1, 3, CvType.CV_64F);
double[] kernel_values = {-1.0, 2.0, -1.0};
kernel.put(0, 0, kernel_values);
Convolve image with kernel
Imgproc.filter2D(gray, convolved, -1, kernel);
Threshold
Imgproc.threshold(convolved, thresh, 45.0, 255, Imgproc.
Result
Object detection
Sliding window
http://www.pyimagesearch.com/2015/03/23/sliding-windows-for-object-detection-with-python-and-opencv/
Haar features
Boosting
• Train all features on every training example
• For each feature find the best threshold which
distinguished positive from negative
• Select features with minimum error rate
• Final classifier is weighted sum of these weak
classifiers
Cascade
• Hugely expensive to compute all features on every
window location
• Group features into different stages with smaller
number of features
• Only proceed to next stage when previous stage
passes
• In Viola-Jones paper as few as 10 features out of
6000 might be evaluated per window
Pre-trained classifiers
• front face
• profile face
• Full body
• Upper body
• Lower body
• Left & right eyes (one classifier each for left & right)
• Smile
• Front cat face
• Russian license plate
Using a classifier
// create classifier object from XML definition
final CascadeClassifier faceClassifier =
new CascadeClassifier(classifierFilename);
// apply classifer to get list of matching regions
final MatOfRect mor = new MatOfRect();
clr.detectMultiScale(image, mor);
List<Rect> result = mor.toList();
Front face detection
Training your own classifier
How to train
• Create sample vectors from text files listing +ve & -ve images
• opencv_createsamples -info positives.txt -num 68 -w 60 -h 98 -vec
nao.vec
• Train
• Haar: opencv_traincascade -data classifier -vec samples.vec -bg
negatives.txt -numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5
-numPos 1000 -numNeg 600 -w 60 -h 98 -mode ALL -precalcValBufSize
1024 -precalcIdxBufSize 1024
• LBP : opencv_traincascade -data classifier.lbp -vec samples.vec -bg
negatives.txt -numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5
-numPos 1000 -numNeg 600 -w 60 -h 98 -featureType LBP -
precalcValBufSize 1024 -precalcIdxBufSize 1024
Training docs & tutorials
• http://docs.opencv.org/trunk/dc/d88/tutorial_train
cascade.html
• http://coding-robin.de/2013/07/22/train-your-own-
opencv-haar-classifier.html
Results
More uses for
contours
Detecting geometric shapes
Find contours
// use Canny edge detector on blurred grayscale image
Imgproc.Canny(blurred, edges, 75, 200);
// find contours
Imgproc.findContours(image, contours, new Mat(),
Imgproc.RETR_EXTERNAL,
Imgproc.CHAIN_APPROX_SIMPLE);
Type conversion
// need to convert the contour from a MatOfPoint to
MatOfPoint2f
final MatOfPoint2f m2f = new MatOfPoint2f();
m2f.fromList(contour.toList());
Approximate shapes
// approximate contour polygon with 1% or less difference in
perimeter
double perimeter = Imgproc.arcLength(m2f, true);
MatOfPoint2f approx = new MatOfPoint2f();
Imgproc.approxPolyDP(m2f, approx, 0.01 * perimeter,
true);
// check number of line segments
int numSides = approx.toList().size();
More information
• OpenCV docs: http://docs.opencv.org/3.1.0/
• Useful blogs:
• http://www.pyimagesearch.com
• https://www.learnopencv.com
• https://opencv-java-tutorials.readthedocs.io/en/latest/
• Code for examples:
https://github.com/davesnowdon/ljc-information-from-pixels
Summary
• Colour spaces: RGB, HSV, L*a*b*
• masking images using colour ranges
• Finding outline of objects using contours
• Convolution
• Using cascade classifiers to detect objects

Information from pixels

Editor's Notes

  • #3 Aim is to give a basic intro to machine vision and some of the basic techniques using java Not going to talk about deep learning - plenty of other introductions to that Not going to talk about photogrammetry - very specialised subject
  • #6 Google, Apple Facebook using machine vision to recognise faces to match photos in photo albums
  • #14 - Additive moddel - red, green & blue light added together to produce different colours - creating colours unintuitive - device dependent, no guarantee colours will look same on different devices - sRGB standard colour space produced by HP & Microsoft in 1996 to allow reproduction - Should still use colour calibration for accurate results
  • #15 - defines a colour using values for hue, lightness & saturation - often easier than RGB when creating colours - device dependent
  • #16 - distances in other colour spaces don't correspond to how perceptually different colours look to humans - device independent (often used when converting from RGB to CMYK)
  • #17 - Most machine vision operates on grayscale images - Sometimes colour can be useful if you know the colour and aren't able to train an object detector
  • #19 http://docs.opencv.org/2.4/modules/core/doc/basic_structures.html#mat
  • #20 Blurring the image smoothes it out and helps remove noise Using a sigma of zero causes it to be computed from the width & height
  • #23 // erode to remove small specks // A foreground pixel in the input image will be kept only if ALL pixels inside the structuring element are > 0. Otherwise, the pixels are set to 0 (i.e. background). final Mat se = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(kernelSize, kernelSize)); Imgproc.erode(image, result, se, new Point(-1, -1), numIterations); // dilate to restore large areas and remove gaps // Dilations, just as an erosion, also utilize structuring elements — a center pixel p of the structuring element is set to white if ANY pixel in the structuring element is > 0. Imgproc.dilate(image, result, se, new Point(-1, -1), numIterations);
  • #30 You may have heard the term convolutional neural networks in conjunction with deep learning. It’s a similar idea: convolutional layers apply a kernel or filter to the layer’s input. The difference is that the filter is learned as part of the training process. In CNNs a convolutional layer is typically followed by a pooling layer which allows a fixed output vector size and a degree of position independence.
  • #42 The vertical scale is a log-scale so the drop of after even modest blurring is substantial
  • #44 - line following at speed a classic robot competition - variants with obstacles/hurdles
  • #52 Finding an object in an image done by sliding a window across the image checking whether the area under the window is an object of interest. Differences in scale handled by using a larger window and scaling down to the detector window size.
  • #53 Each feature is a single value obtained by subtracting sum of pixels under white rectangle from sum of pixels under black rectangle.
  • #59 Need a lot of images. I had 70 positive images and 1000 negative examples