With cheap cameras becoming ubiquitous the camera has become probably the most
important sensor for many applications.
However extracting usable information from the images produced by cameras is
non-trivial. There have been many published successes in recent years using deep
learning (multi-layered convolutional neural networks) but it’s not always
necessary to apply such techniques to get useful results for many applications.
This talk will focus on “classical” machine vision using java and the OpenCV
library. We’ll start with a quick refresher on how image data is represented and
then cover topics such as determining if an image is blurred (and therefore
unusable) and then explore a number of techniques such as shape and face
detection.
3. Why me?
• Social robotics developer
• Social robots need to handle unstructured
environments
• Vision is the most versatile way of sensing the
environment
10. The good news
• Open source
• Tried and tested
• Large collections of algorithms
• Language bindings for C, python & java
• Runs on pretty much anything (Linux, Mac,
Windows, android, iOS, RaspberryPi)
11. The less good news
• Native code
• Java API is a bit clunky
• Not much structure
• Not the new shiny
17. Get an image
• From a Java image
• From video / webcam
org.opencv.videoio.VideoCapture
• From file
import org.opencv.core.Mat;
Mat image = Imgcodecs.imread(filename);
18. org.opencv.core.Mat
new Mat(numRows, numColumns, CvType.CV_8UC3);
• Dense multi-dimensional matrix
• Variants with int, double, byte values
• Implements basic matrix operations
B G R B G R B G R B G R
B G R B G R B G R B G R
B G R B G R B G R B G R
22. Erode & Dilate
final Mat se = Imgproc.getStructuringElement(Imgproc.MORPH
new Size(kernelSize, kernelSize));
Imgproc.erode(image, result, se, new Point(-1, -1), numIteration
Imgproc.dilate(image, result, se, new Point(-1, -1), numIteration
26. Output image
• Don’t always need to
• Grab region of interest
Mat roi = mat.submat(Rect)
• Convert to java image
BufferedImage javaImage = Util.matrixToImage(mat);
Util.displayImage(command, javaImage);
• Write to file
Imgcodecs.imwrite(filename, mat);
27. Built-in blob detection
• OpenCV has built-in blob detection:
SimpleBlobDetector
• blob detection by colour may not work
• Blog post: https://www.learnopencv.com/blob-
detection-using-opencv-python-c/
33. Detecting blurred images
• Want to discard images that are unlikely to be of
use
• The more blurred an image is the fewer sharp
edges will be found
• What happens to the laplacian of an image as it’s
blurred…
53. Boosting
• Train all features on every training example
• For each feature find the best threshold which
distinguished positive from negative
• Select features with minimum error rate
• Final classifier is weighted sum of these weak
classifiers
54. Cascade
• Hugely expensive to compute all features on every
window location
• Group features into different stages with smaller
number of features
• Only proceed to next stage when previous stage
passes
• In Viola-Jones paper as few as 10 features out of
6000 might be evaluated per window
55. Pre-trained classifiers
• front face
• profile face
• Full body
• Upper body
• Lower body
• Left & right eyes (one classifier each for left & right)
• Smile
• Front cat face
• Russian license plate
56. Using a classifier
// create classifier object from XML definition
final CascadeClassifier faceClassifier =
new CascadeClassifier(classifierFilename);
// apply classifer to get list of matching regions
final MatOfRect mor = new MatOfRect();
clr.detectMultiScale(image, mor);
List<Rect> result = mor.toList();
64. Find contours
// use Canny edge detector on blurred grayscale image
Imgproc.Canny(blurred, edges, 75, 200);
// find contours
Imgproc.findContours(image, contours, new Mat(),
Imgproc.RETR_EXTERNAL,
Imgproc.CHAIN_APPROX_SIMPLE);
65. Type conversion
// need to convert the contour from a MatOfPoint to
MatOfPoint2f
final MatOfPoint2f m2f = new MatOfPoint2f();
m2f.fromList(contour.toList());
66. Approximate shapes
// approximate contour polygon with 1% or less difference in
perimeter
double perimeter = Imgproc.arcLength(m2f, true);
MatOfPoint2f approx = new MatOfPoint2f();
Imgproc.approxPolyDP(m2f, approx, 0.01 * perimeter,
true);
// check number of line segments
int numSides = approx.toList().size();
67. More information
• OpenCV docs: http://docs.opencv.org/3.1.0/
• Useful blogs:
• http://www.pyimagesearch.com
• https://www.learnopencv.com
• https://opencv-java-tutorials.readthedocs.io/en/latest/
• Code for examples:
https://github.com/davesnowdon/ljc-information-from-pixels
68. Summary
• Colour spaces: RGB, HSV, L*a*b*
• masking images using colour ranges
• Finding outline of objects using contours
• Convolution
• Using cascade classifiers to detect objects
Editor's Notes
Aim is to give a basic intro to machine vision and some of the basic techniques using java
Not going to talk about deep learning - plenty of other introductions to that
Not going to talk about photogrammetry - very specialised subject
Google, Apple Facebook using machine vision to recognise faces to match photos in photo albums
- Additive moddel
- red, green & blue light added together to produce different colours
- creating colours unintuitive
- device dependent, no guarantee colours will look same on different devices
- sRGB standard colour space produced by HP & Microsoft in 1996 to allow reproduction
- Should still use colour calibration for accurate results
- defines a colour using values for hue, lightness & saturation
- often easier than RGB when creating colours
- device dependent
- distances in other colour spaces don't correspond to how perceptually different colours look to humans
- device independent (often used when converting from RGB to CMYK)
- Most machine vision operates on grayscale images
- Sometimes colour can be useful if you know the colour and aren't able to train
an object detector
Blurring the image smoothes it out and helps remove noise
Using a sigma of zero causes it to be computed from the width & height
// erode to remove small specks
// A foreground pixel in the input image will be kept only if ALL pixels inside the structuring element are > 0. Otherwise, the pixels are set to 0 (i.e. background).
final Mat se = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(kernelSize, kernelSize));
Imgproc.erode(image, result, se, new Point(-1, -1), numIterations);
// dilate to restore large areas and remove gaps
// Dilations, just as an erosion, also utilize structuring elements — a center pixel p of the structuring element is set to white if ANY pixel in the structuring element is > 0.
Imgproc.dilate(image, result, se, new Point(-1, -1), numIterations);
You may have heard the term convolutional neural networks in conjunction with deep learning. It’s a similar idea: convolutional layers apply a kernel or filter to the layer’s input. The difference is that the filter is learned as part of the training process. In CNNs a convolutional layer is typically followed by a pooling layer which allows a fixed output vector size and a degree of position independence.
The vertical scale is a log-scale so the drop of after even modest blurring is substantial
- line following at speed a classic robot competition
- variants with obstacles/hurdles
Finding an object in an image done by sliding a window across the image checking whether the area under the window is an object of interest.
Differences in scale handled by using a larger window and scaling down to the detector window size.
Each feature is a single value obtained by subtracting sum of pixels under white
rectangle from sum of pixels under black rectangle.
Need a lot of images.
I had 70 positive images and 1000 negative examples