Computer Vision Introduction


Published on

Published in: Technology, Education
  • Be the first to comment

Computer Vision Introduction

  1. 1. Computer Vision – Intro Images are taken from: Computer Vision : Algorithms and Applications / Richard Szeliski
  2. 2. Time line
  3. 3. Standard Computer Vision Tasks
  4. 4. Open CV OpenCV (Open Source Computer Vision Library: is an open- source BSD-licensed library that includes several hundreds of computer vision algorithms.
  5. 5. Open CV – hard facts • OpenCV is released under a BSD license • Free for both academic and commercial use. • C++, C, Python and Java interfaces. • Supports Windows, Linux, Mac OS, iOS and Android. • Written in optimized C/C++ • Ctake advantage of multi-core processing. • Downloads exceeding 6 million. • Latest version 2.4.6
  6. 6. Open CV – intro (1/2) OpenCV has a modular structure, which means that the package includes several shared or static libraries. The following modules are available: core - a compact module defining basic data structures, including the dense multi-dimensional array Mat and basic functions used by all other modules. imgproc - an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on. video - a video analysis module that includes motion estimation, background subtraction, and object tracking algorithms. calib3d - basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.
  7. 7. Open CV – intro (2/2) features2d - salient feature detectors, descriptors, and descriptor matchers. objdetect - detection of objects and instances of the predefined classes (for example, faces, eyes, mugs, people, cars, and so on). highgui - an easy-to-use interface to video capturing, image and video codecs, as well as simple UI capabilities. gpu - GPU-accelerated algorithms from different OpenCV modules. ... some other helper modules, such as FLANN and Google test wrappers, Python bindings, and others.
  8. 8. Android programming - steps Minimum skills: Java for android / Objective C for iOS openCV C++ for native code Minimum installation for android: (We will learn and apply later today) Eclipse IDE Android ADT openCV openCV C++/native Simulator
  9. 9. Canny Edge Detector • void Canny(InputArray image, OutputArray edges, double threshold1, double threshold2, int apertureSize=3, bool L2gradient=false ) • Parameters: image – single-channel 8-bit input image. edges – output edge map; it has the same size and type as image . threshold1 – first threshold for the hysteresis procedure. threshold2 – second threshold for the hysteresis procedure. apertureSize – aperture size for the Sobel() operator. L2gradient – a flag, indicating whether a more accurate L_2 norm =sqrt{(dI/dx)^2 + (dI/dy)^2} should be used to calculate the image gradient magnitude ( L2gradient=true ), or whether the default L_1 norm =|dI/dx|+|dI/dy| is enough ( L2gradient=false ).
  10. 10. Canny Edge Detector - code Mat src, src_gray; Mat dst, detected_edges; int edgeThresh = 1; int lowThreshold = 1; int const max_lowThreshold = 100; int kernel_size = 3; char* window_name = "Edge Map"; /// Reduce noise with a kernel 3x3. Assume src_gray is already read blur( src_gray, detected_edges, Size(3,3) ); /// Canny detector Canny( detected_edges, detected_edges, lowThreshold, lowThreshold, kernel_size ); /// Using Canny's output as a mask, we display our result dst = Scalar::all(0); src.copyTo( dst, detected_edges); imshow( window_name, dst );
  11. 11. Hough Transform • void HoughLines(InputArray image, OutputArray lines, double rho, double theta, Int threshold, double srn=0, double stn=0 ) • Parameters: image – 8-bit, single-channel binary source image. lines – Output vector of lines rho – Distance resolution of the accumulator in pixels. theta – Angle resolution of the accumulator in radians. threshold – Accumulator threshold parameter. srn – For the multi-scale Hough transform, it is a divisor for the distance resolution rho. stn – For the multi-scale Hough transform, it is a divisor for the distance resolution theta.
  12. 12. Hough Transform - code Mat dst, cdst; Canny(src, dst, 50, 200, 3); cvtColor(dst, cdst, CV_GRAY2BGR); vector<Vec2f> lines; HoughLines(dst, lines, 1, CV_PI/180, 100, 0, 0 ); // Draw the lines for( size_t i = 0; i < lines.size(); i++ ) { float rho = lines[i][0], theta = lines[i][1]; Point pt1, pt2; double a = cos(theta), b = sin(theta); double x0 = a*rho, y0 = b*rho; pt1.x = cvRound(x0 + 1000*(-b)); pt1.y = cvRound(y0 + 1000*(a)); pt2.x = cvRound(x0 - 1000*(-b)); pt2.y = cvRound(y0 - 1000*(a)); line( cdst, pt1, pt2, Scalar(0,0,255), 3, CV_AA); }
  13. 13. Cascade classifier • void CascadeClassifier::detectMultiScale(const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size()) • Parameters: cascade – Haar classifier cascade (OpenCV 1.x API only). It can be loaded from XML or YAML file using Load(). image – Matrix of the type CV_8U containing an image where objects are detected. objects – Vector of rectangles where each rectangle contains the detected object. scaleFactor – Parameter specifying how much the image size is reduced at each image scale. minNeighbors – Parameter specifying how many neighbors each candidate rectangle should have to retain it. flags – Parameter with the same meaning for an old cascade as in the function cvHaarDetectObjects. It is not used for a new cascade. minSize – Minimum possible object size. Objects smaller than that are ignored. maxSize – Maximum possible object size. Objects larger than that are ignored.
  14. 14. Cascade classifier - codeString face_cascade_name = "haarcascade_frontalface_alt.xml"; CascadeClassifier face_cascade; // load cascade face_cascade.load( face_cascade_name ) ; eyes_cascade.load( eyes_cascade_name ); Mat frame_gray; cvtColor( frame, frame_gray, CV_BGR2GRAY ); equalizeHist( frame_gray, frame_gray ); // Detect faces face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2,0|CV_HAAR_SCALE_IMAGE, Size(30, 30) ); // Draw ellipses for( int i = 0; i < faces.size(); i++ ) { Point center( faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5 ); ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 ); }