Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Computer Vision Introduction


Published on

Published in: Technology, Education
  • Be the first to comment

Computer Vision Introduction

  1. 1. Computer Vision – Intro Images are taken from: Computer Vision : Algorithms and Applications / Richard Szeliski
  2. 2. Time line
  3. 3. Standard Computer Vision Tasks
  4. 4. Open CV OpenCV (Open Source Computer Vision Library: is an open- source BSD-licensed library that includes several hundreds of computer vision algorithms.
  5. 5. Open CV – hard facts • OpenCV is released under a BSD license • Free for both academic and commercial use. • C++, C, Python and Java interfaces. • Supports Windows, Linux, Mac OS, iOS and Android. • Written in optimized C/C++ • Ctake advantage of multi-core processing. • Downloads exceeding 6 million. • Latest version 2.4.6
  6. 6. Open CV – intro (1/2) OpenCV has a modular structure, which means that the package includes several shared or static libraries. The following modules are available: core - a compact module defining basic data structures, including the dense multi-dimensional array Mat and basic functions used by all other modules. imgproc - an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on. video - a video analysis module that includes motion estimation, background subtraction, and object tracking algorithms. calib3d - basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.
  7. 7. Open CV – intro (2/2) features2d - salient feature detectors, descriptors, and descriptor matchers. objdetect - detection of objects and instances of the predefined classes (for example, faces, eyes, mugs, people, cars, and so on). highgui - an easy-to-use interface to video capturing, image and video codecs, as well as simple UI capabilities. gpu - GPU-accelerated algorithms from different OpenCV modules. ... some other helper modules, such as FLANN and Google test wrappers, Python bindings, and others.
  8. 8. Android programming - steps Minimum skills: Java for android / Objective C for iOS openCV C++ for native code Minimum installation for android: (We will learn and apply later today) Eclipse IDE Android ADT openCV openCV C++/native Simulator
  9. 9. Canny Edge Detector • void Canny(InputArray image, OutputArray edges, double threshold1, double threshold2, int apertureSize=3, bool L2gradient=false ) • Parameters: image – single-channel 8-bit input image. edges – output edge map; it has the same size and type as image . threshold1 – first threshold for the hysteresis procedure. threshold2 – second threshold for the hysteresis procedure. apertureSize – aperture size for the Sobel() operator. L2gradient – a flag, indicating whether a more accurate L_2 norm =sqrt{(dI/dx)^2 + (dI/dy)^2} should be used to calculate the image gradient magnitude ( L2gradient=true ), or whether the default L_1 norm =|dI/dx|+|dI/dy| is enough ( L2gradient=false ).
  10. 10. Canny Edge Detector - code Mat src, src_gray; Mat dst, detected_edges; int edgeThresh = 1; int lowThreshold = 1; int const max_lowThreshold = 100; int kernel_size = 3; char* window_name = "Edge Map"; /// Reduce noise with a kernel 3x3. Assume src_gray is already read blur( src_gray, detected_edges, Size(3,3) ); /// Canny detector Canny( detected_edges, detected_edges, lowThreshold, lowThreshold, kernel_size ); /// Using Canny's output as a mask, we display our result dst = Scalar::all(0); src.copyTo( dst, detected_edges); imshow( window_name, dst );
  11. 11. Hough Transform • void HoughLines(InputArray image, OutputArray lines, double rho, double theta, Int threshold, double srn=0, double stn=0 ) • Parameters: image – 8-bit, single-channel binary source image. lines – Output vector of lines rho – Distance resolution of the accumulator in pixels. theta – Angle resolution of the accumulator in radians. threshold – Accumulator threshold parameter. srn – For the multi-scale Hough transform, it is a divisor for the distance resolution rho. stn – For the multi-scale Hough transform, it is a divisor for the distance resolution theta.
  12. 12. Hough Transform - code Mat dst, cdst; Canny(src, dst, 50, 200, 3); cvtColor(dst, cdst, CV_GRAY2BGR); vector<Vec2f> lines; HoughLines(dst, lines, 1, CV_PI/180, 100, 0, 0 ); // Draw the lines for( size_t i = 0; i < lines.size(); i++ ) { float rho = lines[i][0], theta = lines[i][1]; Point pt1, pt2; double a = cos(theta), b = sin(theta); double x0 = a*rho, y0 = b*rho; pt1.x = cvRound(x0 + 1000*(-b)); pt1.y = cvRound(y0 + 1000*(a)); pt2.x = cvRound(x0 - 1000*(-b)); pt2.y = cvRound(y0 - 1000*(a)); line( cdst, pt1, pt2, Scalar(0,0,255), 3, CV_AA); }
  13. 13. Cascade classifier • void CascadeClassifier::detectMultiScale(const Mat& image, vector<Rect>& objects, double scaleFactor=1.1, int minNeighbors=3, int flags=0, Size minSize=Size(), Size maxSize=Size()) • Parameters: cascade – Haar classifier cascade (OpenCV 1.x API only). It can be loaded from XML or YAML file using Load(). image – Matrix of the type CV_8U containing an image where objects are detected. objects – Vector of rectangles where each rectangle contains the detected object. scaleFactor – Parameter specifying how much the image size is reduced at each image scale. minNeighbors – Parameter specifying how many neighbors each candidate rectangle should have to retain it. flags – Parameter with the same meaning for an old cascade as in the function cvHaarDetectObjects. It is not used for a new cascade. minSize – Minimum possible object size. Objects smaller than that are ignored. maxSize – Maximum possible object size. Objects larger than that are ignored.
  14. 14. Cascade classifier - codeString face_cascade_name = "haarcascade_frontalface_alt.xml"; CascadeClassifier face_cascade; // load cascade face_cascade.load( face_cascade_name ) ; eyes_cascade.load( eyes_cascade_name ); Mat frame_gray; cvtColor( frame, frame_gray, CV_BGR2GRAY ); equalizeHist( frame_gray, frame_gray ); // Detect faces face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2,0|CV_HAAR_SCALE_IMAGE, Size(30, 30) ); // Draw ellipses for( int i = 0; i < faces.size(); i++ ) { Point center( faces[i].x + faces[i].width*0.5, faces[i].y + faces[i].height*0.5 ); ellipse( frame, center, Size( faces[i].width*0.5, faces[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 ); }