Computer Vision with Android (Google GTUG Bangkok 2011)

10,221 views

Published on

Ideas and techniques for a new generation of Android mobile applications based on Computer Vision principles.
A presentation given to the Google GTUG BootCamp 2011, in Bangkok, at Kasetsart Univerisity.

Published in: Technology, Art & Photos
2 Comments
16 Likes
Statistics
Notes
No Downloads
Views
Total views
10,221
On SlideShare
0
From Embeds
0
Number of Embeds
126
Actions
Shares
0
Downloads
1
Comments
2
Likes
16
Embeds 0
No embeds

No notes for slide
  • - YUV is a basic color model used in analog color TV broadcastingYUV is compatible with older B/W infrastructure and devicesY (luma) at high resolution; Cb and Cr (chroma components) can be compressed and treated independently
  • The structuring element B is swept over the image A.Each time the origin of the structuring element touches a binary 1-pixel, the entire translated structuring element shape is ORed to the output image, which has been initialized to all zeros.
  • Turn on pixels in the clipped image that are black in the mask
  • Computer Vision with Android (Google GTUG Bangkok 2011)

    1. 1. Computer Vision with Android<br />Ideas and techniques for a new generation of mobile applications<br />A talk by<br />Andrea Gagliardi La Gala<br />andrea.lagala@gmail.com<br />Bangkok GTUG BootCamp 2011<br />15 January 2011 at Kasetsart University<br />
    2. 2. Who am I?<br />Andrea Gagliardi La Gala, Italian<br />Based in South East Asia since 6 years<br />In Thailand since 3 years<br />Mobile Solutions Architect<br />Develop business apps for mobile and Android<br />Integrate mobile apps into enterprise systems<br />R&D in Computer Vision and Artificial Intelligence<br />
    3. 3. In this presentation<br />Computer Vision techniques for Android<br />Live demos and case-studies<br />The Android Camera API<br />Introduction to real-time image processing<br />Develop ideas for new mobile applications<br />Augmented Reality<br />Driver Assistance and Safety<br />Ambient Monitoring<br />…and many more, depending on your creativity!<br />Feel free to interrupt me, this is an open discussion<br />
    4. 4. Mobile capabilities<br />Powerful processors<br />Lots of sensors<br /><ul><li>Qualcomm Snapdragon (1GHz)
    5. 5. Samsung Hummingbird (1GHz)
    6. 6. NVIDIA Tegra 2 (1GHz, dual core)
    7. 7.
    8. 8. GPS / Positioning
    9. 9. Accelerometer / Gyroscope
    10. 10. NFC, Proximity, Light intensity
    11. 11. …</li></ul>Ubiquitous connectivity<br />Advanced cameras<br />Up to 8 MP<br />Good optical quality<br />Sensors are improving<br />…<br /><ul><li>GSM / GPRS / 3G / 4G
    12. 12. SMS / MMS
    13. 13. Wi-Fi
    14. 14. Bluetooth</li></li></ul><li>What is Computer Vision?<br />The goal of Computer Vision:<br />Make useful decisions about real physical objects and scenes based on sensed images<br />Sense the physical world through the camera (2D image)<br />Shape, illumination, spatial relationships<br />Understanding of the 3D world<br />Geometry, texture, motion, identity of objects<br />Algorithms to:<br />Process image information<br />Construct descriptions of the world and its objects <br />
    15. 15. Computer Vision on AndroidA quick demo<br />AndAR (Android ARToolKit)<br />http://code.google.com/p/andar/<br />Java based software library that enables AR on Android<br />Marker recognition within image<br />3D models display<br />Images source: ARToolworks Inc.<br />
    16. 16. A real case-study:Optical Recognition of Passports<br />Remote server <br />Agent<br />with optical scanner<br />End-user<br />Integration ofend-user data into remote solution<br />On-board localdatabase<br />
    17. 17. Live Demo<br />Scan image<br /><ul><li>ID card placed in front of scanner
    18. 18. Real-time,touch-less recognition of ID card</li></ul>Recognize information<br /><ul><li>Background removal
    19. 19. Distortion correction
    20. 20. Detection of information
    21. 21. Image enhancement
    22. 22. Automatic extraction of graphical data (eg. picture)
    23. 23. Extraction of text from image (OCR)</li></ul>Import data<br />
    24. 24. How it is done<br />
    25. 25. Acquire video frames with Android Camera API<br />Camera.PreviewCallback<br />Application<br />Camera<br />Callback<br />Android OS<br />callback = new Callback()<br />camera = Camera.open()<br />camera .setPreviewCallback(callback)<br />onPreviewFrame(byte[] data, Camera camera)<br />handle the image<br />Your code<br />Android OS<br />
    26. 26. Sample code<br />PreviewCallback callback = new PreviewCallback();<br />Camera camera = Camera.open();<br />Camera.setPreviewCallback(callback);<br />
    27. 27. The YCBCR video format<br />YCBCR is a version of YUV color space<br />YUV is used analog color TV broadcasting<br />YCBCR is used in digital video applications<br />YCBCR components:<br />Y = luma (or luminance, or intensity)<br />CB = blue chroma difference<br />CR = red chroma difference<br />Human visual system is more sensitive to luma than to the chromaticity values<br />Y<br />CB<br />CR<br />Images source: Wikipedia.<br />
    28. 28. Translate to RGB by yourself:<br />Or let Android do it:<br />android.graphics.YuvImage class<br />compressToJpeg() method<br />Available in Android 2.2+<br />YCBCR to RGB color space<br />Image source: Wikipedia.<br />
    29. 29. Gray-scale image from YCBCR<br />Gray-scale images (8-bit depth) are:<br />Smaller in size<br />Faster to process<br />Gray-scale image can be obtained straight from YCBCR without further processing: <br />Byte array returned by onPreviewFrame()<br />8-bit<br />Y<br />Y<br />Y<br />Y<br />Y<br />Y<br />Y<br />Y<br />…<br />CB<br />CR<br />CB<br />CR<br />…<br />
    30. 30. Image edge detection<br />Edge detection is a fundamental tool in image processing<br />Many algorithms in scientific literature (Sobel, Canny, etc.)<br />Gray-scale image in input<br />Smooth image to eliminate noise<br />Analyze pixel neighborhood<br />Detect edges<br />Vertical edges<br />Horizontal edges<br />
    31. 31. Binary image morphology<br />Structuring element B<br />Binary morphology refers to the shape of a region in a binary image<br />2 basic operators:<br />Dilation<br />Erosion<br />Dilation:<br />Input image A<br />Output image<br />Vertical edges<br />Enhanced vertical edges<br />
    32. 32. Histogram analysis<br />Count the number of black pixels per column (vertical histogram)<br />Count the number of black pixels per row (horizontal histogram)<br />Vertical histogram:<br />Analyze histograms<br />Derive features within image<br />Peak<br />Valley<br />Vertical Histogram<br />
    33. 33. Clipping and binarization<br />Clip original image based on histogram analysis<br />Binarization = conversion of image to black and white pixels only<br />Many algorithms in scientific literature (Otsu, Sauvola, etc.)<br />Adaptive binarization algorithms handle different pixel intensity values locally<br />Classify pixels in background (white) and foreground (black)<br />Clipped image<br />Binarized image<br />
    34. 34. Connected components analysis<br />Use binary morphology to bring regions of foreground closer<br />Identify connected components<br />Connected component = set of pixels that formed a connected group<br />Connected components can be:<br />Counted<br />Filtered<br />Sorted<br />Labeled<br />Measured<br />
    35. 35. Masking ROIs(regions of interest)<br />Measure connected components<br />Filter components by size and position<br />Detect regions of interest (ROI)<br />Use ROIs to mask the clipped image<br />Finally! We got the passport data!<br />Client picture<br />Client data to be passed to the OCR for text recognition<br />
    36. 36. Can we make it fast?Yes, with the Android NDK!<br />C/C++ compiler<br />STL libraries included since Android 2.3 (NDKr5)<br />Embed C/C++ code into Java .apk<br />JNI layer<br />Image source: Google Inc.<br />
    37. 37. Augmented Reality Applications:Driver Assistance<br />Images source: Opel cars.<br />Video source: YouTube.<br />
    38. 38. Safety Applications:Driver Safety Monitor<br />Images source: Opel cars.<br />Video source: YouTube.<br />
    39. 39. Security Applications:Ambient Monitoring<br />People counting and tracking<br />Motion vectors<br />Direction predictions<br />Send alarms via SMS or network<br />Video source: YouTube.<br />
    40. 40. Get started on Computer Visionfor Android<br />Computer Vision<br />L. Shapiro, G. G. Stockman, Prentice Hall, 2001<br />Digital Image Processing<br />R. C. Gonzalez, R. E. Woods, Prentice Hall, 2007<br />Learning OpenCV: Computer Vision with the OpenCV Library<br />G. Bradski, A. Kaehler, O’Reilly, 2008<br />MATLAB (MathWorks Inc.)<br />Image Processing Toolkit<br />OpenCV C++ library:<br />Sponsored by Intel<br />BSD licence, free, open-source<br />Android port: http://code.google.com/p/android-opencv/<br />
    41. 41. OpenCV library overview<br />
    42. 42. Do you have any questions?<br />Q&A<br />
    43. 43. Unleash your creativity!<br />Andrea Gagliardi La Gala<br />andrea.lagala@gmail.com<br />

    ×