0
Perspective click-and-drag area selections in
pictures
Frank NIELSEN
www.informationgeometry.org
Sony Computer Science Lab...
Traditional click and drag rectangular selection
→ Fails for selecting parts in photos:

c 2013 Frank Nielsen

2/30
Traditional click and drag rectangular selection
→ Fails for selecting parts in photos:
Cannot capture “New” without part ...
Perspective click’n’drag
Intelligent UI (= computer vision + human computer interface)

→ Image “parsing” of perspective r...
Video demonstrations

Perspective click-and-drag + perspective copy/paste/swap

c 2013 Frank Nielsen

5/30
Perspective click’n’drag: Outline

1. Preprocessing:
Detect & structure perspective parts
1.1 Quad detector:
◮
◮
◮

Image ...
Preprocessing workflow

c 2013 Frank Nielsen

7/30
Quad detection: Sobel/Hough transform
How to detect convex quads in images?
indoor robotics [6] using vanishing point.
Lim...
Quad detection: Image segmentation (SRM)

→ Fast Statistical Region Merging [4] (SRM)

Source codes in JavaTM , Matlab R ,...
Quad detection: Image segmentation (SRM)

c 2013 Frank Nielsen

10/30
Quad detector
◮

◮
◮

◮

◮

For each segmented region, consider its exterior contour C
(polygon),
Compute the contour diam...
Quad detection: Image segmentation (SRM)

... any closed contour image segmentation,
→ run at different scales (eg., parame...
Multi-segmentation
Increases the chance of recognizing quads, but get a quad soup.

Q = 128

Q = 0.3

c 2013 Frank Nielsen...
Nested convex quad hierarchy

◮

From a quad soup, sort the quads in decreasing order of their
area in a priority queue.

...
Do not explicit unwarp perspective rectangles

Many existing systems first unwarp...

source

segmented

unwarped

Mobile c...
Perspective click’n’drag: User interaction
Perspective sub-rectangle selection:
Clicking on a corner p1 and dragging the o...
Some examples of perspective click-and-drag selections
Regular vs. perspective rectangle UI selection

c 2013 Frank Nielse...
Implementation details: Primitives on convex quads

By convention, order quads clockwise.
Positive determinant for the two...
In class Quadrangle
double area ( Feature p1 , Feature p2 , Feature p3 )
{
double res ;
res =( p1 .x - p3 .x ) *( p2 .y - ...
Homography estimation
Projective geometry, homogeneous and inhomogeneous
coordinates.
 ′  


xi
˜
h11 h12 h13
xi
˜
p...
Homography estimation using inhomogeneous system
Assume h33 = 0 (and set h33 = 1).


x1
0

 x2

0

 x3

0

 x4...
Homography estimation using the normalized DLT
9

H = UDV

T

λi ui vi⊤ ,

=
i =1

Right eigenvector of V corresponding to...
Image editing: Selection swaps
H12 from Q1 to Q2 by composition:
−1
H12 = H1 H2
−1
−1
H21 = H12 = H2 H1
→ backward pixel m...
Image editing: Selection swaps

c 2013 Frank Nielsen

24/30
Image editing: Selection swaps

c 2013 Frank Nielsen

25/30
Image editing: Selection swaps

c 2013 Frank Nielsen

26/30
Image editing: Selection swaps

c 2013 Frank Nielsen

27/30
Perspective Click-and-Drag UI: Conclusion
◮

Simple UI system relying on computer vision.

◮

Extend to other input format...
Bibliographic references I
Anders Eriksson and Anton van den Hengel.
Optimization on the manifold of multiple homographies...
Bibliographic references II

Huiyu Zhou, Xun Wang, and Gerald Schaefer.
Mean shift and its application in image segmentati...
Upcoming SlideShare
Loading in...5
×

Slides: Perspective click-and-drag area selections in pictures

212

Published on

Slides for the talk at MVA 2013:
Perspective click-and-drag area selections in pictures
(best practical paper award)

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
212
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Slides: Perspective click-and-drag area selections in pictures"

  1. 1. Perspective click-and-drag area selections in pictures Frank NIELSEN www.informationgeometry.org Sony Computer Science Laboratories, Inc. Machine Vision Applications (MVA) 21st May 2013 c 2013 Frank Nielsen 1/30
  2. 2. Traditional click and drag rectangular selection → Fails for selecting parts in photos: c 2013 Frank Nielsen 2/30
  3. 3. Traditional click and drag rectangular selection → Fails for selecting parts in photos: Cannot capture “New” without part of “Court”. Man-made environments: many perspectively slanted planar parts. c 2013 Frank Nielsen 3/30
  4. 4. Perspective click’n’drag Intelligent UI (= computer vision + human computer interface) → Image “parsing” of perspective rectangles (automatic/semi-automatic/manual) c 2013 Frank Nielsen 4/30
  5. 5. Video demonstrations Perspective click-and-drag + perspective copy/paste/swap c 2013 Frank Nielsen 5/30
  6. 6. Perspective click’n’drag: Outline 1. Preprocessing: Detect & structure perspective parts 1.1 Quad detector: ◮ ◮ ◮ Image segmentation Outer contour quad fitting Quad recognition 1.2 Quad homography tree 2. Interactive user interface: Perspective quad selection based on click-and-drag UI (=diagonal selection) 3. Application example: Interactive image editing (swap) c 2013 Frank Nielsen 6/30
  7. 7. Preprocessing workflow c 2013 Frank Nielsen 7/30
  8. 8. Quad detection: Sobel/Hough transform How to detect convex quads in images? indoor robotics [6] using vanishing point. Limitations of Hough transform [8] on Sobel image: Combinatorial line arrangement O(n4 )... → good for limited number of detected lines (blackboard detection [8], name card detection, etc.) c 2013 Frank Nielsen 8/30
  9. 9. Quad detection: Image segmentation (SRM) → Fast Statistical Region Merging [4] (SRM) Source codes in JavaTM , Matlab R , Python R , C, etc. c 2013 Frank Nielsen 9/30
  10. 10. Quad detection: Image segmentation (SRM) c 2013 Frank Nielsen 10/30
  11. 11. Quad detector ◮ ◮ ◮ ◮ ◮ For each segmented region, consider its exterior contour C (polygon), Compute the contour diameter, P1 P3 , Compute the upper most P2 and bottom most P4 extremal points Calculate the symmetric Haussdorf distance between quad Q = (P1 , P2 , P3 , P4 ) and contour C , Accept region as quad when distance falls below as prescribed threshold. All quads convex and clockwise oriented. c 2013 Frank Nielsen 11/30
  12. 12. Quad detection: Image segmentation (SRM) ... any closed contour image segmentation, → run at different scales (eg., parameter Q in SRM). Alternatively, can also use mean-shift [9], normalized cuts [7], etc. Why? To increase the chance of detecting for some parameter tuning quads. → We end up with a quad soup c 2013 Frank Nielsen 12/30
  13. 13. Multi-segmentation Increases the chance of recognizing quads, but get a quad soup. Q = 128 Q = 0.3 c 2013 Frank Nielsen Q = 10 Q = 0.25 13/30
  14. 14. Nested convex quad hierarchy ◮ From a quad soup, sort the quads in decreasing order of their area in a priority queue. ◮ Add image boundary quad Q0 as the quad root of the quad tree Q. ◮ ◮ Greedy selection: Add a quad of the queue if and only if it is fully contained in another quad of Q. When adding a quad Qi , compute the homographies [2] Hi and Hi−1 of the quad to the unit square. c 2013 Frank Nielsen 14/30
  15. 15. Do not explicit unwarp perspective rectangles Many existing systems first unwarp... source segmented unwarped Mobile cell phone signage recognition [5], AR systems, etc. c 2013 Frank Nielsen 15/30
  16. 16. Perspective click’n’drag: User interaction Perspective sub-rectangle selection: Clicking on a corner p1 and dragging the opposite corner p3 . find the deepest quad Q in the quad hierarchy Q that contains both points p1 and p3 . Unit square H p1 p2 ← p ′ 2 perspective dragging p′ = H p 1 ˜1 ˜ H −1 H c 2013 Frank Nielsen  x′ p′ =  y ′  ˜4 1  H −1  ¯ x′ ¯  y′  = 1  regular dragging p4 ← p ′ 4 p3 p′ ˜2 p′ = H p 3 ˜3 ˜ 16/30
  17. 17. Some examples of perspective click-and-drag selections Regular vs. perspective rectangle UI selection c 2013 Frank Nielsen 17/30
  18. 18. Implementation details: Primitives on convex quads By convention, order quads clockwise. Positive determinant for the two quad-induced triangles: det = ◮ ◮ x1 − x3 x2 − x3 y1 − y3 y2 − y3 Predicate p ∈ Q = (p1 , p2 , p3 , p4 )?: Two queries: p ∈ (p1 , p2 , p3 ) and p ∈ (p3 , p4 , p1 ). Area of a quad: One half of the absolute value of the determinant of the two quad triangles. c 2013 Frank Nielsen 18/30
  19. 19. In class Quadrangle double area ( Feature p1 , Feature p2 , Feature p3 ) { double res ; res =( p1 .x - p3 .x ) *( p2 .y - p1 . y) -( p1 .x - p2 . x) *( p3 .y - p1 .y ); return 0.5* Math. abs ( res ); // h a l f of d e t e r m i n a n t } double area () { return ( area (p1 , p2 , p3 ) + area (p1 , p3 , p4 ) ); } // // C l o c k w i s e or a l i g n e d o r d e r p r e d i c a t e // boolean CW ( Feature a , Feature b , Feature c ) { double det =( a.x - c. x) *( b.y -c .y ) -(b .x -c .x ) *( a.y - c. y) ; if ( det >=0.0) { return true ;} else { return false ;} } // D e t e r m i n e if a p i x e l f a l l s i n s i d e t h e q u a d r a n g l e or n o t boolean inside ( int x , int y ) { Feature p = new Feature (x ,y ,1.0) ; if ( CW (p1 , p2 , p) && CW (p2 ,p3 , p) && CW (p3 ,p4 , p) && CW (p4 ,p1 , p) ) { return true ;} else { return false ;} } c 2013 Frank Nielsen 19/30
  20. 20. Homography estimation Projective geometry, homogeneous and inhomogeneous coordinates.  ′     xi ˜ h11 h12 h13 xi ˜ pi′ =  yi′  =  h21 h22 h23   yi  = H pi , ˜ ˜ ˜ ˜ wi′ h31 h32 h33 wi wi′ = h31 xi + h32 yi + h33 wi h11 x +h12 y +h13 w xi′ = h31 xii +h32 yii +h33 wii , yi′ = Ai block matrix: h21 xi +h22 yi +h23 wi h31 xi +h32 yi +h33 wi . xi′ (h31 xi + h32 yi + h33 ) = h11 xi + h12 yi + h13 , yi′ (h31 xi + h32 yi + h33 ) = h21 xi + h22 yi + h23 . Solve for Ai h = 0 c 2013 Frank Nielsen 20/30
  21. 21. Homography estimation using inhomogeneous system Assume h33 = 0 (and set h33 = 1).  x1 0   x2  0   x3  0   x4 0 y1 0 y2 0 y3 0 y4 0 1 0 1 0 1 0 1 0 0 x1 0 x2 0 x3 0 x4 0 y1 0 y2 0 y3 0 y4 0 1 0 1 0 1 0 1 ′ −x1 x1 ′ −x1 y1 ′ −x2 x2 ′ −x2 y2 ′ −x3 x3 ′ −x3 y3 ′ −x4 x4 ′ −x4 y4    ′ ′ h11 −y1 x1 x1 ′ ′ −y1 y1  h12  y1      ′ ′ −y2 x2  h13  x2      ′ ′ −y2 y2  h21  y2    =  ′ × ′ −y3 x3  h22  x3       ′ ′ −y3 y3  h23  y3       ′ h31  x ′  −y4 x 4 ′ −y4 y4 4 h32 ′ y4 h′ Linear system written: Bh′ = b. For four pairs h′ = B −1 b . c 2013 Frank Nielsen 21/30
  22. 22. Homography estimation using the normalized DLT 9 H = UDV T λi ui vi⊤ , = i =1 Right eigenvector of V corresponding to the smallest eigenvalue. (last column vector v9 of V ) When λ9 = 0, the system is exactly determined. When λ9 > 0, the system is over-determined and λ9 is an indicator of the goodness of fit of the solution h = v9 . In practice, this estimation procedure is highly unstable numerically[2]. Points need to be first normalized√ that their centroid defines the to origin, and the diameter is set to 2. c 2013 Frank Nielsen 22/30
  23. 23. Image editing: Selection swaps H12 from Q1 to Q2 by composition: −1 H12 = H1 H2 −1 −1 H21 = H12 = H2 H1 → backward pixel mapping [3] (avoid holes) forward mapping SRC→ DEST (H) c 2013 Frank Nielsen backward mapping DEST→SRC (H −1 ) 23/30
  24. 24. Image editing: Selection swaps c 2013 Frank Nielsen 24/30
  25. 25. Image editing: Selection swaps c 2013 Frank Nielsen 25/30
  26. 26. Image editing: Selection swaps c 2013 Frank Nielsen 26/30
  27. 27. Image editing: Selection swaps c 2013 Frank Nielsen 27/30
  28. 28. Perspective Click-and-Drag UI: Conclusion ◮ Simple UI system relying on computer vision. ◮ Extend to other input formats: Stereo pairs, RGBZ images, etc. ◮ Implemented using processing.org (2500+ lines) Ongoing work: ◮ Rely on efficient quad detection: extensive benchmarking (BSDS500, Corel, ImageNet, etc. databases) ◮ Extend to various perspectively slanted shapes (like ball → ellipsoids, etc.) ◮ Robust multiple quad-to-square homography estimations [1]? www.informationgeometry.org c 2013 Frank Nielsen 28/30
  29. 29. Bibliographic references I Anders Eriksson and Anton van den Hengel. Optimization on the manifold of multiple homographies. pages 24 –249, 2009. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, second edition, 2004. Frank Nielsen. Visual Computing: Geometry, Graphics, and Vision. Charles River Media / Thomson Delmar Learning, 2005. Richard Nock and Frank Nielsen. Statistical region merging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11):1452–1458, 2004. Michael Rohs and Christof Roduner. Camera phones with pen input as annotation devices. In Pervasive workshop on Pervasive Mobile Interaction Devices (PERMID), pages 23–26, Munich, Germany, 2005. David Shaw and Nick Barnes. Perspective rectangle detection. Proceedings of the Workshop of the Application of, pages 1–152, 2006. Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. In Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR ’97), CVPR ’97, pages 731–737, Washington, DC, USA, 1997. IEEE Computer Society. Zhengyou Zhang and Li wei He. Whiteboard scanning and image enhancement. Digital Signal Processing, 17(2):414–432, 2007. c 2013 Frank Nielsen 29/30
  30. 30. Bibliographic references II Huiyu Zhou, Xun Wang, and Gerald Schaefer. Mean shift and its application in image segmentation. In Halina Kwasnicka and Lakhmi Jain, editors, Innovations in Intelligent Image Analysis, volume 339 of Studies in Computational Intelligence, pages 291–312. Springer Berlin / Heidelberg, 2011. c 2013 Frank Nielsen 30/30
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×