Keynote from ISUVR'10


Published on

The slides from my keynote address to the International Symposium on Ubiquitous Virtual Reality, 2010.

Makes the case for image-based modelling as a tool for 3D user-created content, and argues that user-created content is critical to AR and ubiquitous AR particularly.

Published in: Technology, Art & Photos
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Keynote from ISUVR'10

  1. 1. Image-based modelling for augmented reality<br />Anton van den Hengel<br />Director, Australian Centre for Visual technologies<br />Professor, Adelaide University, South Australia<br />Director, PunchCard Visual Technologies<br />
  2. 2. 3D Modelling for AR<br />AR needs models<br />AR is about the interaction between the real and the synthetic<br />3D modelling isn’t much fun<br />Even with the best interfaces invented<br />3D Studio Max?<br />Blender?<br />
  3. 3. User-created content<br />2D UCC has changed the face of the web<br />Blogs, Wikis, Social networking sites, Advertising, Fanfiction, News Sites, Trip planners, Mobile Photos & Videos, Customer review sites, Forums, Experience and photo sharing sites, Audio, Video games, Maps and location systems and such, but more<br />Associated Content,,, Brickfish, CreateDebate, Dailymotion, Deviant Art, Demotix, Digg, eBay, Eventful, Fark, Epinions, Facebook, Filemobile, Flickr, Forelinksters, Friends Reunited, GiantBomb,, HubPages, InfoBarrel, iStockphoto,, JayCut, Mahalo, Metacafe,, MySpace, Newgrounds, Orkut, OpenStreetMap, Picasa, Photobucket, PhoneZoo, Revver, Scribd, Second Life, Shutterstock, Shvoong, Skyrock, Squidoo, TripAdvisor, The Politicus, TypePad, Twitter, Urban Dictionary, Veoh, Vimeo, Widgetbox, Wigix, Wikia, WikiMapia, Wikinvest, Wikipedia,, WordPress, Yelp, YouTube, YoYoGames, Zooppa<br />
  4. 4. User-created content for AR<br />
  5. 5. Google-created content for AR<br />
  6. 6. UCC for AR<br />Just using images is a good start<br />But limits interactions to 2D<br />Flexible AR requires 3D models<br />Ubiquitous AR requires UCC<br />Flexible ubiquitous AR requires 3D UCC<br />
  7. 7. 3D UCC<br />3D has been limited by the lack of good UCC tools<br />This is true for AR<br />But also VR, 3D TV, Second Life, Google Earth, Little Big Planet, 3D PDF, Adobe Premier, Unreal Tournament, Playstation, SGML, ...<br />
  8. 8. 3D UCC<br />AR particularly needs to model the real world<br />Images are a good source of 3D information<br />Easily accessible<br />They’re typically captured anyway<br />Almost everything has a camera attached<br />Humans are very good at interpreting them<br />Can AR be ubiquitous without UCC?<br />
  9. 9. Image-based 3D UCC<br />The image is the interface<br />People can’t help but see images in 3D<br />Most image sets embody 3D<br />Powerful way to model real objects<br />Varying levels of interaction<br />Varying types of models<br />Helps even in modelling imaginary objects<br />
  10. 10. Image-based modelling for AR<br />AR is largely about interactive images<br />Any other mode of interaction adds complexity<br />The majority of the content is real<br />3D modelling from images seems a natural fit with AR<br />
  11. 11. Image-based 3D modelling<br />Automatic<br />Very detailed models of everything<br />But it’s getting better<br />Interactive<br />Means you can specify<br />What you want to model<br />What kind of model you want<br />
  12. 12. Videotrace<br />Interactive image-based modelling<br />A familiar interface<br />Image-based interactions<br />The image is the interface<br />Generates low polygon count models with textures<br />
  13. 13. Videotrace<br />
  14. 14. Input<br />
  15. 15. Modelling<br />
  16. 16. Results<br />
  17. 17. Another example<br />
  18. 18. Modeling<br />
  19. 19. Results<br />
  20. 20. Interactive 3D modelling<br />3D modelling is critical to all sorts of application<br />Special effects, but also mining, architecture, defence, urban planning, …<br />People are getting more visually sophisticated<br />More 3D data is being generated<br />More cameras, but also scanners etc<br />The interfaces of modelling programs are usually very hard to fathom<br />
  21. 21. Low polygon-count models<br />Insert your own objects into a game<br />Model an environment for AR<br />Put your house into Google Earth<br />Video editing<br />Cut and paste between sequences<br />Remove someone from your home videos<br />
  22. 22. Put your truck into a game<br />
  23. 23. Put your truck into a game<br />
  24. 24. Modelling for special effects<br />
  25. 25. Video editing requires models<br />
  26. 26. Video editing requires models<br />
  27. 27. Modelling architecture<br />
  28. 28. Modeling for virtual environments<br />
  29. 29. Modeling for virtual environments<br />
  30. 30. Modeling for virtual environments<br />
  31. 31. Modeling for virtual environments<br />
  32. 32. The process<br />Capture and import the video<br />Run video through the camera tracker<br />Performs structure and motion analysis <br />Interact with the system to generate and edit the model<br />Export to your application<br />
  33. 33. The approach<br />Pre-compute where possible<br />Structure from motion (camera tracking)<br />Superpixels<br />Then interact<br />Interactions allow user to exploit precomputed results<br />
  34. 34. Structure from motion<br />Camera tracking<br />Calculates<br />Reconstructed point cloud<br />Camera parameters<br />Location<br />Orientation<br />Intrinsics (eg. Focal length)<br />Informs interaction interpretation process<br />
  35. 35. Structure from motion<br />
  36. 36. Interactions<br />Straight lines<br />Closed sets of lines define planar polygons<br />Curves<br />For planar shapes with curved edges<br />For NURBS surfaces<br />Mirroring<br />Duplicates existing geometry<br />Extrusion<br />Dense meshing<br />
  37. 37. Fitting planar faces<br />User specifies boundary<br />Boundary specifies infinitely many planes<br />Fitting similar to pre-emptive RANSAC<br />Generate bounded plane hypotheses from point cloud<br />Eliminate hypotheses that fail a series of tests<br />Run simplest / most robust tests first<br />Generally 3d tests before 2d tests<br />
  38. 38. Line of sight<br />Object points<br />Fitting planar faces<br />Image plane<br />
  39. 39. Hierarchical RANSAC<br />Generate bounded plane hypotheses<br />Tests<br />Support from point cloud<br />Reprojects within new image boundaries<br />Constraints on relative edge length and face size<br />Colour histogram matching on faces<br />Colour matching on edge projections<br />Reprojection is not self-occluding<br />
  40. 40. 2D Curves<br />
  41. 41. 3D Curves<br />
  42. 42. Mirroring<br />
  43. 43. Extrusion<br />
  44. 44. Dense surface reconstruction<br />
  45. 45. Live modelling<br />
  46. 46. Live modelling<br />Most geometry cannot be modelled beforehand<br />You can’t tell where it will be<br />Modelling the whole world won’t work<br />Need to generate models in-situ<br />While you’re there<br />
  47. 47. Live modelling in AR<br />Using VideoTrace to model geometry from live video<br />To insert elsewhere in the world<br />So real objects can occlude synthetic geometry<br />
  48. 48. Live modelling for AR<br />The camera tracking is performed live using SLAM<br />Simultaneous Localisation and mapping<br />Markerless video tracking<br />No prior model of the space<br />Using PTAM<br />Parallel Tracking and Mapping<br />Klein and Murray<br />
  49. 49. PTAM<br />
  50. 50. Videotrace - Live<br />
  51. 51. Occlusion<br />Low polygon count models?<br />Needed for efficiency<br />Not accurate enough for occlusion calculations<br />SLAM errors also prevent direct occlusion modelling<br />
  52. 52. Occlusion boundary refinement<br />The model of the foreground object is projected into the image<br />Using the PTAM-estimated camera parameters<br />But there is always some misalignment<br />Solve using a live segmentation of the real object from the video<br />
  53. 53. Occlusion boundary refinement<br />Lay out nodes of a graph around the projected boundary<br />Set foreground and background probabilities per node from colour model<br />Set link weights from edge strength<br />Segment using max-flow algorithm<br />At frame rate<br />
  54. 54. Occlusion boundary refinement<br />
  55. 55. Occlusion boundary refinement<br />Graph cut means that model doesn’t need to be accurate<br />Very low polygon counts<br />Very simple modelling process<br />More complex objects possible<br />
  56. 56. Occlusion boundary refinement<br />Graph cut gives a hard segmentation<br />Fix with an alpha matte<br />Blends between foreground and synthetic object<br />Fixes some holes in the cut<br />
  57. 57. Live modelling for AR<br />
  58. 58. AR modelling for other purposes<br />
  59. 59. Minimal interaction AR modelling<br />Use the camera as the modelling tool<br />The user only specifies the object, the rest is done with the camera<br />Projective texturing<br />Some compensation for Visual Hull<br />
  60. 60. Silhouette modelling<br />
  61. 61. Minimal interaction modelling<br />
  62. 62. How to get Videotrace<br />It’s available on free beta test<br />Just register at<br />They will email you a link<br />It’s a real beta<br />Hopefully the final version will be free too<br />
  63. 63. What’s next?<br />New interactions, applications and data sources<br />Interactive SFM, Better SLAM<br />Videoshop<br />