Inference and perception are important. Intent and goal of the photo is important. The same way camera put photorealistic art out of business, maybe this new artform will put the traditional camera out of business. Because we wont really care about a photo, merely a recording of light but a form that captures meaningful subset of the visual experience. Multiperspective photos. Photosynth is an example.
Many think all this is for CP
Stereo-pair is a simple example of coded photography. Many decomposition problems, direct/global, diffuse/specular,
1:15
Comparisons
SIGGRAPH 2008 Class: Computational Photography Debevec: Illumination as Computing / Scene & Performance Capture August 2008 Nayar et al. used high-frequency illumination patterns to quickly separate “direct” and “global” components. Basically, the global components stay the same as you phase-shift high-frequency illumination on the scene, while the direct components appear and disappear at different pixels. Taking the minimum value over a sequence of phase shifts yields the global component, multiplied by the fill ratio of the patterns; the maximum minus the minimum yields the direct component.
Inference and perception are important. Intent and goal of the photo is important. The same way camera put photorealistic art out of business, maybe this new artform will put the traditional camera out of business. Because we wont really care about a photo, merely a recording of light but a form that captures meaningful subset of the visual experience. Multiperspective photos. Photosynth is an example.
Maybe all the consumer photographer wants is a black box with big red button. No optics, sensors or flash. If I am standing the middle of times square and I need to take a photo. Do I really need a fancy camera?
The camera can trawl on flickr and retrieve a photo that is roughly taken at the same position, at the same time of day. Maybe all the consumer wants is a blind camera.
Reversibly encode all the information in this otherwise blurred photo
The glint out of focus shows the unusual pattern.
Since we are adapting LCD technology we can fit a BiDi screen into laptops and mobile devices.
Recall that one of our inspirations was this new class of optical multi-touch device. At the top you can see a prototype that Sharp Microelectronics has published. These devices are basically arrays of naked phototransistors. Like a document scanner, they are able to capture a sharp image of objects in contact with the surface of the screen. But as objects move away from the screen, without any focusing optics, the images captured this device are blurred.
Our observation is that by moving the sensor plane a small distance from the LCD in an optical multitouch device, we enable mask-based light-field capture. We use the LCD screen to display the desired masks, multiplexing between images displayed for the user and masks displayed to create a virtual camera array. I’ll explain more about the virtual camera array in a moment, but suffice to say that once we have measurements from the array we can extract depth.
This device would of course support multi-touch on-screen interaction, but because it can measure the distance to objects in the scene a user’s hands can be tracked in a volume in front of the screen, without gloves or other fiducials.
Thus the ideal BiDi screen consists of a normal LCD panel separated by a small distance from a bare sensor array. This format creates a single device that spatially collocates a display and capture surface.
So here is a preview of our quantitative results. I’ll explain this in more detail later on, but you can see we’re able to accurately distinguish the depth of a set of resolution targets. We show above a portion of portion of views form our virtual cameras, a synthetically refocused image, and the depth map derived from it.
CPUs and computers don’t mimic the human brain. And robots don’t mimic human activities. Should the hardware for visual computing which is cameras and capture devices, mimic the human eye? Even if we decide to use a successful biological vision system as basis, we have a range of choices. For single chambered to compounds eyes, shadow-based to refractive to reflective optics. So the goal of my group at Media Lab is to explore new designs and develop software algorithms that exploit these designs.
Raskar Next Billion Cameras Siggraph 2009 - Presentation Transcript
Camera Culture Ramesh Raskar Alyosha Efros Ramesh Raskar Steve Seitz Siggraph 2009 Curated Course Next Billion Cameras http://raskar.scripts.mit.edu / nextbillioncameras /
A. Introduction‐‐5 minutes
B. Cameras of the future ( Raskar , 30 minutes) * Form factors, Modalities and Interaction * Enabling Visual Social Computing
C. Reconstruction the World ( Seitz , 30 minutes) * Photo tourism and beyond * Image‐based modeling and rendering on a massive scale * Scene summarization
D. Understanding a Billion Photos ( Efros , 30 minutes) * What will the photos depict? * Photos as visual content for computer graphics * Solving computer vision
E. Discussion‐‐10 minutes
Next Billion Cameras
Alexei (Alyosha) Efros [CMU]
Assistant professor at the Robotics Institute and the Computer Science Department at Carnegie Mellon University .
His research is in the area of computer vision and computer graphics, especially at the intersection of the two. He is particularly interested in using data-driven techniques to tackle problems which are very hard to model parametrically but where large quantities of data are readily available. Alyosha received his PhD in 2003 from UC Berkeley and spent the following year as a post-doctoral fellow in Oxford, England. Alyosha is a recipient of the NSF CAREER award (2006), the Sloan Fellowship (2008), the Guggenheim Fellowship (2008), and the Okawa Grant (2008).
http://www.cs.cmu.edu/~efros/
Ramesh Raskar [MIT]
Associate Professor at the MIT Media Lab and heads the Camera Culture research group.
The group focuses on creating a new class for imaging platforms to better capture and share the visual experience. This research involves developing novel cameras with unusual optical elements, programmable illumination, digital wavelength control, and femtosecond analysis of light transport, as well as tools to decompose pixels into perceptually meaningful components.
Raskar is a receipient of Alfred P Sloan research fellowship 2009, the TR100 Award 2004, Global Indus Technovator Award 2003. He holds 35 US patents and has received four Mitsubishi Electric Invention Awards. He is currently co-authoring, with Jack Tumblin, a book on computational photography.
http://www.media.mit.edu/~raskar
Steve Seitz [U-Washington]
Professor in the Department of Computer Science and Engineering at the University of Washington.
He received Ph.D. in computer sciences at the University of Wisconsin, Madison in 1997. He was twice awarded the David Marr Prize for the best paper at the International Conference of Computer Vision, and has received an NSF Career Award, an ONR Young Investigator Award, and an Alfred P. Sloan Fellowship. His work on Photo Tourism (joint with Noah Snavely and Rick Szeliski) formed the basis of Microsoft's Photosynth technology. Professor Seitz is interested in problems in computer vision and computer graphics. His current research focuses on capturing the structure, appearance, and behavior of the real world from digital imagery.
http://www.cs.washington.edu/homes/seitz/
Where are the ‘camera’s?
Where are the ‘camera’s?
Camera Culture Ramesh Raskar Alyosha Efros Ramesh Raskar Steve Seitz Siggraph 2009 Course Next Billion Cameras http://raskar.info/photo/
Camera Culture Ramesh Raskar Alyosha Efros Ramesh Raskar Steve Seitz Siggraph 2009 Course Next 100 Billion Cameras http://raskar.info/photo/
Key Message
Cameras will not look like anything today
Emerging optics, illumination, novel sensors
Visual Experience will differ from viewfinder
Photos will be ‘computed’
Remarkable post-capture control
Crowdsource the photo collection
Exploit priors and online collections
Visual Essence will dominate
Superior Metadata tagging for effective sharing
Fusion with non-visual data
Can you look around a corner ?
Can you decode a 5 micron feature from 3 meters away with an ordinary camera ?
Convert LCD into a big flat camera? Beyond Multi-touch
Pantheon
How do we move through a space?
What is ‘interesting’ here?
Record what you ‘feel’ not what you ‘see’
Camera Culture Ramesh Raskar Ramesh Raskar Camera Culture http://raskar.scripts.mit.edu / nextbillioncameras /
http://research.cens.ucla.edu/areas/2007/Urban_Sensing/ (Erin Brockovich) n
Community Photo Collections U of Washington/Microsoft: Photosynth
Beyond Visible Spectrum Cedip RedShift
Trust in Images From Hany Farid
Trust in Images From Hany Farid LA Times March’03
Cameras in Developing Countries http://news.bbc.co.uk/2/hi/south_asia/7147796.stm Community news program run by village women
Vision thru tongue http://www.pbs.org/kcet/wiredscience/story/97-mixed_feelings.html Solutions for the Visually Challenged http://www.seeingwithsound.com/
New Topics in Imaging Research
Imaging Devices, Modern Optics and Lenses
Emerging Sensor Technologies
Mobile Photography
Visual Social Computing and Citizen Journalism
Imaging Beyond Visible Spectrum
Computational Imaging in Sciences (Medical)
Trust in Visual Media
Solutions for Visually Challenged
Cameras in Developing Countries
Social Stability, Commerce and Governance
Future Products and Business Models
Traditional Photography Lens Detector Pixels Image Mimics Human Eye for a Single Snapshot : Single View, Single Instant, Fixed Dynamic range and Depth of field for given Illumination in a Static world Courtesy: Shree Nayar
Computational Photography Computational Illumination Computational Camera Scene : 8D Ray Modulator Display Generalized Sensor Generalized Optics Processing 4D Ray Bender Upto 4D Ray Sampler Ray Reconstruction Generalized Optics Recreate 4D Lightfield Light Sources Modulators 4D Incident Lighting 4D Light Field
Computational Photography [Raskar and Tumblin]
Epsilon Photography
Low-level vision: Pixels
Multi-photos by perturbing camera parameters
HDR, panorama, …
‘ Ultimate camera’
Coded Photography
Mid-Level Cues:
Regions, Edges, Motion, Direct/global
Single/few snapshot
Reversible encoding of data
Additional sensors/optics/illum
‘ Scene analysis’
Essence Photography
High-level understanding
Not mimic human eye
Beyond single view/illum
‘ New artform’
captures a machine-readable representation of our world to hyper-realistically synthesize the essence of our visual experience.
Goal and Experience Low Level Mid Level High Level Hyper realism Raw Angle, spectrum aware Non-visual Data, GPS Metadata Priors Comprehensive 8D reflectance field Digital Epsilon Coded Essence Computational Photography aims to make progress on both axis Camera Array HDR, FoV Focal stack Decomposition problems Depth Spectrum LightFields Human Stereo Vision Transient Imaging Virtual Object Insertion Relighting Augmented Human Experience Material editing from single photo Scene completion from photos Motion Magnification Phototourism
2 nd International Conference on Computational Photography Papers due November 2, 2009 http://cameraculture.media.mit.edu/iccp10
Ramesh Raskar and Jack Tumblin
Book Publishers: A K Peters
Siggraph 2009 booth: 20% off
Booth #2527
ComputationalPhotography.org
Meet the Authors
Thursday at 2pm-2:30pm
Computational Photography [Raskar and Tumblin]
Epsilon Photography
Low-level vision: Pixels
Multi-photos by perturbing camera parameters
HDR, panorama, …
‘ Ultimate camera’
Coded Photography
Single/few snapshot
Reversible encoding of data
Additional sensors/optics/illum
‘ Scene analysis’ : (Consumer software?)
Essence Photography
Beyond single view/illum
Not mimic human eye
‘ New art form’
Epsilon Photography
Dynamic range
Exposure bracketing [Mann-Picard, Debevec]
Wider FoV
Stitching a panorama
Depth of field
Fusion of photos with limited DoF [Agrawala04]
Noise
Flash/no-flash image pairs
Frame rate
Triggering multiple cameras [Wilburn04]
Dynamic Range Goal: High Dynamic Range Short Exposure Long Exposure
High frequency illumination, Global/direct illumination [Nayar06]
Glare decomposition [Talvala07, Raskar08]
Coded Sensor
Gradient camera [Tumblin05]
"Fast Separation of Direct and Global Components of a Scene using High Frequency Illumination," S.K. Nayar, G. Krishnan, M. D. Grossberg, R. Raskar, ACM Trans. on Graphics (also Proc. of ACM SIGGRAPH), Jul, 2006.
Computational Photography [Raskar and Tumblin]
Epsilon Photography
Multiphotos by varying camera parameters
HDR, panorama
‘ Ultimate camera’ : (Photo-editor)
Coded Photography
Single/few snapshot
Reversible encoding of data
Additional sensors/optics/illum
‘ Scene analysis’ : (Next software?)
Essence Photography
High-level understanding
Not mimic human eye
Beyond single view/illum
‘ New artform’
Blind Camera Sascha Pohflepp, U of the Art, Berlin, 2006
Data-driven enhancement of facial attractiveness [Leyvand et al 2008]
Deblurring [Fergus et al 2006, Several 2008 and 2009 papers]
Scene Completion Using Millions of Photographs Hays and Efros, Siggraph 2007
Community Photo Collections U of Washington/Microsoft: Photosynth
Can you look around a corner ?
Can you look around a corner ? Kirmani, Hutchinson, Davis, Raskar 2009 Accepted for ICCV’2009, Oct 2009 in Kyoto Impulse Response of a Scene
Femtosecond Laser as Light Source Pico-second detector array as Camera
Coded Aperture Camera The aperture of a 100 mm lens is modified Rest of the camera is unmodified Insert a coded mask with chosen binary pattern
In Focus Photo LED
Out of Focus Photo: Open Aperture
Out of Focus Photo: Coded Aperture
Captured Blurred Photo
Refocused on Person
Smart Barcode size : 3mm x 3mm
Ordinary Camera: Distance 3 meter
Computational Probes: Long Distance Bar-codes Mohan, Woo,Smithwick, Hiura, Raskar Accepted as Siggraph 2009 paper
Bokode
Barcodes markers that assist machines in understanding the real world
Bokode: ankit mohan, grace woo, shinsaku hiura, quinn smithwick, ramesh raskar camera culture group, MIT media lab imperceptible visual tags for camera based interaction from a distance
Defocus blur of Bokode
Image greatly magnified. Simplified Ray Diagram
Our Prototypes
street-view tagging
Converting LCD Screen = large Camera for 3D Interactive HCI and Video Conferencing Matthew Hirsch, Henry Holtzman Doug Lanman, Ramesh Raskar BiDi Screen *
Beyond Multi-touch: Mobile Laptops Mobile
Light Sensing Pixels in LCD Display with embedded optical sensors Sharp Microelectronics Optical Multi-touch Prototype
Design Overview Display with embedded optical sensors LCD , displaying mask Optical sensor array ~2.5 cm ~50 cm
Beyond Multi-touch: Hover Interaction
Seamless transition of multitouch to gesture
Thin package, LCD
Design Vision Object Collocated Capture and Display Bare Sensor Spatial Light Modulator
Touch + Hover using Depth Sensing LCD Sensor
Overview: Sensing Depth from Array of Virtual Cameras in LCD
A. Introduction‐‐5 minutes
B. Cameras of the future ( Raskar , 30 minutes) * Form factors, Modalities and Interaction * Enabling Visual Social Computing
C. Reconstruction the World ( Seitz , 30 minutes) * Photo tourism and beyond * Image‐based modeling and rendering on a massive scale * Scene summarization
D. Understanding a Billion Photos ( Efros , 30 minutes) * What will the photos depict? * Photos as visual content for computer graphics * Solving computer vision
E. Discussion‐‐10 minutes
Next Billion Cameras
Visual Social Computing
Computational Photography
Digital
Epsilon
Coded
Essence
Beyond Traditional Imaging
Looking around a corner
LCDs as virtual cameras
Computational probes (bokode)
Camera Culture Group, MIT Media Lab Ramesh Raskar http://raskar.info Cameras of the Future Digital Epsilon Coded Essence Computational Photography aims to make progress on both axis Camera Array HDR, FoV Focal stack Decomposition problems Depth Spectrum LightFields Human Stereo Vision Transient Imaging Virtual Object Insertion Relighting Augmented Human Experience Material editing from single photo Scene completion from photos Motion Magnification Phototourism
Camera Culture Ramesh Raskar Alyosha Efros Ramesh Raskar Steve Seitz Siggraph 2009 Course Next Billion Cameras http://raskar.info/photo/
A. Introduction‐‐5 minutes
B. Cameras of the future ( Raskar , 30 minutes) * Form factors, Modalities and Interaction * Enabling Visual Social Computing
C. Reconstruction the World ( Seitz , 30 minutes) * Photo tourism and beyond * Image‐based modeling and rendering on a massive scale * Scene summarization
D. Understanding a Billion Photos ( Efros , 30 minutes) * What will the photos depict? * Photos as visual content for computer graphics * Solving computer vision
Next Billion Cameras Digital Epsilon Coded Essence Computational Photography aims to make progress on both axis Camera Array HDR, FoV Focal stack Decomposition problems Depth Spectrum LightFields Human Stereo Vision Transient Imaging Virtual Object Insertion Relighting Augmented Human Experience Material editing from single photo Scene completion from photos Motion Magnification Phototourism
A. Cameras of the future ( Raskar , 30 minutes) * Enabling Visual Social Computing * Computational Photography * Beyond Traditional Imaging
B. Reconstruction the World ( Seitz , 30 minutes) * Photo tourism and beyond * Image‐based modeling and rendering on a massive scale * Scene summarization
C. Understanding a Billion Photos ( Efros , 30 minutes) * What will the photos depict? * Photos as visual content for computer graphics * Solving computer vision
0 comments
Post a comment