SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
Optical Computing for Fast Light Transport AnalysisMatthew O'Toole
Optical Computing for Fast Light Transport Analysis
Matthew O'Toole and Kiriakos N. Kutulakos. ACM SIGGRAPH Asia, 2010.
We present a general framework for analyzing the transport matrix of a real-world scene at full resolution, without capturing many photos. The key idea is to use projectors and cameras to directly acquire eigenvectors and the Krylov subspace of the unknown transport matrix. To do this, we implement Krylov subspace methods partially in optics, by treating the scene as a black box subroutine that enables optical computation of arbitrary matrix-vector products. We describe two methods—optical Arnoldi to acquire a low-rank approximation of the transport matrix for relighting; and optical GMRES to invert light transport. Our experiments suggest that good-quality relighting and transport inversion are possible from a few dozen low-dynamic range photos, even for scenes with complex shadows, caustics, and other challenging lighting effects.
Evaluation of geometrical parameters of buildings from SAR imagesFederico Ariu
The aim of this study is to develop a tecnique able to support the retrieval of the buildings height from SAR images.
In particular, an automatic method, which allows to evaluate the orientation angle of buildings in respect to the projection on the ground of the flight line of the radar platform, has been developed.
The starting point was the algorithm output that allows to identify and extract from SAR images the double scattering contribution whose intensity is linked to the geometrical and electromagnetic parameters of the buildings.
On SAR images, the double scattering contributions turn out to be misaligned dots, so, before evaluating the slope of the straight line of double scattering, a linear regression operation was necessary, starting from the knowledge of the coordinates of each pixel belonging to the line of double scattering.
Given the prospective presence of outliers, the absolute deviation instead of the standard deviation has been used.
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 4)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)Matthew O'Toole
Recent advances in both computational photography and displays have given rise to a new generation of computational devices. Computational cameras and displays provide a visual experience that goes beyond the capabilities of traditional systems by adding computational power to optics, lights, and sensors. These devices are breaking new ground in the consumer market, including lightfield cameras that redefine our understanding of pictures (Lytro), displays for visualizing 3D/4D content without special eyewear (Nintendo 3DS), motion-sensing devices that use light coded in space or time to detect motion and position (Kinect, Leap Motion), and a movement toward ubiquitous computing with wearable cameras and displays (Google Glass).
This short (1.5 hour) course serves as an introduction to the key ideas and an overview of the latest work in computational cameras, displays, and light transport.
Optical Computing for Fast Light Transport AnalysisMatthew O'Toole
Optical Computing for Fast Light Transport Analysis
Matthew O'Toole and Kiriakos N. Kutulakos. ACM SIGGRAPH Asia, 2010.
We present a general framework for analyzing the transport matrix of a real-world scene at full resolution, without capturing many photos. The key idea is to use projectors and cameras to directly acquire eigenvectors and the Krylov subspace of the unknown transport matrix. To do this, we implement Krylov subspace methods partially in optics, by treating the scene as a black box subroutine that enables optical computation of arbitrary matrix-vector products. We describe two methods—optical Arnoldi to acquire a low-rank approximation of the transport matrix for relighting; and optical GMRES to invert light transport. Our experiments suggest that good-quality relighting and transport inversion are possible from a few dozen low-dynamic range photos, even for scenes with complex shadows, caustics, and other challenging lighting effects.
Evaluation of geometrical parameters of buildings from SAR imagesFederico Ariu
The aim of this study is to develop a tecnique able to support the retrieval of the buildings height from SAR images.
In particular, an automatic method, which allows to evaluate the orientation angle of buildings in respect to the projection on the ground of the flight line of the radar platform, has been developed.
The starting point was the algorithm output that allows to identify and extract from SAR images the double scattering contribution whose intensity is linked to the geometrical and electromagnetic parameters of the buildings.
On SAR images, the double scattering contributions turn out to be misaligned dots, so, before evaluating the slope of the straight line of double scattering, a linear regression operation was necessary, starting from the knowledge of the coordinates of each pixel belonging to the line of double scattering.
Given the prospective presence of outliers, the absolute deviation instead of the standard deviation has been used.
A Steered-Response Power (SRP) based Framework for Sound Source Localization using Microphone Arrays in Reverberant Rooms for Enhancement of Speech Intelligibility
Primal-Dual Coding to Probe Light Transport
Matthew O'Toole, Ramesh Raskar, and Kiriakos N. Kutulakos. ACM SIGGRAPH, 2012.
Abstract:
We present primal-dual coding, a photography technique that enables direct fine-grain control over which light paths contribute to a photo. We achieve this by projecting a sequence of patterns onto the scene while the sensor is exposed to light. At the same time, a second sequence of patterns, derived from the first and applied in lockstep, modulates the light received at individual sensor pixels. We show that photography in this regime is equivalent to a matrix probing operation in which the elements of the scene's transport matrix are individually re-scaled and then mapped to the photo. This makes it possible to directly acquire photos in which specific light transport paths have been blocked, attenuated or enhanced. We show captured photos for several scenes with challenging light transport effects, including specular inter-reflections, caustics, diffuse inter-reflections and volumetric scattering. A key feature of primal-dual coding is that it operates almost exclusively in the optical domain: our results consist of directly-acquired, unprocessed RAW photos or differences between them.
Landuse Classification from Satellite Imagery using Deep LearningDataWorks Summit
With the abundance of remote sensing satellite imagery, the possibilities are endless as to the kind of insights that can be derived from them. One such use is to determine land use for agriculture and non-agricultural purposes.
In this talk, we’ll be looking at leveraging Sentinel-2 satellite imagery data along with OpenStreetMap labels to be able to classify land use as agricultural or non-agricultural.
Sentinel-2 data has a 10-meter resolution in RGB bands and is well-suited for land use classification. Using these two datasets, many different machine learning tasks can be performed like image segmentation into two classes (farm land and non-farm land) or more challenging task of identification of crop type being cultivated on fields.
For this talk, we’ll be looking at leveraging convolutional neural networks (CNNs) built with Apache MXNet to train deep learning models for land use classification. We’ll be covering the different deep learning architectures considered for this particular use case along with the appropriate metrics.
We’ll be leveraging streaming pipelines built on Apache Flink and Apache NiFi for model training and inference. Developers will come away with a better understanding of how to analyze satellite imagery and the different deep learning architectures along with their pros/cons when analyzing satellite imagery for land use. SUNEEL MARTHI and CHRIS OLIVIER, Software Development Engineer Amazon Web Services
Corisco is a method for monocular camera orientation estimation in anthropic environments using edgels. This is my doctorate defense presentation, updated and translated to english.
3D Shape and Indirect Appearance by Structured Light TransportMatthew O'Toole
3D Shape and Indirect Appearance by Structured Light Transport
Matthew O'Toole, John Mather, and Kiriakos N. Kutulakos. CVPR, 2014.
Abstract:
We consider the problem of deliberately manipulating the direct and indirect light flowing through a time-varying, fully-general scene in order to simplify its visual analysis. Our approach rests on a crucial link between stereo geometry and light transport: while direct light always obeys the epipolar geometry of a projector-camera pair, indirect light overwhelmingly does not. We show that it is possible to turn this observation into an imaging method that analyzes light transport in real time in the optical domain, prior to acquisition. This yields three key abilities that we demonstrate in an experimental camera prototype: (1) producing a live indirect-only video stream for any scene, regardless of geometric or photometric complexity; (2) capturing images that make existing structured-light shape recovery algorithms robust to indirect transport; and (3) turning them into one-shot methods for dynamic 3D shape capture.
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...Kitsukawa Yuki
パターン・映像情報処理特論において論文を紹介した時の発表資料です。
Xiangyun Meng, Wei Wang, and Ben Leong. 2015. SkyStitch: A Cooperative Multi-UAV-based Real-time Video Surveillance System with Stitching. In Proceedings of the 23rd ACM international conference on Multimedia (MM '15). ACM, New York, NY, USA, 261-270. DOI=http://dx.doi.org/10.1145/2733373.2806225
A Steered-Response Power (SRP) based Framework for Sound Source Localization using Microphone Arrays in Reverberant Rooms for Enhancement of Speech Intelligibility
Primal-Dual Coding to Probe Light Transport
Matthew O'Toole, Ramesh Raskar, and Kiriakos N. Kutulakos. ACM SIGGRAPH, 2012.
Abstract:
We present primal-dual coding, a photography technique that enables direct fine-grain control over which light paths contribute to a photo. We achieve this by projecting a sequence of patterns onto the scene while the sensor is exposed to light. At the same time, a second sequence of patterns, derived from the first and applied in lockstep, modulates the light received at individual sensor pixels. We show that photography in this regime is equivalent to a matrix probing operation in which the elements of the scene's transport matrix are individually re-scaled and then mapped to the photo. This makes it possible to directly acquire photos in which specific light transport paths have been blocked, attenuated or enhanced. We show captured photos for several scenes with challenging light transport effects, including specular inter-reflections, caustics, diffuse inter-reflections and volumetric scattering. A key feature of primal-dual coding is that it operates almost exclusively in the optical domain: our results consist of directly-acquired, unprocessed RAW photos or differences between them.
Landuse Classification from Satellite Imagery using Deep LearningDataWorks Summit
With the abundance of remote sensing satellite imagery, the possibilities are endless as to the kind of insights that can be derived from them. One such use is to determine land use for agriculture and non-agricultural purposes.
In this talk, we’ll be looking at leveraging Sentinel-2 satellite imagery data along with OpenStreetMap labels to be able to classify land use as agricultural or non-agricultural.
Sentinel-2 data has a 10-meter resolution in RGB bands and is well-suited for land use classification. Using these two datasets, many different machine learning tasks can be performed like image segmentation into two classes (farm land and non-farm land) or more challenging task of identification of crop type being cultivated on fields.
For this talk, we’ll be looking at leveraging convolutional neural networks (CNNs) built with Apache MXNet to train deep learning models for land use classification. We’ll be covering the different deep learning architectures considered for this particular use case along with the appropriate metrics.
We’ll be leveraging streaming pipelines built on Apache Flink and Apache NiFi for model training and inference. Developers will come away with a better understanding of how to analyze satellite imagery and the different deep learning architectures along with their pros/cons when analyzing satellite imagery for land use. SUNEEL MARTHI and CHRIS OLIVIER, Software Development Engineer Amazon Web Services
Corisco is a method for monocular camera orientation estimation in anthropic environments using edgels. This is my doctorate defense presentation, updated and translated to english.
3D Shape and Indirect Appearance by Structured Light TransportMatthew O'Toole
3D Shape and Indirect Appearance by Structured Light Transport
Matthew O'Toole, John Mather, and Kiriakos N. Kutulakos. CVPR, 2014.
Abstract:
We consider the problem of deliberately manipulating the direct and indirect light flowing through a time-varying, fully-general scene in order to simplify its visual analysis. Our approach rests on a crucial link between stereo geometry and light transport: while direct light always obeys the epipolar geometry of a projector-camera pair, indirect light overwhelmingly does not. We show that it is possible to turn this observation into an imaging method that analyzes light transport in real time in the optical domain, prior to acquisition. This yields three key abilities that we demonstrate in an experimental camera prototype: (1) producing a live indirect-only video stream for any scene, regardless of geometric or photometric complexity; (2) capturing images that make existing structured-light shape recovery algorithms robust to indirect transport; and (3) turning them into one-shot methods for dynamic 3D shape capture.
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...Kitsukawa Yuki
パターン・映像情報処理特論において論文を紹介した時の発表資料です。
Xiangyun Meng, Wei Wang, and Ben Leong. 2015. SkyStitch: A Cooperative Multi-UAV-based Real-time Video Surveillance System with Stitching. In Proceedings of the 23rd ACM international conference on Multimedia (MM '15). ACM, New York, NY, USA, 261-270. DOI=http://dx.doi.org/10.1145/2733373.2806225
E NHANCED S PREADSHEET C OMPUTING W ITH F INITE - D OMAIN C ONSTRAINT S ATI...ijpla
The spreadsheet application is among the most widel
y used computing tools in the modern society. It
provides great usability and usefulness, and it eas
ily enables a non-programmer to perform programming
-
like tasks in a visual tabular “pen and paper” appr
oach. However, due to its mono-directional dataflow
,
spreadsheets are mostly limited to bookkeeping-like
applications. This paper shows how the spreadsheet
computing paradigm is extended to break through thi
s limitation for solving constraint satisfaction
problems. We present an enhanced spreadsheet system
where finite-domain constraint solving is well
supported in a visual environment. A spreadsheet-sp
ecific constraint language is constructed for gener
al
users to specify constraints among data cells in a
declarative and scalable way. The new spreadsheet
system significantly simplifies the development of
many constraint-based applications using a visual
tabular interface. Examples are given to illustrate
the usability and usefulness of the extended sprea
dsheet
paradigm.
Presentation made by Prof. Adriano Camps (Universitat Politècnica de Catalunya) at ICMARS 2010 (India, 16-December-2010) on the MIRAS instrument aboard ESA's SMOS mission.
13. Chopping
Differential Signals
Fast switching of detectors between source and blank sky.
Analyze difference signals.
E.g. 45” switching at 4 Hz for SHARC
Problems
Differencing Noise
(2x observing time)
Insensitivity to Certain
Spatial Components
Duty Cycle
Striping
(Imperfect Sky Removal)
Observing Strategies for Imaging Arrays SPIE 2008 -- Marseille
15. The Array Imaging Challenge
High background
Unstable detectors
Faint signals
Large data volumes
(100—10,000 pixels 10--100 Hz readout)
Do at least as well as chopping techniques...
16. Introducing CRUSH...
Comprehensive Reduction Utility for SHARC-2
(PhD thesis, Caltech 2006)
Also used for LABOCA, SABOCA, ASZCA,
ArTeMiS, PolKa, GISMO...
Offsprings: sharcsolve (C. D. Dowell), BoA (F.
Schuller, A. Beelen et al.)
40K lines of Java code (and growing...)
Fast (~1GB/min on 4-core HT CPUs)...
Low overheads.
Future: more instrument, interferometry, other
high background applications...
http://www.submm.caltech.edu/~sharc/crush
(2003 -- now)
18. x = F b + n
(AT
A) x = AT
b
The model parameters
That we want to solve for
noise
The measurements
A is the design matrix
Aij = dFi / σj
Need to know: weights, gains, flags before we can invert for signals
(channel c, time t)
scanning strategy
(map index x,y)
data
source
gain
source
signal
signal
gain
correlated
signal
k
noise
19. one term at a time...
Incremental solutions
Correlated signal
incrementgain
Maximum-likelihood estimator
Can use other statistical estimators too...
residual
27. LABOCA (870um) SHARC-2 (350um)
Typical further steps:
- Decorrelate instrumental
signals.
- Remove sky gradients
- Channel flagging by gain
- Flag noisy pixels
- Despiking
- Noise whitening
28. Direct Maximum-Likelihood Clipped model (>1 Jy) Iterated with clipped model
The Orion Molecular Cloud
(OMC-1)
SABOCA (350um) Optical and Near Infrared
GISMO 2-mm Camera
29. 74 GHz (Kassim et al. 1995) 3.7 mm (BIMA + single dish)
1.4 GHz (VLA)2 mm (filtered above 45”)2 mm (GISMO with crush)
3.6-8 micron (Spitzer)
Cassiopeia A
(supernova remnant)
33. Data Reduction Summary
Works well (better than SVD of PCA)...
Fast (~1 GB/min on a modern PC)
Distributable (for cluster computing)
Linear computing requirement
Low overheads
Lets the astronomer decide what's best....
38. Noise Resistance
Spectral Noise Locations
Stationary noise (in time and in space) is characterized by its power
spectrum of independent components.
Projections of a spectral cube
47. Simulations
32 x 32
pixels
16 x 16
pixels
Aim to cover same area
1 pixel/frame average
scanning speed
(1 position/frame)
Size
“Speed”
48. Spectral Moments
m0
: The fraction of phase space volume occupied by a point source observed
with the pattern.
m1
: Resistance against canonical 1/f noise (electronics)
m2
: Resistance against 1/f2
noise (atmopshere + temperature fluctuations)
m1
,m2
: Also large-scale sensitivity indicators...
53. Billiard Scan
a.k.a. 'PONG' and 'box-scan'
Used for SHARC-2 large-field mapping since 2003 (Borys & Dowell).
Irrational x and y
frequencies lead to
non-repeating,
open patterns
Rational x and y
frequencies lead to
closed patterns
57. Large Fields
What's the best strategies for fields > FoV?
All at once... Little by little...
The answer does not depend on field size.
It depends entirely on the pattern chosen!!!
58. Conclusions
I. Recipes for Designing Better Patterns
II. Rankings:
(1) Random
(2) Lissajous, Billiard, Spirals
(3) Cross-Linked OTF
III. Evaluate you own pattern at
http://www.submm.caltech.edu/~sharc/scanning
66. Mapping (nearest pixel algorithm)
Put signal from
channel c at time t
Into map pixel x,y
Map pixel increment:
Map pixel variance:
For Gaussian telescope beams, at 2.5 or more pixels per FWHM required...
67. Sensitivity to Large Scales
Fx
Fy
f
f
Spectral Tapering
(convolution theorem)
S(x) P(x) S(f) x P(f)
S: Source structure
P: Point source spectrum
68. What's Wrong with Staring?
Detector Noise Limited
σdet
> σbg
Heavily Background Limited
σdet
<< σbg
Dark Frame Calibration Time
<<
On-Source Time
4 x overhead!!!
Dark Frame Calibration Time
=
On-Source Time
small overhead
Ground-based sub-mm
cameras
Space-based and airborne sub-mm
and far-infrared instrumentation
optical/IR cameras