Millions of people worldwide need glasses or contact lenses to see or read properly. We introduce a computational display technology that predistorts the presented content for an observer, so that the target image is perceived without the need for eyewear. We demonstrate a low-cost prototype that can correct myopia, hyperopia, astigmatism, and even higher-order aberrations that are difficult to correct with glasses.
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Vision-correcting Displays @ SIGGRAPH 2014
1. Eyeglasses-free Display:
Towards Correcting Visual Aberrations with
Computational Light Field Displays
Fu-Chung Huang1,+ Gordon Wetzstein2,# Brian A. Barsky1 Ramesh Raskar2
University of California, Berkeley
MIT Media Lab
now at Microsoft
now at Stanford University
1
2
+
#
17. Nirmud lens (?)
9th century
reading stone
1284
Salvino D’Armato
1508
concept
1760
Benjamin Franklin
1880
August Mueller
1983
PRK and LASIK
now
934 B.C.
19. Prior Work
Projector Precompensation
- Brown et al. [2006]
- Zhang and Nayer [2006]
- Oyamada et al. [2007]
- Grosse et al. [2010]
Computational Displays
- Lanman et al. [2010]
- Wetzstein et al. [2012]
- Maimone et al. [2013]
- Hirsch et al. [2014]
- Akeley et al. [2004]
Computational Vision Correction
- Alonso and Barreto [2003]
- Yellot and Yellot [2007]
- Huang et al. [2012]
- Pamplona et al. [2012]
- Ji et al. [2014]
- Huang and Barsky [2011]
50. conventional
display
multilayer
display
[Pamplona et al.2012][Huang et al.2012] Proposed method
Method Inverse prefiltering Direct ray tracing Prefiltered light field
Spatial Resolution Very High Very Low High
Image Contrast Very Low Full (100%) High
Building Cost High Very High Very Low
light field
display
light field
display
51. Shortcomings
• Contrast and brightness loss
– Content-dependent
• Resolution loss
– 3-to-1(DroidDNA), 5-to-1(iPhone)
– about 150 PPI
• Computation
– GPU, Mobile
• Calibration
– Eye-tracking
– Off-Axis Opt.
52. Future Work
• Higher Resolution & Large Display
– e.g. tensor displays
• Multi-way correction
• Other applications
– AR/VR, 3D, Cryptography
• Theoretical analysis
– Higher order aberrations
58. 𝝎 𝒙
𝝎 𝒖
(a) conventional display
(in-focus)
spatialdomainfrequencydomain
no angular variations
only spatial energy
𝒙
𝒖
59. 𝝎 𝒙
𝝎 𝒖
𝒙
𝒖
(a) conventional display
(in-focus)
spatialdomainfrequencydomain
pupil function
is a rect(), in u
multiplication
( just spreading )
pupil response
is a sinc(), in 𝜔 𝑢
convolution
Imagine you are near-sighted,
When the display is shown outside your focal range,
everything appears blurred
We built a parallax barrier based light field display using a printed pinhole mask on a ipod touch
And the construction is quite thin, just a few milimeters,
And using computation to correct vision problem!
This does not require the observer to wear eyeglasses.
And we did not change the optics of the eye and display
Once the display has been built, all the corrections are done through computation!
So let me introduce where you can use this kind of technology…
So, this is a technology that has a very broad impact…..
But let me go more specific on who can benefit from this.
It is estimated that about 25% of people in the US , are far-sighted
And since the ability to accommodate will decreases over time,
At age of 40,
the population of people having presbyopia is about 43%,
And the number increases to almost 70% at the age of 80,
This is an inevitable aging process that everyone has to face.
That we will need a pair of reading glasses
In the mean time,
recent study shows that myopia in the US has increased to 41%
Which is also high
But the number in some Asia countries
is approaching to a crazy 60%~90%.
Although these “conditions” can be solved with eyeglasses
They are not always convenient (click)
And there are also certain people having HOA, that the blur they perceive are irregular,
and are very difficult to correct.
So maybe we need a new solution,
Before that, let me briefly review what options people have so far.
The earliest form of correction is the reading stone, or just a magnifier.
(click)But really the first revolution is the eyeglasses;
(click) The second category is the contact lenses, but it took a long time to become what it is today.
(click) Finally, the third category is refractive surgery, and it has become quite popular these days.
In this paper, we introduce the 4th option, which is a computation based correction.
(click) This eye-glasses free vision correcting display
modifies the content shown on the display according to your prescription
(click) When paired with standard eye tracking technology,
the display can generate a sharp perception to the user’s eye.
This capability is unlocked by solving a deconvolution problem in the light field domain.
So let me first introduce some prior work that leads to this idea
The idea of pre-correction is not new.
Early work pre-compensate a defocused projector, such that the projected image will become sharp.
(click) We are also inspired by the concept of computational display,
that modifying the optical components and the added computation give us more degree of freedoms.
(click) Finally, correcting vision through computation has existed for quite some time,
but modern approaches using computational display can give better results.
So let me briefly describe how to build a display that can potentially correct your vision
Let’s consider a 1D scan line of the watch face, its diagram is shown on the right
For we know the blur can be modeled using convolution,
(click) In the frequency domain, the operator becomes a pairwise multiplication of their spatial frequencies
A simple idea of prefiltering is to invert the kernel in frequency domain,
multiply with the signal spectrum
(click)And then transform it back to the spatial domain
This is a very simple idea, and can be implemented in just 3 lines of code
The result looks very strange.
But the ringing artifacts will be canceled out by a final stage of the blur, which happens inside the eye.
Unfortunately the perceived image has very low contrast.
This is because inverting the kernel, or deconvolution, is a well-known ill-posed problem.
So how can we solve the problem?
Our prior work uses a multilayer display,
(click)such that the point spread functions at different layers have different sizes
In the frequency domain, they give different response, thus we have more degrees of freedom.
(click) Here are the photograph comparing with conventional display.
Multilayer display can generate higher contrast and sharper image.
But the contrast is still not perfect, so let’s look at another approach.
Pamplona et al. construct a high angular resolution light field display.
(click)When shown to a patient with blurred vision, the display can generate a sharp perception.
(click) In their solution, all the light rays hitting the same retinal cell
will be assigned to the same value.
(click) But since they requires all 49 views entering the eye,
thus the hardware construction can be very challenging, and the spatial resolution is low.
So there must be some way to have both high contrast and high resolution image.
And actually it’s quite simple:
We use a light field display, and inversely prefilter the 4D space,
And you can see the prefiltered images inside each view have the amplified frequencies.
So let me briefly describe how that is done.
Let’s say you have a simple set up that the rays leaving the display are focused on the retina.
To obtain the perceived image I,
(click) we have to consider the retinal light field L, where the rays are expressed using the spatial location x and the angular direction u.
(click) the retinal image pixel is obtained by integrating rays along the angular direction,
And Finally, since some rays are blocked by the pupil, we have to multiply a binary aperture function A,
That limits the integration.
This gives us the basic projection equation
Now let’s consider the case where the display is out of focus,
(click) since the rays converge at a plane before the retina plane,
the retinal light field is now sheared.
Integrating along the angular direction changes the formulation into a convolution:
There are some interesting frequency domain analysis, please refer to the paper.
But basically we know that the inversion in the frequency domain is ill-posed,
since we don’t have enough degrees of freedoms.
So using a light field display allows us to control the angular variation, thus we have more flexibility.
(click) so let’s go back to look again at the projection equation
(click) Since we know the exact geometry, we can always express the integral using the display light field
So the question is: how can we solve for the display light field such that the eye will see a sharp target image?
Well,
The first step is to discretize the integral equation
Into a matrix-vector multiplication.
(click) The projection matrix P can be obtained by sampling the light transport.
And P times the light field gives the target image
A very simple way to solve for the prefiltered display light field is to move the projection matrix to the right,
And that’s it.
The solution is almost just that simple.
But we might ask ourselve is there any condition allowing us to solve this problem in such a simple way?
Remember that we said
using the light field display gives more degree of freedoms
So does that mean we can solve it with just two rays per pixel?
Or do we need 3 rays?
Or maybe 4?
To answer these question,
(click) we can take a look at the property of the projection matrix P.
Since we want to invert the matrix, it is a good idea to look at its condition number
By changing the numbers of rays entering the pupil, and varying the blur size in screen pixels
What we found is that the decrease of the condition number slows down
(click) as the angular sampling rate is higher than 2,
so we decide 2 rays per pixel is good enough to implement on a regular display
This number gives the guideline for hardware construction,
So let me show you the experiments and results(click)
And we simulate a hyperopic eye using a camera.
On the right is the direct comparison of a battery with our correction.
And here is the video I showed you earlier.(click)
Here we compare with the prior light field solution by pamplona et al.
On the same low resolution hardware, we can still manage to create a sharp image.
The hardware construction is quite straightforward, and we will show how to build it in a few seconds
First we show a light field image,
When the display is out of focused, everything is blurred.
Putting a pinhole array mask on top, a sharp image is revealed after some alignment.
We also evaluate the standard Michaelson contrast
The multilayer prefiltering gives sharp correction, but contrast is very low.
(click) On the other hand, Pamplona et al. has full contrast,
but using the same low resolution prototype, their results are still quite blurry.
(click) Our method gives a good balance between sharpness and contrast.
(click) We also test with the perceptual metric HDR-VDP2,
and our method has low probability of differences detection
Finally, we also show that it is possible to correct higher order aberrations.
You can see the spherical is quite different from the defocus term,
and both are difficult to correct.
(click)The coma is more difficult with a cone shape PSF.
(click)Finally, not all higher order term are difficult;
For example, trefoil has 3 legs, and preserves more high frequency information
making the prefiltering easier.
Our method can also account for slight deviation in the lateral and axial movement.
This is done by stacking multiple projection matrices into the linear system
and solving an over-constrained optimization.
On the top right is the prefiltered light field with an over-constrained lateral movement.
Since the light field display also has repetitive viewing zones,
this allows the display to be viewed with two eyes separated by 65mms.
Finally, to summarize the features and problems with different approaches,
The first multilayer display deconvolve the optical blur of the eye,
the resolution is higher, but the contrast is low.
Building a multilayer display is non-trivial.
(click)Pamplona et al require a super high resolution display,
so its spatial resolution is quite low, even it provides the full image contrast.
Build such high resolution display is extremely difficult and expensive.
(click)The last technology prefilters the 4D light filed, so it has both higher resolution and higher contrast, and the cost is just a few dollars
But of course, we also still have some problems.
First, the prefiltering is content dependent: so if you have a lot of high frequencies, the contrast and brightness will be lower.
(click) Second, our parallax barrier implementation will introduce some resolution loss, and the highest resolution we built is about 150PPI
(click)processing light field requires a lot of computation, but we think it shouldn’t be too hard to port to GPU
(click)Finally, the vision correcting require a good calibration between the eye and the display.
But fortunately Amazon just announce a very cool firephone and SDK to do that,
And in the paper, we also implemented an off-axis optimization to deal with the problem
However, there are still some future work to be done.
First, we anticipate higher resolution on a larger display to be built, and we believe this could be achieve using the tensor display architecture.
(click)And it would be also interesting to provide a multi-way correction for a shared display.
(click) extending the prefiltering to other applications like AR/VR, 3D content, or even cryptography can be very interesting
(click) and finally, we believe a more thorough theoretical analysis is required for higher order aberrations.
But even I just told you how to solve for the light field with a simple equation,
The frequency analysis is actually more interesting than the equation.
Let’s first look at the frequency domain light field,
For a in-focus diffuse object, its spectrum is simply a line.
But remember there is a pupil aperture blocking the light,
In the spatial domain it’s a multiplication with the pupil function
(click)And in the frequency domain it’s a “convolution” with the pupil response, which is a “sinc” function
Since the original light field spectrum is only a line,
(click)the convolution is just a spreading of the line vertically.
The final step to obtain the image is the integration of light rays over the angular direction.
If you are familiar with computational tomography, the integration is a slicing in the freq. domain.
So the target image spectrum is the slicing on the axis.
For an out of focus diffuse object, its representation is sheared,
and so does its frequency spectrum.
If you just slice the axis, there will be nothing left except for the DC term.
But the spreading due to the pupil aperture will deliver some information to the slicing axis.
Unfortunately not all spatial frequencies are preserved, since the spreading is due to a “sinc” function.
In the multilayer prefiltering case, the missing frequencies are covered by the other layer, so you can still preserve all spatial frequencies
Finally, the light field display is a bit different,
You have a full convolution with the entire frequency spectrum, but you also have a lot more flexibility.