Shallow introduction for Deep Learning Retinal Image Analysis

Table of Contents
Imaging Techniques & Eye
Image Quality
AI-enhanced Retinal Imaging
Data Engineering
Retinal Data Sources
Labels and needed Data Quantity
Active Learning
Data Pre-processing
Data Augmentation
AI Frameworks
CNN Architectures
CNN Components
CNN: Domain-specific issues
Sparse ConvNets
Compressed Sensing
Feature Extraction & Understanding
Transfer Learning
Network optimization & Hardware
retinal image analysis
Shallow introduction
for deep learning
Petteri Teikari, PhD
http://petteri-teikari.com/
version Thu 6 July 2017

Introduction
●
Purpose is to introduce 'something about everything' to make the
communication easier with different people from different disciplines.
– Namely try to make biologists, clinicians, engineers, data scientists,
statisticians and physicists understand each other in this practically
multidisciplinary problem rather keeping them all in their own silos.
●
Strive for 'systems engineering' solution where the whole pipeline would be
intelligent rather than just the individual components.
●
Presentation itself is quite dense, and better suitable to be read from a
tablet/desktop rather than as a slideshow projected somewhere
technologyreview.com, August 23, 2016 bv Olga Russakovsky
We discuss recent progress and future directions for imaging in behaving mammals from a systems engineering perspective, which seeks holistic consideration
of fluorescent indicators, optical instrumentation, and l analyses. http://dx.doi.org/10.1016/j.neuron.2015.03.055

Image-forming characteristics of eye
Artal (2015)
http://dx.doi.org/10.1146/annurev-vision-082114-035905

Image-forming characteristics of eye Purkinje
Pablo Artal:“A light source illuminating the eye generates specular reflections at the different ocular interfaces (air-cornea,
cornea-aqueous, aqueous-crystalline lens and lens- vitreous) that are commonly named Purkinje images (PI, PII, PIII and
PIV), after a Czech physiologist, Jan Purkinje made use of them in the 19th century. In the early times of Physiological Optics
these reflections were the primary source used to obtain information of the ocular structures.”
http://typecast.qwriting.qc.cuny.edu/2012/05/21/purkinje-images/ pabloartal.blogspot.co.uk/2009/02
http://dx.doi.org/10.1364/OE.14.010945
Cornsweet, T. N. , and H. D. Crane. “Accurate two-dimensional eye
tracker using first and fourth Purkinje images.” Journal of the Optical
Society of America 63, no. 8 (1973): 921-928.
http://dx.doi.org/10.1364/JOSA.63.000921
http://dx.doi.org/10.1007/978-94-011-5698-1_36
http://dx.doi.org/10.1177/0748730409360888
Multispectral Fundus camera
Lens Absorption Monitor (LAM)
based on Purkinje images
Notice know that the PIV image is
inverted version of all the other
reflections and can be relatively easily
identified automatically using computer
vision techniques.
Now the difference between PIII and PIV
images can be used to quantify
misalignment of ocular surfaces, which
is useful for example after implantation of
intraocular lenses (IOL) after cataract
surgery,
And to measure the crystalline lens
absorbance of the in vivo human eye(e.g.
already done by Said and Weale in 1959,
Gerontologia 1959;3:213–231, doi:
10.1159/000210900). Now in practice the
higher dynamic range the camera has, the
better.

Spectral characteristics of the eye
The eye is composed of several layers,
each different in structure, absorption and
scattering properties. faculty.cua.edu
Teikari thesis (2012)
Enezi et al. 2011.
Stockmann And Sharpe (2000), CVRL
Govardovskii et al. 2000
van de Kraats and van Norren 200
7
Walraven 2003 CIE Report
Styles et al. (2005)
The Annidis RHA™ system combines advanced multispectral imaging (MSI) technology with multi-image
software processing for early detection of ocular pathologies such as age related macular degeneration,
diabetic retinopathy and glaucoma. http://www.annidis.com/page/technology

Birefringent properties of eye i.e. polarization
Eigenvectors associated to the birefringent structure of the eye in
double-pass (central cornea-fovea), for three different subjects: (a) 2
mm; (b) 5 mm of pupil diameter. CR and CL, circular polarization
states; LH, L−45 and L−45, linear polarization states.
http://dx.doi.org/10.1016/S0042-6989(00)00220-0
http://dx.doi.org/10.1038/sj.eye.6702203
http://dx.doi.org/10.1001/archopht.121.7.961
http://dx.doi.org/10.1364/JOSAA.4.000082
http://dx.doi.org/10.1167/iovs.03-1160
http://dx.doi.org/10.1364/AO.21.003811
http://dx.doi.org/10.1016/j.preteyeres.2011.06.003
Example of tissue discrimination based on PS-OCT. (A) intensity image, (B)
pseudo color coded structural images. The light brown corresponds to
conjunctiva, green indicates sclera, dark yellow indicates trabecular meshwork,
blue indicates cornea, and red indicates uvea. (reprinted from
Miyazawa et al. (2009)).
Different performances of RPE detection in a patient with neovascular AMD. (A)
Cirrus OCT (Zeiss Meditec), (B) Spectralis OCT (Heidelberg Engineering), (C) PS-
OCT. (D) Retinal thickness map (inner limiting membrane to RPE) retrieved from
Cirrus OCT, (E) retinal thickness map retrieved from Spectralis OCT, (F) retinal
thickness map obtained with PS-OCT (areas with RPE atrophy are marked in
gray). The arrows in (C) and (F) point to locations with RPE atrophy (reprinted
from Ahlers et al. (2010)).
...differentiation between different highly
backscattering layers is difficult because of the
heavily distorted retinal structure. Therefore, the
automated RPE segmentation provided by
commercial instruments often yields erroneous
results (c.f. bottom red B). Moreover, the
commercially available RPE detection algorithms fail
to detect RPE atrophies. Since RPE segmentation
using PS-OCT data is based on an intrinsic tissue
specific contrast, RPE atrophies can be detected
even in such a case of RPE distortion ©
Note the presence of multiple locations of RPE
atrophy that can only be detected with PS-OCT (c.f.
arrows in F). These atrophies might give an
explanation why in these patients after restoration
of retinal anatomy that is visible in OCT B-scans (e.g.
after antiangiogenic treatment) the visual acuity is
not improved - Ahlers et al., 2010.

Nonlinear Optical Susceptibility of the Eye #1
Multimodal nonlinear imaging of intact excised human corneas.
Adapted from IOVS 2010 and Opt Express 2010.
portail.polytechnique.edu
Nonlinear microscopies have the unique ability to provide micrometer-scale 3D
images from within complex, scattering samples like biological tissues.
In particular, third-harmonic generation (THG) microscopy detects interfaces and
optical heterogeneities and provides 3D structural images of unstained biological
samples. This information can be combined with other nonlinear signals such as
two photon microscopy (2-PM) and second harmonic generation (SHG).
Since THG is a coherent process, signal generation must be properly analyzed in
order to interpret the images. We study the contrast mechanisms in THG
microscopy (phase matching, nonlinear susceptibilities), and we develop novel
applications such as :
●
imaging morphogenesis in small animal models (zebrafish, drosophila),
●
imaging lipids in cells and tissues,
●
polarization-resolved THG analysis of organized media: human cornea, skin
lipids, etc.
Jablonski diagrams showing linear vs. non-linear fluorescence. In linear single-photon
excitation, the absorption of short wavelength photons results in a longer wavelength
fluorescence emission. In non-linear two-photon excitation (2PE), the absorption of two long
wavelength photons results in a shorter wavelength fluorescence emission. The techniques
of second and third harmonic generation fluorescence microscopy (SHG and THG,
respectively) elicit a non-linear optical (NLO) response in molecules that lack a center of
symmetry. When multiple longwave photons are simultaneously absorbed by these
molecules, photons that are ½ or ⅓ of the original wavelength are emitted.
alluxa.com/learning-center
http://dx.doi.org/10.1364/AOP.3.000205

Nonlinear Optical Susceptibility of the Eye #2
https://dx.doi.org/10.1167%2Fiovs.15-16783
http://dx.doi.org/10.1117/1.3183805
http://dx.doi.org/10.1002/lpor.200910024
http://www.molvis.org/molvis/v21/538
Understanding the mechanical behavior of the Optic Nerve Head (ONH) is
important for understanding the pathophysiology of glaucoma. We have
developed an inflation test that uses second harmonic generation (SHG)
imaging and digital volume correlation (DVC) to measure the deformation
response of the lamina cribrosa, the connective tissue structure of the ONH,
to controlled pressurization. Human eyes were obtained from a tissue bank.
http://dx.doi.org/10.1007/978-3-319-21455-9_2
We are also using two-photon and single harmonic generation with confocal
imaging to investigate the extracellular attachments between the trabecular
meshwork and Schlemm's canal. These recent studies show a decrease in elastin
near the base of Schlemm's canal glaucoma eyes which may affect the mechano-
sensitive environment and disrupt outflow. In conclusion, we are utilizing multiple
imaging modalities to answer questions regarding fluid flow patterns, local and
global relationships within the eye, and morphological changes that occur in
glaucoma.
journals.cambridge.org
http://dx.doi.org/10.1098/rsif.2015.0066
(a) A typical PFO/DOFA map of a human ONH. (b) A typical SHG image of the
same ONH. PFO/DOFA maps were overlaid and aligned with SHG images to
allow identification of the scleral canal margin in PFO/DOFA maps. (c) The ONH
was subdivided into: the LC, terminating at the edge of the scleral canal; an
insertion region, defined as an annular ring extending 150 mm from the scleral
canal margin; and a peripapillary scleral region, defined as an annular ring
extending from 150 to 1000 mm from the scleral canal margin. (d) The LC was
subdivided into 12 regions for analysis. S, superior; N, nasal; I, inferior; T,
temporal.
http://dx.doi.org/10.1097/ICO.0000000000000015
http://dx.doi.org/10.1117/12.2077569

PATHOLOGIES
Diabetic retinopathy
http://www.coatswortheyeclinic.co.uk/photography/3102576
http://www.maskelloptometrists.com/glaucoma/
Macular degeneration
http://sutphineyecare.com/Macular_Degeneration.html
Retinal Diseases Signs In One Picture
http://www.ophthnotes.com/retinal-diseases-signs-in-one-picture/
+ Bone spicule pigments (BSP) in Retinitis pigmentosa (RP), Chorioretinal Atrophy, Congenital hypertrophy of the retinal pigment
epithelium (CHRPE), Asteroid hyalosis, Haemangioma, Choroidal neovascularization (CNV), Retinoschisis, etc.

Imaging Techniques
www.optometry.iu.edu

IMAGING TECHNIQUES 2D Fundus photography : Old school imaging
http://dx.doi.org/10.5772/58314
Image is like any other digital image, and is
degraded by combination of fixed-pattern,
random, and banding noise.
http://www.cambridgeincolour.com/tutorials/image-noise.htm
Diffuse illumination
Directillumination, optic
sectiontechnique.
Direct illumination,
parallelepiped technique
A. Fundus retro
illumination.
Indirect illumination,
scleral scatter
technique Indirect illumination
Conical beam illumination
B. Iris retro illumination

Fundus “Components”
Almazroa et al. (2015). doi:10.1155/2015/180972
Hoover and Goldbaum (2003), + STARE
doi:10.1109/TMI.2003.815900
Abdullah et al. (2016)
https://doi.org/10.7717/peerj.2003
Disc + Macula
Girard et al. (2016)

Additional features
Annunziata et al. (2015)
In certain pathologies, the treatmeant itself may
leave features making automatic analysis of disease
progression more difficult
http://www.coatswortheyeclinic.co.uk/photography/3102576
Vascular enhancement with fluorescein dye
(angiography, emedicine.medscape.com/article/1223882-workup)

OCT Optical coherence tomography
http://dx.doi.org/10.5772/58314
http://dx.doi.org/10.5772/58314
Three methods that use low coherence interferometry to acquire
high resolution depth information from the retina. (A) Time
domain OCT. (B) Spectral or Fourier domain OCT. (C) Swept
source OCT. Williams (2011)
http://www.slideshare.net/DrPRATIK189/oct-62435607
by Pratik Gandhi

OCT Scan terminology
http://www.slideshare.net/sealdioftal/oct-presentation

OCT Scan procedures
OCT scan settings used for simulation of the repeatability of different thickness
estimates. Oberwahrenbrock et al. (2015)
https://www.youtube.com/watch?v=_4U3QTrDupE
https://www.youtube.com/watch?v=KKqy8mSFSC0

OCT Image model
OCT images are corrupted by multiplicative speckle noise: “The vast majority of surfaces, synthetic or natural, are extremely
rough on the scale of the wavelength. Images obtained from these surfaces by coherent imaging systems such as laser, SAR, and
ultrasound suffer from a common phenomenon called speckle.” wikipedia.org
Results of applying different speckle compensation methods
on the human retina imagery. Cameron et al. (2013)
Comparison of method by Bian et al. (2013) with the other four popular
methods. Input: 8 frames of the pig eye data. (a) is the original image in log
transformed space, while (b) is the averaged image of 455 registered frames. (c)
is the averaged image of the input 8 frames, and (d)-(g) are the recovered
results of four popular methods. The result of our method is shown in (h). The
two clipped patches on the right of each subfigure are closeups of the regions
of interest.
Fourier-domain optical coherence
tomography (FD-OCT) image of
optical nerve head, before (A) and
after (B) curvelet coefficients
shrinkage-based speckle noise
reduction Jian et al. (2009)

OCT Image model #2
https://www.aop.org.uk/ot/science-and-vision/technology/2017/06/27/breakthroug
h-could-make-oct-images-clearer
http://dx.doi.org/10.1038/ncomms15845
(a) Introducing local phase shifts between scatterers in the same voxel changes the intensity of the
resulting speckle noise, enabling one to reduce speckle noise via averaging many different phase shifts.
This leads to the detection of scatterers otherwise hidden by the speckle noise. (b) Implementation of
SM-OCT on the high-resolution OCT system. DC, dispersion compensation; BS, beam splitter; L1, lens
of the conventional OCT; L2, lenses added to create a 4f imaging system; f1, focal length of L1; f2,
focal length of L2; n, refractive index of the diffuser; λ, the centre wavelength of the light source.
Optotune uses electroactive polymers (EAPs) as an electrostatic
actuator for its series of laser speckle reducers. These so-called
"artificial muscles" can undergo a large amount of deformation while
sustaining large forces. While today’s piezoelectric actuators only
deform by a fraction of a percent, EAPs can exhibit a strain of up to
380%.Optotune has specialized in the use of electroactive polymers as
an actuator in optical components such as laser speckle reducers,
tunable diffraction gratings or tunable phase retarders. The DEAP
principle can also be used as sensor or even as a generator.
http://www.optotune.com/index.php/products/laser-speckle-reducers
Liba et al. (2017)
essentially introduced
a low tech version of
Optotune speckle
reducer.
A new technique used in the study, speckle-modulating OCT (SM-OCT), was found
to clarify and reveal structures that were previously undetectable. Lead author, Orly
Liba, told OT that removing speckle noise allowed clinicians to see the structure of
the tissue much better.
“We were able to see the inner stromal structure of a live mouse cornea and see improved
definition of the mouse retinal layers,” Ms Liba explained.
“We also tested SM-OCT on people and saw reduced speckle noise,” she added.
The research has the potential to give clinicians the tools for improved diagnosis of
eye conditions and better follow-up treatments, Ms Liba elaborated.

OCT “Components”
Kraus et al. (2014)
(a, b) Final segmentation on the original image. (c) Definition of eleven retinal surfaces (surfaces 1 – 11), ILM = internal limiting
membrane, NFL = nerve fiber layer, GCL = ganglion cell layer, IPL = inner plexiform layer, INL = inner nuclear layer, OPL = outer
plexiform layer, ONL = outer nuclear layer, ISP-TI = Inner segment of photoreceptors, transition to outer part of inner segment, ISP-
TO = Inner segment of photoreceptors, start of transition to outer segment, RPE = retinal pigment epithelium
Kafieh et al. (2013)
Automated segmentation of 7 retinal layers. NFL: Nerve Fiber Layer, GCL + IPL: Ganglion Cell
Layer + Inner Plexiform Layer, INL: Inner Nuclear Layer, OPL: Outer Plexiform Layer, ONL: Outer
Nuclear Layer, OS: Outer Segments, RPE: Retinal Pigment Epithelium.
Hendargo et al. (2013)

Scanning laser ophthalmoscope SLO
http://dx.doi.org/10.5772/5831
https://en.wikipedia.org/wiki/Scanning_laser_ophthalmoscopy
Image from a patient with autosomal dominant RP. The background is an
infra-red SLO image from the Heidelberg Spectralis. The line indicates the
location of the SD-OCT scan, which goes through fixation. The SD-OCT scan
shows that photoreceptors are preserved in the central macula A reduced-
scale AOSLO montage is aligned and superimposed on the background
image. The insets are full scale-sections of the AOSLO montage at two
locations indicated by the black squares. Godara et al. (2010)

What modality is the best for diagnosis?
From “old school methods”
(visual field, fundus photograph
and SD-OCT), the SD-OCT
seems to offer clearly the best
diagnostic capability
Results:” Among the four specialists, the inter-observer
agreement across the three diagnostic tests was poor for VF
and photos, with kappa (κ) values of 0.13 and 0.16,
respectively, and moderate for OCT, with κ value of 0.40.
Using panel consensus as reference standard, OCT had the
highest discriminative ability, with an area under the curve
(AUC) of 0.99 (95% 0.96–1.0) compared to photograph AUC
0.85 (95% 0.73–0.96) and VF AUC 0.86 (95% 0.76–0.96),
suggestive of closer performance to that of a group of
glaucoma specialists.” Blumberg et al. (2016)
For the analysis of the
performance of each test
modality, the scores from each
rater were summed to a
composite, ordinal measure.
Curves and AUC values are shown
for each diagnostic test summed
across all specialists. AUC for VF
was 0.86, for photo was 0.85, and
for OCT was 0.99. The blue line
corresponds to VF, red to photos,
and green to OCT, and the
straight line to the reference
standard
Optic disc photograph, visual
field, and SD-OCT for
representative patient A.

Future of OCT and retinal biomarkers
●
From Schmidt-Erfurth et al. (2016): “The therapeutic efficacy of VEGF inhibition in combination with the potential of
OCT-based quantitative biomarkers to guide individualized treatment may shift the medical need from CNV treatment towards other and/or
additional treatment modalities. Future therapeutic approaches will likely focus on early and/or disease-modifying interventions aiming to
protect the functional and structural integrity of the morphologic complex that is primarily affected in AMD, i.e. the choriocapillary - RPE –
photoreceptor unit. Obviously, new biomarkers tailored towards early detection of the specific changes in this functional unit will be
required as well as follow-up features defining the optimal therapeutic goal during extended therapy, i.e. life-long in neovascular AMD.
Three novel additions to the OCT armamentarium are particularly promising in their capability to identify the biomarkers of the future:”
Polarization-sensitive OCT OCT angiography Adaptiveopticsimaging
“this modality is particularly appropriate to highlight early
features during the pathophysiological development of
neovascular AMD
Findings from studies using adaptive optics implied that
decreased photoreceptor function in early AMD may be
possible, suggesting that eyes with pseudodrusen appearance
may experience decreased retinal (particularly scotopic) function
in AMD independent of CNV or RPE atrophy.”
“...the specific patterns of RPE plasticity
including RPE atrophy, hypertrophy, and
migration can be assessed and quantified).
Moreover, polarization-sensitiv OCT allows
precise quantification of RPE-driven disease
at the early stage of drusen”,
“Angiographic OCT with its potential to
capture choriocapillary, RPE, and
neuroretinal fetures provides novel
types of biomarkers identifying disease
pathophysiology rather than late
consecutive features during advanced
neovascular AMD.””
Schlanitz et al. (2011)
zmpbmt.meduniwien.ac.at
See also Leitgeb et al. (2014)
Zayit-Soudry et al. (2013)

Polarization-sensitive OCT
Features of Retinal Pigment Epithelium (RPE) evaluated on PS-OCT.
Color fundus photographs (1a–4a); PS-OCT RPE thickness maps (1b–
4b); and PS-OCT RPE segmentation B-scans (1c–4c) corresponding to
the yellow horizontal lines in the en-face images. Images illustrate
examples of RPE atrophy ([1a–c], dashed white line); RPE thickening
([2a–c], yellow circle); RPE skip lesion ([3a–c], white arrow) and RPE
aggregations ([4a–c]: yellow arrows). Roberts et al. (2016)
Color fundus photography (a), late phase fluorescein angiography (b), PS-OCT
imaging (c–j), and conventional SD-OCT imaging (k–o) of the right eye of a patient
with subretinal fibrosis secondary to neovascular AMD. Retardation en face (c),
pseudo–scanning laser ophthalmoscope (SLO) (d), median retardation en face (e),
and the axis en face map thresholded by median retardation (f) show similarity with
standard imaging (a, b). In the averaged intensity (g), depolarizing material (h), axis
orientation (i), and retardation B-scans (j) from PS-OCT the scar complex can be
observed as subretinal hyperreflective and birefringent tissue. The retinal pigment
epithelium is absent in the area of fibrosis (h); however, clusters of depolarizing
material are consistent with pigment accumulations in (a). Note the “column-like”
pattern in the axis orientation B-scan image (i) reflecting the intrinsic birefringence
pattern of collagenous fibers in fibrous tissue. Tracings from PS-OCT segmentation
(f) were overlayed on color fundus photography (a) to facilitate the comparison
between the two imaging modalities. The retinal thickness map (k), central
horizontal (l), and vertical (m) B-scans as well as an ETDRS-grid with retinal
thickness (n), and the pseudo-SLO (o) of the fibrous lesion generated from
conventional SD-OCT (Carl Zeiss Meditec) are shown for comparison. Color scales:
0 to 50° for retardation en face (c), −90 to +90° for axis orientation (f, i), 0 to +90°
for median retardation (e), and retardation B-scan (j). Roberts et al. (2016)b

OCT angiography
OCT Angiographyand Fluorescein Angiography of
Microaneurysmsin diabetic retinopathy.
The right eye (A) and left eye (B) of a 45 year old Caucasian man
with non-proliferative diabetic retinopathy using the swept
source optical coherence tomography angiography (OCTA)
prototype (A1) Fluorescein angiography (FA) cropped to
approximately 6 x 6 mm. Aneurysms are circled in yellow. (A2)
Full-thickness (internal limiting membrane to Bruch’s membrane)
6 x 6 mm OCT angiogram. (B1) FA cropped to approximately 3
x 3 mm. Aneurysms are circled in yellow. (B2) Full-thickness 3 x
3 mm OCT angiogram, which provides improved detail over 6 x
6 mm OCT angiograms, demonstrates higher sensitivity in
detecting micro vascular abnormalities. FAZ appears enlarged.
Aneurysms that are seen on FA in B1 that are also seen on
OCTA are circled in yellow. Aneurysms on FA that are seen as
areas of capillary non-perfusion on OCTA are circled in blue.
de Carlo et al. (2015)
Disc photographs (A, C) and en face OCT
angiograms (B, D) of the ONH in representative
normal (A, B) and preperimetric glaucoma
(PPG) subjects (C, D). Both examples are from
left eyes. In (B) and (D) the solid circles indicate
the whole discs, and the dash circles indicate
the temporal ellipses. A dense microvascular
network was visible on the OCT angiography of
the normal disc (B). This network was greatly
attenuated in the glaucomatous disc (D)
Jia et al. (2012)
Total (a) and temporal (b)optic nerve head ( ONH)
acquisition in a normal patient. Total (c) and temporal (d)
ONH acquisition in a glaucoma patient
Lévêque et al. (2016)
In glaucoma the vascularization of
the optic nerve head is greatly
attenuated,
This is not readily visible from
the fundus photograph (see
above)
Prada et al. (2016)

Adaptive optics systems in practice
Not many commercial systems available, mainly in university laboratories, but
See Imagine Eyes' http://www.imagine-eyes.com/product/rtx1/

Adaptive optics Functional “add-ons”
https://dx.doi.org/10.1364/BOE.3.000225
https://doi.org/10.1364/BOE.6.003405
https://doi.org/10.1364/BOE.7.001051
http://dx.doi.org/10.1145/2857491.288858
Integrate pupillometry for clinical assessment to
the AO system as pupil tracking is useful for
optimizing imaging quality as well
doi:10.1371/journal.pone.0162015

Multispectral Imaging
http://dx.doi.org/10.1038/eye.2011.202
Absorption spectra for the major absorbing elements of the eye. Note that some of the spectra change with
relatively small changes in wavelength. Maximizing the differential visibility requires utilizing small spectral
slices. Melanin is the dominant absorber beyond 600 nm.
Zimmer et al. (2014)
Zimmer et al. (2014) The aim of this project is to build and clinically test a
reliable multi-spectral imaging device, that allows in
vivo imaging of oxygen tension and β-amyloid in
human eyes. Maps showing the possible existence and
distribution of β-amyloid plaques will be obtained in
glaucoma patients and possibly patients with (early)
Alzheimers’s disease.

OCT towards handheld devices
http://dx.doi.org/10.1364/BOE.5.000293
Cited by 54
10.1038/nphoton.2016.141
http://dx.doi.org/10.1364/OE.24.013365
Here, we report the design and operation of a handheld probe that can perform both scanning laser
ophthalmoscopy and optical coherence tomography of the parafoveal photoreceptor structure in infants
and children without the need for adaptive optics. The probe, featuring a compact optical design weighing
only 94 g, was able to quantify packing densities of parafoveal cone photoreceptors and visualize cross-
sectional photoreceptor substructure in children with ages ranging from 14 months to 12 years.
https://aran.library.nuigalway.ie/handle/10379/5481
EU-funded Horizon 2020 project led by Wolfgang Drexler from the Medical University of Vienna is aiming to shrink the core
technology to no more than the size of a coin, primarily to diagnose eye diseases including diabetic retinopathy and glaucoma.
“OCTCHIP” (short for ophthalmic OCT on a chip, project began at the start of the year 2016) and directly applied in the
field of OCT for ophthalmology
http://optics.org/news/7/6/19 | cordis.europa.eu/project/rcn/199593 | jeppix.eu

Multispectral Imaging
http://dx.doi.org/10.1038/eye.2011.202
Absorption spectra for the major absorbing elements of the eye. Note that some of the spectra change with
relatively small changes in wavelength. Maximizing the differential visibility requires utilizing small spectral
slices. Melanin is the dominant absorber beyond 600 nm.
Zimmer et al. (2014) The aim of this project is to build and clinically test a
reliable multi-spectral imaging device, that allows in
vivo imaging of oxygen tension and β-amyloid in human
eyes. Maps showing the possible existence and
distribution of β-amyloid plaques will be obtained in
glaucoma patients and possibly patients with (early)
Alzheimers’s disease.

Functional biomarkers
MICROPERIMETRY VISUAL FIELD
Right eye of a 72-year-old man. Native en-face image
(A) and reticular drusen (RDR) area highlighted (B).
Interpolated test results for both scotopic (C) and
photopic (D) microperimetry. Numerical values for
scotopic (E) and photopic (F) microperimetry.
Steinberg et al. (2015)
The Cassini diagnostic device offers a suite of
examinations including corneal topography, mesopic
and photopic pupillometry, and color photography
for diagnostic purposes. crstodayeurope.com
Nissen et al (2014). “Melanopsin”-based pupillometry,
differential post-illumination pupil response (PIPR) due to
pathological changes on ganglion cell layer (GCL)
Pupillometry (“Pupillary Light Reflex”)
ophthalmologymanagement.com
Mmultifocal ERG responses from the macular
area of a patient with AMD. The responses of the
fovea are reduced in amplitude. In the 3-D map
it can be seen that the foveal area is flat,
suggesting no cone activity, compared with the
characteristic peak of responses in the normal
retina.
webvision.med.utah.edu

Clinical diagnosis current
Fundus photographs, optical coherence tomography (OCT)
images, thickness maps, and profiles of thickness of the
circumpapillary retinal nerve fiber layer (cpRNFL) in the right eye
of a 60-year-old woman with open-angle glaucoma and a mead
deviation (MD) of –2.33 dB. Nukada et al. (2011)
Conclusions
Assessment of RNFL thickness with OCT was able to detect
glaucomatous damage before the appearance of visual field
defects on SAP. In many subjects, significantly large lead
times were seen when applying OCT as an ancillary
diagnostic tool.
http://dx.doi.org/10.1016/j.ophtha.2015.06.015

Visual field
“Patterns of early glaucomatous visual field loss and their evolution
over time” http://iovs.arvojournals.org/article.aspx?articleid=2333021
http://dx.doi.org/10.1016/j.ajo.2015.12.006 http://dx.doi.org/10.1007/s12325-016-0333-6
Humphrey HFA II-i
Field Analyzer http://ibisvision.co.uk/

Additional biomarkers
Conclusions
We report macular thickness data derived from SD OCT
images collected as part of the UKBB study and found novel
associations among older age, ethnicity, BMI, smoking,
and macular thickness.
Correspondence: Praveen J. Patel, FRCOphth, MD(Res), Moorfields Eye Hospital NHS
Foundation Trust, 162 City Road, London, EC1V2PD UK.
http://dx.doi.org/10.1016/j.arr.2016.05.013

Retina beyond retinal pathologies
http://dx.doi.org/10.1016/j.neuroimage.2010.06.020
http://dx.doi.org/10.4172/2161-0460.1000223
http://dx.doi.org/10.1371/journal.pone.0085718
http://dx.doi.org/10.1186/s40478-016-0346-z
Affiliated with: UCL Institute of Ophthalmology, University College
London
retinalphysician.com
http://dx.doi.org/10.1097/WCO.0b013e328334e99b http://dx.doi.org/10.1016/j.pscychresns.2011.08.011

Imaging technique
implications for Automatic diagnosis
“Garbage in – Garbage out” Fundus image captures very macro-level changes, and works with advanced pathologies,
but how about detecting very early signs allowing very early interventions as well?
Cannot analyze something that is not visible in the image
The retina appears normal in the fundus photograph, but extensive loss of outer segments is revealed in the superimposed montage of AOSLO images. Dropout is visible
everywhere in the AOSLO montage, but increases sharply at 6.5° (arrow) from the optic disc coinciding with the border of the subject’s enlarged blind spot. Arrow indicates blood
vessel marked in Fig. 2. F = fovea. For a higher resolution image, see Fig. S2. Red boxed region is shown in Fig. 4, green boxed region in Fig. 5a,b. Horton et al. (2015)
Towardsmultimodal imageanalysis Try to imageall relevant pathologicalfeatures
and multivariate analysis incorporating functional measures,
and even some more static variables from electronic health records (EHR)

Imaging technique: Mobile phone
Going for quantity rather than quality
Instead of high-end imaging solutions, one could go for the smartphone-based solution on
the side and trying to gather as much as possible low-quality training data which then
would be helpful in developing nations to allow easily accessible healthcare.
For some details and startup, see the following slideshow:
http://www.slideshare.net/PetteriTeikariPhD/smartphonepowered-ophthalmic-diagnostics
eyenetra.com

Mobile Ecosystems
Apple HealthKit
https://developer.apple.com/healthkit/
theophthalmologist.com/issues/0716
“Despite the availability of multiple health data
aggregation platforms such as Apple’s HealthKit,
Microsoft’s Health, Samsung’s S Health, Google Fit, and
Qualcomm Health, the public will need to be convinced
that such platforms provide long-term security of
health information. In the rapidly developing business
opportunities represented by the worlds of ehealth and
mhealth, the blurring of the lines between consumer
goods and medical devices will be further tested by the
consumer goods industry hoping not to come under
the scrutiny of the FDA.”
meddeviceonline.com
http://www.medscape.com/viewarticle/85277
9
doi:10.5811%2Fwestjem.2015.12.2878
1
imedicalapps.com/2016/03/ohiohealth-epic-apple-health/
http://www.wareable.com/sport/google-fit-vs-apple-healt
h

GENERIC IMAGE QUALITY
http://www.imatest.com/support/image-quality/
Imatest Software Suite
used commonly in practice
that was built using Matlab

Retinal Image Quality
See nice literature review on Dias' Master's thesis.
– Color
– Focus
– Illumination
– Contrast
+ Camera artifacts
+ Noise (~SNR)
COLOR
FOCUS
CONTRAST
ILLUMINATION
CAMERA Artifacts

Domain-specific IMAGE QUALITY: Fundus
Usher et al. (2003)
Maberley et al. (2004)
Fleming et al. (2006)
http://dx.doi.org/10.1016/j.compbiomed.2016.01.0
27
Wang et al. (2016)

Domain-specific IMAGE QUALITY: OCT #1
Nawas et al. (2016)
http://dx.doi.org/10.1136/bjo.2004.097022
http://dx.doi.org/10.1080/02713683.2016.117933
2
http://dx.doi.org/10.1111/opo.12289

Domain-specific IMAGE QUALITY: OCT #2
The left image is blurred due to poor focusing. This results in increased noise and
loss of transversal resolution in the OCT image on the right.
Signal: The signal strength for this image is 13 dB which is lower than the limit of 15
dB. This results in a more noisy OCT image with a lot of speckling.
Decentration: The ring scan is not correctly centred as can be observed in the left
image. The edge of the optic nerve head crosses more than two circles. Therefore
the ringscan is rejected.
Algorithm failure: The red line in the OCT image right is not clearly at the border of
the RNFL. The location corresponds to inferior of the ONH.
Retinal pathology: There is severe peri-papillary atrophy. It can be seen that this
affects the RNFL enormously.
Illumination: The OCT scan here is badly illuminated. Also here this results in
speckling and decrease of resolution.
Beam placement: the laser beam is not placed centrally. This can be seen at the
outer nuclear layer (ONL). The two arrows point to two regions of the ONL. The
left arrow points to a light gray region whereas the other points to a darker gray
region. If there is too much difference in colour of the ONL itself a scan is
rejected.
The OSCAR-IB Consensus Criteria for Retinal OCT Quality Assessment

OCT Device Comparison
Comparison of images obtained with 3 different
spectral-domain OCT devices (Topcon 3D OCT-
1000, Zeiss Cirrus, Heidelberg Spectralis) of both
eyes of the same patient with early AMD changes
taken just minutes apart.
Comparison of images obtained with 3 different
spectral-domain OCTs (Heidelberg Spectralis,
Optovue RTVue, Topcon 3D OCT-1000) and with
1 time-domain OCT (Zeiss Stratus) of both eyes
of the same patient with a history of central
serous chorioretinopathy in both eyes.
The same set of images as shownabove in
pseudo color.
Comparison of horizontal B-scan images and 3D
images of a patient with neovascular age-related
macular degeneration obtained with Heidelberg
Spectralis, Zeiss Cirrus, Topcon 3D OCT-1000.
Spectral-domain Optical
Coherence Tomography: A
Real-world Comparison
IRENE A. BARBAZETTO, MD · SANDRINE A. ZWEIFEL,
MD · MICHAEL ENGELBERT, MD, PhD · K. BAILEY
FREUND, MD · JASON S. SLAKTER, MD
retinalphysician.com
e.g. from Xie et al. (2015): “Hyper-class
Augmented and Regularized Deep Learning for
Fine-grained Image Classification”
How much inter-device variance,
are the images more or less the
same between images in CNN-
sense, and the inter-individual
variance dominate?

OCT IMAGE Quality issues & ARTIFACTS
http://dx.doi.org/10.1155%2F2015%2F746150
Blink artifact Smudged Lens Floaters over
Optic disk
Patient-Dependent Factors
Operator-Dependent Factors
Device-Dependent Factors
Pupil Size, Dry Eye, and Cataract
Floaters and Other Vitreous Opacities
Epiretinal Membranes
Blinks
Motion Artifacts
Signal Strength
OCT Lens Opacities
Incorrect Axial Alignment of the OCT image
Inaccurate Optic Disc Margins Delineation
Inaccurate Retinal Nerve Fiber Layer Segmentation

OCT factors affecting quality
RNFLT: retinal nerve fiber layer thickness.
Note: case examples obtained using Cirrus HD-
OCT (Carl Zeiss Meditec, Dublin, CA; software
version 5.0.0.326). The content of this table may
not be applicable to different Cirrus HD-OCT
models or to other Spectral-domain OCT devices.
http://dx.doi.org/10.1155/2015/746150

OCT Image quality issues & ARTIFACTS #2
Recent studies demonstrated a lower frequency of artifacts in SD-OCT instruments compared with Stratus TD-
OCT.2, 3Interestingly, the authors identified several types of clinically important artifacts generated by SD-OCT,
including those previously seen in TD-OCT and those new with SD-OCT.1
We have recently performed a similar analysis by comparing TD-OCT and SD-OCT (Querques G, unpublished data,
June 2010), and our findings completely agree with those reported by the authors. Here, we would like to focus on
new artifacts seen on SD-OCT. Given that the Fourier transform of OCT information is Hermitian,4 a real image
is always accompanied by its inverted image.5 This feature of SD-OCT may be responsible for image artifacts that
could be mistakenly interpreted for retinal lesions. This is especially true if scan acquisition is performed by a
technician, and then the physician analyzes the printout for diagnostic evaluation.
It was recently the case, when we faced with an unusual printout, showing a small, round retinal lesion located
within the outer plexiform layer, which presented a shadowing effect not only in the deeper layers but even in the
superficial layers. This was evident with both Cirrus HD-OCT and Spectralis HRA-OCT. Interestingly, in some other
printouts, the lesion was still located within the outer plexiform layer, even though no clear shadowing effect was
evident.
When we returned to this patient by personally performing the SD-OCT examination, we realized that the patient
presented asteroid bodies in the vitreous, which due to the Fourier transformation of OCT information (the
inverted image always accompanying the real image), were responsible for the “pseudo” retinal lesions
Artifacts represent a major concern of every imaging modality. Although SD-OCT marks a significant advance in
the ability to image the retina, artifacts may still influence clinical decisions. Recognizing the limitations of OCT,
as well as the “new” and “old” misleading image artifacts would help the physicians in every day clinical practice.
COMMENTARY by Querques et al. 2010
Purpose
To report the frequency of optical coherence tomography (OCT) scan artifacts and
to compare macular thickness measurements, interscan reproducibility, and
interdevice agreeability across 3 spectral-domain (SD) OCT (also known as Fourier
domain; Cirrus HD-OCT, RTVue-100, and Topcon 3D-OCT 1000) devices and 1
time-domain (TD) OCT (Stratus OCT) device.
Results
Time-domain OCT scans contained a significantly higher percentage of clinically
significant improper central foveal thickness (IFT) after manual correction (11-
μm change or more) compared with SD OCT scans. Cirrus HD-OCT had a
significantly lower percentage of clinically significant IFT (11.1%) compared with the
other SD OCT devices (Topcon 3D, 20.4%; Topcon Radial, 29.6%; RTVue (E)MM5,
42.6%; RTVue MM6, 24.1%; P = 0.001). All 3 SD OCT devices had central foveal
subfield thicknesses that were significantly more than that of TD OCT after manual
correction (P<0.0001). All 3 SD OCT devices demonstrated a high degree of
reproducibility in the central foveal region (ICCs, 0.92–0.97). Bland-Altman plots
showed low agreeability between TD and SD OCT scans.
Conclusions
Out of all OCT devices analyzed, cirrus HD-OCT scans exhibited the lowest
occurrence of any artifacts (68.5%), IFT (40.7%), and clinically significant IFT
(11.1%), whereas Stratus OCT scans exhibited the highest occurrence of clinically
significant IFT. Further work on improving segmentation algorithm to decrease
artifacts is warranted.

OCT Image quality issues & ARTIFACTS #3
9
Conclusions
Retinal thickness and retinal height could be underestimated in patients
with central serous chorioretinopathy (CSC) or neovascular age-related
macular degeneration (AMD) after retinal thickness analysis in Stratus
OCT when either automatic measurements or manual caliper–assisted
measurements are performed on the analyzed images. We recommend
exporting the original scanned OCT images for retinal thickness and
retinal height measurement in patients with CSC or neovascular AMD.
http://dx.doi.org/10.1212/WNL.0000000000002774
Objective: To develop consensus recommendations for reporting of quantitative
optical coherence tomography (OCT) study results.
Methods: A panel of experienced OCT researchers (including 11 neurologists, 2
ophthalmologists, and 2 neuroscientists) discussed requirements for performing
and reporting quantitative analyses of retinal morphology and developed a list of
initial recommendations based on experience and previous studies. The list of
recommendations was subsequently revised during several meetings of the
coordinating group.
Results: We provide a 9-point checklist encompassing aspects deemed relevant
when reporting quantitative OCT studies. The areas covered are study protocol,
acquisition device, acquisition settings, scanning protocol, funduscopic imaging,
postacquisition data selection, postacquisition data analysis, recommended
nomenclature, and statistical analysis.
Conclusions: The Advised Protocol for OCT Study Terminology and Elements
recommendations include core items to standardize and improve quality of
reporting in quantitative OCT studies. The recommendations will make
reporting of quantitative OCT studies more consistent and in line with existing
standards for reporting research in other biomedical areas. The
recommendations originated from expert consensus and thus represent Class IV
evidence. They will need to be regularly adjusted according to new insights and
practices.
Methods: Studies that used intra-retinal layer segmentation of macular OCT scans in patients with MS were
retrieved from PubMed. To investigate the repeatability of previously applied layer estimation approaches, we
generated datasets of repeating measurements of 15 healthy subjects and 13 multiple sclerosis patients using
two OCT devices (Cirrus HD-OCT and Spectralis SD-OCT). We calculated each thickness estimate in each
repeated session and analyzed repeatability using intra-class correlation coefficients and coefficients of
repeatability.
Results: We identified 27 articles, eleven of them used the Spectralis SD-OCT, nine Cirrus HD-OCT, two
studies used both devices and two studies applied RTVue-100. Topcon OCT-1000, Stratus OCT and a research
device were used in one study each. In the studies that used the Spectralis, ten different thickness estimates
were identified, while thickness estimates of the Cirrus OCT were based on two different scan settings. In the
simulation dataset, thickness estimates averaging larger areas showed an excellent repeatability for all retinal
layers except the outer plexiform layer (OPL).
Conclusions: Given the good reliability, the thickness estimate of the 6mm-diameter area around the fovea
should be favored when OCT is used in clinical research. Assessment of the OPL was weak in general and
needs further investigation before OPL thickness can be used as a reliable parameter.

OCT repeatability
Explanation of different thickness estimates used for the simulation of repeatability. The red areas or
points on the fundus images indicate the values that were averaged to generate the layer thickness
estimates.
Differences in the outer plexiform layer (OPL)
in repeated OCT measurements
The values in the grid are the mean OPL
thickness differences for each sector. The
right graph maps the OPL thickness of the B-
scans in (A) (green line) and (B) (blue line),
respectively. The red line indicates the
difference between the repeated B-scans

OCT Inter-Device variability and intra-device reproducibility
http://dx.doi.org/10.1136/bjophthalmol-2014-305573
Methods: 29 eyes were imaged prospectively with Spectralis (Sp), Cirrus
(Ci), 3D-OCT 2000 (3D) and RS-3000 (RS) OCTs. … Conclusions: By
comparison of identical regions, substantial differences were detected
between the tested OCT devices regarding technical accuracy and clinical
impact. Spectralis showed lowest error incidence but highest error impact.
Purpose: To evaluate and compare the frequency, type and cause of imaging artifacts
incurred when using swept-source optical coherence tomography (SS OCT) and Cirrus
HD OCT in the same patients on the same day.
Conclusions: There was no significant difference in the frequency, type and cause of
artifacts between SS OCT and Cirrus HD OCT. Artifacts in OCT can influence the
interpretation of OCT results. In particular, ERM around the optic disc could contribute
to OCT artifacts and should be considered in glaucoma diagnosis or during patient
follow-up using OCT.
http://dx.doi.org/10.3109/02713683.2015.1075219
http://dx.doi.org/10.1167/tvst.4.1.5
Conclusions: : RTVue thickness reproducibility appears similar to Stratus.
Conversion equations to transform RTVue measurements to Stratus-equivalent
values within 10% of the observed Stratus RT are feasible. CST changes greater
than 10% when using the same machine or 20% when switching from Stratus to
RTVue, after conversion to Stratus equivalents, are likely due to a true change
beyond measurement error
Translational Relevance: : Conversion equations to translate central retinal
thickness measurements between OCT instruments is critical to clinical trials.
Bland-Altman plots of the differences between values
on machines (RTVue minus Stratus) versus the means
of the automated Stratus test–retest values, for each
measurement. CST - Central subfield thickness results
More on Bland-Altman, see for example: McAlinden et al. (2011): “Statistical methods
for conducting agreement (comparison of clinical tests) and precision (repeatability or reproducibility)
studies in optometry and ophthalmology” Cited by 108

OCT IMAGE Quality issues & ARTIFACTS #4
The total number of rejected OCT scans (prospective validation
set of 159 OCT scans from Amsterdam, San Francisco and
Calgary.) from the pooled prospective validation set was high
(42%–43%) in each of the readers

OCT IMAGE Quality Summary
●
Based on the results of the OSCAR-IB study by
Tewarie et al. (2012), we can see that almost half of the OCT
images were rejected!
– This poses challenges for the deep learning framework for the
classification as the bad quality samples can be then
misclassified.
●
Two mutually non-exclusive approaches:
– Improve the image quality of the scans.
●
Improve the hardware itself
●
Make the scanning more intelligent with software without
having to change underlying hardware
– Improve the automated algorithms distinguishing good quality
scans from bad quality.

OCT-SPECIFIC CORRECTIONS #1
http://dx.doi.org/10.1109/ISBI.2016.749324
The example of OCT images of the nerve head (below
row) affected by motion artifact (top row). (a) En face
fundus projection (b) B-scan.
http://dx.doi.org/10.1088/2057-1976/2/3/03501
2
http://dx.doi.org/10.1016/j.ijleo.2016.05.088
(a-1)–(a-3) are cartoon part u, texture part v and speckle noise part w decomposed of Fig. 1 by variational image
decomposition model TV-G-Curvelet; (b-1)–(b-3) are cartoon part u, texture part v and speckle noise
part w decomposed of Fig. 1 by variational image decomposition model TV-Hilbert-Curvelet.

OCT-SPECIFIC CORRECTIONS #2
Optical Coherence Tomography (OCT) is an emerging technique in the field of biomedical imaging,
with applications in ophthalmology, dermatology, coronary imaging etc. OCT images usually suffer
from a granular pattern, called speckle noise, which restricts the process of interpretation.
Therefore the need for speckle noise reduction techniques is of high importance. To the best of
our knowledge, use of Independent Component Analysis (ICA) techniques has never been
explored for speckle reduction of OCT images. Here, a comparative study of several ICA
techniques (InfoMax, JADE, FastICA and SOBI) is provided for noise reduction of retinal OCT
images. Having multiple B-scans of the same location, the eye movements are compensated using
a rigid registration technique. Then, different ICA techniques are applied to the aggregated set of
B-scans for extracting the noise-free image. Signal-to-Noise-Ratio (SNR), Contrast-to-Noise-Ratio
(CNR) and Equivalent-Number-of-Looks (ENL), as well as analysis on the computational
complexity of the methods, are considered as metrics for comparison. The results show that use
of ICA can be beneficial, especially in case of having fewer number of B-scans.
Overall, Second Order Blind Identification (SOBI) is the best among the
ICA techniques considered here in terms of performance based on SNR,
CNR and ENL, while needing less computational power.
http://dx.doi.org/10.1007/978-3-540-77550-8_13

OCT Layer segmentation #1
http://dx.doi.org/10.1364%2FBOE.5.000348
http://dx.doi.org/10.1117/1.JBO.21.7.076015
http://dx.doi.org/10.1364/AO.55.000454
http://dx.doi.org/10.1142/S179354581650008
5

OCT Layer segmentation #2

IMAGE QUALITY ASSESSMENT
●
With enough manual labels we could train again a deep learning
network to do the quality classification for us
Fornatural images
http://dx.doi.org/10.1109/TNNLS.2014.2336852 Fornatural images
http://dx.doi.org/10.1016/j.image.2015.10.005
http://dx.doi.org/10.1016/j.cmpb.2016.03.011
For natural images
http://arxiv.org/abs/1602.05531
What about using generative adversial networks (GAN) as well for training for proper image quality?

OCT Devices already have GPUS
●
Increase of use GPU throughout the OCT computation
pipeline.
– More operations in less time compared to CPU
computations with many algorithms.
– GPU computation allows one to embed artificial
intelligence to the device itself
e.g. Moptim Mocean 3000

OCT or custom FPGA boards
http://www.alazartech.com/landing/oct-news-2016-09
Complete on-FPGA FFT solution that includes:
• User programmable dispersion compensation function
• User programmable windowing
• Log calculation
• FFT magnitude output in floating point or integer format
Special "Raw + FFT" mode that allows users to acquire both time domain and FFT data
• This can be very useful during the validation process

GPU interventional OCT
4D Optical Coherence Tomography Imaging
Demo of GPU-based real-time 4D OCT technology, providing
comprehensive spatial view of micro-manipulation region with accurate
depth perception. Image reconstruction performed by NVIDIA GTX 580 and
volume rendering by NVIDIA GTS 450. The images are volume rendered
from the same 3D data set. Imaging speed is 5 volumes per second. Each
volume has 256×100×1024 voxels, corresponding to a physical volume of
3.5mm×3.5mm×3mm. http://www.nvidia.co.uk
Real-time 4D signal processing and visualization using graph
ics processing unit on a regular nonlinear-k Fourier-domain
OCT system
by K Zhang - 2010 - Cited by 161 - Related articles
repository.cmu.edu
http://dx.doi.org/10.3807/JOSK.2013.17.1.068
Flowchart of the computation
and image display of the hybrid
CPU/GPU processing scheme
in the program.
http://dx.doi.org/10.1364/OE.20.014797

Embedded decision system Next Generation
●
Upgrade from Quadro 600 Titan X / GTX970 depending on the power needed per price.→
– Accelerating traditional signal processing operations,
and the future artifical intelligence analysis
AI does not have to be limited to analysis for pathology!
●
Use AI to find Regions of Interest (ROI), and do denser sampling
from possible pathological areas of retina.
– More data from relevant regions → better analysis accuracy.
●
AI to optimize image quality, e.g.
– Super-resolution from multiple scans within device
– Multiple scans to get rid of artifacts
– Train AI for image denoising / deconvolution
– Make the analysis quality less reliant on the operator
Systems engineering approach
Optimize the whole process from imaging to analysis jointly rather separately
MOptim MOcean 3000

UPgrading EXISTING SYSTEMS
●
Add value to existing install base by providing an “AI module” that in essence is a
“Raspberry Pi/Arduino”-style minicomputer running an embedded GPU accelerator NVIDIA
Jetson

“Smart” AI Acquisition Examples
zeiss.com Cirrus Smart HD
zeiss.com

Image quality improvement Super-resolution
Super-resolution from retinal
fundus videos (Köhler et al. 2014)
Improved
dynamic
range
https://www5.informatik.uni-erlangen.de/Forschung/Publikatione
n/2016/Kohler16-SRI-talk.pdf

Image quality improvement 3D Reconstruction
Multiple GPUs, Threads and
Reconstruction Volumes
Multiple GPUs can be used by Kinect Fusion, however, each must have its own
reconstruction volume(s), as an individual volume can only exist on one GPU. It is
recommended your application is multithreaded for this and each thread specifies a device
index when calling NuiFusionCreateReconstruction.
Multiple volumes can also exist on the same GPU – just create multiple instances of
INuiFusionReconstruction. Individual volumes can also be used in multi-threaded
environments, however, note that the volume related functions will block if a call is in
progress from another thread.
https://msdn.microsoft.com/en-us/library/dn188670.aspx
motivation from Kinectand microscopy
http://www.label.mips.uha.fr/fichiers/articles/bailleul12spie.pd
f
http://dx.doi.org/10.1364/OE.24.011839

3D Reconstruction for OCT
2D example:
"The image reconstructions and super-resolution processing can be further accelerated by paralleled computing with graphics processing units (GPU), which can
potentially improve the applicability of the PSR method illustrated herein." - He et al. (2016)
3D example:
"We use a parallelized and hardware accelerated SVR reconstruction method. A full field of view reconstruction of 8 input stacks at 288 × 288 × 100 voxels takes up
to 1 – 2 hours using a small patch size (e.g., a = 32, ω = 16) on a multi GPU System (Intel Xeon E5-2630 2.60GHz system with 16 GB RAM, an Nvidia Tesla K40
(released back in 2013, 1.43 Tflops in double precision) and a Geforce 780). Using large (k = 0.1) overlapping super-pixels reduces this time to approximately 45min for
a full field-of-view volume, while maintaining a comparable result to the best configuration of overlapping square patches." - Kainz et al. (2015)

Image Quality Conclusion
●
As shown in previous slides, almost half of the OCT images were discarded due to bad image quality (OSCAR-IB
study by Tewarie et al., 2012)
– Wasteful to have the operator scan the patient and end up with suboptimal quality image.
●
Better to take an automated approach with multi-exposure scans and making scan quality operator-
independent in the end. Take inspiration from computational photography and 'smart imaging'
Köhler et al. (2013)
MM’10, October 25–29, 2010, Firenze,
Italy,
graphics.stanford.edu
http://prolost.com/blog/lightl16
https://light.co/
ee.surrey.ac.uk
doi:10.1109/ICASSP.2012.6288078
Manuscripts are solicited to address a
wide range of topics on computer vision
techniques and applications focusing on
computational photography tasks,
including but not limited to the following:
●
Advanced image processing
●
Computational cameras
●
Computational illumination
●
Computational optics
●
High-performance imaging
●
Multiple images and camera arrays
●
Sensor and illumination hardware
●
Scientific imaging and videography
●
Organizing and exploiting
photo/video collections
●
Vision for graphics
●
Graphics for vision
wikicfp.com

Data Engineering Data wrangling before analysis

Data engineering vs. data science
●
PROBLEM: Datasets come in various formats often collected
by clinicians with little understanding of the data analysis
steps.
– Try to develop a pre-processing pipeline that takes several
different data formats and can parse them into “standardized”
(internal standard compatible with TensorFlow and similar
libraries) HDF5 dataformat
https://en.wikipedia.org/wiki/Hierarchical_Data_Format
– HDF allows to store image data mixed with metadata for
example, and can be read in various environments, and can be
further converted to other databases relatively easily.
●
If HDF5 proves to be inefficient, we can batch convert all
the databases to new format If desired.
– HDF5 common in deep learning, Fuel for example uses HDF5.

ETL (Extract, transform and load)
A Typical Data Science Department
Most companies structure their data science departments
into 3 groups:
Data scientists: the folks who are “better engineers than
statisticians and better statisticians than engineers”. Aka,
“the thinkers”.
Data engineers: these are the folks who build pipelines
that feed data scientists with data and take the ideas from
the data scientists and implement them. Aka, “the doers”.
Infrastructure engineers: these are the folks who
maintain the Hadoop cluster / big data infrastructure. Aka,
“the plumbers”.
https://cran.r-project.org/web/packages/h5/index.html
http://docs.h5py.org/en/latest/build.htmlhttp://www.kdnuggets.com/2016/03/engineers-shouldnt-write-etl.html

PROCESSING FUNNEL
DATA ENGINEERING DATA SCIENCE

Interoperability
https://youtu.be/0E121gukglE?t=26m36s

OPEN-SOURCE DATA SOURCES
Allam et al. (2015)

Proprietary formatS Vendor-specific OCT
zeiss.com
FromHuang etal.(2013):
“Scans were obtained with certified photographers to
minimize the OCT data acquisition artifacts [15], [20].
The data samples were saved in the Heidelberg
proprietary .e2e format. They were exported from a
Heidelberg Heyex review software (version 5.1) in .vol
format and converted to the DICOM (Digital Imaging
and Communication in Medicine) [21] OPT (ophthalmic
tomography) format using a custom application built in
MATLAB. “
These plugins interpret raw binary files exported from Heidelberg
Spectralis Viewing Software. They successfully import both 8-bit
SLO and 32-bit SD-OCT images, retaining pixel scale (optical and
SD-OCT), segmentation data, and B-scan position relative to the
SLO image (included in v1.1+). In addition to single B-scan SD-OCT
images, the plug-in also opens multiple B-scan SD-OCT images as a
stack, enabling 3-D reconstruction, analysis, and modeling. The
plug-in is compatible with Spectralis Viewing Module exporting raw
data in HSF-OCT-### format. Compatability has been tested with
HSF-OCT-101, 102, and 103
http://dx.doi.org/10.1016/j.exer.2010.10.009
Heidelberg Engineering Spectralis OCT RAW
data (.vol ending): Circular scans and Optic
Nerve Head centered volumes are supported
www5.cs.fau.de .. octseg/
File format? Ease of reading?
nidek-intl.com
No Cube export
moptim.com
optos.com
optovue.com
topconmedical.com
File formats?
Ease of reading?
For these vendors!

Proprietary Open-source data formats→
OpenEyes is a collaborative, open source, project led by
Moorfields Eye Hospital. The goal is to produce a
framework which will allow the rapid, and continuous
development of electronic patient records (EPR) with
contributions from Hospitals, Institutions, Academic
departments, Companies, and Individuals.
https://github.com/openeyes/OpenEyes

Proprietary Bitbucket C++ Project
https://bitbucket.org/uocte/uocte/wiki/Home
uocte / Heidelberg File
Format
View History
Because no specification of this file format was available for the
development of uocte, the file format was reverse engineered for
interoperability. The information on this page therefore is
incomplete and may be incorrect. It only serves to document
which parts of the data are interpreted by uocte and which
assumptions it makes concerning interpretation.
Heidelberg data is stored in a single binary, little endian file with
extension e2e or E2E. It contains a header, a directory that is split
in chunks of entries in a single-linked list, and data chunks. The
high-level structure is this:
uocte / Topcon File Format View History
uocte / NIDEK File Format View History
….
File Format Notes
UOCTML
Eyetec
Heidelberg
NIDEK
Topcon
Zeiss
Reverse-engineered by Paul Rosenthal to have file readers for proprietary bit ordering

Typical volumetric medical formats
DICOM NIFTI .NIIANALYZE.HDR, .IMG
brainder.org
http://nipy.org/nibabel/gettingstarted.html
mathworks.com
http://people.cas.sc.edu/rorden/dicom/index.htm
l
http://dicom.nema.org/
NEMA standard PS3, and as ISO standard
12052:2006
Practicallyoutdated
An Analyze 7.5 data set consists of two files:
●
Header file (something.hdr): Provides
dimensional, identifying and some processing
history information
●
Image file (something.img): Stream of voxels,
whose datatype and ordering are described by
the header file
These links also describe the Analyze format in
more detail:
Mayo/Analyze description of file format.
SPM/FIL description of format (this is a less
detailed description that the SPM99 help system
provides - see above). However, note that the
SPM version of the Analyze format uses a couple
of the header fields in an unconventional way (see
below)
The Nifti format has rapidly replaced the Analyze in
neuroimaging research, being adopted as the default format
by some of the most widespread public domain software
packages, as, FSL [12], SPM [13], and AFNI [14]. The format is
supported by many viewers and image analysis software like
3D Slicer [15], ImageJ [16], and OsiriX, as well as other
emerging software like R [17] and Nibabel [18], besides
various conversion utilities.
An update version of the standard, the Nifti-2, developed to
manage larger data set has been defined in the 2011. This
new version encode each of the dimensions of an image
matrix with a 64-bit integer instead of a 16-bit as in the Nifti-
1, eliminating the restriction of having a size limit of 32,767.
This updated version maintains almost all the characteristics
of the Nifti-1 but, as reserve for some header fields the
double precision, comes with a header of 544 bytes [19].
Use this
doi:10.1007/s10278-013-9657-9

NIFTI .nii
Python
http://nipy.org/nibabel/
http://slideplayer.com/slide/4703517/
https://itk.org/
https://fiji.sc/
https://imagej.nih.gov/ij/plugins/nifti.html
This project aims to offer easy access to Deep Learning for segmentation of
structures of interest in biomedical 3D scans. It is a system that allows the easy
creation of a 3D Convolutional Neural Network, which can be trained to detect
and segment structures if corresponding ground truth labels are provided for
training. The system processes NIFTI images, making its use straightforward for
many biomedical tasks. https://github.com/Kamnitsask/deepmedic

MEDICAL File transfer
vigilantmedical.net
nuance.co
m
http://www.dicomgrid.com/product/share
http://www.intelemage.com
businesswire.com

Eye care Cloud
https://eyenetra.com/product-insight.html

Labels and needed Data Quantity

Number of images needed?
●
There is rule-of-thumb (#1)stating that one should have 10x the number of samples as
parameters in the network (for more formal approach, see VC dimension), and for example
the ResNet (He et al. 2015) in the ILSVRC2015 challenge had around 1.7M parameters, thus
requiring 17M images with this rule-of-thumb.
Zagoruyko et al. (2016)

17 million images? Not necessarily
●
Synthetically increase the number of training sample by distorting them in way expected from the
dataset (random xy-shifts, left-right flips, add gaussian noise, blur, etc.)
– For example Krizhevsky et al. (2012) from UToronto who pushed deep learning into
mainstream increased their training set (15M images from ImageNet) by a factor of 2,048 with
image translations. Futhermore they apply RGB intensity alterations with unspecified factor.
– This have shown to reduce overfitting
– We would still need 8,300 images (17M/2,048) with the same augmentation scheme
DATA AUGMENTATION
Images from:
ftp://ftp.dca.fee.unicamp.br/pub/docs/vonzuben/ia353_1s15/topico10_IA353_1s2015.pdf |
Wu et al. (2015)

~8,300 retinal images?
ImageNet -based transfer learning for Medical analysis
●
Tajbakhsh et al. (2016) used the 'original' pre-trained AlexNet (in Caffe) by
Krizhevsky et al. (2012) with 60M parameters, and fine-tuned it for medical image analysis.
●
Very modest-sized datasets outperformed the hand-crafted methods that they selected.
[65] N. Tajbakhsh, “Automatic assessment of image informativeness in colonoscopy”, Discrete Cosine
Transform-based feature engineering
[60] J. Liang and J. Bi,“Computer aided detection of pulmonary embolism with tobogganing
and multiple instance classification in CT pulmonary angiography,”. - ”A set of 116 descriptive
properties, called features, are computed for each candidate”
database consisting of 121 CT pulmonary angiography (CTPA),
datasets with a total of 326 pulmonary embolisms (PEs)
6 complete colonoscopy videos. 40,000 frames

NOIsy labels
Now we have a 'circular' problem where our diagnosis
labels come from human experts that we know to do
suboptimal job.
● If human expert reach AUCROC= 0.8, and we get AUC of 1.0, what would that
mean in practice?
●
Unlike in ImageNet where correct dog breeds are relatively easy to get right
with proper dog experts, the 'real pathology' becomes more ambigous.
“Considering the recent success of deep learning (Krizhevsky et al.,
2012; Taigman et al., 2014; Sermanet et al., 2014), there is relatively
little work on their application to noisy data”
- Sukhbaatar et al. (2014)
http://dx.doi.org/10.1109/TNNLS.2013.2292894
http://dx.doi.org/10.1177/1062860609354639

Gold standard Beyond typical machine learning
Abstract
Despite the accelerating pace of scientific discovery, the current
clinical research enterprise does not sufficiently address pressing
clinical questions. Given the constraints on clinical trials, for a
majority of clinical questions, the only relevant data available to aid
in decision making are based on observation and experience. Our
purpose here is 3-fold. First, we describe the classic context of
medical research guided by Poppers’ scientific epistemology of
“falsificationism.” Second, we discuss challenges and shortcomings
of randomized controlled trials and present the potential of
observational studies based on big data. Third, we cover several
obstacles related to the use of observational (retrospective) data in
clinical studies. We conclude that randomized controlled trials are
not at risk for extinction, but innovations in statistics, machine
learning, and big data analytics may generate a completely new
ecosystem for exploration and validation.
http://dx.doi.org/10.2196%2Fjmir.5549
http://dx.doi.org/10.1590%2F2176-9451.19.5.027-030.ebo
http://dx.doi.org/10.3310/hta11500
http://dx.doi.org/10.1197/jamia.M1733
Information retrieval studies that involve searching the Internet or marking
phrases usually lack a well-defined number of negative cases. This prevents
the use of traditional interrater reliability metrics like the κ statistic to
assess the quality of expert-generated gold standards. Such studies often
quantify system performance as precision, recall, and F-measure, or as
agreement. It can be shown that the average F-measure among pairs of
experts is numerically identical to the average positive specific agreement
among experts and that κ approaches these measures as the number of
negative cases grows large. Positive specific agreement—or the equivalent F-
measure—may be an appropriate way to quantify interrater reliability and
therefore to assess the reliability of a gold standard in these studies.

Crowdsourcing labels #1
http://dx.doi.org/10.1109/TMI.2016.2528120

Crowdsourcing labels #2
●
Gamify the segmentation process for electron microscopy
https://www.youtube.com/watch?v=c43jVfpzvZ0
https://www.youtube.com/watch?v=8L_ATqjfjbY
https://www.youtube.com/watch?v=bwcuhbj2rSI
EyeWire, http://eyewire.org/explore

Active learning: Helping to annotate
Reduce labeling effort
→ Which new labeled images will
“help” the model to perform the best
for its task (rather than blindly
annotating all images).
https://infoscience.epfl.ch/record/217962/files/top.pdf?version=1

ACTIVE LEARNING: Helping to annotate #2
http://dx.doi.org/10.1016/j.neucom.2016.01.091
http://dx.doi.org/10.1109/TGRS.2016.2552507

ACTIVE LEARNING with dimensionality reduction

DATA PREPROCESSING
http://tflearn.org/data_preprocessing/
Data Normalization
A standard first step to data preprocessing is data normalization. While there are a few possible
approaches, this step is usually clear depending on the data. The common methods for feature
normalization are:
Simple Rescaling
In simple rescaling, our goal is to rescale the data along each data dimension (possibly
independently) so that the final data vectors lie in the range [0,1] or [ − 1,1] (depending on your
dataset). This is useful for later processing as many default parameters (e.g., epsilon in PCA-
whitening) treat the data as if it has been scaled to a reasonable range.
Example: When processing natural images, we often obtain pixel values in the range [0,255]. It is
a common operation to rescale these values to [0,1] by dividing the data by 255.
Per-example mean subtraction
If your data is stationary (i.e., the statistics for each data dimension follow the same distribution),
then you might want to consider subtracting the mean-value for each example (computed per-
example).
Example: In images, this normalization has the property of removing the average brightness
(intensity) of the data point. In many cases, we are not interested in the illumination conditions of
the image, but more so in the content; removing the average pixel value per data point makes
sense here. Note: While this method is generally used for images, one might want to take more
care when applying this to color images. In particular, the stationarity property does not generally
apply across pixels in different color channels.
Feature Standardization
Feature standardization refers to (independently) setting each dimension of the data to have zero-
mean and unit-variance. This is the most common method for normalization and is generally used
widely (e.g., when working with SVMs, feature standardization is often recommended as a
preprocessing step). In practice, one achieves this by first computing the mean of each dimension
(across the dataset) and subtracts this from each dimension. Next, each dimension is divided by
its standard deviation.
Natural Grey-scale Images
Since grey-scale images have the stationarity property, we usually first remove the mean-
component from each data example separately (remove DC). After this step, PCA/ZCA
whitening is often employed with a value of epsilon set large enough to low-pass filter the
data.
Color Images
For color images, the stationarity property does not hold across color channels. Hence,
we usually start by rescaling the data (making sure it is in [0,1]). Then perform feature
mean-normalization, and then applying PCA/ZCA with a sufficiently large epsilon.
http://ufldl.stanford.edu/wiki/index.php/Data_Preprocessing:

DATA PREPROCESSING ZCA
stats.stackexchange.com
Whitening for non-image data
http://blog.explainmydata.com/2012/07/
Nam et al. (2014)
http://ufldl.stanford.edu/wiki/index.php/Whitening:
If we are training on images, the raw input is redundant, since adjacent
pixel values are highly correlated. The goal of whitening is to make the
input less redundant; more formally, our desiderata are that our
learning algorithms sees a training input where (i) the features are less
correlated with each other, and (ii) the features all have the same
variance.
ZCA whitening is a form of pre-processing of the data that maps it from
x to xZCAwhite
. It turns out that this is also a rough model of how the
biological eye (the retina) processes images. Specifically, as your eye
perceives images, most adjacent "pixels" in your eye will perceive very
similar values, since adjacent parts of an image tend to be highly
correlated in intensity. It is thus wasteful for your eye to have to
transmit every pixel separately (via your optic nerve) to your brain.
Instead, your retina performs a decorrelation operation (this is done via
retinal neurons that compute a function called "on center, off
surround/off center, on surround") which is similar to that performed
by ZCA. This results in a less redundant representation of the input
image, which is then transmitted to your brain.

DATA PREPROCESSING ZCA background
http://dx.doi.org/10.1016/S0042-6989(97)00121-1

Region-of-interest Masking
●
Mask the 'foreground' (eye) from 'background' (black empty area around eye).
●
The boundary rather clear, so nothing superfancy luckily is needed.
– Try some fast graph-cut algorithms for example with possible dense CRF refinement
INPUT BINARY MASK
Or alternatively do quick adaptive thresholding, and refine
with dense CRF which is faster. Test the robustness.
Active contour methods (ACM)
- Gradient Vector Field (GVF)
- Markov Random Field (MRF)
- Level-set (Chan-Vese energy
function)
Ahmed Thesis (2016)
Balazs, thesis (2015)

CLASS IMBALANCE #1
https://github.com/fmfn/imbalanced-learn
http://dx.doi.org/10.5281/zenodo.33539
http://dx.doi.org/10.1007/s13748-016-0094-0
https://github.com/fmfn/imbalanced-learn

CLASS IMBALANCE #2
By Tom Fawcett, http://www.svds.com/learning-imbalanced-classes/
Research on imbalanced classes often considers
imbalanced to mean a minority class of 10% to 20%.
In reality, datasets can get far more imbalanced than
this. —Here are some examples:
●
About 2% of credit card accounts are defrauded
per year1. (Most fraud detection domains are
heavily imbalanced.)
●
Medical screening for a condition is usually
performed on a large population of people
without the condition, to detect a small minority
with it (e.g., HIV prevalence in the USA is ~0.4%).
●
Disk drive failures are approximately ~1% per
year.
●
The conversion rates of online ads has been
estimated to lie between 10-3 to 10-6.
●
Factory production defect rates typically run
about 0.1%.
Many of these domains are imbalanced because they
are what I call needle in a haystack problems, where
machine learning classifiers are used to sort through
huge populations of negative (uninteresting) cases to
find the small number of positive (interesting, alarm-
worthy) cases.
Handling imbalanced data
Learning from imbalanced data has been studied actively for about two decades in machine
learning. It’s been the subject of many papers, workshops, special sessions, and dissertations
(a recent survey has about 220 references). A vast number of techniques have been tried,
with varying results and few clear answers. Data scientists facing this problem for the first
time often ask What should I do when my data is imbalanced? This has no definite answer
for the same reason that the general question Which learning algorithm is best? has no
definite answer: it depends on the data.
That said, here is a rough outline of useful approaches. These are listed approximately in
order of effort:
●
Do nothing. Sometimes you get lucky and nothing needs to be done. You can train on the
so-called natural (or stratified) distribution and sometimes it works without need for
modification.
●
Balance the training set in some way:
●
Oversample the minority class.
●
Undersample the majority class.
●
Synthesize new minority classes.
●
Throw away minority examples and switch to an anomaly detection framework.
●
At the algorithm level, or after it:
●
Adjust the class weight (misclassification costs).
●
Adjust the decision threshold.
●
Modify an existing algorithm to be more sensitive to rare classes.
●
Construct an entirely new algorithm to perform well on imbalanced data.

CLASS IMBALANCE #3Tom Fawcett,
http://www.svds.com/learning-imbalanced-classes/
Bagging
Neighbor-based approaches Chawla’s SMOTE
(Synthetic Minority Oversampling
TEchnique) system
Adjust class weights
Box Drawings for Learning
with Imbalanced Data5
Buying or creating more data
If rare data simply needs to be labeled reliably by people, a
common approach is to crowdsource it via a service like
Mechanical Turk. Reliability of human labels may be an issue, but
work has been done in machine learning to combine human labels
to optimize reliability. Finally, Claudia Perlich in her Strata talk
All The Data and Still Not Enough gives examples of how
problems with rare or non-existent data can be finessed by using
surrogate variables or problems, essentially using proxies and
latent variables to make seemingly impossible problems possible.
Related to this is the strategy of using transfer learning to learn
one problem and transfer the results to another problem with rare
examples, as described here.

Multi-STREAM model | ROI detection
●
Make a multi-stream model as the whole image does not fit to GPU memory
512x512px
CUP DISC
RIM-ONEr3 dataset
Macula/FOVEA seemsharder to segment,
NOTE! Fovea void of vessels
Maninis et al. (2016)
Sironi et al. (2015)
Merkow et al. (2016) With HED
Balazs Thesis (2015)

Macula/fovea segmentation
●
If you are willing to come slightly outside your 'data science' silo (and are actually able to),
you can exploit the physical properties of macula:
– It is void of vasculature
– It is protected by the macular pigment that absorbs maximally at 460 nm.
490nm 620nm
http://www.annidis.com/page/technolog
y
Li et al. (2013)
envisionoptical.com.au
Macular pigment screener
Psychophysical flicker method
news-medical.net
http://guardionpro.com/?page_id=411

Vessel segmentation
Copyright Daniele Cortinovis, Orobix Srl (www.orobix.com).
https://github.com/orobix/retina-unet
Retina blood vessel segmentation with a
convolution neural network (U-net)
Running the experiment on DRIVE
The code is written in Python, it is possible to replicate the
experiment on the DRIVE database by following the guidelines
below.
Prerequisities
The neural network is developed with the Keras library, we refer to
the Keras library for the installation.
The following dependencies are needed:
numpy >= 1.11.1
PIL >=1.1.7
opencv >=2.4.10
h5py >=2.6.0
ConfigParser >=3.5.0b2
scikit-learn >= 0.17.1
Also, you will need the DRIVE database, which can be freely
downloaded as explained in the next section.

Region proposals
Girshick et al. (2014)
https://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf
http://dx.doi.org/10.1109/TPAMI.2015.246590
8

Data Augmentation: Natural Images
Images from: ftp://ftp.dca.fee.unicamp.br/pub/docs/vonzuben/ia353_1s15/topico10_IA353_1s2015.pdf | Wu et al. (2015)
Basic idea is to distort the images in the training set in a way that can be expected also to happen for real-
life images and synthetically increase the size of the training set
→ Act as regularizer and reduces overfitting
Are these distortions really sufficient for medical image pipelines that have lower
SNR due to the image formation process that give noisy and blurry images.
Should we denoise and deconvolve the image for example first, and then add the
error components back with varying weights?

Data augmentation in practice
●
He et al. (2015, ResNet): “On the other hand, better generalization is achieved by effective
regularization techniques, aggressive data augmentation [16, 13, 25, 29], and large-scale data.”
– Krizhevsky et al. (2012) [16]: “The easiest and most common method to reduce overfitting on image
data is to artificially enlarge the dataset using label-preserving transformations … The first form of data
augmentation consists of generating image translations and horizontal reflections which increases the
size of our training set by a factor of 2048.
●
The second form of data augmentation consists of altering the intensities of the RGB channels in
training images. Specifically, we perform PCA on the set of RGB pixel values throughout the ImageNet
training set. This scheme approximately captures an important property of natural images, namely,
that object identity is invariant to changes in the intensity and color of the illumination. This
scheme reduces the top-1 error rate by over 1%.
– Howard (2013)[ 13]: “We add additional transformations (to [16]) that extend the translation invariance
and color invariance”
●
In addition to the random lighting noise that has been used in previous pipelines, we also add
additional color manipulations. We randomly manipulate the contrast, brightness and color using the
python image library (PIL). This helps generate training examples that cover the span of image
variations helping the neural network to learn invariance to changes in these properties.
– Images contain useful predictive elements at different scales (MULTI-SCALE). To capture this we make
predictions at three different scales. We use the original 256 scale as well as 228 and 284.
– In order to make use of all of the pixels in the image when making predictions, we generate
three different square image views. For an 256xN (Nx256) image, we generate a left (upper),
center, and right (lower) view of 256x256 pixels

Data augmentation: Tensorflow (in Keras)

FUNDUS IMAGE AUGMENTATION
●
The quality of fundus images are rather close to natural image analysis (in contrast to OCT
images that are much more noisy and corrupted mainly by multiplicative speckle noise)
Denoise (and smooth additionally) to obtain ground truth image
Start degrading it with noise, image blur and compression
artifacts in addition to typical “deep learning data augmentation”
VAMPIRE (Vascular Assessment and Measurement
Platform for Images of the REtina) is a software
application for efficient, semi-automatic quantification of
retinal vessel properties with large collections of fundus
camera images. VAMPIRE is also an international
collaborative project of several image processing groups
and clinical centres in Europe, Asia and the UK. The
system aims to provide efficient and reliable detection of
retinal landmarks (optic disc, retinal zones, main
vasculature), and quantify key parameters used
frequently in investigative studies, currently vessel width,
vessel branching coefficients, and tortuosity.
The ultimate vision is to enable efficient
quantitative analysis of large collections of retinal
images acquired from multiple instruments. Tools
from the VAMPIRE suite have been used by or in
collaboration with: NHS Princess Alexandra Eye
Pavilion, Edinburgh, UK, NHS Ninewells, Dundee, UK,
Anne Rowling Regenerative Neurology Clinic,
Edinburgh, Uk, School of Philosophy, Edinburgh, UK,
Royal Infirmary of Edinburgh, UK, National Healthcare
Group Eye Institute, Singapore, NHS Manchester
Royal Eye Hospital, UK, Moorfields Eye Hospital,
London, UK, Oftalmologia, Pernambuco, Brasil,
Harvard Medical School, USA, NHS Leeds Teaching
Hospitals, UK, Clinica Veterinaria Privata San Marco,
Padova, Italy, Optometry Clinic at Aston University,
UK, Atma Jaya Catholic University of Indonesia,
University of Sydney, Australia, University of
Newcastle, Australia, University of Minnesota, USA,
University Clinic Hradec Králové, Czech Republic, S.V.
Medical College, Tirupati, India, College of the Holy
Cross, Worcester, MA, USA, Regional Center of Anna
University, Tirunelveli, India, University of Brescia,
Italy, Lincoln University, USA, Lappeenranta
University of Technology, Finland, University of
Surrey, UK. College of Public Health, National Taiwan
University, Taiwan
vampire.computing.dundee.ac.uk
http://dx.doi.org/10.1016/j.procs.2016.07.01
0
SYNTHESIZE FUNDUS IMAGES

Data Augmentation: Compression artifacts
●
Due to bandwidth limitations, some of the data coming in might be rather heavily
compressed, and the trained network should handle these artifacts (typically JPEG)
Clipart JPEG artifact removal with L0 smoothing(www.cse.cuhk.edu.hk
)

Data Augmentation: Compression JPEG2000
●
When going from the Discrete Cosine Transform (DCT) based JPEG to wavelet-based JPEG2000 as the format
to store images, we can reduce the blocking artifacts, but then introduce some blurring and ringing artifacts
– JPEG2000 not that commonly used in the end, but could be used as an additional augmentation method, if the dataset consist them
Blocking artifacts:
Color distortion:
Ringing artifacts:
Blurring artifacts:
http://www.stat.columbia.edu/~jakulin/jpeg/artifacts.htm
http://dx.doi.org/10.1109/LSP.2003.817179
http://dx.doi.org/10.1109/TPAMI.2015.2389797
http://dx.doi.org/10.1109/TIP.2016.2558825
An example of the wavelet transform that
is used in JPEG 2000. This is a 2nd-level
CDF 9/7 wavelet transform.
wikipedia.org

Data Augmentation Compression RNN
Improving image compression over JPEG with recurrent
neural networks

Denoising - Comparison
Youssef K, Jarenwattananon NN, Bouchard LS. 2015. Feature-
Preserving Noise Removal. IEEE Transactions on Medical
Imaging PP:1–1. doi: 10.1109/TMI.2015.2409265
Feature-preserving noise removal
L Bouchard, K Youssef - US Patent App. 14/971,775, 2015 - Google Patents

Data Augmentation: Fundus Denoising
With and without BM4D Denoising before
ZCA Transform of fundus image (green channel) Renormalized zoomed portions of the same images
on the left(imZoom= squeeze(imageComp(14:163,183:300,ch,i)))

Data Augmentation: OCT Denoising
Noise component
Difference between “Raw+ZCA”
and “denoising”
GRADIENT MAGNITUDE
Imgradient()

Training for Denoising CNN of course
CNN
“Different from high-level applications such as segmentation or
recognition, pooling tends to deteriorate denoising performance.”
DATASETS Averaged scans to estimate noise-free ground truths
www5.cs.fau.de
https://arxiv.org/abs/1312.1931

EDGE-AWARE SMOOTHING Old-School
Original images (a) and the corresponding images with additive Gaussian
noise (b); denoised images: best result with Gaussian Filter (GF) (c), best
result with MF (d), best result with Perona and Malik (PM) filter (e), and
best result with directional anisotropic diffusion filter (f).
Anisotropic diffusion
filter have been very
commonly used for
edge-aware smoothing

edge-Aware smoothing guided filtering
Fusing the concept of guided filtering and conditional random fields (CRF)
FCN - Fully Convolutional Networks
Because of its nice visual quality, fast speed, and ease
of implementation, the guided filter has witnessed
various applications in real products, such as image
editing apps in phones and stereo reconstruction, and
has been included in official MATLAB and OpenCV
https://sites.google.com/site/yijunlimaverick/deepjointfilter, ECCV 2016

Data Augmentation: Edge-Aware Smoothing
He et al. (2015), "Fast Guided Filter"
R= 4, eps = 0.02^2, s = 1
NoOfIter = 4; % iterate from previous result (not part of the original implementation)
Xu et al. (2011), "Image Smoothing via L0 Gradient Minimization"
https://www.youtube.com/watch?v=jliea54nNFM
Kappa = 1.5, lambda = 5e-4
Try to even further remove the fine texture and leave relevant edges and “big” structures

Edge-Aware smoothing & Deep learning
Bilateral convolutional networks
Gadde et al. (2015)
Jampani et al. (2015)
Computationally efficient with
permutohedral lattice optimization.
5D space = 2D spatial + RGB
Barron and Poole (2015)

Data Augmentation: (De)noise summary
Raw and denoised version
“conservative solution”
Denoisethe ZCA of Denoised image
“slightly less noisy,
some features lost”
EDGE-AWARE SMOOTHING
No texture really left anymore
+ +
Note! Now all the operators are 'old-school' signal processing
algorithms whereas with sufficient data we could train
all the operations as well with CNNs
ORIGINAL OCT SLICE
No ZCA
No denoising
Not shown as image

Data Augmentation: DECONVOLUTION
SYNTHETIC IMAGE Left column: volume corrupted by Gaussian noise (σ=15) and
Poisson noise (SNR = 30) (b) and deconvolution results, 40 iterations, HuygensPro
(d), AutoDeblur (f), Iterative Deconvolve 3D (h), Parallel Iterative Deconvolution (l),
DeconvolutionLab (n). Right column: volume corrupted by Gaussian noise (σ=15)
and Poisson noise (SNR = 15) (c) and deconvolution results, 40 iterations,
HuygensPro (e), AutoDeblur (g), Iterative Deconvolve 3D (i), Parallel Iterative
Deconvolution (m), DeconvolutionLab (o). http://imaging-git.com/
Gaussian noise (σ=15) and
Poisson noise (SNR = 30)
Gaussian noise (σ=15) and
Poisson noise (SNR = 15)
Marrugo et al. (2014)

3D Data Augmentation
elektronn.org/documentation
Two exemplary results of random rotation,
flipping, deformation and histogram
augmentation. The black regions are only
shown for illustration here, internally the
data pipeline calculates the required input
patch (larger than the CNN input size) such
that if cropped to the CNN input size, after
the transformation, no missing pixels
remain. The labels would be transformed in
the same way but are not shown here.
In addition to two-dimensional augmentation, we
would like to 'non-rigid' transformations (see e.g.
non-rigid registration literature) for the volumetric OCT
stacks (cubes) that we are having
http://dx.doi.org/10.1016/j.media.2013.09.009
Retinal image fusion accomplished by our
registration algorithm with one angiogram
grayscale image and one fundus color image.
We show the fusion results obtained by linearly
combing the angiogram and fundus images with
a set of weight values (as shown by the
numbers under picture).
Different transformations. (a) the fixed image, (b) the moving image with a grid overlayed, (c) the deformed
moving image with a translation transformation, (d) a rigid transformation, (e) an affine transformation, and (f) a B-
spline transformation. The deformed moving image nicely resembles the fixed image using the B-spline
transformation. The overlay grids give an indication of the deformations imposed on the moving image.
Elastix manual v 4.7
The second row shows example results when using the default settings for two generic deformable
registration algorithms, SyN [19] and DRAMMS [39], to register the subject image to the target.
http://dx.doi.org/10.1364%2FBOE.5.002196

CNN Architectures
http://dx.doi.org/10.1038/nature14539,
Cited by 742
http://dx.doi.org/10.1038/nature14541, Cited by 42

Classification vs detectionvs segmentation
We're making the code for DeepMask+SharpMask as
well as MultiPathNet — along with our research
papers and demos related to them — open and
accessible to all, with the hope that they'll help
rapidly advance the field of machine vision. As we
continue improving these core technologies we'll
continue publishing our latest results and updating
the open source tools we make available to the
community.
code.facebook.com, by
Piotr Dollar

Shift in natural image classification
Deep learning-based methods have taken over
Surpassing human performance!

CLASSIFICATION MODELS: SUpervised
State-of-the art models from Microsoft and Google

CLASSIFICATION MODELS: Supervised alternatives

CLASSIFICATION MODELS: UnSUPERVISEd pretraining
http://dx.doi.org/10.1007/978-3-319-13972-2_8
Cited by 624
Outside image analysis as curiosity
http://dx.doi.org/10.1016/j.neuroimage.2015.05.018
Better generalization. Unsupervised pre-training gives substantially
lower test classification error than no pre-training, for the same
depth or for smaller depth (comparing 1,2,3,4 and 5 hidden layers)
on various vision datasets
Aid to optimization. Bengio et al. (2007) hypothesized that higher layers
in the network were overfitting the training error, thus to make it
clearer whether some optimization effect (of the lower layers) was
going on, they constrained the top layer to be small (20 units in-
stead of 500 and 1000). In that experiment, they show that the final
training errors are higher without pre-training.
Distinct local minima. With 400 different random initializations, with or
without pre-training, each trajectory ends up in a different apparent
local minimum corresponding not only to different parameters but to
a different function. The regions in function spaced reached with-
out pre-training and with pre-training seem completely dis joint (i.e.
no model without pre-training ever gets close to a model with pre-
training).
Lower variance. In the same set of experiments, the variance of final
test error with respect to the initialization random seed is larger
without pre-training, and this effect is magnified for deeper
architectures ((erhan2009thedifficulty)). This supports a
regularization explanation, but does not exclude an optimization
hypothesis either.
Capacity control. Finally, when all the layers are constrained to a smaller
size, the pre-training advantage disappears, and for very small sizes
generalization is worse with pre-training. Such a result is highly
compatible with a regularization effect and seems incompatible with
a pure optimization effect.

CLASSIFICATION MODELS: SEMI-Supervised with GANs
Joint use of supervised and unsupervised data
Specially useful with medical data where expert time is expensive
for manual labeling needed for supervised learning
We performed semi-supervised experiments on MNIST, CIFAR-10
and SVHN, and sample generation experiments on MNIST, CIFAR-
10, SVHN and ImageNet. We provide code to reproduce the
majority of our experiments
The MNIST dataset contains 60, 000 labeled images of digits
The CIFAR-10 dataset contains 60, 000 labeled images of 32x32px tiny images
73257 digits for training, 26032 digits for testing

Shallow introduction for Deep Learning Retinal Image Analysis

Shallow introduction for Deep Learning Retinal Image Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Shallow introduction for Deep Learning Retinal Image Analysis

Similar to Shallow introduction for Deep Learning Retinal Image Analysis (20)

More from PetteriTeikariPhD

More from PetteriTeikariPhD (20)

Recently uploaded

Recently uploaded (20)

Shallow introduction for Deep Learning Retinal Image Analysis