SlideShare a Scribd company logo
Single-Image Depth
Estimation Based on
Fourier Domain
Analysis
Ahan M R | Ishaant Agarwal | Karthik K | Srijan Nikhar
BITS PILANI K K BIRLA GOA CAMPUS
● The depth map of an image is used for 3-D reconstruction,human pose
estimation and scene recognition.
● Most mainstream estimation algorithms use 3-D stereo images or motion
sequences to calculate the depth map of a scene.
● With the fast development of deep learning technology, various attempts
have been made to use convolutional neural networks (CNNs) for single-
image depth estimation.
Main contributions of the paper
● Designed a ResNet-based depth estimation network which is build
over a Convolutional Neural Network with shortcut layers.
● Proposes a new DBE loss to enable more reliable training of the
network to determine shallow and deep depths of images.
● This is one of the only works to perform the Fourier analysis of the
single-image depth estimation problem. It proposes an accurate
and reliable scheme to combine multiple depths in the frequency
domain.
Proposed Algorithm
The proposed algorithm involves four major steps.
1. Training
● Defining Loss Function.
● Training the weights of ResNet-152 CNN.
1. Candidate Generation.
● Division of images into sub-images using different
cropping ratio.
● Combining images from 4 corners to find estimated depth
map.
1. Candidate Combination using Fourier Analysis.
● Using fourier analysis to combine images of different
cropping ratios to improve overall accuracy.
1. Testing
● Test the CNN on images of particular dataset and
estimate the accuracy.
Training Stage
Structure of Res-Net 152
1. ResNet is used for the architecture of CNN used in Depth Estimation, 34-layer is
better than 18-layer model, since we learn model advanced filters and parameters
over the Deep Neural Network.
2. Vanishing gradient problem has been solved by skip connections in ResNet which is
mainly observed in VGG16 and AlexNet. Hence, we particularly used this model,
including the advantages provided by the skip connections in the architecture.
3. All the extracted intermediate features are concatenated and fed into a fully
connected layer of 1000 neurons to obtain the estimated depth map.
4. The intermediate shortcut connections also help in improving the context of the
Pixels close to the pixel p in the feedforward network. The mathematical formula
Or the formulation reduces to f(x)+(x) where + is the addition of weights from nth
Layer and (n-2)th layer in the network, which is a given baseline in the paper.
Training Stage
Euclidean Loss Function
1. The usual loss function used in CNN
Algorithms is the Euclidean Loss Function
given by equation shown below.
1. This type of loss function poses a problem in
the case of depth estimation since an error in
depths of higher values have a higher impact
on the weight parameter than an error of
proportionate magnitude for lower depth
values this can be observed by looking at the
equation for weight update.
Training Stage
Improved Loss function (DBE)
1. The paper provides an alternative loss
function that eliminates the depth bias
that exists in the simple Euclidean Loss
Function.
1. We use a quadratic function g(d) to
balance the loss function we note that
if the value of a2 is negative the impact
due to higher depth values diminishes
and we get a more reliable balanced
loss function.
2. We can tune the value of a1 and a2 to
give us the most optimal results, this
form of loss function is known as
Depth Balanced Loss Function.
Candidate Generation
● After the training process we
generate Depth map
candidates for different
cropping ratios
● Cropping ratio is defined as
the ratio of size of sub-image
to the size of the original
image.
● The cropped images from the
four corners are passed
through the trained CNN and
are combined to form depth
map candidates
corresponding to different
cropping ratios.
Candidate Combination using Fourier Analysis.
● Local depth variations obtained in the depth map candidates with small cropping ratios
correspond to higher frequency components
● Overall depth features obtained in the depth map candidates with larger cropping
ratios correspond to low frequency components
● We combine these two complementary features using Fourier Transform of the
individual candidates
Estimated value of kth
component of fcap for mth
candidate
Combined fmcap matrix
with entries from all
cropping ratios.
The fourier transform of
the true depth map
Candidate Combination using
Fourier Analysis.
We find the weight factor by
minimizing the mean squared
error between fkcap and fk
The matrix Wk for which we
achieve the smallest value for
MSE is the required solution.
Implementation & Testing.
● As explained previously, we select a1 and a2 such that, a1 is large and a2 is negative to extract
deeper depths from our image and modify the basic euclidean loss which helps in more reliably
estimating shallow and deep depths.
● It is used as an objective function and after each epoch we reduce the rms(root mean squared)
error of our network, and hence, generate output depth map based on 50-70 epochs.
Implementation & Testing.
● We implement the code by particularly working on 3 different parts of the code: Depth
Estimation Network, Depth Balanced Euclidean Loss, Fourier Domain Combination model.
● Depth Estimation Network: Based on ResNet-152, we modify all the last 19 ResNet blocks to
extract intermediate features. All extracted features are concatenated and fed into fully
connected layer to estimate depth map. [1]
● We import ResNet-152 architecture from keras or Pytorch framework and train the weights.
Implementation & Testing.
● We convert the depth map candidates into frequency domain using 2-D DFT, and add the
different cropped images in their frequency domain to obtain the depth map generated by the
depth candidates.
● We implemented these function using inbuilt rfft and irfft functions from Pytorch library and
hence, extract the features in frequency domain and process it.
● This part is crucial step of our project, since this particularly differentiates our approach from the
other depth estimation methods, since we generate cropped ratios in our method and
concatenate the frequency domain maps of different cropped sub-images.
Figure a. Input Image, b. Depth estimated
map after 15 epochs, c. Depth estimated
map after 25 epochs.
Figure a. Input Image, b. Depth estimated
map after 15 epochs and estimating much
larger depths including multiple objects.
Conclusion.
Real-life applications and Further improvements.
● Object localization and Detection of the objects using Depth Maps will ensure de-
localization of objects to much closer extent, and is much more efficient than the
bounding box prediction used by traditional algorithms, and hence, this approach
would be as fast as the semantic segmentation methods by also would be segmented
Based on the depth estimate, so, the results are much more accurate. [3]
● Planetary exploration would find the depth estimation very useful, since most
satellites capture only single image and estimates as much information from it as
possible, and with the usage of depth estimation maps, we can ensure the rovers and
other space vehicles land safely by finding out the geographical topology of the
planetary space of different planets and moons.
References 1. Jae-Han Lee, Minhyeok Heo, Kyung-Rae Kim,
and Chang-Su Kim, Single Image Depth
Estimation using Fourier Domain Analysis,
IEEE, 2018
2. Clement Godard, Unsupervised Monocular
Depth Estimation using Left Right
consistency, IEEE, 2017
3. Xiao Lin, Depth estimation and semantic
segmentation from Single RGB image using
Hybrid CNNs, 2019

More Related Content

What's hot

Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
Yu Huang
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
Richard Kuo
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
Gioele Ciaparrone
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
mvitelli_ee367_final_report
mvitelli_ee367_final_reportmvitelli_ee367_final_report
mvitelli_ee367_final_reportMatt Vitelli
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
Arvind Rapaka
 
Imran2016
Imran2016Imran2016
Imran2016
bitraece
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
Jeremy Nixon
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
Yogendra Tamang
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
Offline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural NetworkOffline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural Network
ijaia
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
taeseon ryu
 
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Convolutional neural networks
Convolutional neural networks Convolutional neural networks
Convolutional neural networks
Roozbeh Sanaei
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
NamHyuk Ahn
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
Kasun Chinthaka Piyarathna
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Cnn
CnnCnn
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
Ferdous ahmed
 

What's hot (20)

Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
mvitelli_ee367_final_report
mvitelli_ee367_final_reportmvitelli_ee367_final_report
mvitelli_ee367_final_report
 
Big data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial UsecasesBig data 2.0, deep learning and financial Usecases
Big data 2.0, deep learning and financial Usecases
 
Imran2016
Imran2016Imran2016
Imran2016
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Offline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural NetworkOffline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural Network
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
 
Convolutional neural networks
Convolutional neural networks Convolutional neural networks
Convolutional neural networks
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Cnn
CnnCnn
Cnn
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 

Similar to Single Image Depth Estimation using frequency domain analysis and Deep learning

Mnist report
Mnist reportMnist report
Mnist report
RaghunandanJairam
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
cscpconf
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
NavneetPaul2
 
Application of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving WeatherApplication of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving Weather
ijsrd.com
 
Scene understanding
Scene understandingScene understanding
Scene understanding
Mohammed Shoaib
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
IRJET Journal
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
RaghunandanJairam
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
cscpconf
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
csandit
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstruction
csandit
 
A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...
IRJET Journal
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
IRJET Journal
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
Prudhvi Raj
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
Prudhvi Raj
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep Learning
Yu Huang
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
Naeem Shehzad
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors III
Yu Huang
 
Road Segmentation from satellites images
Road Segmentation from satellites imagesRoad Segmentation from satellites images
Road Segmentation from satellites images
YoussefKitane
 
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET Journal
 
Advanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalll
Advanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalllAdvanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalll
Advanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalllMuddassar Abbasi
 

Similar to Single Image Depth Estimation using frequency domain analysis and Deep learning (20)

Mnist report
Mnist reportMnist report
Mnist report
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
 
Application of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving WeatherApplication of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving Weather
 
Scene understanding
Scene understandingScene understanding
Scene understanding
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstruction
 
A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep Learning
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors III
 
Road Segmentation from satellites images
Road Segmentation from satellites imagesRoad Segmentation from satellites images
Road Segmentation from satellites images
 
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
 
Advanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalll
Advanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalllAdvanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalll
Advanced Image Reconstruction Algorithms in MRIfor ISMRMversion finalll
 

Recently uploaded

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 

Recently uploaded (20)

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 

Single Image Depth Estimation using frequency domain analysis and Deep learning

  • 1. Single-Image Depth Estimation Based on Fourier Domain Analysis Ahan M R | Ishaant Agarwal | Karthik K | Srijan Nikhar BITS PILANI K K BIRLA GOA CAMPUS
  • 2. ● The depth map of an image is used for 3-D reconstruction,human pose estimation and scene recognition. ● Most mainstream estimation algorithms use 3-D stereo images or motion sequences to calculate the depth map of a scene. ● With the fast development of deep learning technology, various attempts have been made to use convolutional neural networks (CNNs) for single- image depth estimation.
  • 3. Main contributions of the paper ● Designed a ResNet-based depth estimation network which is build over a Convolutional Neural Network with shortcut layers. ● Proposes a new DBE loss to enable more reliable training of the network to determine shallow and deep depths of images. ● This is one of the only works to perform the Fourier analysis of the single-image depth estimation problem. It proposes an accurate and reliable scheme to combine multiple depths in the frequency domain.
  • 4. Proposed Algorithm The proposed algorithm involves four major steps. 1. Training ● Defining Loss Function. ● Training the weights of ResNet-152 CNN. 1. Candidate Generation. ● Division of images into sub-images using different cropping ratio. ● Combining images from 4 corners to find estimated depth map. 1. Candidate Combination using Fourier Analysis. ● Using fourier analysis to combine images of different cropping ratios to improve overall accuracy. 1. Testing ● Test the CNN on images of particular dataset and estimate the accuracy.
  • 5. Training Stage Structure of Res-Net 152 1. ResNet is used for the architecture of CNN used in Depth Estimation, 34-layer is better than 18-layer model, since we learn model advanced filters and parameters over the Deep Neural Network. 2. Vanishing gradient problem has been solved by skip connections in ResNet which is mainly observed in VGG16 and AlexNet. Hence, we particularly used this model, including the advantages provided by the skip connections in the architecture. 3. All the extracted intermediate features are concatenated and fed into a fully connected layer of 1000 neurons to obtain the estimated depth map. 4. The intermediate shortcut connections also help in improving the context of the Pixels close to the pixel p in the feedforward network. The mathematical formula Or the formulation reduces to f(x)+(x) where + is the addition of weights from nth Layer and (n-2)th layer in the network, which is a given baseline in the paper.
  • 6. Training Stage Euclidean Loss Function 1. The usual loss function used in CNN Algorithms is the Euclidean Loss Function given by equation shown below. 1. This type of loss function poses a problem in the case of depth estimation since an error in depths of higher values have a higher impact on the weight parameter than an error of proportionate magnitude for lower depth values this can be observed by looking at the equation for weight update.
  • 7. Training Stage Improved Loss function (DBE) 1. The paper provides an alternative loss function that eliminates the depth bias that exists in the simple Euclidean Loss Function. 1. We use a quadratic function g(d) to balance the loss function we note that if the value of a2 is negative the impact due to higher depth values diminishes and we get a more reliable balanced loss function. 2. We can tune the value of a1 and a2 to give us the most optimal results, this form of loss function is known as Depth Balanced Loss Function.
  • 8. Candidate Generation ● After the training process we generate Depth map candidates for different cropping ratios ● Cropping ratio is defined as the ratio of size of sub-image to the size of the original image. ● The cropped images from the four corners are passed through the trained CNN and are combined to form depth map candidates corresponding to different cropping ratios.
  • 9. Candidate Combination using Fourier Analysis. ● Local depth variations obtained in the depth map candidates with small cropping ratios correspond to higher frequency components ● Overall depth features obtained in the depth map candidates with larger cropping ratios correspond to low frequency components ● We combine these two complementary features using Fourier Transform of the individual candidates Estimated value of kth component of fcap for mth candidate Combined fmcap matrix with entries from all cropping ratios. The fourier transform of the true depth map
  • 10. Candidate Combination using Fourier Analysis. We find the weight factor by minimizing the mean squared error between fkcap and fk The matrix Wk for which we achieve the smallest value for MSE is the required solution.
  • 11. Implementation & Testing. ● As explained previously, we select a1 and a2 such that, a1 is large and a2 is negative to extract deeper depths from our image and modify the basic euclidean loss which helps in more reliably estimating shallow and deep depths. ● It is used as an objective function and after each epoch we reduce the rms(root mean squared) error of our network, and hence, generate output depth map based on 50-70 epochs.
  • 12. Implementation & Testing. ● We implement the code by particularly working on 3 different parts of the code: Depth Estimation Network, Depth Balanced Euclidean Loss, Fourier Domain Combination model. ● Depth Estimation Network: Based on ResNet-152, we modify all the last 19 ResNet blocks to extract intermediate features. All extracted features are concatenated and fed into fully connected layer to estimate depth map. [1] ● We import ResNet-152 architecture from keras or Pytorch framework and train the weights.
  • 13. Implementation & Testing. ● We convert the depth map candidates into frequency domain using 2-D DFT, and add the different cropped images in their frequency domain to obtain the depth map generated by the depth candidates. ● We implemented these function using inbuilt rfft and irfft functions from Pytorch library and hence, extract the features in frequency domain and process it. ● This part is crucial step of our project, since this particularly differentiates our approach from the other depth estimation methods, since we generate cropped ratios in our method and concatenate the frequency domain maps of different cropped sub-images.
  • 14. Figure a. Input Image, b. Depth estimated map after 15 epochs, c. Depth estimated map after 25 epochs.
  • 15. Figure a. Input Image, b. Depth estimated map after 15 epochs and estimating much larger depths including multiple objects.
  • 17. Real-life applications and Further improvements. ● Object localization and Detection of the objects using Depth Maps will ensure de- localization of objects to much closer extent, and is much more efficient than the bounding box prediction used by traditional algorithms, and hence, this approach would be as fast as the semantic segmentation methods by also would be segmented Based on the depth estimate, so, the results are much more accurate. [3] ● Planetary exploration would find the depth estimation very useful, since most satellites capture only single image and estimates as much information from it as possible, and with the usage of depth estimation maps, we can ensure the rovers and other space vehicles land safely by finding out the geographical topology of the planetary space of different planets and moons.
  • 18. References 1. Jae-Han Lee, Minhyeok Heo, Kyung-Rae Kim, and Chang-Su Kim, Single Image Depth Estimation using Fourier Domain Analysis, IEEE, 2018 2. Clement Godard, Unsupervised Monocular Depth Estimation using Left Right consistency, IEEE, 2017 3. Xiao Lin, Depth estimation and semantic segmentation from Single RGB image using Hybrid CNNs, 2019