This document summarizes a presentation on prior embedding deep super-resolution. It discusses challenges like ill-posedness and proposes solutions like additional constraints and embedding signal structure. It reviews representative works in deep image super-resolution from 2014-2018. It also summarizes research on deep band-based image super-resolution and STR-ResNet for video super-resolution, discussing network architectures, experimental results and comparisons to other methods.
Enhanced Deep Residual Networks for Single Image Super-ResolutionNAVER Engineering
발표자: 김희원 (서울대학교 박사과정)
발표일: 2017.9.
(현)서울대학교 전기정보공학 석박통합과정 재학
Best Paper Award of NTIRE 2017 Workshop: Challenge Track
개요:
Single Image Super-Resolution은 저해상도 이미지를 고해상도의 원본 이미지로 복원시키는 연구 분야입니다. 실생활에서 접할 수 있는 흔한 예로는 SNS 사진 중 작은 부분을 크게 확대해도 선명하게 하는 것이나, thumb nail로 원본 이미지만큼의 해상도를 만들어 내는 것입니다.
이번 발표에서는 딥러닝 전과 후의 연구방향에 대해서 알아본 후, CVPR 2017의 2nd NTIRE Workshop Challenge에서 우승한 저희 팀의 연구를 신경망 구조에 대한 분석을 중심으로 살펴보려고 합니다.
We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-
ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make train-
ing faster, we used non-saturating neurons and a very efficient GPU implemen-
tation of the convolution operation. To reduce overfitting in the fully-connected
layers we employed a recently-developed regularization method called “dropout”
that proved to be very effective. We also entered a variant of this model in the
ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%,
compared to 26.2% achieved by the second-best entry.
The presentation is coverong the convolution neural network (CNN) design.
First,
the main building blocks of CNNs will be introduced. Then we systematically
investigate the impact of a range of recent advances in CNN architectures and
learning methods on the object categorization (ILSVRC) problem. In the
evaluation, the influence of the following choices of the architecture are
tested: non-linearity (ReLU, ELU, maxout, compatibility with batch
normalization), pooling variants (stochastic, max, average, mixed), network
width, classifier design (convolution, fully-connected, SPP), image
pre-processing, and of learning parameters: learning rate, batch size,
cleanliness of the data, etc.
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
Abstract (Eng/Kor):
Image restoration (IR) is one of the fundamental problems, which includes denoising, deblurring, super-resolution, etc. Among those, in today's talk, I will more focus on the super-resolution task. There are two main streams in the super-resolution studies; a traditional model-based optimization and a discriminative learning method. I will present the pros and cons of both methods and their recent developments in the research field. Finally, I will provide a mathematical view that explains both methods in a single holistic framework, while achieving the best of both worlds. The last slide summarizes the remaining problems that are yet to be solved in the field.
영상 복원(Image restoration, IR)은 low-level vision에서 매우 중요하게 다루는 근본적인 문제 중 하나로서 denoising, deblurring, super-resolution 등의 다양한 영상 처리 문제를 포괄합니다. 오늘 발표에서는 영상 복원 분야 중에서도 super-resolution 문제에 대해 집중적으로 다루겠습니다. 전통적인 model-based optimization 방식과 deep learning을 적용하여 문제를 푸는 방식에 대해, 각각의 장단점과 최신 연구 발전 흐름을 소개하겠습니다. 마지막으로는 이 둘을 하나로 잇는 통일된 관점을 제시하고 관련 연구들 살펴본 후, super-resolution 분야에서 아직 남아있는 문제점들을 정리하겠습니다.
Enhanced Deep Residual Networks for Single Image Super-ResolutionNAVER Engineering
발표자: 김희원 (서울대학교 박사과정)
발표일: 2017.9.
(현)서울대학교 전기정보공학 석박통합과정 재학
Best Paper Award of NTIRE 2017 Workshop: Challenge Track
개요:
Single Image Super-Resolution은 저해상도 이미지를 고해상도의 원본 이미지로 복원시키는 연구 분야입니다. 실생활에서 접할 수 있는 흔한 예로는 SNS 사진 중 작은 부분을 크게 확대해도 선명하게 하는 것이나, thumb nail로 원본 이미지만큼의 해상도를 만들어 내는 것입니다.
이번 발표에서는 딥러닝 전과 후의 연구방향에 대해서 알아본 후, CVPR 2017의 2nd NTIRE Workshop Challenge에서 우승한 저희 팀의 연구를 신경망 구조에 대한 분석을 중심으로 살펴보려고 합니다.
We trained a large, deep convolutional neural network to classify the 1.2 million
high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-
ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5%
and 17.0% which is considerably better than the previous state-of-the-art. The
neural network, which has 60 million parameters and 650,000 neurons, consists
of five convolutional layers, some of which are followed by max-pooling layers,
and three fully-connected layers with a final 1000-way softmax. To make train-
ing faster, we used non-saturating neurons and a very efficient GPU implemen-
tation of the convolution operation. To reduce overfitting in the fully-connected
layers we employed a recently-developed regularization method called “dropout”
that proved to be very effective. We also entered a variant of this model in the
ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%,
compared to 26.2% achieved by the second-best entry.
The presentation is coverong the convolution neural network (CNN) design.
First,
the main building blocks of CNNs will be introduced. Then we systematically
investigate the impact of a range of recent advances in CNN architectures and
learning methods on the object categorization (ILSVRC) problem. In the
evaluation, the influence of the following choices of the architecture are
tested: non-linearity (ReLU, ELU, maxout, compatibility with batch
normalization), pooling variants (stochastic, max, average, mixed), network
width, classifier design (convolution, fully-connected, SPP), image
pre-processing, and of learning parameters: learning rate, batch size,
cleanliness of the data, etc.
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
Abstract (Eng/Kor):
Image restoration (IR) is one of the fundamental problems, which includes denoising, deblurring, super-resolution, etc. Among those, in today's talk, I will more focus on the super-resolution task. There are two main streams in the super-resolution studies; a traditional model-based optimization and a discriminative learning method. I will present the pros and cons of both methods and their recent developments in the research field. Finally, I will provide a mathematical view that explains both methods in a single holistic framework, while achieving the best of both worlds. The last slide summarizes the remaining problems that are yet to be solved in the field.
영상 복원(Image restoration, IR)은 low-level vision에서 매우 중요하게 다루는 근본적인 문제 중 하나로서 denoising, deblurring, super-resolution 등의 다양한 영상 처리 문제를 포괄합니다. 오늘 발표에서는 영상 복원 분야 중에서도 super-resolution 문제에 대해 집중적으로 다루겠습니다. 전통적인 model-based optimization 방식과 deep learning을 적용하여 문제를 푸는 방식에 대해, 각각의 장단점과 최신 연구 발전 흐름을 소개하겠습니다. 마지막으로는 이 둘을 하나로 잇는 통일된 관점을 제시하고 관련 연구들 살펴본 후, super-resolution 분야에서 아직 남아있는 문제점들을 정리하겠습니다.
TensorFlow Korea 논문읽기모임 PR12 243째 논문 review입니다
이번 논문은 RegNet으로 알려진 Facebook AI Research의 Designing Network Design Spaces 입니다.
CNN을 디자인할 때, bottleneck layer는 정말 좋을까요? layer 수는 많을 수록 높은 성능을 낼까요? activation map의 width, height를 절반으로 줄일 때(stride 2 혹은 pooling), channel을 2배로 늘려주는데 이게 최선일까요? 혹시 bottleneck layer가 없는 게 더 좋지는 않은지, 최고 성능을 내는 layer 수에 magic number가 있는 건 아닐지, activation이 절반으로 줄어들 때 channel을 2배가 아니라 3배로 늘리는 게 더 좋은건 아닌지?
이 논문에서는 하나의 neural network을 잘 design하는 것이 아니라 Auto ML과 같은 기술로 좋은 neural network을 찾을 수 있는 즉 좋은 neural network들이 살고 있는 좋은 design space를 design하는 방법에 대해서 얘기하고 있습니다. constraint이 거의 없는 design space에서 human-in-the-loop을 통해 좋은 design space로 그 공간을 좁혀나가는 방법을 제안하였는데요, EfficientNet보다 더 좋은 성능을 보여주는 RegNet은 어떤 design space에서 탄생하였는지 그리고 그 과정에서 우리가 당연하게 여기고 있었던 design choice들이 잘못된 부분은 없었는지 아래 동영상에서 확인하실 수 있습니다~
영상링크: https://youtu.be/bnbKQRae_u4
논문링크: https://arxiv.org/abs/2003.13678
발표자: 전석준(KAIST 박사과정)
발표일: 2018.8.
Super-resolution은 저해상도 이미지를 고해상도 이미지로 변환시키는 기술로 오랜기간 연구되어 온 주제입니다. 최근 딥러닝 기술이 적용됨에 따라 super-resolution 성능이 비약적으로 향상되었습니다. 저희는 스테레오 이미지를 이용하여 더 높은 해상도의 이미지를 얻는 기술을 개발하였습니다. 이에 관련 내용을 발표하고자 합니다.
1. Multi-Frame Super-Resolution
2. Learning-Based Super-Resolution
3. Stereo Imaging
4. Deep-Learning Based Stereo Super-Resolution
Scaling up Deep Learning Based Super Resolution AlgorithmsXiaoyong Zhu
Superresolution is a process for obtaining one or more high-resolution images from one or more low-resolution observations. It has been used for many applications, including satellite and aerial imaging, medical image processing, ultrasound imaging, line fitting, automated mosaicking, infrared imaging, facial image improvement, text image improvement, compressed image and video enhancement, and fingerprint image enhancement. While research on superresolution began in the 1970s, recently, with the power of deep learning, many notable new methods have been created, including SRCNN, SRResNet, and lately, SRGANs, which use generative adversarial networks. However, since these approaches require a lot of images to train the deep learning network, they are supercompute intensive. Fortunately, with the power of the cloud, you can easily scale up the compute resources as needed, making the algorithm converge faster.
Convolutional Neural Networks : Popular Architecturesananth
In this presentation we look at some of the popular architectures, such as ResNet, that have been successfully used for a variety of applications. Starting from the AlexNet and VGG that showed that the deep learning architectures can deliver unprecedented accuracies for Image classification and localization tasks, we review other recent architectures such as ResNet, GoogleNet (Inception) and the more recent SENet that have won ImageNet competitions.
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee
Tensorfkow-KR 논문읽기모임 PR12 144번째 논문 review입니다.
이번에는 Efficient CNN의 대표 중 하나인 SqueezeNext를 review해보았습니다. SqueezeNext의 전신인 SqueezeNet도 같이 review하였고, CNN을 평가하는 metric에 대한 논문인 NetScore에서 SqueezeNext가 1등을 하여 NetScore도 같이 review하였습니다.
논문링크:
SqueezeNext - https://arxiv.org/abs/1803.10615
SqueezeNet - https://arxiv.org/abs/1602.07360
NetScore - https://arxiv.org/abs/1806.05512
영상링크: https://youtu.be/WReWeADJ3Pw
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
Recently, Convolutional Neural Networks have been successfully applied to image segmentation tasks. Here we present some of the most recent techniques that increased the accuracy in such tasks. First we describe the Inception architecture and its evolution, which allowed to increase width and depth of the network without increasing the computational burden. We then show how to adapt classification networks into fully convolutional networks, able to perform pixel-wise classification for segmentation tasks. We finally introduce the hypercolumn technique to further improve state-of-the-art on various fine-grained localization tasks.
Deep learning for image super resolutionPrudhvi Raj
Using Deep Convolutional Networks, the machine can learn end-to-end mapping between the low/high-resolution images. Unlike traditional methods, this method jointly optimizes all the layers of the image. A light-weight CNN structure is used, which is simple to implement and provides formidable trade-off from the existential methods.
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamNAVER Engineering
Despite recent successes of 3D reconstruction, the majority of researches mainly focus on acquiring the precise geometry.
Even though many computer graphics applications such as AR/VR need more than just scene geometry such as surface color and semantics to provide richer user experience, existing 3D reconstruction methods leave such auxiliary information behind their consideration.
This talk will present our two approaches to reconstruct color and semantic information of 3D indoor scenes as follow:
Junho Jeon, Yeongyu Jung, Haejoon Kim, Seungyong Lee, "Texture map generation for 3D reconstructed scenes", The Visual Computer (CGI 2016), Vol. 32, No. 5, May 2016.
Junho Jeon, Jinwoong Jung, Jungeon Kim, Seungyong Lee, "Semantic Reconstruction: Reconstruction of Semantically Segmented 3D Meshes via Volumetric Semantic Fusion", Computer Graphics Forum (Pacific Graphics 2018), Vol. 37, No. 7, October 2018.
TensorFlow Korea 논문읽기모임 PR12 243째 논문 review입니다
이번 논문은 RegNet으로 알려진 Facebook AI Research의 Designing Network Design Spaces 입니다.
CNN을 디자인할 때, bottleneck layer는 정말 좋을까요? layer 수는 많을 수록 높은 성능을 낼까요? activation map의 width, height를 절반으로 줄일 때(stride 2 혹은 pooling), channel을 2배로 늘려주는데 이게 최선일까요? 혹시 bottleneck layer가 없는 게 더 좋지는 않은지, 최고 성능을 내는 layer 수에 magic number가 있는 건 아닐지, activation이 절반으로 줄어들 때 channel을 2배가 아니라 3배로 늘리는 게 더 좋은건 아닌지?
이 논문에서는 하나의 neural network을 잘 design하는 것이 아니라 Auto ML과 같은 기술로 좋은 neural network을 찾을 수 있는 즉 좋은 neural network들이 살고 있는 좋은 design space를 design하는 방법에 대해서 얘기하고 있습니다. constraint이 거의 없는 design space에서 human-in-the-loop을 통해 좋은 design space로 그 공간을 좁혀나가는 방법을 제안하였는데요, EfficientNet보다 더 좋은 성능을 보여주는 RegNet은 어떤 design space에서 탄생하였는지 그리고 그 과정에서 우리가 당연하게 여기고 있었던 design choice들이 잘못된 부분은 없었는지 아래 동영상에서 확인하실 수 있습니다~
영상링크: https://youtu.be/bnbKQRae_u4
논문링크: https://arxiv.org/abs/2003.13678
발표자: 전석준(KAIST 박사과정)
발표일: 2018.8.
Super-resolution은 저해상도 이미지를 고해상도 이미지로 변환시키는 기술로 오랜기간 연구되어 온 주제입니다. 최근 딥러닝 기술이 적용됨에 따라 super-resolution 성능이 비약적으로 향상되었습니다. 저희는 스테레오 이미지를 이용하여 더 높은 해상도의 이미지를 얻는 기술을 개발하였습니다. 이에 관련 내용을 발표하고자 합니다.
1. Multi-Frame Super-Resolution
2. Learning-Based Super-Resolution
3. Stereo Imaging
4. Deep-Learning Based Stereo Super-Resolution
Scaling up Deep Learning Based Super Resolution AlgorithmsXiaoyong Zhu
Superresolution is a process for obtaining one or more high-resolution images from one or more low-resolution observations. It has been used for many applications, including satellite and aerial imaging, medical image processing, ultrasound imaging, line fitting, automated mosaicking, infrared imaging, facial image improvement, text image improvement, compressed image and video enhancement, and fingerprint image enhancement. While research on superresolution began in the 1970s, recently, with the power of deep learning, many notable new methods have been created, including SRCNN, SRResNet, and lately, SRGANs, which use generative adversarial networks. However, since these approaches require a lot of images to train the deep learning network, they are supercompute intensive. Fortunately, with the power of the cloud, you can easily scale up the compute resources as needed, making the algorithm converge faster.
Convolutional Neural Networks : Popular Architecturesananth
In this presentation we look at some of the popular architectures, such as ResNet, that have been successfully used for a variety of applications. Starting from the AlexNet and VGG that showed that the deep learning architectures can deliver unprecedented accuracies for Image classification and localization tasks, we review other recent architectures such as ResNet, GoogleNet (Inception) and the more recent SENet that have won ImageNet competitions.
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee
Tensorfkow-KR 논문읽기모임 PR12 144번째 논문 review입니다.
이번에는 Efficient CNN의 대표 중 하나인 SqueezeNext를 review해보았습니다. SqueezeNext의 전신인 SqueezeNet도 같이 review하였고, CNN을 평가하는 metric에 대한 논문인 NetScore에서 SqueezeNext가 1등을 하여 NetScore도 같이 review하였습니다.
논문링크:
SqueezeNext - https://arxiv.org/abs/1803.10615
SqueezeNet - https://arxiv.org/abs/1602.07360
NetScore - https://arxiv.org/abs/1806.05512
영상링크: https://youtu.be/WReWeADJ3Pw
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
Recently, Convolutional Neural Networks have been successfully applied to image segmentation tasks. Here we present some of the most recent techniques that increased the accuracy in such tasks. First we describe the Inception architecture and its evolution, which allowed to increase width and depth of the network without increasing the computational burden. We then show how to adapt classification networks into fully convolutional networks, able to perform pixel-wise classification for segmentation tasks. We finally introduce the hypercolumn technique to further improve state-of-the-art on various fine-grained localization tasks.
Deep learning for image super resolutionPrudhvi Raj
Using Deep Convolutional Networks, the machine can learn end-to-end mapping between the low/high-resolution images. Unlike traditional methods, this method jointly optimizes all the layers of the image. A light-weight CNN structure is used, which is simple to implement and provides formidable trade-off from the existential methods.
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamNAVER Engineering
Despite recent successes of 3D reconstruction, the majority of researches mainly focus on acquiring the precise geometry.
Even though many computer graphics applications such as AR/VR need more than just scene geometry such as surface color and semantics to provide richer user experience, existing 3D reconstruction methods leave such auxiliary information behind their consideration.
This talk will present our two approaches to reconstruct color and semantic information of 3D indoor scenes as follow:
Junho Jeon, Yeongyu Jung, Haejoon Kim, Seungyong Lee, "Texture map generation for 3D reconstructed scenes", The Visual Computer (CGI 2016), Vol. 32, No. 5, May 2016.
Junho Jeon, Jinwoong Jung, Jungeon Kim, Seungyong Lee, "Semantic Reconstruction: Reconstruction of Semantically Segmented 3D Meshes via Volumetric Semantic Fusion", Computer Graphics Forum (Pacific Graphics 2018), Vol. 37, No. 7, October 2018.
Similar to Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
Biologically Inspired Methods for Adversarially Robust Deep LearningMuhammadAhmedShah2
Presentation of Muhammad's research on Biologically Inspired Methods for Adversarially Robust Deep Learning at MIT on April 12 2024. The talk covers work that integrates various sensory, and cerebral biological mechanisms into Deep Neural Networks (DNNs) and evaluates the impact on robustness to noise and adversarial attacks
Universal plane wave compounding for high quality us imaging using deep learningShujaat Khan
Plane-wave compounding is to sum up several successive plane waves incident at different angles to form an image. By applying time-reversal of the received signals, transmit focusing can be synthesized. Unfortunately, to improve the temporal resolution, the number of plane waves should be reduced, which often degrades the image quality. To address this problem, an image domain learning method using neural networks has been proposed, but the network needs to be retrained when the number of plane waves changes. Herein, we propose, for the first time, a universal plane-wave compounding scheme using deep learning to directly process plane waves and RF data acquired at different view angles and sub-sampling rate to generate high quality US images.
A comprehensive tutorial on Convolutional Neural Networks (CNN) which talks about the motivation behind CNNs and Deep Learning in general, followed by a description of the various components involved in a typical CNN layer. It explains the theory involved with the different variants used in practice and also, gives a big picture of the whole network by putting everything together.
Next, there's a discussion of the various state-of-the-art frameworks being used to implement CNNs to tackle real-world classification and regression problems.
Finally, the implementation of the CNNs is demonstrated by implementing the paper 'Age ang Gender Classification Using Convolutional Neural Networks' by Hassner (2015).
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
Graph Representation Learning with Deep Embedding Approach:
Graphs are commonly used data structure for representing the real-world relationships, e.g., molecular structure, knowledge graphs, social and communication networks. The effective encoding of graphical information is essential to the success of such applications. In this talk I’ll first describe a general deep learning framework, namely structure2vec, for end to end graph feature representation learning. Then I’ll present the direct application of this model on graph problems on different scales, including community detection and molecule graph classification/regression. We then extend the embedding idea to temporal evolving user-product interaction graph for recommendation. Finally I’ll present our latest work on leveraging the reinforcement learning technique for graph combinatorial optimization, including vertex cover problem for social influence maximization and traveling salesman problem for scheduling management.
Deep Learning-Based Universal Beamformer for Ultrasound ImagingShujaat Khan
In ultrasound (US) imaging, individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented using a hardware- or software-based delay-and-sum (DAS) beamformer, the performance of DAS decreases rapidly in situations where data acquisition is not ideal. Herein, for the first time, we demonstrate that a single data-driven adaptive beamformer designed as a deep neural network can generate high quality images robustly for various detector channel configurations and subsampling rates. The proposed deep beamformer is evaluated for two distinct acquisition schemes: focused ultrasound imaging and planewave imaging. Experimental results showed that the proposed deep beamformer exhibit significant performance gain for both focused and planar imaging schemes, in terms of contrast-to-noise ratio and structural similarity.
Image segmentation is a classic computer vision task that aims at labeling pixels with semantic classes. These slides provide an overview of the basic approaches applied from the deep learning field to tackle this challenge and presents the basic subtasks (semantic, instance and panoptic segmentation) and related datasets.
Presented at the International Summer School on Deep Learning (ISSonDL) 2020 held online and organized by the University of Gdansk (Poland) between the 30th August and 2nd September.
http://2020.dl-lab.eu/virtual-summer-school-on-deep-learning/
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
발표자: 김준태 (KAIST 박사과정)
발표일: 2018.10
Voice activity detection (VAD) and speech enhancement (SE) are important front-end technologies for noise robust speech recognition system.
From incoming noisy signal, VAD detects the speech signal only and SE removes the noise signal while conserving the speech signal.
For VAD and SE, this presentation will cover the traditional methods, deep learning based methods, and our papers as follows:
1. J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
2. J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Also, this presentation will briefly introduce some experimental results in real-world environment (far-field, noisy environment), conducted on the embedded board.
For VAD,
Traditional VAD methods.
Deep learning based VAD methods.
Paper presentation: J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
End point detection based on VAD.
Experimental results of DNN-EPD on embedded board in real-world environment.
For SE,
Traditional SE methods.
Deep learning based SE methods.
Paper presentation: J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Experimental results in real-world environment.
In this presentation we discuss the convolution operation, the architecture of a convolution neural network, different layers such as pooling etc. This presentation draws heavily from A Karpathy's Stanford Course CS 231n
Similar to Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution (20)
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 4: retinex model based low light enhancement
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 2: text centric image style transfer
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 1: prior embedding deep rain removal
Instagram has become one of the most popular social media platforms, allowing people to share photos, videos, and stories with their followers. Sometimes, though, you might want to view someone's story without them knowing.
Understanding User Behavior with Google Analytics.pdfSEO Article Boost
Unlocking the full potential of Google Analytics is crucial for understanding and optimizing your website’s performance. This guide dives deep into the essential aspects of Google Analytics, from analyzing traffic sources to understanding user demographics and tracking user engagement.
Traffic Sources Analysis:
Discover where your website traffic originates. By examining the Acquisition section, you can identify whether visitors come from organic search, paid campaigns, direct visits, social media, or referral links. This knowledge helps in refining marketing strategies and optimizing resource allocation.
User Demographics Insights:
Gain a comprehensive view of your audience by exploring demographic data in the Audience section. Understand age, gender, and interests to tailor your marketing strategies effectively. Leverage this information to create personalized content and improve user engagement and conversion rates.
Tracking User Engagement:
Learn how to measure user interaction with your site through key metrics like bounce rate, average session duration, and pages per session. Enhance user experience by analyzing engagement metrics and implementing strategies to keep visitors engaged.
Conversion Rate Optimization:
Understand the importance of conversion rates and how to track them using Google Analytics. Set up Goals, analyze conversion funnels, segment your audience, and employ A/B testing to optimize your website for higher conversions. Utilize ecommerce tracking and multi-channel funnels for a detailed view of your sales performance and marketing channel contributions.
Custom Reports and Dashboards:
Create custom reports and dashboards to visualize and interpret data relevant to your business goals. Use advanced filters, segments, and visualization options to gain deeper insights. Incorporate custom dimensions and metrics for tailored data analysis. Integrate external data sources to enrich your analytics and make well-informed decisions.
This guide is designed to help you harness the power of Google Analytics for making data-driven decisions that enhance website performance and achieve your digital marketing objectives. Whether you are looking to improve SEO, refine your social media strategy, or boost conversion rates, understanding and utilizing Google Analytics is essential for your success.
Gen Z and the marketplaces - let's translate their needsLaura Szabó
The product workshop focused on exploring the requirements of Generation Z in relation to marketplace dynamics. We delved into their specific needs, examined the specifics in their shopping preferences, and analyzed their preferred methods for accessing information and making purchases within a marketplace. Through the study of real-life cases , we tried to gain valuable insights into enhancing the marketplace experience for Generation Z.
The workshop was held on the DMA Conference in Vienna June 2024.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
3. Outline
Background and Related Work / 005
Deep Band-Based Image Super-Resolution / 014
STR-ResNet for Video Super-Resolution / 037
Prior Embedding Deep Super-Resolution
4. Outline
Background and Related Work / 005
Deep Band-Based Image Super-Resolution / 014
STR-ResNet for Video Super-Resolution / 037
Prior Embedding Deep Super-Resolution
5. STRUCT Group
Challenges and Solutions
Prior Embedding Deep Super-Resolution5
Ill-Posedness
Solution
• One-to-Infinity
• Additional Constraints
• Model architectures
• Loss functions
• Signal structure embedding
• Domain knowledge
[Haefner18]
[Haefner18] Bjoern Haefner et al., "Fight Ill-Posedness With Ill-Posedness: Single-Shot Variational Depth Super-Resolution From Shading", CVPR, 2018.
6. STRUCT Group
DL SR Route
Deep Image Super-Resolution6
RDBSRGAN
DBPN
SRCNN (TPAMI16) VDSR (CVPR16)
FSRCNN (ECCV16) DRCN (CVPR16) LapSRN (CVPR17)
SRGAN (CVPR17)
DBPN (CVPR18)
RDB (CVPR18)
9. STRUCT Group09
Representative Work
Deep Image Super-Resolution
2014-2016
Deep Network
2015-2017
Signal Regularization
2017-2018
Perceptual SR
DEGREE
Christian Ledig et al., Photo-realistic single image super-resolution using a generative adversarial network, CVPR, 2017.
10. STRUCT Group010
Representative Work
Deep Image Super-Resolution
2014-2016
Deep Network
2015-2017
Signal Regularization
2017-2018
Perceptual SR
SFTGAN
Xintao Wang et al., Recovering realistic texture in image super-resolution by deep spatial feature transform, CVPR, 2018.
11. STRUCT Group011
Representative Work
Deep Image Super-Resolution
2014-2016
Deep Network
2015-2017
Signal Regularization
2017-2018
Perceptual SR
2017-2018
DenseNet + ResNet
RDB
RDN
Yulun Zhang et al., Residual dense network for image super-resolution, CVPR, 2018.
12. STRUCT Group012
Representative Work
Deep Image Super-Resolution
2014-2016
Deep Network
2015-2017
Signal Regularization
2017-2018
Perceptual SR
2017-2018
DenseNet + ResNet
Residual Channel Attention Network
2018
RCAN
Yulun Zhang et al., Image Super-Resolution Using Very Deep Residual Channel Attention Networks, ECCV, 2018.
13. Outline
Background and Related Work / 005
Deep Band-Based Image Super-Resolution / 014
STR-ResNet for Video Super-Resolution / 037
Prior Embedding Deep Super-Resolution
14. STRUCT Group14 Deep Band-Based Image Super-Resolution
Deep Band-Based Image Super-Resolution
Deep Edge Guided Recurrent Residual Learning for
Image Super-Resolution
Wenhan Yang, Jiashi Feng, Jianchao Yang, Fang Zhao, Jiaying Liu, Zongming Guo,
and Shuicheng Yan, IEEE TIP 2017
15. STRUCT Group
Previous Works
Deep Band-Based Image Super-Resolution15
Issues
• Traditional Regularizer in MAP framework
• Limited capacity to describe complex features
• Deep learning-Based Methods
• Network Black box, How it works ?
• How to embed priors ?
16. STRUCT Group
Sub-Bands Super-Resolution
Deep Band-Based Image Super-Resolution16
Limitations
• The accuracy of reconstruction for each band[Singh14]
• High energy bands macrostructures
• low energy sub-bands suffer from attenuation
[Singh14] A. Singh and N. Ahuja, "Sub-Band Energy Constraints for Self-Similarity based Super-Resolution", ICPR, 2014.
17. STRUCT Group
Sub-Bands Super-Resolution
Deep Band-Based Image Super-Resolution17
Reconstruction for each band
• Entire signal separate signals[Freeman91, Taubman94]
• Pay attention to the signal details of each band
2
2
ˆ argmin , 1,2,...,
i
i i i
p i n
x
x DHx y x
ˆx
1
ˆ
n
i i
i
w x
2
2
ˆ argmin p
x
x DHx y x
[Freeman91] WT. Freeman et al., "The Design and Use of Steerable Filters", TPAMI, 1991.
[Taubman94] D. Taubman et al., "Multirate Multirate 3-D Subband Coding of Video", TIP, 1994.
18. STRUCT Group
Sub-Bands Super-Resolution
Deep Band-Based Image Super-Resolution18
Gradual reconstruction
• F : band recovery
• G : aggregation function
• F and G linear transformation[Singh14, Song16]
• Supervised learning[Chatterjee07, Singh14]
1 1
ˆ, ,i i i i i i
s s x sˆi
x
[Song16] S. Song et al., "Joint Sub-Band based Neighbor Embedding for Image Super-Resolution", ICASSP, 2016.
[Chatterjee07] D. Taubman et al., "Super-Resolution Using Sub-Band Constrained Total Variation", SSVM, 2007.
F1
G1 0
y s
Lx1
1
ˆx
1
s
F2
G2
Lx2
2
ˆx
Fn
Gn
ˆn
x
… 1n
s
Lxn
ˆn
s x
19. STRUCT Group
Sub-Bands Super-Resolution
Deep Band-Based Image Super-Resolution19
Gradual learned reconstruction
• End-to-end learning
• Not dependent on the specific band choice
• G : summation Concise, does not introduce other parameters
1 1i i i i
s s s
F1
0
y s
1
ˆx
1
s
F2
2
ˆx
Fn
ˆn
x
… 1n
s
ˆn
s x
Lx
20. STRUCT Group
Sub-Bands Super-Resolution
Deep Band-Based Image Super-Resolution20
Gradual learned reconstruction
• F : Convs + ReLu Nonlinear
• Unsupervised sub-band learning
1 1i i i i
s s s
conv
0
y s
1
ˆx
1
s
2
ˆx ˆn
x
… 1n
s
ˆn
s x
Lxconv
conv
conv
conv
conv
21. STRUCT Group
Prior Embedding Modeling
Deep Band-Based Image Super-Resolution21
• Regularization embedding: internal + auxiliary
[Weston08] J. Weston et al., "Deep Learning via Semi-Supervised Embedding", ICML, 2008.
Layer 1
Input
Layer 2
Layer 3
Output
Embedding
Space
Layer 1
Input
Layer 2
Layer 3
Output
Layer 1
Input
Layer 2
Layer 3
Output
Embedding
Layer
(a) Output (b) Internal (c) Auxiliary
Embedding
Space
Space Embedding Network[Weston08]
36. Outline
Background and Related Work / 005
Deep Band-Based Image Super-Resolution / 014
STR-ResNet for Video Super-Resolution / 037
Prior Embedding Deep Super-Resolution
37. STRUCT Group37 STR-ResNet for Video Super-Resolution
STR-ResNet for Video Super-Resolution
Video Super-Resolution Based on Spatial-Temporal Recurrent
Residual Networks
Wenhan Yang, Jiashi Feng, Guosen Xie, Jiaying Liu, Zongming Guo, Shuicheng Yan
CVIU 2018
38. STRUCT Group038
Representative Work
STR-ResNet for Video Super-Resolution
2015
BRCN
Huang et al., Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution, NIPS, 2015.
39. STRUCT Group039
Representative Work
STR-ResNet for Video Super-Resolution
Renjie Liao et. al., Video Super-Resolution via Deep Draft-Ensemble Learning, ICCV, 2015.
2015
BRCN
2016
Draft Learning
40. STRUCT Group040
Representative Work
STR-ResNet for Video Super-Resolution
2015
BRCN
2016
Draft Learning
A.Kappeler, S.Yoo, Q.Dai, A.K.Katsaggelos. Video Super-Resolution with Convolutional Neural Networks, TCI, 2016.
2016
VSRNet
41. STRUCT Group041
Representative Work
STR-ResNet for Video Super-Resolution
2015
BRCN
2016
Draft Learning
2016
VSRNet
2017
Sub-Pixel Network
Wenzhe Shi et al., Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, CVPR, 2016.
42. STRUCT Group042
Representative Work
STR-ResNet for Video Super-Resolution
2015
BRCN
2016
Draft Learning
2016
VSRNet
2017
Sub-Pixel Network
Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, Jiaya Jia, Detail-revealing Deep Video Super-resolution, ICCV, 2017.
2018
Detail Revealing SR
43. STRUCT Group
Explicit vs. Implicit
STR-ResNet for Video Super-Resolution43
Explicit Motion Modeling
• Regions with salient geometric features: better details
• Smooth regions: artifacts
• Motion compensation: high complexity
• Draft Learning[Liao15]: 50×50 625 s
Implicit Motion Modeling
• Robust, highly efficient
• Automatic network learning
• Deficiency: motion modeling
Bicubic Draft
BRCN Draft
44. STRUCT Group
Spatial Temporal Recurrent ResNet
STR-ResNet for Video Super-Resolution44
Implicit Motion Modeling
• There is no motion compensation and alignment
Spatial Temporal ResNet
• Modeling intra-frame redundancy and inter-frame
correspondence jointly
• Motion embedding
• Inter-frame residue
45. STRUCT Group
Spatial and Temporal Joint Modeling (1/2)
STR-ResNet for Video Super-Resolution45
Spatial Domain
46. STRUCT Group
Spatial and Temporal Joint Modeling (2/2)
STR-ResNet for Video Super-Resolution46
Spatial Temporal Domain
47. STRUCT Group
Implicit Motion Embedding (1/3)
STR-ResNet for Video Super-Resolution47
Temporal Redundancy in Different Temporal Domain
48. STRUCT Group
Implicit Motion Embedding (2/3)
STR-ResNet for Video Super-Resolution48
Spatial Residue vs. Temporal Residue
Temporal Residue
Spatial Residue
Spatial and temporal residue
49. STRUCT Group
Implicit Motion Embedding (3/3)
STR-ResNet for Video Super-Resolution49
Input and Predict Inter-Frame Residue
52. STRUCT Group
Experimental Comparison
STR-ResNet for Video Super-Resolution52
Experimental Setting (1/2)
• Dataset
• 20 videos from Xiph.org Video Test Media1
• 75,000 33×33 overlapped patches2
• Comparison methods
• A+[Timofte14], SRCNN[Dong15], VE2, 3DSKR[Takeda09], Draft SR[Liao15],
and BRCN[Huang15]
• Kernel: 3×3, Channel: 64
• Degradation: 9×9
• Blur level 1.6, SF 4
1https://media.xiph.org/video/derf/
2http://www.infognition.com/videoenhancer/
53. STRUCT Group
Experimental Comparison
STR-ResNet for Video Super-Resolution53
Experimental Setting (2/2)
• Loss function
• Training
• Before 250,000 iterations: 0.0001
• After 250,000 iterations: “Focus on” fine-tuning, 0.00001