Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
VICTOR H. S. HA, PH.D.
VPG MEDIA AND DISPLAY IP, INTEL CORP.
2
1
2
3
Title: Ultra High Definition (UHD) Video Scaling:
Low-Power (LP) Hardware (HW) Fixed-Function (FF) vs.
Convolution...
Gen9Intel®processorgraphics
3
4
Table of Content
Gen9 Intel®
Processor Graphics
Super-Resolution
Scaling
SFC Media HW FF
Advanced Video
Scaler in SFC
Co...
5
UHD End-to-End Support in
Gen9 Intel® Processor Graphics
UHD Decode, Encode, Display
UHD Content
UHD Display
UHD Capture
U...
7
Why UHD Scaling is Different?
SD to HD Scaling
• Pixel Resolution from 720x480 to 1920x1080
• Aspect Ratio from 4:3 to 1...
8
Why UHD Scaling is Different?
SD to HD Scaling
• Pixel Resolution from 720x480 to 1920x1080
• Aspect Ratio from 4:3 to 1...
9
UnsliceGeometry
Subslice
Slice Common
FF Media in Unslice
• 6th Generation Intel Core Processor
Graphics on 14nm Proce...
10
Multi-Format Codec (MFX)
• HEVC Decode
• HEVC Encode
• HEVC 10bit Decode (GPU Accelerated)
• JPEG / MJPEG Decode
• JPEG...
11
Video Quality Engine (VQE)
• Video Processing and Enhancement
• 16bit per channel processing pipe
• RAW image processin...
12
Scaler and Format Conversion (SFC)
• Dedicated Media FF HW
• Advanced Video Scaler (AVS)
• Sharpness Enhancement
• Colo...
SFC (Scaler and Format Converter)
Low-Power UHD Video Playback
• New SFC HW pipe is added to deliver Ultra Low Power media...
14
Video Decode  Scaling  Display (or Encode)
MFX
Video Decode
Media Sampler
AVS
VQE
Video Enhancement
MFX
Video Decode
...
15
SFC AVS Example #2
Video Quality Enhancement  Scaling  Display (or Encode)
MFX
Video Decode
VQE
Video Enhancement
Med...
SFC (Scaler and Format Converter)
Low-Power UHD Video Playback
• New SFC HW pipe is added to deliver Ultra Low Power media...
17
AVS (Advanced Video Scaler) in SFC
AVS is a Low-Power Fixed-Function Hardware in SFC
• Real-time video scaling in a 12b...
18
AVS Smooth Filter
Reference Ground Truth (1440x960) Smooth Filter (720x480 to 1440x960)
** Blurrier than Reference Grou...
19
AVS Sharp Filter
Reference Ground Truth (1440x960) Sharp Filter (720x480 to 1440x960)
** Similar to Reference Ground Tr...
20
AVS Sharper Filter
Reference Ground Truth (1440x960) Sharper Filter (720x480 to 1440x960)
** Sharper than Reference Gro...
21
Sharp vs. Smooth Filter
Smooth Filter Sharper Filter
** Ringing Artifacts **
22
Adaptive Mode in AVS
Sharp Filter
• Sharp and Crisp Output on Natural Scenes
• Ringing on Computer Graphics
Smooth Filt...
23
Sharp vs. Smooth Filter
Smooth Filter Sharper Filter
** Ringing Artifacts **
24
Adaptive Mode 1
Adaptive Mode On Sharper Filter
** Ringing Artifacts **** Sharper than Smooth Filter without Ringing **
25
Adaptive Mode 2
Adaptive Mode On Smooth Filter
** Sharper than Smooth Filter **
26
Adaptive Mode in AVS
Sharp Filter
• Sharp and Crisp Output on Natural Scenes
• Ringing on Computer Graphics
Smooth Filt...
Media Scaler Interface
Interface Video Scaler
Intel® Media Server Studio SDK
https://software.intel.com/en-us/media-sdk
• ...
neurontoconvolutionalneuralnetworks
forSuper-resolutionscaling
28
29
Table of Content
Gen9 Intel®
Processor Graphics
Super-Resolution
Scaling
SFC Media HW FF
Advanced Video
Scaler in SFC
C...
30
From Neuron to CNN
Neuron CNN
Scaling Super Resolution
Sparse Coding
Super Resolution
CNN-based SR
Sparse
Coding
Sparse...
neurontoconvolutionalneuralnetworks
31
32
Neuron
A neuron
• Is a nerve cell in brains, spinal cords, etc.
• Processes and transmits data through electrical/chemi...
33
Artificial Neuron
• A Neuron has a single Axon and multiple Dendrites
o Dendrites receive incoming electrical signals
o...
34
Artificial Neuron – what does it do?
x0 x1 x0 AND x1 x0 NAND x1
0 0 0 1
0 1 0 1
1 0 0 1
1 1 1 0
x0 x1 f out
0 0 3 1
0 1...
S
x0
x1
b
f
out
w0
w1
S
x0
x1
b
f
out
w0
w1
S
x0
x1
b
f
out1
w0
w1
S
x0
x1
b
f
out2
w0
w1
S
x0
x1
b
f
out0
w0
w1
in0
in1
L...
36
Convolutional Neural Networks (CNN)
What is it?
• Multiple layers of artificial neural networks
• Some layers performin...
37
Convolution using a Neuron
• Each neuron processes a small part (receptive field) of input image
using shared weights i...
CNN-basedSuper-Resolution
38
39
Super-Resolution
Super-Resolution
• The term has been used by many to mean many different things over the years
• We wi...
Publications on CNN-based SR
40
SCN from University of Illinois – Urbana Champaign
1. Image Super-Resolution via Sparse Re...
Publications on CNN-based SR
41
SCN from University of Illinois – Urbana Champaign
1. Image Super-Resolution via Sparse Re...
From Sparse Coding to CNN-based SR
42
Neuron CNN
Scaling Super Resolution
Sparse Coding
Super Resolution
CNN-based SR
Spar...
Sparse Coding
43
• Reconstruct input signal x using a linear combination of basis vectors of a Dictionary D with
sparse co...
Sparse Coding Super-Resolution
44
Super-Resolution Reconstruction
• y = Dy ⋅ y  y = x  Dx ⋅ x = x
3x3 LR
Image Patch...
45
SCN (Sparse Coding based Network)
Sparse Coding Super-Resolution  Deep Network Super-Resolution
1. Layer #1 (Convoluti...
46
SCN: 5-Layer Deep Network for Super-Resolution
Deep Network Architecture
• 2 Convolutional Layers (H and G) and 3 Layer...
QualityStudyviaSimulation
47
48
Table of Content
PSNR
MSE
Visual
Inspection
Gen9 Intel®
Processor Graphics
Super-Resolution
Scaling
SFC Media HW FF
Adv...
Capturing LR and HR Test Images
49
1. Camera Capture
• LR: Camera Capture in FHD Mode at 1936x1288, then cropped to 720x48...
SR Test Scenarios
50
Scaling Solutions
• SFC AVS: Gen9 Intel® Processor Graphics Media HW FF SFC AVS in SW Simulation
• SC...
51
52
SFC AVS
SCN
visual artifact
SCN result is sharper than AVS
SCN adds some visual artifacts
+1 to AVS or on Par
53
SCN
SFC AVS
SCN has the halo problem that is more pronounced in 4x upscaling
+1 to AVS
halo added
54
SCN
SFC AVS
ringing
severe color bleeding
SCN result is sharper, but with more visible ringing and color
bleeding artif...
SR Test Results
55
Upscaling Ratio Test 1 Test 2 Test 3
1.3x SFC AVS SFC AVS SCN
2x SFC AVS SFC AVS SCN
4x SFC AVS / SCN S...
Summary
56
• Gen9 Intel® Processor Graphics adds a new HW FF called SFC
• SFC AVS provides a high-quality video scaling so...
Summary
57
• Super-Resolution scaling solutions have
been developed using CNN framework
and presents a great potential for...
Summary
58
• Super-Resolution scaling solutions have
been developed using CNN framework
and presents a great potential for...
Summary
59
• Super-Resolution scaling solutions have
been developed using CNN framework
and presents a great potential for...
Q&A
60
61
Acknowledgement
Many thanks go to the following individuals from Intel
• Yi-jen Chiu
• Keith Rowe
• Niranjan S Mulay
• ...
Legal Notices and Disclaimers
Intel technologies’ features and benefits depend on system configuration and may require ena...
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Upcoming SlideShare
Loading in …5
×

Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution

1,795 views

Published on

The visual computing world is moving to an exciting technological era of ultra HD (UHD) and wide-gamut deep colors (WCG). The new Gen9 graphics engine in the 6th generation Intel® Core™ processors is the developers’ platform choice for creating visual excellence in 4K and deep colors. The Gen9 processor graphics offers attractive solutions for high-quality and low-power video scaling that handle UHD and WCG. First, we introduce a hardware fixed-function scaler inside the new SFC (scaling and format conversion) module that provides high quality scaling in low-power platforms. Second, we present a super-resolution scaling solution based on convolutional neural network that can be implemented via OpenCL™ running on the execution units (EUs). We discuss the merits of each solution in different user environments

Published in: Technology
  • Intimacy has never been so much fun! Buy the clinically proven men's natural supplement that helped guys increase satisfaction by 71.43%! ★★★ https://tinyurl.com/yy3nfggr
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution

  1. 1. VICTOR H. S. HA, PH.D. VPG MEDIA AND DISPLAY IP, INTEL CORP.
  2. 2. 2 1 2 3 Title: Ultra High Definition (UHD) Video Scaling: Low-Power (LP) Hardware (HW) Fixed-Function (FF) vs. Convolutional Neural Network (CNN)-based Super-Resolution (SR) Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  3. 3. Gen9Intel®processorgraphics 3
  4. 4. 4 Table of Content Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  5. 5. 5
  6. 6. UHD End-to-End Support in Gen9 Intel® Processor Graphics UHD Decode, Encode, Display UHD Content UHD Display UHD Capture UHD Video Scaling Support • Upscale from HD to UHD • Downscale from UHD to HD Display Port* (DP), Embedded DisplayPort* (eDP), Miracast* and other names and brands may be claimed as the property of others * GPU Accelerated; Media Codec support may not be available on all operating systems and applications.
  7. 7. 7 Why UHD Scaling is Different? SD to HD Scaling • Pixel Resolution from 720x480 to 1920x1080 • Aspect Ratio from 4:3 to 16:9 • SD Video in Low Quality, often requiring, De-interlace, De-noise, De-blocking, Sharpening, etc. FHD to 4K UHD Scaling • Pixel Resolution from 1920x1080 to 3840x2160 • Aspect Ratio stays at 16:9 • FHD Video already in High-Quality with Crisp Details
  8. 8. 8 Why UHD Scaling is Different? SD to HD Scaling • Pixel Resolution from 720x480 to 1920x1080 • Aspect Ratio from 4:3 to 16:9 • SD Video in Low Quality, often requiring, De-interlace, De-noise, De-blocking, Sharpening, etc. • 345,600 pixels to 2,073,600 pixels FHD to 4K UHD Scaling • Pixel Resolution from 1920x1080 to 3840x2160 • Aspect Ratio stays at 16:9 • FHD Video already in High-Quality with Crisp Details • 2,073,600 pixels to 8,294,400 pixels
  9. 9. 9 UnsliceGeometry Subslice Slice Common FF Media in Unslice • 6th Generation Intel Core Processor Graphics on 14nm Process • Support of Latest APIs o DirectX* 12/11.3 o OpenCL 2.0 o OpenGL* 4.4 • Scalable uArch Partitioning similar to 5th Generation Intel® Core™ Architecture o Unslice, Slice, Subslice, etc. • Improved Design for Better Energy Efficiency • Flexible and Finer-grain Power Management * Other names and brands may be claimed as the property of others
  10. 10. 10 Multi-Format Codec (MFX) • HEVC Decode • HEVC Encode • HEVC 10bit Decode (GPU Accelerated) • JPEG / MJPEG Decode • JPEG / MJPEG Encode • MPEG2 Decode and Encode • AVC Decode and Encode • VP8 Decode and Encode FF Media in UnsliceUnsliceGeometry Subslice Slice Common
  11. 11. 11 Video Quality Engine (VQE) • Video Processing and Enhancement • 16bit per channel processing pipe • RAW image processing pipe • De-noise • De-interlace • Contrast/Saturation Enhancement • Skin-tone Detection and Enhancement • Color Space Conversion (BT2020) • Color Correction FF Media in UnsliceUnsliceGeometry Subslice Slice Common
  12. 12. 12 Scaler and Format Conversion (SFC) • Dedicated Media FF HW • Advanced Video Scaler (AVS) • Sharpness Enhancement • Color Space Conversion • Chroma Sampling • Rotation and other Format Conversions Media Sampler • Video Motion Estimation (VME) • Advanced Video Scaler (AVS) • Sharpness Enhancement FF Media in UnsliceUnsliceGeometry Subslice Slice Common
  13. 13. SFC (Scaler and Format Converter) Low-Power UHD Video Playback • New SFC HW pipe is added to deliver Ultra Low Power media playback experience • SFC is connected inline (without memory read/write) to MFX (video decode) and VQE (video processing)
  14. 14. 14 Video Decode  Scaling  Display (or Encode) MFX Video Decode Media Sampler AVS VQE Video Enhancement MFX Video Decode SFC AVS VD-SFC (Video Decode SFC) VQE Video Enhancement MFX Video Encode MFX Video Encode SFC AVS Example #1 GEN8 without SFC GEN9 with SFC memory read/write memory read/write memory read/write
  15. 15. 15 SFC AVS Example #2 Video Quality Enhancement  Scaling  Display (or Encode) MFX Video Decode VQE Video Enhancement Media Sampler AVS MFX Video Decode VQE Video Enhancement SFC AVS VE-SFC (Video Enhance SFC) MFX Video Encode MFX Video Encode GEN8 without SFC GEN9 with SFC memory read/write memory read/write memory read/write
  16. 16. SFC (Scaler and Format Converter) Low-Power UHD Video Playback • New SFC HW pipe is added to deliver Ultra Low Power media playback experience • SFC is connected inline (without memory read/write) to MFX (video decode) and VQE (video processing) SFC pipeline delivers many benefits: • Inline Connection: Reduced bandwidth and power consumption • SFC handles scaling, detail enhancement, color space conversion, and other format conversion on the fly • 12bit Data Path ready for Ultra-HD (UHD), High Dynamic Range (HDR), Wide Color Gamut (WCG) • Free up EU resources (slice/subslice) from media use cases and power-gated when not used • SFC can process UHD Video (3840x2160 @ 60fps) operating at power-efficient low-frequency mode
  17. 17. 17 AVS (Advanced Video Scaler) in SFC AVS is a Low-Power Fixed-Function Hardware in SFC • Real-time video scaling in a 12bits per channel data path • Consists of a pair of spatial filters, Sharp Filter and Smooth Filter Adaptive Mode • The results of the two filters are alpha-blended to generate the output pixel value • The alpha blending factor, , is computed for each pixel from neighboring pixels Sharp Filter Smooth Filter Blending Factor Computation + Input Pixel Output Pixel Blending Factor 
  18. 18. 18 AVS Smooth Filter Reference Ground Truth (1440x960) Smooth Filter (720x480 to 1440x960) ** Blurrier than Reference Ground Truth **
  19. 19. 19 AVS Sharp Filter Reference Ground Truth (1440x960) Sharp Filter (720x480 to 1440x960) ** Similar to Reference Ground Truth **
  20. 20. 20 AVS Sharper Filter Reference Ground Truth (1440x960) Sharper Filter (720x480 to 1440x960) ** Sharper than Reference Ground Truth ** visual artifact
  21. 21. 21 Sharp vs. Smooth Filter Smooth Filter Sharper Filter ** Ringing Artifacts **
  22. 22. 22 Adaptive Mode in AVS Sharp Filter • Sharp and Crisp Output on Natural Scenes • Ringing on Computer Graphics Smooth Filter • Blurrier Output on Natural Scenes • Ringing-free Output on Computer Graphics Adaptive Mode • Best of Both Filters possible based on Per-Pixel Adjustment • Sharp Output on Natural Scenes • Ringing-free Output on Computer Graphics
  23. 23. 23 Sharp vs. Smooth Filter Smooth Filter Sharper Filter ** Ringing Artifacts **
  24. 24. 24 Adaptive Mode 1 Adaptive Mode On Sharper Filter ** Ringing Artifacts **** Sharper than Smooth Filter without Ringing **
  25. 25. 25 Adaptive Mode 2 Adaptive Mode On Smooth Filter ** Sharper than Smooth Filter **
  26. 26. 26 Adaptive Mode in AVS Sharp Filter • Sharp and Crisp Output on Natural Scenes • Ringing on Computer Graphics Smooth Filter • Blurrier Output on Natural Scenes • Ringing-free Output on Computer Graphics Adaptive Mode • Best of Both Filters possible based on Per-Pixel Adjustment • Sharp Output on Natural Scenes • Ringing-free Output on Computer Graphics
  27. 27. Media Scaler Interface Interface Video Scaler Intel® Media Server Studio SDK https://software.intel.com/en-us/media-sdk • Microsoft Windows* DXVA SFC AVS (default) • LibVA (Android/Linux) SFC AVS (default) macOS* SFC and AVS 27 • Application SW specifies input/output formats, then o conf.vpp.In.Width, Height, CropX, CropY, CropW, CropH o conf.vpp.Out.Wdith, Height, CropX, CropY, CropW, CropH • MSDK configures the video processing pipeline accordingly * Other names and brands may be claimed as the property of others
  28. 28. neurontoconvolutionalneuralnetworks forSuper-resolutionscaling 28
  29. 29. 29 Table of Content Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  30. 30. 30 From Neuron to CNN Neuron CNN Scaling Super Resolution Sparse Coding Super Resolution CNN-based SR Sparse Coding Sparse Coding Deep Network
  31. 31. neurontoconvolutionalneuralnetworks 31
  32. 32. 32 Neuron A neuron • Is a nerve cell in brains, spinal cords, etc. • Processes and transmits data through electrical/chemical signals • Can give rise to multiple dendrites, but not more than one axon • Signals travel from the axon of one neuron to a dendrite of another (with many exceptions to these rules) via a synapse • Connects to each other to form neural networks • A human brain contains about 100 billion neurons • Each has 5K~100K synaptic connections to other neurons input signal input signal dendrites axon output signal axon terminals nucleus cell body
  33. 33. 33 Artificial Neuron • A Neuron has a single Axon and multiple Dendrites o Dendrites receive incoming electrical signals o Electrical signal is sent out from an Axon to Dendrites and 𝑜𝑢𝑡 = 0 1 𝑖𝑓 𝑓 < 0 𝑖𝑓 𝑓 ≥ 0 𝑓 = 𝑏 + 𝑖=0 𝑛 𝑤𝑖 𝑥𝑖 S x0 xn b f out w0 wn x1 w1 . . . . . . input signal input signal dendrites axon output signal axon terminals nucleus cell body
  34. 34. 34 Artificial Neuron – what does it do? x0 x1 x0 AND x1 x0 NAND x1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 0 x0 x1 f out 0 0 3 1 0 1 1 1 1 0 1 1 1 1 -1 0 S x0 x1 b f out w0 w1 NAND gate is universal for computation - any logic can be built up out of NAND gates An artificial neuron (perceptron with 2 input) can implement a NAND gate: • input = (x0, x1) • weights = (w0, w1) = (-2, -2) • bias b = 3 • out = 0 if f < 0 1 if f ≥ 0 NAND Gate Artificial Neuron and 𝑜𝑢𝑡 = 0 1 𝑖𝑓 𝑓 < 0 𝑖𝑓 𝑓 ≥ 0 𝑓 = 𝑏 + 𝑖=0 𝑛 𝑤𝑖 𝑥𝑖
  35. 35. S x0 x1 b f out w0 w1 S x0 x1 b f out w0 w1 S x0 x1 b f out1 w0 w1 S x0 x1 b f out2 w0 w1 S x0 x1 b f out0 w0 w1 in0 in1 Layer 1 Layer 2 35 Neural Network Connect multiple artificial neurons • Simple compute devices become interconnected • Connections between neurons determine the function of the overall network • Massively parallel structure allows fast results with slow neurons • Multi-layer networks are more powerful
  36. 36. 36 Convolutional Neural Networks (CNN) What is it? • Multiple layers of artificial neural networks • Some layers performing Convolution Operations that extract features (e.g., edges) from input images • 2D Convolution Operation is Usages: • Image Classification • Object Detection • Face Recognition • Denoise • Deblurring • Super-Resolution Scaling 𝑓(𝑥, 𝑦) = 𝑖=−∞ ∞ 𝑗=−∞ ∞ 𝑤 𝑖, 𝑗 𝑥(𝑥 − 𝑖, 𝑦 − 𝑗)
  37. 37. 37 Convolution using a Neuron • Each neuron processes a small part (receptive field) of input image using shared weights in convolutional layers What’s it good for? Why use it? • Instead of designing and optimizing each convolution kernel manually, train the network to solve difficult problems simply by feeding input and output pairs (i.e., feature extraction process is learned by the network) x0 x1 x3 x4 x2 x5 x6 x7 x8 w0 w1 w3 w4 w2 w5 w6 w7 w8 x1 x4 x2 x5 x7 x8 w0 w1 w3 w4 w2 w5 w6 w7 w8 x1 x4 x2 x5 x7 x8 w0 w1 w3 w4 w2 w5 w6 w7 w8 x0 x1 x3 x4 x2 x5 x6 x7 x8 x0 x1 x3 x4 x2 x5 x6 x7 x8 x1 x4 x2 x5 x7 x8 x0 x1 x3 x4 x2 x5 x6 x7 x8 Convolution Kernel Convolution Kernel Convolution Kernel Image Patch Image Patch Image Patch Input Image Input Image Input Image 𝑓 = 𝑏 + 𝑖=0 𝑛 𝑤𝑖 𝑥𝑖 S x0 xn b f out w0 wn x1 w1 . . . . . . 𝑓(𝑥, 𝑦) = 𝑖=−∞ ∞ 𝑗=−∞ ∞ 𝑤 𝑖, 𝑗 𝑥(𝑥 − 𝑖, 𝑦 − 𝑗)
  38. 38. CNN-basedSuper-Resolution 38
  39. 39. 39 Super-Resolution Super-Resolution • The term has been used by many to mean many different things over the years • We will define what we mean by it in this talk, and then move on Super-Resolution as Upscaling • Input = Low-resolution Image (e.g., 1920x1080 RGB picture) • Output = High-resolution Image (e.g., 3840x2160 RGB picture) • Super-Resolution Requirements: o Use a single input image to generate a single output image, i.e., Single-frame (Spatial) SR o Output image quality is better than traditional scalers based on interpolation (bilinear, bicubic, etc.) o No visual artifacts are introduced by SR upscaling
  40. 40. Publications on CNN-based SR 40 SCN from University of Illinois – Urbana Champaign 1. Image Super-Resolution via Sparse Representation, Huang et al., TIP 2010 2. Coupled Dictionary Training for Image Super-Resolution, Huang et al., TIP 2012 3. Deep Networks for Image Super-Resolution with Sparse Prior, Huang et al., ICCV 2015 4. Self-Tuned Deep Super Resolution, Huang et al., CVPR 2015 5. Robust Single Image Super-Resolution via Deep Networks with Sparse Prior, Huang et al., TIP 2016 SRCNN from The Chinese University of Hong Kong 1. Learning a deep convolutional network for image super-resolution, Tang et al., ECCV 2014 2. Image Super-Resolution using Deep Convolutional Networks, Tang et al., TPAMI 2016 DRCN from Seoul National University 1. Deeply-Recursive Convolutional Network for Image Super-Resolution, Kim et al., CVPR 2016 2. Accurate Image Super-Resolution using Very Deep Convolutional Networks, Kim et al., CVPR 2016 Technische Universität Mϋnchen, Image Super-Resolution with Fast Approximate Convolutional Sparse Coding, Smagt et al., ICONIP 2014 Huaqiao University, Deep Network Cascade for Image Super-Resolution, Chen et al., ECCV 2014
  41. 41. Publications on CNN-based SR 41 SCN from University of Illinois – Urbana Champaign 1. Image Super-Resolution via Sparse Representation, Huang et al., TIP 2010 2. Coupled Dictionary Training for Image Super-Resolution, Huang et al., TIP 2012 3. Deep Networks for Image Super-Resolution with Sparse Prior, Huang et al., ICCV 2015 4. Self-Tuned Deep Super Resolution, Huang et al., CVPR 2015 5. Robust Single Image Super-Resolution via Deep Networks with Sparse Prior, Huang et al., TIP 2016 SRCNN from The Chinese University of Hong Kong 1. Learning a deep convolutional network for image super-resolution, Tang et al., ECCV 2014 2. Image Super-Resolution using Deep Convolutional Networks, Tang et al., TPAMI 2016 DRCN from Seoul National University 1. Deeply-Recursive Convolutional Network for Image Super-Resolution, Kim et al., CVPR 2016 2. Accurate Image Super-Resolution using Very Deep Convolutional Networks, Kim et al., CVPR 2016 Technische Universität Mϋnchen, Image Super-Resolution with Fast Approximate Convolutional Sparse Coding, Smagt et al., ICONIP 2014 Huaqiao University, Deep Network Cascade for Image Super-Resolution, Chen et al., ECCV 2014 compared to all SFSR (CNN-based or not) solutions
  42. 42. From Sparse Coding to CNN-based SR 42 Neuron CNN Scaling Super Resolution Sparse Coding Super Resolution CNN-based SR Sparse Coding Sparse Coding Deep Network
  43. 43. Sparse Coding 43 • Reconstruct input signal x using a linear combination of basis vectors of a Dictionary D with sparse coefficients  o x = D ⋅  • where x is an n x 1 input vector D is an n x m matrix, an overcomplete (m > n) Dictionary with m basis vectors  is an m x 1 sparse code vector • Sparse = Most of sparse code coefficients in  are zero, i.e.,  is a sparse representation of x • Optimal sparse code is obtained as  = argminz E(x, z) = 1 2 x − 𝐃𝐳 2 2 +  𝐳 1 Encoder • Dictionary D • ISTA/CoD (iterative) • LSTA/LCoD (approximate) Input Vector x Sparse Code 
  44. 44. Sparse Coding Super-Resolution 44 Super-Resolution Reconstruction • y = Dy ⋅ y  y = x  Dx ⋅ x = x 3x3 LR Image Patch y HR Sparse Representation x LR Sparse Representation y 9x9 HR Image Patch x Joint Dictionary Training: Iterative Optimization using 100,000 random image patch pairs  Overcomplete LR Dictionary Dy (m = 1024) Overcomplete HR Dictionary Dx (m = 1024) Linear Combination Linear Combination Dictionary Elements Dictionary Elements Sparse Code Encoder
  45. 45. 45 SCN (Sparse Coding based Network) Sparse Coding Super-Resolution  Deep Network Super-Resolution 1. Layer #1 (Convolutional Layer H): image patch/feature y is extracted from the LR image Iy with my filters 2. Layer #2 and #3 (Sparse Code Encoder as k-iterations of LISTA network): Sparse code  is computed from y 3. Layer #4 (Reconstruction): Sparse code  is multiplied with HR Dictionary Dx to reconstruct HR image patch x 4. Layer #5 (Convolutional Layer G): All HR patches x are combined to HR Image Ix Sparse Code Encoder Iy LR Image y LR Image Patch  Sparse Code x HR Image Patch Ix HR Image Fig. 2 from “Robust Single Image Super-Resolution via Deep Networks with Sparse Prior”, IEEE Transactions on Image Processing, Vol. 25. Issue 7, pp 3194-3207, 2016
  46. 46. 46 SCN: 5-Layer Deep Network for Super-Resolution Deep Network Architecture • 2 Convolutional Layers (H and G) and 3 Layers for Sparse Coding Encoder • All parameters trained via back-propagation using MSE cost function • Network learns more complex function beyond the sparse coding model • Performs better than sparse coding results even with dictionary size reduced from 1024 to 128 Advantages of SCN • LISTA sub-network to enforce sparse representation, i.e., better interpretation of filter responses and parameter initialization based on domain knowledge in sparse coding • Better SR results, faster training speed and smaller model size Subjective Quality Assessment • Best Visual Quality against other SFSR solutions (sharper boundaries, richer textures, no ringing) • Scale ratio is fixed for the network  Use a cascade of multiple SCNs + bicubic downscaler • Cascade of multiple networks is better than a single network trained with a large scale factor
  47. 47. QualityStudyviaSimulation 47
  48. 48. 48 Table of Content PSNR MSE Visual Inspection Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  49. 49. Capturing LR and HR Test Images 49 1. Camera Capture • LR: Camera Capture in FHD Mode at 1936x1288, then cropped to 720x480 • HR: Camera Capture in UHD Mode at 3888x2592, then cropped to 1440x960 2. Optical Scanner • LR: Scan a letter-size printed document in 300dpi Mode at 2478x3228, then cropped to 720x480 • HR: Scan the same printed document in 600dpi Mode at 4956x6456, then cropped to 1440x960 3. Screen Capture (www.intel.com) • LR: Screen Capture of Intel Website at 100% Zoom, then cropped to 720x480 • HR: Screen Capture of the same Intel Website at 200% Zoom, then cropped to 1440x960 Test Image #1 Test Image #2 Test Image #3
  50. 50. SR Test Scenarios 50 Scaling Solutions • SFC AVS: Gen9 Intel® Processor Graphics Media HW FF SFC AVS in SW Simulation • SCN: Sparse-Coding Network (SCN) is CNN-based SR from Huang et al. MATLAB codes and network parameters available in http://www.ifp.illinois.edu/~dingliu2/iccv15/ 2x Upscaling for 1920x1080 to 3840x2160 • SFC AVS: 2x • SCN: 2x 4x Upscaling for 1920x1080 to 7680x4320 • SFC AVS: 4x • SCN: 2x (SCN)  2x (SCN) 1.3x Upscaling for 1920x1080 to 2560x1440 • SFC AVS: 1.3x • SCN: 2x (SCN)  0.65x (MATLAB Bicubic)
  51. 51. 51
  52. 52. 52 SFC AVS SCN visual artifact SCN result is sharper than AVS SCN adds some visual artifacts +1 to AVS or on Par
  53. 53. 53 SCN SFC AVS SCN has the halo problem that is more pronounced in 4x upscaling +1 to AVS halo added
  54. 54. 54 SCN SFC AVS ringing severe color bleeding SCN result is sharper, but with more visible ringing and color bleeding artifacts +1 to AVS
  55. 55. SR Test Results 55 Upscaling Ratio Test 1 Test 2 Test 3 1.3x SFC AVS SFC AVS SCN 2x SFC AVS SFC AVS SCN 4x SFC AVS / SCN SFC AVS SFC AVS Overall • SFC AVS and SCN performed well against the ground truth and quite closely to each other in 3 test examples • SFC AVS seems to have a slight advantage over SCN on these 3 test examples But, Why...? • SCN has not been trained on a wide range of non-natural scenes / computer graphics contents • Test input images are high-quality LR images, but SCN is trained on very blurry LR input images (Gaussian Blurring + Downsample + Bicubic Upsample) • Better understanding of CNN architecture, training database, and training strategies is required
  56. 56. Summary 56 • Gen9 Intel® Processor Graphics adds a new HW FF called SFC • SFC AVS provides a high-quality video scaling solution at low-power • Adaptive mode in AVS combines benefits of smooth and sharp filters on a per-pixel basis for superior output quality 1 Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  57. 57. Summary 57 • Super-Resolution scaling solutions have been developed using CNN framework and presents a great potential for high quality video scaling • Gen9 Intel® Processor Graphics adds a new HW FF called SFC • SFC AVS provides a high-quality video scaling solution at low-power • Adaptive mode in AVS combines benefits of smooth and sharp filters on a per-pixel basis for superior output quality 2 Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  58. 58. Summary 58 • Super-Resolution scaling solutions have been developed using CNN framework and presents a great potential for high quality video scaling • SFC AVS produces very high quality output that is comparable to current state-of-the-art CNN-based SR solutions • CNN-based SR scaling can be further improved with more intelligent training and architecture in the future • Gen9 Intel® Processor Graphics adds a new HW FF called SFC • SFC AVS provides a high-quality video scaling solution at low-power • Adaptive mode in AVS combines benefits of smooth and sharp filters on a per-pixel basis for superior output quality 3 Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  59. 59. Summary 59 • Super-Resolution scaling solutions have been developed using CNN framework and presents a great potential for high quality Super-Resolution scaling • SFC AVS produces very high quality output that is comparable to current state-of-the-art CNN-based SR solutions • CNN-based SR scaling can be further improved with more intelligent training and architecture in the future • Gen9 Intel® Processor Graphics adds a new HW FF called SFC • SFC AVS provides a high-quality video scaling solution at low-power • Adaptive mode in AVS combines benefits of smooth and sharp filters on a per-pixel basis for superior output quality • Use Gen9 Intel HW FF Scaler for Low-Power High-Performance High-Quality UHD 4K60 Scaling • Use Gen9 Intel® Processor Graphics for CNN-based SR running on openCL for enhanced UHD picture quality Gen9 Intel® Processor Graphics Super-Resolution Scaling SFC Media HW FF Advanced Video Scaler in SFC Convolutional Neural Network Super-Resolution Scaling using CNN Compare
  60. 60. Q&A 60
  61. 61. 61 Acknowledgement Many thanks go to the following individuals from Intel • Yi-jen Chiu • Keith Rowe • Niranjan S Mulay • Ping Liu • Furong Zhang • Wen-fu Kao • Vidhya Krishnan • Sungye Kim • Charles Lingle, Jon Kennedy and other tech reviewers • Michaelle Gonzalez, Naomi Pitfield, and the SIGGRAPH Team
  62. 62. Legal Notices and Disclaimers Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. © 2016 Intel Corporation. Intel, the Intel logo, OpenCL and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.

×