SlideShare a Scribd company logo
1 of 43
Download to read offline
STYLE TRANSFER
Lars Lowe Sjösund
AI Research Engineer at Peltarion
OVERVIEW
1. Intro style transfer
2. Convolutional Neural Networks
3. Gatys - A Neural Algorithm of Artistic Style
4. Improvements
OVERVIEW
1. Intro style transfer
2. Convolutional Neural Networks
3. Gatys - A Neural Algorithm of Artistic Style
4. Improvements
+ =
Content Style Desired output
Image courtesy: https://github.com/jcjohnson/neural-style
STYLE TRANSFER
OVERVIEW
1. Intro style transfer
2. Convolutional Neural Networks
3. Gatys - A Neural Algorithm of Artistic Style
4. Improvements
Image courtesy: Matthieu Cord : Deep CNN and Weak Supervision Learning for visual recognition, https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-
learning-meetup-5/
HOW DOES A CNN WORK?
16
32
32
3
Convolution Layer
32x32x3 image
width
height
depth
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
17
32
32
3
Convolution Layer
5x5x3 filter
32x32x3 image
Convolve the filter with the image
i.e. “slide over the image spatially,
computing dot products”
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
18
32
32
3
Convolution Layer
5x5x3 filter
32x32x3 image
Convolve the filter with the image
i.e. “slide over the image spatially,
computing dot products”
Filters always extend the full
depth of the input volume
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
19
32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
1 number:
the result of taking a dot product between the
filter and a small 5x5x3 chunk of the image
(i.e. 5*5*3 = 75-dimensional dot product + bias)
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
20
32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
convolve (slide) over all
spatial locations
activation map
1
28
28
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
21
32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
convolve (slide) over all
spatial locations
activation maps
1
28
28
consider a second, green filter
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
22
32
32
3
Convolution Layer
activation maps
6
28
28
For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps:
We stack these up to get a “new image” of size 28x28x6!
Slide credit: CS231n Lecture 7
Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
Image courtesy: http://vision03.csail.mit.edu/cnn_art/data/single_layer.png
OVERVIEW
1. Intro style transfer
2. Convolutional Neural Networks
3. Gatys - A Neural Algorithm of Artistic Style
4. Improvements
+ =
Content Style Desired output
Image courtesy: https://github.com/jcjohnson/neural-style
STYLE TRANSFER
+ =
Content Style Desired output
Image courtesy: https://github.com/jcjohnson/neural-style
STYLE TRANSFER
Image courtesy: https://github.com/jcjohnson/neural-style
RECONSTRUCTING CONTENT
➤ Given image, how can we find a
new one with the same content?
➤ Find content distance measure
between images
➤ Start from random noise image
➤ Minimize distance through iteration
Image courtesy: D. Ulyanov https://bayesgroup.github.io/bmml_sem/2016/style.pdf
1. Load a pre-trained CNN (e.g. VGG19)
2. Pass image #1 through the net
3. Save activation maps from conv-layers
4. Pass image #2 through the net
5. Save activation maps from conv-layers
6. Calculate Euclidean distance between
activation maps from image #1 and #2
and sum up for all layers
CONTENT DISTANCE MEASURE
Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf
Lcontent (x, ˆx) =
1
2
wl (Al (x)− Al ( ˆx))2
l
∑
x ˆx
RECONSTRUCTING CONTENT
➤ Start from random image
➤ Update it using gradient descent
Lcontent (x, ˆx) =
1
2
wl (Al (x)− Al ( ˆx))2
l
∑
ˆxt+1 = ˆxt − ε
∂Lcontent
∂ ˆx
Image courtesy: D. Ulyanov, https://bayesgroup.github.io/bmml_sem/2016/style.pdf
RECONSTRUCTING CONTENT
➤ Start from random image
➤ Update it using gradient descent
Lcontent (x, ˆx) =
1
2
wl (Al (x)− Al ( ˆx))2
l
∑
ˆxt+1 = ˆxt − ε
∂Lcontent
∂ ˆx
55
Reconstructions from intermediate layers
Higher layers are less sensitive to changes in
color, texture, and shape
Mahendran and Vedaldi, “Understanding Deep Image Representations by Inverting Them”, CVPR 2015
Feature Inversion
Slide courtesy: Johnson, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
54
Feature Inversion
Reconstructions from the representation after last last pooling layer
(immediately before the first Fully Connected layer)
Mahendran and Vedaldi, “Understanding Deep Image Representations by Inverting Them”, CVPR 2015
Slide courtesy: Johnson, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
+ =
Content Style Desired output
Image courtesy: https://github.com/jcjohnson/neural-style
STYLE TRANSFER
+ =
Content Style Desired output
STYLE TRANSFER
Image courtesy: https://github.com/jcjohnson/neural-style
Style = Texture / Local structure

Ignores global semantic content
STYLE DISTANCE MEASURE
➤ Represent style by Gram matrix - pairwise covariance of activation maps
➤ Just the uncentered covariance matrix between vectorized activation maps
Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf
Gij
l
(x) =
!
Ai
l
(x)i
!
Aj
l
(x)
G(A1,A1) … G(A1,An )
! " !
G(An,A1) # G(An,An )
⎛
⎝
⎜
⎜
⎜
⎞
⎠
⎟
⎟
⎟
STYLE DISTANCE MEASURE
Lstyle(x, ˆx) =
1
2
wl (Gl
(x)− Gl
( ˆx))2
l
∑
Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf
➤ Style loss - Euclidean distance between Gram matrices
from two images
RECONSTRUCTING STYLE
➤ Start from random image
➤ Update it using gradient descent
Lstyle(x, ˆx) =
1
2
wl (Gl
(x)− Gl
( ˆx))2
l
∑
ˆxt+1 = ˆxt − ε
∂Lstyle
∂ ˆx
Image courtesy: D. Ulyanov https://bayesgroup.github.io/bmml_sem/2016/style.pdf
RECONSTRUCTING STYLE
➤ Start from random image
➤ Update it using gradient descent
Lstyle(x, ˆx) =
1
2
wl (Gl
(x)− Gl
( ˆx))2
l
∑
ˆxt+1 = ˆxt − ε
∂Lstyle
∂ ˆx
RECONSTRUCTING STYLE
Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf
MATHEMATICAL SIDE NOTE
Special case of square of Maximum Mean Discrepancy (MMD)
with
Further reading: Demystifying Style Transfer, Li et al.
Lstyle(x, ˆx) =
1
2
wl (Gl
(x)− Gl
( ˆx))2
l
∑
Lstyle
l
=
1
Zk
l
MMD2
(Al
(x),Al
( ˆx))
= E[φ(Al
(x))]− E[φ(Al
( ˆx))]
2
=
1
Zk
l
(k(A:,i
l
,A:, j
l
)+k( ˆA:,i
l
, ˆA:, j
l
)
j=1
Ml
∑
i=1
Ml
∑ + 2k(A:,i
l
, ˆA:, j
l
))
k(x, ˆx) = (xT
ˆx)2
+ =
Content Style Desired output
STYLE TRANSFER
Ltotal (x, ˆx) = αLcontent (x, ˆx)+ βLstyle(x, ˆx)
Image courtesy: https://github.com/jcjohnson/neural-style
OVERVIEW
1. Intro style transfer
2. Convolutional Neural Networks
3. Gatys - A Neural Algorithm of Artistic Style
4. Improvements
TOTAL VARIATION LOSS
TOTAL VARIATION LOSS
LTV = (vi+1, j − vi, j )2
+ (vi, j+1 − vi, j )2
i, j
∑
PERCEPTUAL LOSSES FOR REAL-TIME STYLE TRANSFER AND SUPER-RESOLUTION
➤ Train a network to do the optimization
➤ + Fast
➤ - One network per style
➤ - Quantitatively slightly worse
Image courtesy: Johnson et al., Perceptual Losses for Real-Time Style Transfer and Super-Resolution, https://arxiv.org/abs/1603.08155
ARBITRARY STYLE TRANSFER IN REAL-TIME WITH ADAPTIVE INSTANCE NORMALIZATION
Image courtesy: Huang et al., Arbitrary style transfer in real-time with adaptive instance normalization
AdaIN(xc,xs ) = σ (xs )
xc − µ(xc )
σ (xc )
⎛
⎝⎜
⎞
⎠⎟ + µ(xs )
➤ Align mean and variance for activation maps
➤ + Fast (15 fps, 512x512px)
➤ + One net, arbitrary style
➤ - Quantitatively slightly worse
QUESTIONS
&
DISCUSSION
THANK YOU!
Email: lars@peltarion.com
Twitter: sjosund

More Related Content

What's hot

Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Deep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hullDeep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hullHanqing Wang
 
Writing your own Neural Network.
Writing your own Neural Network.Writing your own Neural Network.
Writing your own Neural Network.shafkatdu9212
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Abdulrazak Zakieh
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Composition of Clans for Solving Linear Systems on Parallel Architectures
Composition of Clans for Solving Linear Systems on Parallel ArchitecturesComposition of Clans for Solving Linear Systems on Parallel Architectures
Composition of Clans for Solving Linear Systems on Parallel ArchitecturesDmitryZaitsev5
 
Pyramid Algorithm Framework for Real-Time Image Effects in Game Engines
Pyramid Algorithm Framework for Real-Time Image Effects in Game EnginesPyramid Algorithm Framework for Real-Time Image Effects in Game Engines
Pyramid Algorithm Framework for Real-Time Image Effects in Game EnginesDaniel Michelsanti
 
Performance evaluation of ds cdma
Performance evaluation of ds cdmaPerformance evaluation of ds cdma
Performance evaluation of ds cdmacaijjournal
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDing Li
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Austin Benson
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
 
An Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learningAn Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learningChetan Khatri
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficientsAustin Benson
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingMark Kilgard
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesData-Centric_Alliance
 

What's hot (20)

Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Deep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hullDeep single view 3 d object reconstruction with visual hull
Deep single view 3 d object reconstruction with visual hull
 
Writing your own Neural Network.
Writing your own Neural Network.Writing your own Neural Network.
Writing your own Neural Network.
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
 
Composition of Clans for Solving Linear Systems on Parallel Architectures
Composition of Clans for Solving Linear Systems on Parallel ArchitecturesComposition of Clans for Solving Linear Systems on Parallel Architectures
Composition of Clans for Solving Linear Systems on Parallel Architectures
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Pyramid Algorithm Framework for Real-Time Image Effects in Game Engines
Pyramid Algorithm Framework for Real-Time Image Effects in Game EnginesPyramid Algorithm Framework for Real-Time Image Effects in Game Engines
Pyramid Algorithm Framework for Real-Time Image Effects in Game Engines
 
Performance evaluation of ds cdma
Performance evaluation of ds cdmaPerformance evaluation of ds cdma
Performance evaluation of ds cdma
 
Digit recognizer by convolutional neural network
Digit recognizer by convolutional neural networkDigit recognizer by convolutional neural network
Digit recognizer by convolutional neural network
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
 
An Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learningAn Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learning
 
Higher-order clustering coefficients
Higher-order clustering coefficientsHigher-order clustering coefficients
Higher-order clustering coefficients
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and Culling
 
Aleksander gegov
Aleksander gegovAleksander gegov
Aleksander gegov
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 

Similar to Stockholm AI study group #1 - Style Transfer

Can AI say from our eyes when we read relevant information?
Can AI say from our eyes when we read relevant information?Can AI say from our eyes when we read relevant information?
Can AI say from our eyes when we read relevant information?Nilavra Bhattacharya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
Generating super resolution images using transformers
Generating super resolution images using transformersGenerating super resolution images using transformers
Generating super resolution images using transformersNEERAJ BAGHEL
 
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkitde:code 2017
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
4.Do& Martion- Contourlet transform (Backup side-4)
4.Do& Martion- Contourlet transform (Backup side-4)4.Do& Martion- Contourlet transform (Backup side-4)
4.Do& Martion- Contourlet transform (Backup side-4)Nashid Alam
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesAdnanHaider234505
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through ExamplesSri Ambati
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowOswald Campesato
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache sparkEmiliano Martinez Sanchez
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowOswald Campesato
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
 

Similar to Stockholm AI study group #1 - Style Transfer (20)

Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 
Can AI say from our eyes when we read relevant information?
Can AI say from our eyes when we read relevant information?Can AI say from our eyes when we read relevant information?
Can AI say from our eyes when we read relevant information?
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Generating super resolution images using transformers
Generating super resolution images using transformersGenerating super resolution images using transformers
Generating super resolution images using transformers
 
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
 
Conv xg
Conv xgConv xg
Conv xg
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
4.Do& Martion- Contourlet transform (Backup side-4)
4.Do& Martion- Contourlet transform (Backup side-4)4.Do& Martion- Contourlet transform (Backup side-4)
4.Do& Martion- Contourlet transform (Backup side-4)
 
Cgm Lab Manual
Cgm Lab ManualCgm Lab Manual
Cgm Lab Manual
 
Convolution Neural Network Lecture Slides
Convolution Neural Network Lecture SlidesConvolution Neural Network Lecture Slides
Convolution Neural Network Lecture Slides
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlow
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache spark
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
Deep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlowDeep Learning, Keras, and TensorFlow
Deep Learning, Keras, and TensorFlow
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 

Recently uploaded

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Stockholm AI study group #1 - Style Transfer

  • 1. STYLE TRANSFER Lars Lowe Sjösund AI Research Engineer at Peltarion
  • 2. OVERVIEW 1. Intro style transfer 2. Convolutional Neural Networks 3. Gatys - A Neural Algorithm of Artistic Style 4. Improvements
  • 3. OVERVIEW 1. Intro style transfer 2. Convolutional Neural Networks 3. Gatys - A Neural Algorithm of Artistic Style 4. Improvements
  • 4. + = Content Style Desired output Image courtesy: https://github.com/jcjohnson/neural-style STYLE TRANSFER
  • 5. OVERVIEW 1. Intro style transfer 2. Convolutional Neural Networks 3. Gatys - A Neural Algorithm of Artistic Style 4. Improvements
  • 6.
  • 7. Image courtesy: Matthieu Cord : Deep CNN and Weak Supervision Learning for visual recognition, https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep- learning-meetup-5/ HOW DOES A CNN WORK?
  • 8. 16 32 32 3 Convolution Layer 32x32x3 image width height depth Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 9. 17 32 32 3 Convolution Layer 5x5x3 filter 32x32x3 image Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 10. 18 32 32 3 Convolution Layer 5x5x3 filter 32x32x3 image Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” Filters always extend the full depth of the input volume Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 11. 19 32 32 3 Convolution Layer 32x32x3 image 5x5x3 filter 1 number: the result of taking a dot product between the filter and a small 5x5x3 chunk of the image (i.e. 5*5*3 = 75-dimensional dot product + bias) Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 12. 20 32 32 3 Convolution Layer 32x32x3 image 5x5x3 filter convolve (slide) over all spatial locations activation map 1 28 28 Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 13. 21 32 32 3 Convolution Layer 32x32x3 image 5x5x3 filter convolve (slide) over all spatial locations activation maps 1 28 28 consider a second, green filter Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 14. 22 32 32 3 Convolution Layer activation maps 6 28 28 For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps: We stack these up to get a “new image” of size 28x28x6! Slide credit: CS231n Lecture 7 Slide courtesy: Johnson, cs231n lecture 7, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 16. OVERVIEW 1. Intro style transfer 2. Convolutional Neural Networks 3. Gatys - A Neural Algorithm of Artistic Style 4. Improvements
  • 17. + = Content Style Desired output Image courtesy: https://github.com/jcjohnson/neural-style STYLE TRANSFER
  • 18. + = Content Style Desired output Image courtesy: https://github.com/jcjohnson/neural-style STYLE TRANSFER
  • 20. RECONSTRUCTING CONTENT ➤ Given image, how can we find a new one with the same content? ➤ Find content distance measure between images ➤ Start from random noise image ➤ Minimize distance through iteration Image courtesy: D. Ulyanov https://bayesgroup.github.io/bmml_sem/2016/style.pdf
  • 21. 1. Load a pre-trained CNN (e.g. VGG19) 2. Pass image #1 through the net 3. Save activation maps from conv-layers 4. Pass image #2 through the net 5. Save activation maps from conv-layers 6. Calculate Euclidean distance between activation maps from image #1 and #2 and sum up for all layers CONTENT DISTANCE MEASURE Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf Lcontent (x, ˆx) = 1 2 wl (Al (x)− Al ( ˆx))2 l ∑ x ˆx
  • 22. RECONSTRUCTING CONTENT ➤ Start from random image ➤ Update it using gradient descent Lcontent (x, ˆx) = 1 2 wl (Al (x)− Al ( ˆx))2 l ∑ ˆxt+1 = ˆxt − ε ∂Lcontent ∂ ˆx Image courtesy: D. Ulyanov, https://bayesgroup.github.io/bmml_sem/2016/style.pdf
  • 23. RECONSTRUCTING CONTENT ➤ Start from random image ➤ Update it using gradient descent Lcontent (x, ˆx) = 1 2 wl (Al (x)− Al ( ˆx))2 l ∑ ˆxt+1 = ˆxt − ε ∂Lcontent ∂ ˆx
  • 24. 55 Reconstructions from intermediate layers Higher layers are less sensitive to changes in color, texture, and shape Mahendran and Vedaldi, “Understanding Deep Image Representations by Inverting Them”, CVPR 2015 Feature Inversion Slide courtesy: Johnson, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 25. 54 Feature Inversion Reconstructions from the representation after last last pooling layer (immediately before the first Fully Connected layer) Mahendran and Vedaldi, “Understanding Deep Image Representations by Inverting Them”, CVPR 2015 Slide courtesy: Johnson, http://web.stanford.edu/class/cs20si/lectures/slides_06.pdf
  • 26. + = Content Style Desired output Image courtesy: https://github.com/jcjohnson/neural-style STYLE TRANSFER
  • 27. + = Content Style Desired output STYLE TRANSFER Image courtesy: https://github.com/jcjohnson/neural-style
  • 28. Style = Texture / Local structure Ignores global semantic content
  • 29. STYLE DISTANCE MEASURE ➤ Represent style by Gram matrix - pairwise covariance of activation maps ➤ Just the uncentered covariance matrix between vectorized activation maps Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf Gij l (x) = ! Ai l (x)i ! Aj l (x) G(A1,A1) … G(A1,An ) ! " ! G(An,A1) # G(An,An ) ⎛ ⎝ ⎜ ⎜ ⎜ ⎞ ⎠ ⎟ ⎟ ⎟
  • 30. STYLE DISTANCE MEASURE Lstyle(x, ˆx) = 1 2 wl (Gl (x)− Gl ( ˆx))2 l ∑ Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf ➤ Style loss - Euclidean distance between Gram matrices from two images
  • 31. RECONSTRUCTING STYLE ➤ Start from random image ➤ Update it using gradient descent Lstyle(x, ˆx) = 1 2 wl (Gl (x)− Gl ( ˆx))2 l ∑ ˆxt+1 = ˆxt − ε ∂Lstyle ∂ ˆx Image courtesy: D. Ulyanov https://bayesgroup.github.io/bmml_sem/2016/style.pdf
  • 32. RECONSTRUCTING STYLE ➤ Start from random image ➤ Update it using gradient descent Lstyle(x, ˆx) = 1 2 wl (Gl (x)− Gl ( ˆx))2 l ∑ ˆxt+1 = ˆxt − ε ∂Lstyle ∂ ˆx
  • 33. RECONSTRUCTING STYLE Image courtesy: Gatys et al., Texture Synthesis Using Convolutional Neural Networks, https://arxiv.org/pdf/1505.07376.pdf
  • 34. MATHEMATICAL SIDE NOTE Special case of square of Maximum Mean Discrepancy (MMD) with Further reading: Demystifying Style Transfer, Li et al. Lstyle(x, ˆx) = 1 2 wl (Gl (x)− Gl ( ˆx))2 l ∑ Lstyle l = 1 Zk l MMD2 (Al (x),Al ( ˆx)) = E[φ(Al (x))]− E[φ(Al ( ˆx))] 2 = 1 Zk l (k(A:,i l ,A:, j l )+k( ˆA:,i l , ˆA:, j l ) j=1 Ml ∑ i=1 Ml ∑ + 2k(A:,i l , ˆA:, j l )) k(x, ˆx) = (xT ˆx)2
  • 35. + = Content Style Desired output STYLE TRANSFER Ltotal (x, ˆx) = αLcontent (x, ˆx)+ βLstyle(x, ˆx) Image courtesy: https://github.com/jcjohnson/neural-style
  • 36.
  • 37. OVERVIEW 1. Intro style transfer 2. Convolutional Neural Networks 3. Gatys - A Neural Algorithm of Artistic Style 4. Improvements
  • 39. TOTAL VARIATION LOSS LTV = (vi+1, j − vi, j )2 + (vi, j+1 − vi, j )2 i, j ∑
  • 40. PERCEPTUAL LOSSES FOR REAL-TIME STYLE TRANSFER AND SUPER-RESOLUTION ➤ Train a network to do the optimization ➤ + Fast ➤ - One network per style ➤ - Quantitatively slightly worse Image courtesy: Johnson et al., Perceptual Losses for Real-Time Style Transfer and Super-Resolution, https://arxiv.org/abs/1603.08155
  • 41. ARBITRARY STYLE TRANSFER IN REAL-TIME WITH ADAPTIVE INSTANCE NORMALIZATION Image courtesy: Huang et al., Arbitrary style transfer in real-time with adaptive instance normalization AdaIN(xc,xs ) = σ (xs ) xc − µ(xc ) σ (xc ) ⎛ ⎝⎜ ⎞ ⎠⎟ + µ(xs ) ➤ Align mean and variance for activation maps ➤ + Fast (15 fps, 512x512px) ➤ + One net, arbitrary style ➤ - Quantitatively slightly worse