SlideShare a Scribd company logo
Implementing Neural Style Transfer
Authors : Tasmiah Tahsin Mayeesha,Ahraf
Sharif , Hashmir Rahsan Toron
Electrical and Computer Engineering Department,
North South University
Abstract— This technical report implements the
recent neural style transfer method invented by Gatys
et.al in the paper “A Neural Algorithm of Artistic
Style Transfer” and compares different optimization
techniques and variety of data.
Keywords—machine learning, deep learning,
convolutional neural networks, neural style transfer,
computer vision
I. INTRODUCTION
“Imitation is the greatest form of
flattery”—said Charles Caleb Canton, a 17th
Century English cleric and writer. Artists
has always build on past works made by the
former artists to push the frontier of human
imagination ahead. Art has been used for
depicting religious figures, spreading
political propaganda, inspiring protests,
subtly communicating humor with cartoons
to preserving history via portraits. But until
now, art has always been created by
humans for consumption by other humans.
A significant difference between humans
and machines is that only humans can
imagine and create new art in the form of
paintings, books and songs.
However, machine learning has advanced
so far to the point of being able to create art
by themselves. Deep neural networks can
combine two or more images in such ways
that we can split the style and the content of
images and take the style of one image to
impose on the other to create a completely
new image that takes inspiration from the
style and context of the input images. This
is the key idea of neural style transfer by
Gatys Et. Al that uses convolutional neural
networks for this task. This paper will
describe our implementation of the style
transfer technique within the computational
constraints we faced.
II. RELATED WORK
Transferring the style from one image to
another image is an interesting yet difficult
problem. There have been many efforts to
develop efficient methods for automatic
style transfer [Hertzmann et al., 2001;
Efros and Freeman, 2001 Recently, Gatys
et al. proposed a seminal work [Gatys et al.,
2016]: It captures the style of artistic
images and transfer it to other images using
Convolutional Neural Networks (CNN).
This work formulated the problem as
finding an image that matching both the
content and style statistics based on the
neural activations of each layer in CNN.
Later more improvements have been
disclosed in papers like “Perceptual Losses
for Style Transfer and Super Resolution”
by Johnson(2016)
III. METHODOLOGY
Mathematically, Given a content image c
and a style image s we want to generate
an output image x that has the
texture,color etc from s and content from
c. According to Gatys et.al we can pose it
as an optimization problem where
X*=argminx(αLcontent(c,x)+βLstyle(s,x))
Here α = weight of content loss and β =
weight of style loss. We want to find out the
output image x that minimizes the loss or
differs as little as possible from c in content
and s in style.
A. Algorithms and Techniques
To generate the output image with neural
style transfer technique another method
called transfer learning is used. Transfer
learning refers to using the weights
pretrained network(on imagenet dataset) to
do some other task that the network was
not originally trained for. For example, in
imagenet challenge the imposed task is a
classification problem for 1000 classes, but
it’s possible to take the weights from this
network and use them to a binary
classification problem by replacing the
final softmax layer.
When a convolutional neural network is
trained for a classification task, the
convolutional layers tend to learn the
feature representations for those images.
The higher level convolutional layers learn
the general high level features(textures,
color etc) and the arrangement of the input
images, but they do not learn the exact
pixel values, on the contrary the lower level
layers learn the general content of the
image as we progressively go deeper in the
network. Thus in a sense the style and the
content of the image is separable.
In this case, we use a CNN called VGG16
released by Oxford’s Visual Geometry
group in 2016. We use this network to get
the feature representations of the images
and use them to define the loss score and
gradients to update a randomly generated
image and minimize the loss. The
architecture of the network is shown in the
following diagram :
B. DATA PREPROCESSING
Before proceeding with using the VGG-
16 network on our images to extract feature
representations, we need to preprocess them
like the original paper.
For this, following transforms were
applied:
1. Subtraction of the mean RGB value
(computed previously on the imagenet
training set) from each pixel.
2. Flipping the ordering of the multi-
dimensional array from RGB to BGR (the
ordering used in the paper).
For memory related constraints we’ve
also resized the images to 224 x 224 as
bigger images mean more parameters to
tune. With a 224 x 224 image with 3
channels(R,G,B) , the combined image has
224 x 224 x 3 = 150528 parameters to tune
already.
C. Loss Function
Loss or cost function in machine learning is
used for scoring algorithms by comparing
generated output with the expected output.
For neural style transfer the output is an
image that contains both the style of the
style image and the content of the content
image as much as possible.
The loss function outputs the score that
indicates how close the generated image is
to the original style image in style and
content image for content. Unlike image
classification, where the loss function is
used for updating the weights of the
network after comparing the predictions
with the original classes, score of loss
function for the neural style transfer is used
for updating the pixels of the generated
image with stochastic gradient descent or
other optimizers.
Since the loss function has to measure both
the style loss and the content loss, we can
write the loss function following way.
Loss = αLcontent(c,x)+βLstyle(s,x)
Content Loss is the Mean Squared Error
between the feature representations of the
content image and the combined image.
Style loss is the scaled, squared loss of the
frobenius norm of the difference between the
Gram matrices of the style and combination
images. Gram Matrices refer the matrix
formed by multiplying the transpose of a
matrix with itself.
In order to denoise the result images we also
add ‘Total Variation Loss’ to the images to
reduce shakiness introduced in the paper
“Understanding Deep Images by Inverting
them” by Aravindh(2014). Thus the loss
function is the summation of these three
terms.
IV. MODEL TRAINING AND
EVALUATION
A.Model Training
The combination image is initialized as a
random collection of pixels. Using the L-
BFGS algorithm (a quasi-Newton
algorithm that's significantly quicker to
converge than standard gradient descent) to
iteratively improve upon it.
For training the model we pass the content
image, style image and the combined
images through the VGG-16 network to
extract features to measure the loss
functions. In each iteration we measure the
loss and update the combination image
accordingly. Each iteration took around 5
minutes on a 4GB RAM machine, but we
expect a significant speed up using GPUs.
A. Model Evaluation
The training loss for the generated image
was measured with the loss function as
described above with L-BFGS optimizer.
Table for Training Loss for each epoch :
Epoch Loss(1e^10) Time
1 5.849 284
2 2.633 312
3 2.032 275
4 1.801 279
5 1.61 272
Graph of Loss Function
As we were able to derive photos with good
resolution after only 5 epochs we have not
increased the number of training iterations.
After using a content image for Jatio Songsod
Bhaban of Bangladesh and a style image of a
impressionist painting “Forest” drawn by Artist
Leonardo Afremov, we were able to generate this
output :
We can also experiment with different
content, style and variation loss weights to
see how the outputs differ. This image of
Savar has following parameters : α = 0.025,
β = 5, γ = 1
The style weight is larger than the content
weight so the output image looks a lot like
Van Gogh’s starry night. But if we change
the parameters to α = 4, β = 2, γ = 1, the
output image also changes a lot. As the
content weight > style weight the output
image looks more like the original image
except with some filters.
V.IMPROVEMENTS AND DEPLOYMENT
A. Improvement
1. Variation of parameters
We will try different variations of the
parameters by changing the input images,
their sizes, the weights of the different loss
functions, the features used to construct
them to compare the results given our
computational constraints. As the memory
of my laptop is quite small(only 4GB) so
far we've been unable to use the algorithm
on anything above the 224x224 image size.
Deep learning algorithms are meant to run
on the GPU which we do not have so far.
2. Speed Optimization
As the current process is very slow, we're
going to replace our current implementation
with an image transformation CNN network
and implement fast style transfer method as
described in Perceptual Loss paper in
Johnson(2016).This will give us a 1000x
speed up over this implementation, making
it suitable for a webapp.
B.Deployment
The preferred outcome of this project
would have been a deployed application
implemented in python that would help us
to create new images with neural style
transfer in real time styled with traditional
Bangladeshi paintings.
However, because of the complexity of
the technique so far we’ve been able to
implement the backend only with the basic
neural style transfer technique as indicated
above.
The front end design for such an app has
also been developed. Prototype front end
designs are attached below.
REFERENCES
1. A Neural Algorithm of Artistic Style]
(https://arxiv.org/pdf/1508.06576.pdf) (First
Neural Style Transfer Paper)
2. Perceptual Losses for Real-Time Style
Transfer and Super-Resolution]
(https://arxiv.org/pdf/1603.08155.pdf) (ECCV
2016)
3. Application : Prisma
4. Course.fast.ai
5. https://arxiv.org/abs/1412.0035
6.
described in Perceptual Loss paper in
Johnson(2016).This will give us a 1000x
speed up over this implementation, making
it suitable for a webapp.
B.Deployment
The preferred outcome of this project
would have been a deployed application
implemented in python that would help us
to create new images with neural style
transfer in real time styled with traditional
Bangladeshi paintings.
However, because of the complexity of
the technique so far we’ve been able to
implement the backend only with the basic
neural style transfer technique as indicated
above.
The front end design for such an app has
also been developed. Prototype front end
designs are attached below.
REFERENCES
1. A Neural Algorithm of Artistic Style]
(https://arxiv.org/pdf/1508.06576.pdf) (First
Neural Style Transfer Paper)
2. Perceptual Losses for Real-Time Style
Transfer and Super-Resolution]
(https://arxiv.org/pdf/1603.08155.pdf) (ECCV
2016)
3. Application : Prisma
4. Course.fast.ai
5. https://arxiv.org/abs/1412.0035
6.

More Related Content

What's hot

Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
Jeremy Nixon
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
ArtiKhanchandani
 
Face Detection
Face DetectionFace Detection
Face Detection
Amr Sheta
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
Naiyan Noor
 
Computer vision
Computer vision Computer vision
Computer vision
Dmitry Ryabokon
 
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
TEJVEER SINGH
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
Edge AI and Vision Alliance
 
Skin Cancer Detection using Digital Image Processing and Implementation using...
Skin Cancer Detection using Digital Image Processing and Implementation using...Skin Cancer Detection using Digital Image Processing and Implementation using...
Skin Cancer Detection using Digital Image Processing and Implementation using...
ijtsrd
 
Image Processing and Computer Vision
Image Processing and Computer VisionImage Processing and Computer Vision
Image Processing and Computer Vision
Silicon Mentor
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trends
JaeJun Yoo
 
Image recognition
Image recognitionImage recognition
Image recognition
Nikhil Singh
 
Computer vision lane line detection
Computer vision lane line detectionComputer vision lane line detection
Computer vision lane line detection
Jonathan Mitchell
 
Final year ppt
Final year pptFinal year ppt
Final year ppt
Shruti Chandra
 
Restricted Boltzmann Machines.pptx
Restricted Boltzmann Machines.pptxRestricted Boltzmann Machines.pptx
Restricted Boltzmann Machines.pptx
husseinali674716
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Facial Expression Recognition  System using Deep Convolutional Neural Networks.Facial Expression Recognition  System using Deep Convolutional Neural Networks.
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Sandeep Wakchaure
 
Face Detection
Face DetectionFace Detection
Face Detection
Reber Novanta
 
Morphological image processing
Morphological image processingMorphological image processing
Morphological image processing
Vinayak Narayanan
 
Object tracking final
Object tracking finalObject tracking final
Object tracking final
MrsShwetaBanait1
 
Facial recognition system
Facial recognition systemFacial recognition system
Facial recognition system
Divya Sushma
 

What's hot (20)

Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Face Detection
Face DetectionFace Detection
Face Detection
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
 
Computer vision
Computer vision Computer vision
Computer vision
 
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
 
face detection
face detectionface detection
face detection
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
Skin Cancer Detection using Digital Image Processing and Implementation using...
Skin Cancer Detection using Digital Image Processing and Implementation using...Skin Cancer Detection using Digital Image Processing and Implementation using...
Skin Cancer Detection using Digital Image Processing and Implementation using...
 
Image Processing and Computer Vision
Image Processing and Computer VisionImage Processing and Computer Vision
Image Processing and Computer Vision
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trends
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Computer vision lane line detection
Computer vision lane line detectionComputer vision lane line detection
Computer vision lane line detection
 
Final year ppt
Final year pptFinal year ppt
Final year ppt
 
Restricted Boltzmann Machines.pptx
Restricted Boltzmann Machines.pptxRestricted Boltzmann Machines.pptx
Restricted Boltzmann Machines.pptx
 
Facial Expression Recognition System using Deep Convolutional Neural Networks.
Facial Expression Recognition  System using Deep Convolutional Neural Networks.Facial Expression Recognition  System using Deep Convolutional Neural Networks.
Facial Expression Recognition System using Deep Convolutional Neural Networks.
 
Face Detection
Face DetectionFace Detection
Face Detection
 
Morphological image processing
Morphological image processingMorphological image processing
Morphological image processing
 
Object tracking final
Object tracking finalObject tracking final
Object tracking final
 
Facial recognition system
Facial recognition systemFacial recognition system
Facial recognition system
 

Similar to Implementing Neural Style Transfer

IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET Journal
 
E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162
E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162
E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162
Jose Daniel Ramirez Soto
 
Log polar coordinates
Log polar coordinatesLog polar coordinates
Log polar coordinates
Oğul Göçmen
 
Neural Style Transfer in practice
Neural Style Transfer in practiceNeural Style Transfer in practice
Neural Style Transfer in practice
MohamedAmineHACHICHA1
 
Neural Style Transfer in Practice
Neural Style Transfer in PracticeNeural Style Transfer in Practice
Neural Style Transfer in Practice
KhalilBergaoui
 
E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699
E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699
E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699Guowei Xu
 
Cartoonization of images using machine Learning
Cartoonization of images using machine LearningCartoonization of images using machine Learning
Cartoonization of images using machine Learning
IRJET Journal
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
Naeem Shehzad
 
Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...
Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...
Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...
DataConf
 
Ai big dataconference_volodymyr getmanskyi colorization distance measuring
Ai big dataconference_volodymyr getmanskyi colorization distance measuringAi big dataconference_volodymyr getmanskyi colorization distance measuring
Ai big dataconference_volodymyr getmanskyi colorization distance measuring
Olga Zinkevych
 
Video to Video Translation CGAN
Video to Video Translation CGANVideo to Video Translation CGAN
Video to Video Translation CGAN
Alessandro Calmanovici
 
Comparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting TechniquesComparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting Techniques
IOSR Journals
 
A Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in PaintingA Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in Painting
Eswar Publications
 
Multiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-TimeMultiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-Time
KaustavChakraborty28
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
IPT.pdf
IPT.pdfIPT.pdf
IPT.pdf
Manas Das
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
aciijournal
 
Learning from Simulated and Unsupervised Images through Adversarial Training....
Learning from Simulated and Unsupervised Images through Adversarial Training....Learning from Simulated and Unsupervised Images through Adversarial Training....
Learning from Simulated and Unsupervised Images through Adversarial Training....
eraser Juan José Calderón
 
G04654247
G04654247G04654247
G04654247
IOSR-JEN
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
Ankur Nanda
 

Similar to Implementing Neural Style Transfer (20)

IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...IRJET-  	  Concepts, Methods and Applications of Neural Style Transfer: A Rev...
IRJET- Concepts, Methods and Applications of Neural Style Transfer: A Rev...
 
E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162
E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162
E4040.2016 fall.cjmd.report.ce2330.jb3852.jdr2162
 
Log polar coordinates
Log polar coordinatesLog polar coordinates
Log polar coordinates
 
Neural Style Transfer in practice
Neural Style Transfer in practiceNeural Style Transfer in practice
Neural Style Transfer in practice
 
Neural Style Transfer in Practice
Neural Style Transfer in PracticeNeural Style Transfer in Practice
Neural Style Transfer in Practice
 
E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699
E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699
E4040.2016Fall.NAAS.report.gx2127.hs2917.kp2699
 
Cartoonization of images using machine Learning
Cartoonization of images using machine LearningCartoonization of images using machine Learning
Cartoonization of images using machine Learning
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 
Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...
Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...
Volodymyr Getmanskyi "Deep learning for satellite imagery colorization and di...
 
Ai big dataconference_volodymyr getmanskyi colorization distance measuring
Ai big dataconference_volodymyr getmanskyi colorization distance measuringAi big dataconference_volodymyr getmanskyi colorization distance measuring
Ai big dataconference_volodymyr getmanskyi colorization distance measuring
 
Video to Video Translation CGAN
Video to Video Translation CGANVideo to Video Translation CGAN
Video to Video Translation CGAN
 
Comparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting TechniquesComparative Study and Analysis of Image Inpainting Techniques
Comparative Study and Analysis of Image Inpainting Techniques
 
A Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in PaintingA Novel Approach to Image Denoising and Image in Painting
A Novel Approach to Image Denoising and Image in Painting
 
Multiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-TimeMultiple Style-Transfer in Real-Time
Multiple Style-Transfer in Real-Time
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
IPT.pdf
IPT.pdfIPT.pdf
IPT.pdf
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 
Learning from Simulated and Unsupervised Images through Adversarial Training....
Learning from Simulated and Unsupervised Images through Adversarial Training....Learning from Simulated and Unsupervised Images through Adversarial Training....
Learning from Simulated and Unsupervised Images through Adversarial Training....
 
G04654247
G04654247G04654247
G04654247
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 

Recently uploaded

一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 

Recently uploaded (20)

一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 

Implementing Neural Style Transfer

  • 1. Implementing Neural Style Transfer Authors : Tasmiah Tahsin Mayeesha,Ahraf Sharif , Hashmir Rahsan Toron Electrical and Computer Engineering Department, North South University Abstract— This technical report implements the recent neural style transfer method invented by Gatys et.al in the paper “A Neural Algorithm of Artistic Style Transfer” and compares different optimization techniques and variety of data. Keywords—machine learning, deep learning, convolutional neural networks, neural style transfer, computer vision I. INTRODUCTION “Imitation is the greatest form of flattery”—said Charles Caleb Canton, a 17th Century English cleric and writer. Artists has always build on past works made by the former artists to push the frontier of human imagination ahead. Art has been used for depicting religious figures, spreading political propaganda, inspiring protests, subtly communicating humor with cartoons to preserving history via portraits. But until now, art has always been created by humans for consumption by other humans. A significant difference between humans and machines is that only humans can imagine and create new art in the form of paintings, books and songs. However, machine learning has advanced so far to the point of being able to create art by themselves. Deep neural networks can combine two or more images in such ways that we can split the style and the content of images and take the style of one image to impose on the other to create a completely new image that takes inspiration from the style and context of the input images. This is the key idea of neural style transfer by Gatys Et. Al that uses convolutional neural networks for this task. This paper will describe our implementation of the style transfer technique within the computational constraints we faced. II. RELATED WORK Transferring the style from one image to another image is an interesting yet difficult problem. There have been many efforts to develop efficient methods for automatic style transfer [Hertzmann et al., 2001; Efros and Freeman, 2001 Recently, Gatys et al. proposed a seminal work [Gatys et al., 2016]: It captures the style of artistic images and transfer it to other images using Convolutional Neural Networks (CNN). This work formulated the problem as finding an image that matching both the content and style statistics based on the
  • 2. neural activations of each layer in CNN. Later more improvements have been disclosed in papers like “Perceptual Losses for Style Transfer and Super Resolution” by Johnson(2016) III. METHODOLOGY Mathematically, Given a content image c and a style image s we want to generate an output image x that has the texture,color etc from s and content from c. According to Gatys et.al we can pose it as an optimization problem where X*=argminx(αLcontent(c,x)+βLstyle(s,x)) Here α = weight of content loss and β = weight of style loss. We want to find out the output image x that minimizes the loss or differs as little as possible from c in content and s in style. A. Algorithms and Techniques To generate the output image with neural style transfer technique another method called transfer learning is used. Transfer learning refers to using the weights pretrained network(on imagenet dataset) to do some other task that the network was not originally trained for. For example, in imagenet challenge the imposed task is a classification problem for 1000 classes, but it’s possible to take the weights from this network and use them to a binary classification problem by replacing the final softmax layer. When a convolutional neural network is trained for a classification task, the convolutional layers tend to learn the feature representations for those images. The higher level convolutional layers learn the general high level features(textures, color etc) and the arrangement of the input images, but they do not learn the exact pixel values, on the contrary the lower level layers learn the general content of the image as we progressively go deeper in the network. Thus in a sense the style and the content of the image is separable. In this case, we use a CNN called VGG16 released by Oxford’s Visual Geometry group in 2016. We use this network to get the feature representations of the images and use them to define the loss score and gradients to update a randomly generated image and minimize the loss. The architecture of the network is shown in the following diagram :
  • 3. B. DATA PREPROCESSING Before proceeding with using the VGG- 16 network on our images to extract feature representations, we need to preprocess them like the original paper. For this, following transforms were applied: 1. Subtraction of the mean RGB value (computed previously on the imagenet training set) from each pixel. 2. Flipping the ordering of the multi- dimensional array from RGB to BGR (the ordering used in the paper). For memory related constraints we’ve also resized the images to 224 x 224 as bigger images mean more parameters to tune. With a 224 x 224 image with 3 channels(R,G,B) , the combined image has 224 x 224 x 3 = 150528 parameters to tune already. C. Loss Function Loss or cost function in machine learning is used for scoring algorithms by comparing generated output with the expected output. For neural style transfer the output is an image that contains both the style of the style image and the content of the content image as much as possible. The loss function outputs the score that indicates how close the generated image is to the original style image in style and content image for content. Unlike image classification, where the loss function is used for updating the weights of the network after comparing the predictions with the original classes, score of loss function for the neural style transfer is used for updating the pixels of the generated image with stochastic gradient descent or other optimizers. Since the loss function has to measure both the style loss and the content loss, we can write the loss function following way. Loss = αLcontent(c,x)+βLstyle(s,x) Content Loss is the Mean Squared Error between the feature representations of the content image and the combined image. Style loss is the scaled, squared loss of the frobenius norm of the difference between the Gram matrices of the style and combination images. Gram Matrices refer the matrix
  • 4. formed by multiplying the transpose of a matrix with itself. In order to denoise the result images we also add ‘Total Variation Loss’ to the images to reduce shakiness introduced in the paper “Understanding Deep Images by Inverting them” by Aravindh(2014). Thus the loss function is the summation of these three terms. IV. MODEL TRAINING AND EVALUATION A.Model Training The combination image is initialized as a random collection of pixels. Using the L- BFGS algorithm (a quasi-Newton algorithm that's significantly quicker to converge than standard gradient descent) to iteratively improve upon it. For training the model we pass the content image, style image and the combined images through the VGG-16 network to extract features to measure the loss functions. In each iteration we measure the loss and update the combination image accordingly. Each iteration took around 5 minutes on a 4GB RAM machine, but we expect a significant speed up using GPUs. A. Model Evaluation The training loss for the generated image was measured with the loss function as described above with L-BFGS optimizer. Table for Training Loss for each epoch : Epoch Loss(1e^10) Time 1 5.849 284 2 2.633 312 3 2.032 275 4 1.801 279 5 1.61 272 Graph of Loss Function As we were able to derive photos with good resolution after only 5 epochs we have not increased the number of training iterations. After using a content image for Jatio Songsod Bhaban of Bangladesh and a style image of a impressionist painting “Forest” drawn by Artist Leonardo Afremov, we were able to generate this output :
  • 5. We can also experiment with different content, style and variation loss weights to see how the outputs differ. This image of Savar has following parameters : α = 0.025, β = 5, γ = 1 The style weight is larger than the content weight so the output image looks a lot like Van Gogh’s starry night. But if we change the parameters to α = 4, β = 2, γ = 1, the output image also changes a lot. As the content weight > style weight the output image looks more like the original image except with some filters. V.IMPROVEMENTS AND DEPLOYMENT A. Improvement 1. Variation of parameters We will try different variations of the parameters by changing the input images, their sizes, the weights of the different loss functions, the features used to construct them to compare the results given our computational constraints. As the memory of my laptop is quite small(only 4GB) so far we've been unable to use the algorithm on anything above the 224x224 image size. Deep learning algorithms are meant to run on the GPU which we do not have so far. 2. Speed Optimization As the current process is very slow, we're going to replace our current implementation with an image transformation CNN network and implement fast style transfer method as
  • 6. described in Perceptual Loss paper in Johnson(2016).This will give us a 1000x speed up over this implementation, making it suitable for a webapp. B.Deployment The preferred outcome of this project would have been a deployed application implemented in python that would help us to create new images with neural style transfer in real time styled with traditional Bangladeshi paintings. However, because of the complexity of the technique so far we’ve been able to implement the backend only with the basic neural style transfer technique as indicated above. The front end design for such an app has also been developed. Prototype front end designs are attached below. REFERENCES 1. A Neural Algorithm of Artistic Style] (https://arxiv.org/pdf/1508.06576.pdf) (First Neural Style Transfer Paper) 2. Perceptual Losses for Real-Time Style Transfer and Super-Resolution] (https://arxiv.org/pdf/1603.08155.pdf) (ECCV 2016) 3. Application : Prisma 4. Course.fast.ai 5. https://arxiv.org/abs/1412.0035 6.
  • 7. described in Perceptual Loss paper in Johnson(2016).This will give us a 1000x speed up over this implementation, making it suitable for a webapp. B.Deployment The preferred outcome of this project would have been a deployed application implemented in python that would help us to create new images with neural style transfer in real time styled with traditional Bangladeshi paintings. However, because of the complexity of the technique so far we’ve been able to implement the backend only with the basic neural style transfer technique as indicated above. The front end design for such an app has also been developed. Prototype front end designs are attached below. REFERENCES 1. A Neural Algorithm of Artistic Style] (https://arxiv.org/pdf/1508.06576.pdf) (First Neural Style Transfer Paper) 2. Perceptual Losses for Real-Time Style Transfer and Super-Resolution] (https://arxiv.org/pdf/1603.08155.pdf) (ECCV 2016) 3. Application : Prisma 4. Course.fast.ai 5. https://arxiv.org/abs/1412.0035 6.