SlideShare a Scribd company logo
1 of 32
2022. 06. 03.
A Style-Based Generator Architecture for Generative
Adversarial Networks
Tero Karras, Samuli Laine, Timo Aila
CVPR 2019
Hyunwook Lee
Contents
• Overview
• Preliminaries
• Disentangled Representation
• Various Normalization in Deep Learning
Domain
• StyleGAN
• Disentanglement of StyleGAN
• AdaIN in the StyleGAN
• Applications of StyleGAN
• Case: Music Generation
• Conclusion
3
Overview: What is the StyleGAN?
(a) Traditional generator and (b) StyleGAN generator
One of the most famous GANs for image synthesis
Automatic, unsupervised separation of high-level attributes
Control image synthesis inspired by style transfer
Scale-specific mixing and interpolation
Learnable
Operation
AdaIN
4
Overview: Examples with StyleGAN
• StyleGAN enables scale-specific styling
• Different styles in each layer only affect to
the corresponding scale of style
• Coarse – pose, general hair style, face, eyeglasses
• Middle – small facial features, hair style, eyes
• Fine – color scheme and microstructure
• How they achieve these styling?
 AdaIN from Style Transfer & Disentangled Representation!
5
Preliminaries: Disentangled Representation
Entangled and disentangled representation
One of the unsupervised representation learning in generative learning
Control image synthesis inspired by style transfer
Automatic, unsupervised separation of high-level attributes
Scale-specific mixing and interpolation
6
Preliminaries: Disentangled Representation
Training of the GANs in traditional way
 Latent z will be a kind of feature vector
(i.e., representation)
Train & Test
• Change of the latent z in arbitrary dimension causes changes of two or
more features  these features are entangled!
• Degrades interpretability and controllability of generation process
 each latent dimension should correspond to one “independent feature”
7
Preliminaries: Disentangled Representation
• Based on manifold hypothesis
• real-world high-dimensional data lie on low-
dimensional manifolds embedded within the high-
dimensional space
• Unit Gaussian is not enough to represents image
manifolds
• Images can badly reconstructed
• A latent space that consists of linear subspaces,
each of which controls one factor of variation
• Reading materials:
• InfoGAN, β-VAE, Spatial CBN, LAPGAN,…
8
Various Normalization in Deep Learning
• Commonly utilized in most of the deep learning models
• Main idea: normalize layer input  guarantee all the layers have same /
similar input distribution
mean/std among minibatch mean/std among channels mean/std in minibatch mean/std in minibatch
9
Instance Normalization in Style Transfer
• Convolutional feature statistics of DNN can capture the style of images
• Recent work reveals that channel-wise mean/variance are effective for
style transfer
 Instance Normalization can be seen as one of the style normalization!
10
Adaptive Instance Normalization
• Given context image x and style image y, the style can be obtained by:
• Normalize x (remove style of the context)  denormalize x with style of y
11
Style-based Generator (StyleGAN)
How can they achieve Disentangled Representation?
How can they design a generator as a style transfer?
Why do they need noise input for each layer?
(a) Traditional generator and (b) StyleGAN generator
Learnable
Operation
AdaIN
12
StyleGAN: Disentangled Representation
• Latent space disentanglement is crucial part for both style
transfer and generative model
• Hard to achieved by direct mapping (b in lower figure)
• StyleGAN generates disentangled intermediate latent space
𝒲
• Not a fixed distribution, but learned mapping
• Spatially invariant, modified by affined transformation A
• Generate images from disentangled representation is much
easier than that from entangled representation
 mapping network surely trained to generate disentangled
representation
13
StyleGAN: AdaIN as Styling Methods
• By affined transformation A, the vector w be the
style y = (ys, yb)
• ys is style deviation and yb is style mean
• Step-by-Step
• Input x is normalized as Instance Normalization
• Effectively localize the styles
• Denormalized by ys and yb
• To guarantee ys is standard deviation (i.e., positive value),
actual multiplier is ys + 1
• Forward to next layer
• Note: scale-specific styling is only possible when
we can separate each network output gradually
14
StyleGAN: Style Mixing
• Encouraging the styles to localize by
Style Mixing in training
• Simply,
• try to run two different latent code z1, z2
• Mix corresponding intermediate latent w1, w2
at a randomly selected point in 𝑔
• preventing the network from assuming
that adjacent styles are correlated
 more localized, scale-specific modification!
15
StyleGAN: Style Mixing
16
Style Mixing in Coarse Level (42 - 82)
17
Style Mixing in Middle Level (162 - 322)
• Bring smaller scale face features, hair style, eye
open / close,…
18
Style Mixing in Fine Level (642 – 10242)
• Mainly bring color scheme and microstructures
• Doesn’t change coarse / middle styles
19
StyleGAN: Stochastic Variation
• Traditional GANs achieves stochastic
variation by…
• generating spatially-varying pseudorandom
numbers
• Consumes network capacity
• Not always successful
• In StyleGAN…
• Introduce random noise in layer-level
• Hypothesis: there is pressure to introduce new
content as soon as possible at any point
• Fake discriminator
• The easiest way: introducing new random noise for
each layers  variation with random noises
20
StyleGAN: Stochastic Variation
• The main areas of stochastic
variation is
• the hair
• Silhouettes
• parts of background
 The noise doesn’t affect to
global aspects!
21
StyleGAN: Water Droplet –like Artifacts
22
Advances of StyleGAN: StyleGAN2
23
Advances of StyleGAN: StyleGAN2
Phase artifacts in StyleGAN Examples of unnatural images w/ StyleGAN
• StyleGAN (left) has texture sticking problem due to
the progressive growing
• Each Image in different scale generated by
corresponding generator, independently
• Adopting ResNet architecture to solve problem
• Note: not perfectly solved – it’ll be discussed in StyleGAN3
Examples of natural images w/ StyleGAN2
24
Advances of StyleGAN: StyleGAN3
• StyleGAN2 (left) has not perfectly solved texture sticking problem
• (left) Averaged images w/ small changes of latent should blur the central image
• (left) But StyleGAN2 have stick to the same pixel coordinates
• asdf
25
Advances of StyleGAN: StyleGAN3
26
Applications of StyleGAN: Image Domain
• InterFaceGAN
• Extract linear editing directions through attribute-level supervision
• StyleFlow
• First to present editing that is stable to be composed
• Normalizing flows and attribute-level supervision
• DyStyle
• Addresses compositional editing directly
• Accurate, elaborate, and diverse editing
• StyleCLIP
• Free textual editing w/ visual-linguistic pretrained model
• Pose with Style
• Human pose supervision to edit body poses and clothing
• StyleMapGAN
• Localized editing by augmenting StyleGAN’s architecture
w/ spatially adaptive modulation
27
Applications of StyleGAN: StyleMapGAN
28
Applications of StyleGAN: StyleMapGAN
• Localized editing by augmenting StyleGAN’s
architecture w/ spatially adaptive modulation
• Localied editing conducted with
29
Music Generation: Recent Works w/wo GANs
• Style-Conditioned Music Generation
• Style transfer-like methods in music generation w/ LSTM-based GANs
• Making style codebook that decides overall style of the music
• Symbolic Music Generation with Transformer-GANs
• Compound Word Transformer: Learning to Compose Full-Song Music over
Dynamic Directed Hypergraphs
• Transformer-based music generation model
30
Music Generation: Why is the StyleGAN hard to utilized?
• Main “scale-specific controllability” of the StyleGAN comes from the
stacked CNN w/ various size
 To utilize StyleGAN, it should be separable
• Music composition should be infinitely extended
 cannot utilize CNNs in temporal dimension
• Separation of the musical components (e.g., Motive – Phrase – Period)
 Hard to modeled like CNNs (intuitive separation of the components are hard)
 Each of them shares overall flow  separation causes incoherence music
• Separation of the Midi components (e.g., bar – beat - …)
 Using CNNs to combine them can cause information loss
(e.g., structured tokens)
• Too many additional features to consider
• StyleGAN and Image processing  no other input or consideration except image
• Music has a bunch of extra features like instrument
31
Conclusion
• “Scale-specific controllability” of the StyleGAN comes from the stacked
CNN w/ various size
 To utilize StyleGAN, target domain output should be separable
(e.g., 4x4  8x8  16x16  32x32  …  1024x1024)
• Maybe utilized in GUI design, but it will be more like “Conditioned image
synthesis regardless of the structure”
• If we utilize StyleGAN in GUI design, we should defense…
• why do we ignore the structures?
• Isn’t this design a combination of existing designs?
Thank you

More Related Content

What's hot

Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial NetworksDong Heon Cho
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIWithTheBest
 
BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis
BigGAN: Large Scale GAN Training for High Fidelity Natural Image SynthesisBigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis
BigGAN: Large Scale GAN Training for High Fidelity Natural Image SynthesisYoung Seok Kim
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Prakhar Rastogi
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Generative models (Geek hub 2021 lecture)
Generative models (Geek hub 2021 lecture)Generative models (Geek hub 2021 lecture)
Generative models (Geek hub 2021 lecture)Vitaly Bondar
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)NamHyuk Ahn
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion ModelsSangwoo Mo
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksJong Wook Kim
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
 
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image ManipulationDiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulationssuser2e0133
 
Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)DaeJin Kim
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Vitaly Bondar
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density ModelsSangwoo Mo
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
Latent diffusions vs DALL-E v2
Latent diffusions vs DALL-E v2Latent diffusions vs DALL-E v2
Latent diffusions vs DALL-E v2Vitaly Bondar
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANS.Shayan Daneshvar
 

What's hot (20)

Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 
BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis
BigGAN: Large Scale GAN Training for High Fidelity Natural Image SynthesisBigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis
BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Generative models (Geek hub 2021 lecture)
Generative models (Geek hub 2021 lecture)Generative models (Geek hub 2021 lecture)
Generative models (Geek hub 2021 lecture)
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial Networks
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image ManipulationDiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
 
Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)Research Trends in Editing image using GAN (TAGAN, Editable GAN)
Research Trends in Editing image using GAN (TAGAN, Editable GAN)
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
Imagen: Photorealistic Text-to-Image Diffusion Models with Deep Language Unde...
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
Latent diffusions vs DALL-E v2
Latent diffusions vs DALL-E v2Latent diffusions vs DALL-E v2
Latent diffusions vs DALL-E v2
 
Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GAN
 

Similar to A Style-Based Generator Architecture for Generative Adversarial Networks

PyData Delhi 2018 : Creating Art with Neural Nets
PyData Delhi 2018 : Creating Art with Neural NetsPyData Delhi 2018 : Creating Art with Neural Nets
PyData Delhi 2018 : Creating Art with Neural Netssrish1
 
Demystifying Neural Style Transfer
Demystifying Neural Style TransferDemystifying Neural Style Transfer
Demystifying Neural Style TransferSEMINARGROOT
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsJaeJun Yoo
 
Let there be color! 논문 설명 입니다.
Let there be color! 논문 설명 입니다.Let there be color! 논문 설명 입니다.
Let there be color! 논문 설명 입니다.Joowon Moon
 
Animated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsAnimated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsSAIL_QU
 
Introduction image features
Introduction image featuresIntroduction image features
Introduction image featurespayalshah14
 
Y. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAIY. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAIMLILAB
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveySangwoo Mo
 
DC04 Image Compression Standards.pdf
DC04 Image Compression Standards.pdfDC04 Image Compression Standards.pdf
DC04 Image Compression Standards.pdfssuser1bd081
 
Make your designers love (working with) you
Make your designers love (working with) youMake your designers love (working with) you
Make your designers love (working with) youVinay Shenoy
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
06 image features
06 image features06 image features
06 image featuresankit_ppt
 
Y. Kim, ICLR 2023, MLILAB, KAISTAI
Y. Kim, ICLR 2023, MLILAB, KAISTAIY. Kim, ICLR 2023, MLILAB, KAISTAI
Y. Kim, ICLR 2023, MLILAB, KAISTAIMLILAB
 
Clean architecture for shaders unite2019
Clean architecture for shaders unite2019Clean architecture for shaders unite2019
Clean architecture for shaders unite2019Abhilash Majumder
 
A (very brief) Introduction to Image Processing and 3D Printing with ImageJ
A (very brief) Introduction to Image Processing and 3D Printing with ImageJA (very brief) Introduction to Image Processing and 3D Printing with ImageJ
A (very brief) Introduction to Image Processing and 3D Printing with ImageJPaul Mignone, Ph.D
 

Similar to A Style-Based Generator Architecture for Generative Adversarial Networks (20)

PyData Delhi 2018 : Creating Art with Neural Nets
PyData Delhi 2018 : Creating Art with Neural NetsPyData Delhi 2018 : Creating Art with Neural Nets
PyData Delhi 2018 : Creating Art with Neural Nets
 
Demystifying Neural Style Transfer
Demystifying Neural Style TransferDemystifying Neural Style Transfer
Demystifying Neural Style Transfer
 
A beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trendsA beginner's guide to Style Transfer and recent trends
A beginner's guide to Style Transfer and recent trends
 
Let there be color! 논문 설명 입니다.
Let there be color! 논문 설명 입니다.Let there be color! 논문 설명 입니다.
Let there be color! 논문 설명 입니다.
 
Image processing.pdf
Image processing.pdfImage processing.pdf
Image processing.pdf
 
3 D texturing
 3 D texturing 3 D texturing
3 D texturing
 
Deferred shading
Deferred shadingDeferred shading
Deferred shading
 
Animated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution StoryboardsAnimated Visualization of Software History Using Software Evolution Storyboards
Animated Visualization of Software History Using Software Evolution Storyboards
 
Introduction image features
Introduction image featuresIntroduction image features
Introduction image features
 
WT in IP.ppt
WT in IP.pptWT in IP.ppt
WT in IP.ppt
 
Y. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAIY. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAI
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
DC04 Image Compression Standards.pdf
DC04 Image Compression Standards.pdfDC04 Image Compression Standards.pdf
DC04 Image Compression Standards.pdf
 
Make your designers love (working with) you
Make your designers love (working with) youMake your designers love (working with) you
Make your designers love (working with) you
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
06 image features
06 image features06 image features
06 image features
 
Y. Kim, ICLR 2023, MLILAB, KAISTAI
Y. Kim, ICLR 2023, MLILAB, KAISTAIY. Kim, ICLR 2023, MLILAB, KAISTAI
Y. Kim, ICLR 2023, MLILAB, KAISTAI
 
Clean architecture for shaders unite2019
Clean architecture for shaders unite2019Clean architecture for shaders unite2019
Clean architecture for shaders unite2019
 
A (very brief) Introduction to Image Processing and 3D Printing with ImageJ
A (very brief) Introduction to Image Processing and 3D Printing with ImageJA (very brief) Introduction to Image Processing and 3D Printing with ImageJ
A (very brief) Introduction to Image Processing and 3D Printing with ImageJ
 
Core Animation
Core AnimationCore Animation
Core Animation
 

More from ivaderivader

DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernelsivaderivader
 
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality ivaderivader
 
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...ivaderivader
 
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...ivaderivader
 
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...ivaderivader
 
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...ivaderivader
 
Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
Perception! Immersion! Empowerment! Superpowers as Inspiration for VisualizationPerception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualizationivaderivader
 
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...ivaderivader
 
Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
Neural Approximate Dynamic Programming for On-Demand Ride-PoolingNeural Approximate Dynamic Programming for On-Demand Ride-Pooling
Neural Approximate Dynamic Programming for On-Demand Ride-Poolingivaderivader
 
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...ivaderivader
 
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTubeBad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTubeivaderivader
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise RemovalInvertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise Removalivaderivader
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural NetworkTraffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Networkivaderivader
 
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training  MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training ivaderivader
 
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI ComponentsScreen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI Componentsivaderivader
 
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...ivaderivader
 
Natural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine TranslationNatural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine Translationivaderivader
 
Recommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking SystemRecommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking Systemivaderivader
 

More from ivaderivader (20)

Argument Mining
Argument MiningArgument Mining
Argument Mining
 
Papers at CHI23
Papers at CHI23Papers at CHI23
Papers at CHI23
 
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph KernelsDDGK: Learning Graph Representations for Deep Divergence Graph Kernels
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
 
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
So Predictable! Continuous 3D Hand Trajectory Prediction in Virtual Reality
 
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
Reinforcement Learning-based Placement of Charging Stations in Urban Road Net...
 
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
Prediction for Retrospection: Integrating Algorithmic Stress Prediction into ...
 
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Orien...
 
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
CatchLIve: Real-time Summarization of Live Streams with Stream Content and In...
 
Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
Perception! Immersion! Empowerment! Superpowers as Inspiration for VisualizationPerception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualization
 
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic F...
 
Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
Neural Approximate Dynamic Programming for On-Demand Ride-PoolingNeural Approximate Dynamic Programming for On-Demand Ride-Pooling
Neural Approximate Dynamic Programming for On-Demand Ride-Pooling
 
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
StoryMap: Using Social Modeling and Self-Modeling to Support Physical Activit...
 
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTubeBad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
Bad Breakdowns, Useful Seams, and Face Slapping: Analysis of VR Fails on YouTube
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise RemovalInvertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise Removal
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural NetworkTraffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
 
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training  MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
 
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI ComponentsScreen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
 
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Impro...
 
Natural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine TranslationNatural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine Translation
 
Recommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking SystemRecommending What Video to Watch Next: A Multitask Ranking System
Recommending What Video to Watch Next: A Multitask Ranking System
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 

A Style-Based Generator Architecture for Generative Adversarial Networks

  • 1. 2022. 06. 03. A Style-Based Generator Architecture for Generative Adversarial Networks Tero Karras, Samuli Laine, Timo Aila CVPR 2019 Hyunwook Lee
  • 2. Contents • Overview • Preliminaries • Disentangled Representation • Various Normalization in Deep Learning Domain • StyleGAN • Disentanglement of StyleGAN • AdaIN in the StyleGAN • Applications of StyleGAN • Case: Music Generation • Conclusion
  • 3. 3 Overview: What is the StyleGAN? (a) Traditional generator and (b) StyleGAN generator One of the most famous GANs for image synthesis Automatic, unsupervised separation of high-level attributes Control image synthesis inspired by style transfer Scale-specific mixing and interpolation Learnable Operation AdaIN
  • 4. 4 Overview: Examples with StyleGAN • StyleGAN enables scale-specific styling • Different styles in each layer only affect to the corresponding scale of style • Coarse – pose, general hair style, face, eyeglasses • Middle – small facial features, hair style, eyes • Fine – color scheme and microstructure • How they achieve these styling?  AdaIN from Style Transfer & Disentangled Representation!
  • 5. 5 Preliminaries: Disentangled Representation Entangled and disentangled representation One of the unsupervised representation learning in generative learning Control image synthesis inspired by style transfer Automatic, unsupervised separation of high-level attributes Scale-specific mixing and interpolation
  • 6. 6 Preliminaries: Disentangled Representation Training of the GANs in traditional way  Latent z will be a kind of feature vector (i.e., representation) Train & Test • Change of the latent z in arbitrary dimension causes changes of two or more features  these features are entangled! • Degrades interpretability and controllability of generation process  each latent dimension should correspond to one “independent feature”
  • 7. 7 Preliminaries: Disentangled Representation • Based on manifold hypothesis • real-world high-dimensional data lie on low- dimensional manifolds embedded within the high- dimensional space • Unit Gaussian is not enough to represents image manifolds • Images can badly reconstructed • A latent space that consists of linear subspaces, each of which controls one factor of variation • Reading materials: • InfoGAN, β-VAE, Spatial CBN, LAPGAN,…
  • 8. 8 Various Normalization in Deep Learning • Commonly utilized in most of the deep learning models • Main idea: normalize layer input  guarantee all the layers have same / similar input distribution mean/std among minibatch mean/std among channels mean/std in minibatch mean/std in minibatch
  • 9. 9 Instance Normalization in Style Transfer • Convolutional feature statistics of DNN can capture the style of images • Recent work reveals that channel-wise mean/variance are effective for style transfer  Instance Normalization can be seen as one of the style normalization!
  • 10. 10 Adaptive Instance Normalization • Given context image x and style image y, the style can be obtained by: • Normalize x (remove style of the context)  denormalize x with style of y
  • 11. 11 Style-based Generator (StyleGAN) How can they achieve Disentangled Representation? How can they design a generator as a style transfer? Why do they need noise input for each layer? (a) Traditional generator and (b) StyleGAN generator Learnable Operation AdaIN
  • 12. 12 StyleGAN: Disentangled Representation • Latent space disentanglement is crucial part for both style transfer and generative model • Hard to achieved by direct mapping (b in lower figure) • StyleGAN generates disentangled intermediate latent space 𝒲 • Not a fixed distribution, but learned mapping • Spatially invariant, modified by affined transformation A • Generate images from disentangled representation is much easier than that from entangled representation  mapping network surely trained to generate disentangled representation
  • 13. 13 StyleGAN: AdaIN as Styling Methods • By affined transformation A, the vector w be the style y = (ys, yb) • ys is style deviation and yb is style mean • Step-by-Step • Input x is normalized as Instance Normalization • Effectively localize the styles • Denormalized by ys and yb • To guarantee ys is standard deviation (i.e., positive value), actual multiplier is ys + 1 • Forward to next layer • Note: scale-specific styling is only possible when we can separate each network output gradually
  • 14. 14 StyleGAN: Style Mixing • Encouraging the styles to localize by Style Mixing in training • Simply, • try to run two different latent code z1, z2 • Mix corresponding intermediate latent w1, w2 at a randomly selected point in 𝑔 • preventing the network from assuming that adjacent styles are correlated  more localized, scale-specific modification!
  • 16. 16 Style Mixing in Coarse Level (42 - 82)
  • 17. 17 Style Mixing in Middle Level (162 - 322) • Bring smaller scale face features, hair style, eye open / close,…
  • 18. 18 Style Mixing in Fine Level (642 – 10242) • Mainly bring color scheme and microstructures • Doesn’t change coarse / middle styles
  • 19. 19 StyleGAN: Stochastic Variation • Traditional GANs achieves stochastic variation by… • generating spatially-varying pseudorandom numbers • Consumes network capacity • Not always successful • In StyleGAN… • Introduce random noise in layer-level • Hypothesis: there is pressure to introduce new content as soon as possible at any point • Fake discriminator • The easiest way: introducing new random noise for each layers  variation with random noises
  • 20. 20 StyleGAN: Stochastic Variation • The main areas of stochastic variation is • the hair • Silhouettes • parts of background  The noise doesn’t affect to global aspects!
  • 21. 21 StyleGAN: Water Droplet –like Artifacts
  • 23. 23 Advances of StyleGAN: StyleGAN2 Phase artifacts in StyleGAN Examples of unnatural images w/ StyleGAN • StyleGAN (left) has texture sticking problem due to the progressive growing • Each Image in different scale generated by corresponding generator, independently • Adopting ResNet architecture to solve problem • Note: not perfectly solved – it’ll be discussed in StyleGAN3 Examples of natural images w/ StyleGAN2
  • 24. 24 Advances of StyleGAN: StyleGAN3 • StyleGAN2 (left) has not perfectly solved texture sticking problem • (left) Averaged images w/ small changes of latent should blur the central image • (left) But StyleGAN2 have stick to the same pixel coordinates • asdf
  • 26. 26 Applications of StyleGAN: Image Domain • InterFaceGAN • Extract linear editing directions through attribute-level supervision • StyleFlow • First to present editing that is stable to be composed • Normalizing flows and attribute-level supervision • DyStyle • Addresses compositional editing directly • Accurate, elaborate, and diverse editing • StyleCLIP • Free textual editing w/ visual-linguistic pretrained model • Pose with Style • Human pose supervision to edit body poses and clothing • StyleMapGAN • Localized editing by augmenting StyleGAN’s architecture w/ spatially adaptive modulation
  • 28. 28 Applications of StyleGAN: StyleMapGAN • Localized editing by augmenting StyleGAN’s architecture w/ spatially adaptive modulation • Localied editing conducted with
  • 29. 29 Music Generation: Recent Works w/wo GANs • Style-Conditioned Music Generation • Style transfer-like methods in music generation w/ LSTM-based GANs • Making style codebook that decides overall style of the music • Symbolic Music Generation with Transformer-GANs • Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs • Transformer-based music generation model
  • 30. 30 Music Generation: Why is the StyleGAN hard to utilized? • Main “scale-specific controllability” of the StyleGAN comes from the stacked CNN w/ various size  To utilize StyleGAN, it should be separable • Music composition should be infinitely extended  cannot utilize CNNs in temporal dimension • Separation of the musical components (e.g., Motive – Phrase – Period)  Hard to modeled like CNNs (intuitive separation of the components are hard)  Each of them shares overall flow  separation causes incoherence music • Separation of the Midi components (e.g., bar – beat - …)  Using CNNs to combine them can cause information loss (e.g., structured tokens) • Too many additional features to consider • StyleGAN and Image processing  no other input or consideration except image • Music has a bunch of extra features like instrument
  • 31. 31 Conclusion • “Scale-specific controllability” of the StyleGAN comes from the stacked CNN w/ various size  To utilize StyleGAN, target domain output should be separable (e.g., 4x4  8x8  16x16  32x32  …  1024x1024) • Maybe utilized in GUI design, but it will be more like “Conditioned image synthesis regardless of the structure” • If we utilize StyleGAN in GUI design, we should defense… • why do we ignore the structures? • Isn’t this design a combination of existing designs?

Editor's Notes

  1. Batch Norm  가장 기본적인 normalization technique, batch의 평균 / 분산이 전체 데이터셋을 대표한다는 가정하에 실행. inference와 training시의 실행 방식이 다름 Layer Norm  입력 scale에 robust, 가중치의 scale / shifting에 robust Instance Norm  각 채널별 / 배치별로 mean / std normalization, inferenc단에서도 동일하게 이용가능, 명암 대비 등을 normalize할 수 있음 Group Norm  2018년 Kaiming He가 발표, Layer Norm과 Instance Norm의 절충안
  2. Bring High-level aspects (i.e., pose, general hair style, face shape, and eye glasses)
  3. Normalization으로 인해 발생하는 smooth하지 못한 mapping임 64 by 64 image부터 나타나며, 모든 feature map에 발생함  AdaIN이 결국 channel간의 연관성을 박살을 내기때문 또한, normalization이 입력에 의존하기때문에, 입력에 아주 큰 spike가 있다면 다른 곳에서 세부 조정이 쉬워지기때문에
  4. Bias 및 noise가 normalize 전에 적용된다면 상대적인 영향력이 style magnitude에 반비례하게 됨.  noise와 bias가 style과 correlate하게 됨  따라서, normalization 이후에 noise 및 bias를 적용함 Mean 빼는 부분을 없앰 + data를 기반으로 한 normalization을 없앰 AdaIN을 weight의 norm / denorm으로 변경 (bias가 없는 convolution의 성질을 생각해본다면 간단하게 가능함.)
  5. Average value를 다루는 것으로 EMA 추가. Upsample을 통해 Filtering 진행 Upsampling에는 (a)에 있는 Filter를 이용  continuous, infinite spatial domain 가정
  6. Note: scale-specific styling is only possible when we can separate each network output gradually