SlideShare a Scribd company logo
Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa
Waseda University
(*equal contribution)
2
Colorization of Black-and-white Pictures
3
Our Goal: Fully-automatic colorization
4
Colorization of Old Films
5
Related Work
 Scribble-based [Levin+ 2004; Yatziv+ 2004;
An+ 2009; Xu+ 2013; Endo+ 2016]
 Specify colors with scribbles
 Require manual inputs
 Reference image-based [Chia+ 2011;
Gupta+ 2012]
 Transfer colors of reference images
 Require very similar images
[Levin+ 2004]
[Gupta+ 2012]
Input Reference Output
6
Related Work
 Automatic colorization with hand-crafted features [Cheng+ 2015]
 Uses existing multiple image features
 Computes chrominance via a shallow neural network
 Depends on the performance of semantic segmentation
 Only handles simple outdoor scenes
Image
features
Chroma OutputInput Neural Network
7
Contributions
 Novel end-to-end network that jointly learns global and local features
for automatic image colorization
 New fusion layer that elegantly merges the global and local features
 Exploit classification labels for learning
8
Layers of Our Model
 Fully-connected layer
 All neurons are connected between layers
 Convolutional layer
 Takes into account underlying spatial structure
No. of feature maps
Convolutional layer
…
…
Fully-connected layer
Neuron
𝑥
𝑦
9
Our Model
 Two branches: local features and global features
 Composed of four networks
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
10
Low-Level Features Network
 Extract low-level features such as edges and corners
 Lower resolution for efficient processing
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
11
Global Features Network
 Compute a global 256-dimensional vector representation of the image
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
12
Mid-Level Features Network
 Extract mid-level features such as texture
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
13
Fusion Layer
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
14
Fusion Layer
 Combine the global features with the mid-level features
 The resulting features are independent of any resolution
Fusion Layer
Mid-Level Features
Network
Global Features Network
𝐲 𝑢,𝑣
fusion = 𝜎 𝐛 + 𝑊
𝐲global
𝐲 𝑢,𝑣
mid
15
Colorization Network
 Compute chrominance from the fused features
 Restore the image to the input resolution
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
16
Training of Colors
 Mean Squared Error (MSE) as loss function
 Optimization using ADADELTA [Zeiler 2012]
 Adaptively sets a learning rate
Model
Forward
Backward
MSE
Input Output Ground truth
17
Joint Training
 Training for classification jointly with the colorization
 Classification network connected to the global features
Shared
weights
Scaling
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
20.60% Formal Garden
16.13% Arch
13.50% Abbey
7.07% Botanical Garden
6.53% Golf Course
Predicted labels
Classification
Network
Chrominance
18
Dataset
 MIT Places Scene Dataset [Zhou+ 2014]
 2.3 million training images with 205 scene labels
 256 × 256 pixels
Abbey Airport terminal Aquarium Baseball field
Dining room Forest road Gas station Gift shop
⋯
⋯
20
Computational Time
 Colorize within a few seconds
80ms
21
Colorization of MIT Places Dataset
22
Comparisons
Input [Cheng+ 2015] Ours
(w/ global features)
Ours
(w/o global features)
23
Effectiveness of Global Features
w/ global featuresInput w/o global features
24
User Study
 10 users participated
 We show 500 images of each type: total 1,500 images per user
 90% of our results are considered “natural”
Natural Unnatural
25
Colorization of Historical Photographs
Mount Moran, 1941 Scott's Run, 1937 Youngsters, 1912 Burns Basement, 1910
26
Style Transfer
Low-Level Features
Network
27
Style Transfer
Low-Level Features
Network
28
Style Transfer
 Adapting the colorization of one image to the style of another
Inputs
Local Global Local Global Local Global
Output
29
Limitations
 Difficult to output colorful images
 Cannot restore exact colors
Input Ground truth Output
Input Ground truth Output
30
Conclusion
 Novel approach for image colorization by fusing global and local information
 Fusion layer
 Joint training of colorization and classification
 Style transfer
Farm Land, 1933 California National
Park, 1936
Homes, 1936 Spinners, 1910 Doffer Boys, 1909
31
Thank you!
 Project Page http://hi.cs.waseda.ac.jp/~iizuka/projects/colorization
 Code on GitHub! https://github.com/satoshiiizuka/siggraph2016_colorization
Norris Dam, 1933North Dome,
1936
Miner,
1937
Community Center,
1936

More Related Content

What's hot

1.arithmetic & logical operations
1.arithmetic & logical operations1.arithmetic & logical operations
1.arithmetic & logical operationsmukesh bhardwaj
 
Lect 02 second portion
Lect 02  second portionLect 02  second portion
Lect 02 second portionMoe Moe Myint
 
Image Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom ConceptsImage Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom Conceptsmmjalbiaty
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation pptGichelle Amon
 
Fundamental steps in image processing
Fundamental steps in image processingFundamental steps in image processing
Fundamental steps in image processingPremaPRC211300301103
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processingAhmed Daoud
 
DIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTESDIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTESEzhilya venkat
 
Image Restoration
Image RestorationImage Restoration
Image RestorationPoonam Seth
 
Texture in image processing
Texture in image processing Texture in image processing
Texture in image processing Anna Aquarian
 
A thesis presentation on pothole detection
A thesis presentation on pothole detectionA thesis presentation on pothole detection
A thesis presentation on pothole detectionPrimeAsia University
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural networkKIRAN R
 
Color image processing Presentation
Color image processing PresentationColor image processing Presentation
Color image processing PresentationRevanth Chimmani
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processingYu Huang
 
Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)Kalyan Acharjya
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationMostafa G. M. Mostafa
 
Ray Tracing in Computer Graphics
Ray Tracing in Computer GraphicsRay Tracing in Computer Graphics
Ray Tracing in Computer GraphicsKABILESH RAMAR
 
Color Image Processing
Color Image ProcessingColor Image Processing
Color Image Processingkiruthiammu
 

What's hot (20)

1.arithmetic & logical operations
1.arithmetic & logical operations1.arithmetic & logical operations
1.arithmetic & logical operations
 
Lect 02 second portion
Lect 02  second portionLect 02  second portion
Lect 02 second portion
 
Image Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom ConceptsImage Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom Concepts
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation ppt
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 
Fundamental steps in image processing
Fundamental steps in image processingFundamental steps in image processing
Fundamental steps in image processing
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
DIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTESDIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTES
 
Image Restoration
Image RestorationImage Restoration
Image Restoration
 
Texture in image processing
Texture in image processing Texture in image processing
Texture in image processing
 
A thesis presentation on pothole detection
A thesis presentation on pothole detectionA thesis presentation on pothole detection
A thesis presentation on pothole detection
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
Hog
HogHog
Hog
 
Color image processing Presentation
Color image processing PresentationColor image processing Presentation
Color image processing Presentation
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Ray Tracing in Computer Graphics
Ray Tracing in Computer GraphicsRay Tracing in Computer Graphics
Ray Tracing in Computer Graphics
 
Color Image Processing
Color Image ProcessingColor Image Processing
Color Image Processing
 

Similar to [SIGGRAPH 2016] Automatic Image Colorization

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
[SIGGRAPH 2017] Globally and Locally Consistent Image Completion[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
[SIGGRAPH 2017] Globally and Locally Consistent Image CompletionSatoshi Iizuka
 
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Savvas Chatzichristofis
 
An Introduction to Computer Vision
An Introduction to Computer VisionAn Introduction to Computer Vision
An Introduction to Computer Visionguestd1b1b5
 
Artist Assistant AI(AAA)
Artist Assistant AI(AAA)Artist Assistant AI(AAA)
Artist Assistant AI(AAA)Gunhee Lee
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
 
Towards Utilizing GPUs in Information Visualization
Towards Utilizing GPUs in Information VisualizationTowards Utilizing GPUs in Information Visualization
Towards Utilizing GPUs in Information VisualizationNiklas Elmqvist
 
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Matthias Trapp
 
Computational Methods in Physics for Students
Computational Methods in Physics for StudentsComputational Methods in Physics for Students
Computational Methods in Physics for StudentsMaheshPatil527151
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Dongmin Choi
 
Image_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.pptImage_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.pptLOUISSEVERINOROMANO
 
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuAN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by MadhuMadhu Rock
 
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...IRJET Journal
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 IntroductionSperasoft
 
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...CSCJournals
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamColor and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamNAVER Engineering
 
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI ComponentsScreen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI Componentsivaderivader
 
Using A Application For A Desktop Application
Using A Application For A Desktop ApplicationUsing A Application For A Desktop Application
Using A Application For A Desktop ApplicationTracy Huang
 
iVideo Editor with Background Remover and Image Inpainting
iVideo Editor with Background Remover and Image InpaintingiVideo Editor with Background Remover and Image Inpainting
iVideo Editor with Background Remover and Image InpaintingIRJET Journal
 
lectureAll-OpenGL-complete-Guide-Tutorial.pdf
lectureAll-OpenGL-complete-Guide-Tutorial.pdflectureAll-OpenGL-complete-Guide-Tutorial.pdf
lectureAll-OpenGL-complete-Guide-Tutorial.pdfSanjeevSaharan5
 

Similar to [SIGGRAPH 2016] Automatic Image Colorization (20)

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
[SIGGRAPH 2017] Globally and Locally Consistent Image Completion[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
 
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
 
An Introduction to Computer Vision
An Introduction to Computer VisionAn Introduction to Computer Vision
An Introduction to Computer Vision
 
Graphics and Java 2D
Graphics and Java 2DGraphics and Java 2D
Graphics and Java 2D
 
Artist Assistant AI(AAA)
Artist Assistant AI(AAA)Artist Assistant AI(AAA)
Artist Assistant AI(AAA)
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
 
Towards Utilizing GPUs in Information Visualization
Towards Utilizing GPUs in Information VisualizationTowards Utilizing GPUs in Information Visualization
Towards Utilizing GPUs in Information Visualization
 
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
 
Computational Methods in Physics for Students
Computational Methods in Physics for StudentsComputational Methods in Physics for Students
Computational Methods in Physics for Students
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
Image_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.pptImage_Processing_LECTURE_c#_programming.ppt
Image_Processing_LECTURE_c#_programming.ppt
 
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuAN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by Madhu
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
 
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 Introduction
 
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamColor and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
 
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI ComponentsScreen2Vec: Semantic Embedding of GUI Screens and GUI Components
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
 
Using A Application For A Desktop Application
Using A Application For A Desktop ApplicationUsing A Application For A Desktop Application
Using A Application For A Desktop Application
 
iVideo Editor with Background Remover and Image Inpainting
iVideo Editor with Background Remover and Image InpaintingiVideo Editor with Background Remover and Image Inpainting
iVideo Editor with Background Remover and Image Inpainting
 
lectureAll-OpenGL-complete-Guide-Tutorial.pdf
lectureAll-OpenGL-complete-Guide-Tutorial.pdflectureAll-OpenGL-complete-Guide-Tutorial.pdf
lectureAll-OpenGL-complete-Guide-Tutorial.pdf
 

Recently uploaded

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsVlad Stirbu
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 

Recently uploaded (20)

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 

[SIGGRAPH 2016] Automatic Image Colorization

  • 1. Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa Waseda University (*equal contribution)
  • 5. 5 Related Work  Scribble-based [Levin+ 2004; Yatziv+ 2004; An+ 2009; Xu+ 2013; Endo+ 2016]  Specify colors with scribbles  Require manual inputs  Reference image-based [Chia+ 2011; Gupta+ 2012]  Transfer colors of reference images  Require very similar images [Levin+ 2004] [Gupta+ 2012] Input Reference Output
  • 6. 6 Related Work  Automatic colorization with hand-crafted features [Cheng+ 2015]  Uses existing multiple image features  Computes chrominance via a shallow neural network  Depends on the performance of semantic segmentation  Only handles simple outdoor scenes Image features Chroma OutputInput Neural Network
  • 7. 7 Contributions  Novel end-to-end network that jointly learns global and local features for automatic image colorization  New fusion layer that elegantly merges the global and local features  Exploit classification labels for learning
  • 8. 8 Layers of Our Model  Fully-connected layer  All neurons are connected between layers  Convolutional layer  Takes into account underlying spatial structure No. of feature maps Convolutional layer … … Fully-connected layer Neuron 𝑥 𝑦
  • 9. 9 Our Model  Two branches: local features and global features  Composed of four networks Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 10. 10 Low-Level Features Network  Extract low-level features such as edges and corners  Lower resolution for efficient processing Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 11. 11 Global Features Network  Compute a global 256-dimensional vector representation of the image Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 12. 12 Mid-Level Features Network  Extract mid-level features such as texture Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 13. 13 Fusion Layer Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 14. 14 Fusion Layer  Combine the global features with the mid-level features  The resulting features are independent of any resolution Fusion Layer Mid-Level Features Network Global Features Network 𝐲 𝑢,𝑣 fusion = 𝜎 𝐛 + 𝑊 𝐲global 𝐲 𝑢,𝑣 mid
  • 15. 15 Colorization Network  Compute chrominance from the fused features  Restore the image to the input resolution Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 16. 16 Training of Colors  Mean Squared Error (MSE) as loss function  Optimization using ADADELTA [Zeiler 2012]  Adaptively sets a learning rate Model Forward Backward MSE Input Output Ground truth
  • 17. 17 Joint Training  Training for classification jointly with the colorization  Classification network connected to the global features Shared weights Scaling Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer 20.60% Formal Garden 16.13% Arch 13.50% Abbey 7.07% Botanical Garden 6.53% Golf Course Predicted labels Classification Network Chrominance
  • 18. 18 Dataset  MIT Places Scene Dataset [Zhou+ 2014]  2.3 million training images with 205 scene labels  256 × 256 pixels Abbey Airport terminal Aquarium Baseball field Dining room Forest road Gas station Gift shop ⋯ ⋯
  • 19.
  • 20. 20 Computational Time  Colorize within a few seconds 80ms
  • 21. 21 Colorization of MIT Places Dataset
  • 22. 22 Comparisons Input [Cheng+ 2015] Ours (w/ global features) Ours (w/o global features)
  • 23. 23 Effectiveness of Global Features w/ global featuresInput w/o global features
  • 24. 24 User Study  10 users participated  We show 500 images of each type: total 1,500 images per user  90% of our results are considered “natural” Natural Unnatural
  • 25. 25 Colorization of Historical Photographs Mount Moran, 1941 Scott's Run, 1937 Youngsters, 1912 Burns Basement, 1910
  • 28. 28 Style Transfer  Adapting the colorization of one image to the style of another Inputs Local Global Local Global Local Global Output
  • 29. 29 Limitations  Difficult to output colorful images  Cannot restore exact colors Input Ground truth Output Input Ground truth Output
  • 30. 30 Conclusion  Novel approach for image colorization by fusing global and local information  Fusion layer  Joint training of colorization and classification  Style transfer Farm Land, 1933 California National Park, 1936 Homes, 1936 Spinners, 1910 Doffer Boys, 1909
  • 31. 31 Thank you!  Project Page http://hi.cs.waseda.ac.jp/~iizuka/projects/colorization  Code on GitHub! https://github.com/satoshiiizuka/siggraph2016_colorization Norris Dam, 1933North Dome, 1936 Miner, 1937 Community Center, 1936