SlideShare a Scribd company logo
1 of 53
Download to read offline
Photo Enhance
01
Agenda
Problems and Architecture
• Pipeline
• Why Classical CV is still awesome
• Why GANs not the best solution
• HSV vs RGB vs YUV
Other steps
• Super Resolution
• Style Transfer
• Noise & aberration reduction. Why JPEG
is a problem
02
03
Color enhancing
• AutoHDR
• Real-Time photo color enhance
with Bilateral Filter
AGENDA:
Problems and
Architecture
01
Pipeline
Raw to 16-32bit Image Noise & aberration reduction
Color EnhanceStyle Transfer
Super ResolutionSkin editing / Painting
Why Classical CV is still awesome
● White box solution
● Many tasks are based on optical physics and clean mathematics so no NN
needed
● Controlled result which is very important for professional photo editing
● Very fast and can be used almost on any device
● No training and datasets needed
- Some very simple examples of color enhancing with polynomial in my article
- Building shape descriptor of image
- Unwarping example
Why GANs not the best solution
● To many uncontrolled artifacts
● Require a lot of computational resources
● Hard to train (Mode collapse, dying gradient, etc)
● Need big datasets
So it’s Money x Money x Money = Money3
And bad for data conversion
Why GANs not the best solution
BigGan - 1 training cycle: 15 days, 8xGPUs, ~$8k-$12k
RGB vs YUV vs HSV
RGB
- Good for NN
- Bad for classic CV
- Not human readable
YUV
- Good for NN
- So-so for classic CV
- Not human readable
- Efficient coding,
reduce bandwidth
HSV
- Bad for NN
- Good for classic CV
- Human readable
- Based on RGB (not
real colorspace)
02AGENDA:
Color enhancing
10
Auto HDR
Auto HDR
Raw - Sensors has different physical grid,
Formats has different strategy for saving data,
Sensors data has to be processed differently.
PAIN
Use 3d party software: pip install rawkit
Auto HDR
Overexposed better then Underexposed
That how sensors built
Auto HDR
White Balance - Photographers are wrong, classical CV not always right
Train NN to learn objects that should be
white and use this for your WB settings
Auto HDR
Camera Response Function (CRF) - relates scene irradiance to image intensities.
It needed for most HDR transformations
God save OpenCV developers
Auto HDR
CLAHE - Adaptive histogram equalisation. Useful algorithm to solve histogram
imbalance of the image. Visually similar to what HDR trying to do
OpenCV, scikit-image and many others
Auto HDR
HDR - Basically we fuse few images with different parameters to get good
exposure of every object in the image. Tone Mapping technique used to map 32bit
image to 8bit image(for most devices)
OpenCV
Auto HDR
And yes we can do same with one image
Auto HDR
Now merge all things together and we will get something like this
Auto HDR
Now merge all things together and we will get something like this
10
HDRNet
Real-Time photo color enhance with
Bilateral Filter
Bilateral Filter
Main idea of it that it use not only spatial distance,
but also intensity distance.
Bilateral Filter
Bilateral Grid
HDRNet
HDRNet
Input Result Ground Truth
HDRNet
Input Result Ground Truth
HDRNet
https://github.com/creotiv/hdrnet/
https://github.com/creotiv/hdrnet-pytorch/ - under development
- Only color enhance
- Uses some sort of segmentation with Bilateral Grid
- Supervised learning
- Cant handle edge noise
- Small network cant handle vast input image params (should be unified)
HDRNet
Other interesting solutions
https://github.com/nothinglo/Deep-Photo-Enhancer
https://github.com/yuanming-hu/exposure
03
AGENDA:
Other steps
Super Resolution
Super Resolution - SRGAN
Super Resolution - ESRGAN
Super Resolution - SFTGAN
Super Resolution - SFTGAN
Input Bicubic ESRGAN
Super Resolution - SFTGAN
Super Resolution - ESRGAN
Changes from SRGAN
- Deeper model using Residual-in-Residual Dense Block (RRDB)
- Use Relativistic average GAN instead of the vanilla GAN.
- Improve the perceptual loss by using the features before activation.
- Pixel Shuffle replaced by Nearest Interpolation
- Pretrain Generator model before use in GAN
- Removed Batch Normalization to remove artifacts
https://arxiv.org/abs/1809.00219
https://arxiv.org/abs/1804.02815
https://github.com/xinntao/ESRGAN
https://github.com/xinntao/SFTGAN
Super Resolution
What we can see from this improvements?
- Context and Data is what matter
- All things like BN, Max Pooling which destroy data making worse for generator
- Understanding materials behind the image can greatly improve quality
- Skip connection passing context to the end of network also improve results
Style Transfer
Style Transfer
Style Transfer
Style Transfer
Style Transfer
What matter
- No MaxPool on Encoder network use strides
- Style poisoning works not so good as we want
- No BatchNorm on Transformer network
Adaptive Style Transfer
Adaptive Style Transfer
Adaptive Style Transfer
What matter
- GAN, for such tasks when Loss hard to build
- Multiple Artworks per style
- Additional loss - maybe, hard to tell
https://compvis.github.io/adaptive-style-transfer/
Noise & Aberrations
Jpeg Noise
https://csfieldguide.org.nz/en/chapters/coding-
compression/image-compression-using-jpeg/
Low frequencies
High frequencies
Jpeg Noise
- Waste context loss
- Noise position static
- Noise patterns repetitive
- Transform function is not
bijective
F (X1)
F (X2)
Jpeg Noise
UnBlock http://www.johncostella.com/unblock/
Wavelet transforms https://sci-hub.tw/10.1109/ICIEA.2006.257369
+
Detalisation Gan Network ~ SuperRes Network
https://pywavelets.readthedocs.io/en/latest/
Hot Noise aka Grain Noise
God Bless AutoEncoders
Simple MNIST Example http://bit.ly/2TDQaFr
Noise2Noise (Unsupervised) https://github.com/NVlabs/noise2noise
(https://arxiv.org/abs/1803.04189)
Chromatic Aberration
Chromatic Aberration
TCA_Correct + Hugin(Fulla) https://wiki.panotools.org/Tca_correct
https://wiki.panotools.org/Lens_correction_model
Thanks
Help to save homeless animals
Donate to animal shelter
BIT.LY / MLCATS
fb.me / anikishaev t.me / ml_world

More Related Content

Similar to Photo echance. Problems. Solutions. Ideas

Harnessing the power of Generative Adversarial Networks (GANs) for supervised...
Harnessing the power of Generative Adversarial Networks (GANs) for supervised...Harnessing the power of Generative Adversarial Networks (GANs) for supervised...
Harnessing the power of Generative Adversarial Networks (GANs) for supervised...
Scaleway
 
Webinar -- G7 for Grand Format Printers
Webinar -- G7 for Grand Format PrintersWebinar -- G7 for Grand Format Printers
Webinar -- G7 for Grand Format Printers
RPimaging, INC
 
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex Vlachos
 

Similar to Photo echance. Problems. Solutions. Ideas (20)

We should optimize images
We should optimize imagesWe should optimize images
We should optimize images
 
NCET Tech Bite - Photoshop - July 2015
NCET Tech Bite - Photoshop - July 2015NCET Tech Bite - Photoshop - July 2015
NCET Tech Bite - Photoshop - July 2015
 
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
 
Design in Motion: Video Production Workflow
Design in Motion: Video Production WorkflowDesign in Motion: Video Production Workflow
Design in Motion: Video Production Workflow
 
Harnessing the power of Generative Adversarial Networks (GANs) for supervised...
Harnessing the power of Generative Adversarial Networks (GANs) for supervised...Harnessing the power of Generative Adversarial Networks (GANs) for supervised...
Harnessing the power of Generative Adversarial Networks (GANs) for supervised...
 
Digital Historian Series: Using Digital Tools for Archival Research
Digital Historian Series: Using Digital Tools for Archival ResearchDigital Historian Series: Using Digital Tools for Archival Research
Digital Historian Series: Using Digital Tools for Archival Research
 
Webinar -- G7 for Grand Format Printers
Webinar -- G7 for Grand Format PrintersWebinar -- G7 for Grand Format Printers
Webinar -- G7 for Grand Format Printers
 
AWS Webcast - What's new with Amazon Elastic Transcoder
AWS Webcast - What's new with Amazon Elastic TranscoderAWS Webcast - What's new with Amazon Elastic Transcoder
AWS Webcast - What's new with Amazon Elastic Transcoder
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Deferred rendering in_leadwerks_engine[1]
Deferred rendering in_leadwerks_engine[1]Deferred rendering in_leadwerks_engine[1]
Deferred rendering in_leadwerks_engine[1]
 
P1.1
P1.1P1.1
P1.1
 
Putting Your Images on a Diet (SmashingConf, 2014)
Putting Your Images on a Diet (SmashingConf, 2014)Putting Your Images on a Diet (SmashingConf, 2014)
Putting Your Images on a Diet (SmashingConf, 2014)
 
Presentation shortstory
Presentation shortstoryPresentation shortstory
Presentation shortstory
 
Research on image processing based on fpga
Research on image processing based on fpgaResearch on image processing based on fpga
Research on image processing based on fpga
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
Getting started with High-Definition Render Pipeline for games- Unite Copenha...
Getting started with High-Definition Render Pipeline for games- Unite Copenha...Getting started with High-Definition Render Pipeline for games- Unite Copenha...
Getting started with High-Definition Render Pipeline for games- Unite Copenha...
 
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
 
HiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesHiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use Cases
 
AWS Webcast - What's New with Amazon Elastic Transcoder
AWS Webcast - What's New with Amazon Elastic TranscoderAWS Webcast - What's New with Amazon Elastic Transcoder
AWS Webcast - What's New with Amazon Elastic Transcoder
 
DigitRecognition.pptx
DigitRecognition.pptxDigitRecognition.pptx
DigitRecognition.pptx
 

More from Andrew Nikishaev

More from Andrew Nikishaev (10)

What is ML and how it can be used in sport
What is ML and how it can be used in sportWhat is ML and how it can be used in sport
What is ML and how it can be used in sport
 
Crypto trading - the basics
Crypto trading - the basicsCrypto trading - the basics
Crypto trading - the basics
 
Machine learning for newbies
Machine learning for newbiesMachine learning for newbies
Machine learning for newbies
 
Neo4j after 1 year in production
Neo4j after 1 year in productionNeo4j after 1 year in production
Neo4j after 1 year in production
 
Ideal pitch - for investors and clients
Ideal pitch - for investors and clientsIdeal pitch - for investors and clients
Ideal pitch - for investors and clients
 
От идеи до рабочей MVP
От идеи до рабочей MVPОт идеи до рабочей MVP
От идеи до рабочей MVP
 
Sit&fit - uderdesk stepper trainer with charger
Sit&fit - uderdesk stepper trainer with chargerSit&fit - uderdesk stepper trainer with charger
Sit&fit - uderdesk stepper trainer with charger
 
Тонкости работы с Facebook
Тонкости работы с FacebookТонкости работы с Facebook
Тонкости работы с Facebook
 
Построение Business Model Canvas и Value Proposition Canvas
Построение Business Model Canvas и Value Proposition CanvasПостроение Business Model Canvas и Value Proposition Canvas
Построение Business Model Canvas и Value Proposition Canvas
 
Нетворкинг и Социальная Инженерия
Нетворкинг и Социальная ИнженерияНетворкинг и Социальная Инженерия
Нетворкинг и Социальная Инженерия
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 

Photo echance. Problems. Solutions. Ideas