SlideShare a Scribd company logo
1 of 32
In the past few years we have been witnessing incredible
progress in the field of computer vision, mainly due to deep
learning.
Tackling challenges in
computer vision
Augustin Marty
CEO Deepomatic
Deep learning changed so much in solving image related questions.
You need to feed your model with examples. Give your model images, thousands of images so that it learns to differentiate.
Imagenet: Iconic challenge in the world of computer vision –with thousands of categories: all the images must be placed by algorithms into the correct category.
Democratisation of image recognition – error rate dropped from 26% to 3% today. 5% is the error rate of a human.
DEEP LEARNING
IS THE NEW PARADIGM
The deep learning progress was also made possible because the tech giants started to massively investing in it, developing bigger models (with more layers and millions of parameters) .
This is a heavy process
GoogleNet (2014)
22 LAYERS
ResNet (2015)
152 LAYERS
With bigger and more complex models and algorithms there is a need for great computing power.
Here NVIDIA’s CEO, Jen-Hsun Huang, is presenting and Open AI supercomputer to Elon Musk.
This is a relatively small box, aligned with GPU processing calculations amazingly fast.
This type of supercomputer is now affordable and accessible, especially since it is in the Cloud.
This progress and computing power has led to interesting applications in image recognition:
OPEN-AI SUPERCOMPUTER
google show and tell : Google’s image captioning model.
The model is able to describe the scene with a collection of
verbs, adjectives – like a human does.
Google’s Show and Tell
Style transfer is another example:
Deep learning algorithms can understand style of a
painting and reproduce it.
Here it understands Van Gogh’s expressionist painting and
- coupled with a picture of houses along a river -
reproduces the style, creating a new picture.
Style
Transfer
Video Coloration:
Artificial intelligence colours the video turning the black and
white video into a coloured one.
How was this achieved?:
1.Engineers took thousands of coloured videos and made
them black and white.
2. They then trained the algorithm to understand the
correlation between the B&W videos and the respective
coloured ones.
3. Then the algorithm was able to colour new, initially B&W
videos.
Video
Colouration
In specific industry problems those models don’t necessarily work. Here the following 3 images are taken from Microsoft’s image recognition platform online.
Here an automotive part is mistaken for a “close up of a plane” - not quite!
“Close up of a plane”
A terrorist is labeled as ‘man looking at the ocean' .
Of course there's an ocean and a man but first of all he’s
clearly looking away from the ocean. The machine doesn’t
recognise the dark knife in his hands, and can’t properly
identify the face because of the mask.
“Man looking at the ocean”
a group of men standing on a dirt field :
it’s not wrong: you have a group of men and yes it’s a dirt road.
But we humans, understand the context of this picture better , unlike the algorithm: these group of men are fighters – they have weapons, they are fighters, and are probably engaging in an act
of war
All this goes to show that progress is undeniable but there is much to still do.
When you want to apply these technologies to your industry or company specific challenges it might not work. You may think that therefore its not for you, that AI and computer vision doesn’t
solve your need
But…
“Group of men standing on top of a dirt field”
Artificial Intelligence is for everyone, for every company.
The following examples show industry specific problems
that were solved thanks to computer visions
AI IS FOR EVERY COMPANY
SADAKO TECHNOLOGIES- a Spanish firm- has developed a waste sorting device by combining robotics and computer vision.
They are therefore able to automatically distinguish plastic from other waste on a conveyer belt.
This can have incredibly promising applications for the future waste sorting systems and management,, and other cleaning applications
CASE STUDY: SADAKO
Waste Sorting
Regaind is a startup that is able to qualify the image
aesthetics. Selecting best pictures (amongst thousands of
pictures taken during a vacation) it creates photo albums
automatically by analysing the quality of the picture.
CASE STUDY: REGAIND
Image Aesthetics
Coming back to the image of the fighters.
Its possible, with todays’ technology, to develop weapon
detection for images and in videos. This has great use and
is of great importance to military and intelligence.
CASE STUDY: SECURITY
Weapon Detection
The three previous application have been developed by
small companies. If they were able to do this than so can
you, if you follow the right methodology
There is a secret sauce to tackle image recognition
challenges specific to your industry:
THE SECRET AI SAUCE
TO SOLVE
YOUR PROBLEMS
First, you need a deep learning framework.
These are available as they are open source. You just
need an engineer to use them.
A framework
Second ingredient: annotated images.
These must be relevant to the task you are tackling
The images differ for each use-case and problem – need to
develop a dataset.
ANNOTATED DATA
Annotators – human in the loop
HUMANS-IN-THE-LOOP
Assemble the 3 ingredients;
You trained an algorithm thanks to your dataset and
framework.
Your first algorithm is applied to never-before-seen images
(it is never perfect at first).
Your algorithm won’t give an answer on all new images
provides: you need humans-in-the-loop to keep annotating
those that weren’t (when the machine lacked confidence),
completing the task.
AI + annotators who complete the job and also keeps
building the dataset, creating a better algorithm …
Dataset
Neural network
models
Humans in
the loop
TrainingAnnotation
Calling humans when the
model is not sure
THE LOOP
This may all seem relatively simple but there’s a catch: you need to have a very good dataset for it to work well: a
huge amount of perfectly annotated images
Creating these datasets takes lot’s of time.
BUT…
THERE’S A CATCH
Here’s an example of furniture detection
To develop and algorithm that detects furniture in images you need a dataset with boxes around every single item in the image.
Consequently, you need to do this manually at first, making sure the boxes are perfectly around the item, and that no object is missed.
This takes 10 minutes for 1 image!
Sadako Technology, mentioned earlier, needed to do this to train their technology: they put millions of boxes around plastic bottles to create their dataset.
FURNITURE
DETECTION
(10 min)
Some tasks are even more time consuming. If you want to
develop algorithms for robotics (automated cars, robots,
drones etc.) – they need to understand their entire
environment.
So in this case, to train algorithms you need to determine
what each pixel represents in the image.
This segmentation task takes over an hour
URBAN
SEGMENTATIO
N
(70min)
The real bottleneck is now the dataset creation
DATA IS AI
BOTTLENECK
Good dataset creation is crucial to speed up the pace of AI
progress
LACK OF DATA IS SLOWING DOWN AI EXPANSION
To make datasets today there are 2 ways of doing it for now:
1) done internally by data scientists
2) Use crowdsourcing, such s Amazons Mechanical Turk – this isn’t too bad – but is time consuming and you need to do many quality reviews and check to ensure satisfactory results
Every data scientist has, at least once, thrown out a dataset due to its poor quality.
AMAZON
MECHANICAL
TURK
Time consuming,
poor quality
DO IT INTERNALLY
Make your data
scientists want to quit
or
Solutions
Real need for Industrialising the dataset creation process is the true solution to move forward to solve image related challenges for each company.
There are a few elements that may help the industrialisation and democratisation of the dataset creation:
INDUSTRIALISING THE
ANNOTATION PROCESS
1.
Improve the UX
of annotation tools
We need to have a dedicated software: today there is no software to produce datasets. Its crazy to think that each company develops their own small software.
A big leap in productivity can be achieved by simply improving the design and the annotation experience
Second element to increase pace fo AI production is to work on active learning.
You don’t want to annotate millions of images. Active learning is science that helps select the most informative image to build AI with as few images as possible
2.
Active Learning & HITL
Improve software with machine learning: if software knows
what you’re doing it can really improve the ease of the
3.
Improve tools with AI
We intend to reduce the time from 70 to 5 minutes
increasing the productivity by 10x
We intend to reduce the time from
70 to 5 minutes in 10 months
Here machine learning helps software to annotate images.
This video shows that if you are looking for a face the box
will automatically adjust around the head. The same goes
for when annotating objects pixel-wise – speeding up the
completion of the task.
AI really is for everyone and can solve any companies
challenges. Algorithms are becoming commodities and
datasets are the bottle neck of AI.
Democratising and industrialising the process of dataset
creation will allow for all of us, all companies to move
forward with their AI. Applications and goals.
THANK YOU

More Related Content

Similar to Tackling Challenges in Computer Vision

Everything You Need to Know About Computer Vision
Everything You Need to Know About Computer VisionEverything You Need to Know About Computer Vision
Everything You Need to Know About Computer VisionKavika Roy
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYAI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYsantoshverma90
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGIRJET Journal
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGIRJET Journal
 
Machine Learning Fundamentals.docx
Machine Learning Fundamentals.docxMachine Learning Fundamentals.docx
Machine Learning Fundamentals.docxHaritvKrishnagiri
 
Computer Vision Applications - White Paper
Computer Vision Applications - White Paper Computer Vision Applications - White Paper
Computer Vision Applications - White Paper Addepto
 
Area's of Artificial Inteligence .pptx
Area's of Artificial Inteligence .pptxArea's of Artificial Inteligence .pptx
Area's of Artificial Inteligence .pptxJIMSVKII
 
Computer Vision - White Paper 2020
Computer Vision - White Paper 2020Computer Vision - White Paper 2020
Computer Vision - White Paper 2020AmandaAntoszewska
 
Why AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docx
Why AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docxWhy AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docx
Why AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docxShakuro
 
Deep learning and its problem types
Deep learning and its problem typesDeep learning and its problem types
Deep learning and its problem typesQualitasTechnology
 
Machine Vision – Augment not replace Humans
Machine Vision – Augment not replace HumansMachine Vision – Augment not replace Humans
Machine Vision – Augment not replace HumansQualitasTechnology
 
Color based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlabColor based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlabKamal Pradhan
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Visionbutest
 
Deep Learning - Hype, Reality and Applications in Manufacturing
Deep Learning - Hype, Reality and Applications in ManufacturingDeep Learning - Hype, Reality and Applications in Manufacturing
Deep Learning - Hype, Reality and Applications in ManufacturingAdam Cook
 
Facial expression recognition projc 2 (3) (1)
Facial expression recognition projc 2 (3) (1)Facial expression recognition projc 2 (3) (1)
Facial expression recognition projc 2 (3) (1)AbhiAchalla
 
Top AI Tools For Image Generation.docx
Top AI Tools For Image Generation.docxTop AI Tools For Image Generation.docx
Top AI Tools For Image Generation.docxorage technologies
 
Face Mask Detection System Using Artificial Intelligence
Face Mask Detection System Using Artificial IntelligenceFace Mask Detection System Using Artificial Intelligence
Face Mask Detection System Using Artificial IntelligenceIRJET Journal
 

Similar to Tackling Challenges in Computer Vision (20)

Computer vision
Computer visionComputer vision
Computer vision
 
Everything You Need to Know About Computer Vision
Everything You Need to Know About Computer VisionEverything You Need to Know About Computer Vision
Everything You Need to Know About Computer Vision
 
Machine Learning & AI
Machine Learning & AIMachine Learning & AI
Machine Learning & AI
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYAI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
 
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNINGHANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
HANDWRITTEN DIGIT RECOGNITION USING MACHINE LEARNING
 
Machine Learning Fundamentals.docx
Machine Learning Fundamentals.docxMachine Learning Fundamentals.docx
Machine Learning Fundamentals.docx
 
Computer Vision Applications - White Paper
Computer Vision Applications - White Paper Computer Vision Applications - White Paper
Computer Vision Applications - White Paper
 
Area's of Artificial Inteligence .pptx
Area's of Artificial Inteligence .pptxArea's of Artificial Inteligence .pptx
Area's of Artificial Inteligence .pptx
 
Computer Vision - White Paper 2020
Computer Vision - White Paper 2020Computer Vision - White Paper 2020
Computer Vision - White Paper 2020
 
Why AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docx
Why AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docxWhy AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docx
Why AI Image Generators Won’t Replace UI_UX Designers & Illustrators.docx
 
Deep learning and its problem types
Deep learning and its problem typesDeep learning and its problem types
Deep learning and its problem types
 
Machine Vision – Augment not replace Humans
Machine Vision – Augment not replace HumansMachine Vision – Augment not replace Humans
Machine Vision – Augment not replace Humans
 
Color based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlabColor based image processing , tracking and automation using matlab
Color based image processing , tracking and automation using matlab
 
Null
NullNull
Null
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
Deep Learning - Hype, Reality and Applications in Manufacturing
Deep Learning - Hype, Reality and Applications in ManufacturingDeep Learning - Hype, Reality and Applications in Manufacturing
Deep Learning - Hype, Reality and Applications in Manufacturing
 
Facial expression recognition projc 2 (3) (1)
Facial expression recognition projc 2 (3) (1)Facial expression recognition projc 2 (3) (1)
Facial expression recognition projc 2 (3) (1)
 
Top AI Tools For Image Generation.docx
Top AI Tools For Image Generation.docxTop AI Tools For Image Generation.docx
Top AI Tools For Image Generation.docx
 
Face Mask Detection System Using Artificial Intelligence
Face Mask Detection System Using Artificial IntelligenceFace Mask Detection System Using Artificial Intelligence
Face Mask Detection System Using Artificial Intelligence
 

Recently uploaded

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Tackling Challenges in Computer Vision

  • 1. In the past few years we have been witnessing incredible progress in the field of computer vision, mainly due to deep learning. Tackling challenges in computer vision Augustin Marty CEO Deepomatic
  • 2. Deep learning changed so much in solving image related questions. You need to feed your model with examples. Give your model images, thousands of images so that it learns to differentiate. Imagenet: Iconic challenge in the world of computer vision –with thousands of categories: all the images must be placed by algorithms into the correct category. Democratisation of image recognition – error rate dropped from 26% to 3% today. 5% is the error rate of a human. DEEP LEARNING IS THE NEW PARADIGM
  • 3. The deep learning progress was also made possible because the tech giants started to massively investing in it, developing bigger models (with more layers and millions of parameters) . This is a heavy process GoogleNet (2014) 22 LAYERS ResNet (2015) 152 LAYERS
  • 4. With bigger and more complex models and algorithms there is a need for great computing power. Here NVIDIA’s CEO, Jen-Hsun Huang, is presenting and Open AI supercomputer to Elon Musk. This is a relatively small box, aligned with GPU processing calculations amazingly fast. This type of supercomputer is now affordable and accessible, especially since it is in the Cloud. This progress and computing power has led to interesting applications in image recognition: OPEN-AI SUPERCOMPUTER
  • 5. google show and tell : Google’s image captioning model. The model is able to describe the scene with a collection of verbs, adjectives – like a human does. Google’s Show and Tell
  • 6. Style transfer is another example: Deep learning algorithms can understand style of a painting and reproduce it. Here it understands Van Gogh’s expressionist painting and - coupled with a picture of houses along a river - reproduces the style, creating a new picture. Style Transfer
  • 7. Video Coloration: Artificial intelligence colours the video turning the black and white video into a coloured one. How was this achieved?: 1.Engineers took thousands of coloured videos and made them black and white. 2. They then trained the algorithm to understand the correlation between the B&W videos and the respective coloured ones. 3. Then the algorithm was able to colour new, initially B&W videos. Video Colouration
  • 8. In specific industry problems those models don’t necessarily work. Here the following 3 images are taken from Microsoft’s image recognition platform online. Here an automotive part is mistaken for a “close up of a plane” - not quite! “Close up of a plane”
  • 9. A terrorist is labeled as ‘man looking at the ocean' . Of course there's an ocean and a man but first of all he’s clearly looking away from the ocean. The machine doesn’t recognise the dark knife in his hands, and can’t properly identify the face because of the mask. “Man looking at the ocean”
  • 10. a group of men standing on a dirt field : it’s not wrong: you have a group of men and yes it’s a dirt road. But we humans, understand the context of this picture better , unlike the algorithm: these group of men are fighters – they have weapons, they are fighters, and are probably engaging in an act of war All this goes to show that progress is undeniable but there is much to still do. When you want to apply these technologies to your industry or company specific challenges it might not work. You may think that therefore its not for you, that AI and computer vision doesn’t solve your need But… “Group of men standing on top of a dirt field”
  • 11. Artificial Intelligence is for everyone, for every company. The following examples show industry specific problems that were solved thanks to computer visions AI IS FOR EVERY COMPANY
  • 12. SADAKO TECHNOLOGIES- a Spanish firm- has developed a waste sorting device by combining robotics and computer vision. They are therefore able to automatically distinguish plastic from other waste on a conveyer belt. This can have incredibly promising applications for the future waste sorting systems and management,, and other cleaning applications CASE STUDY: SADAKO Waste Sorting
  • 13. Regaind is a startup that is able to qualify the image aesthetics. Selecting best pictures (amongst thousands of pictures taken during a vacation) it creates photo albums automatically by analysing the quality of the picture. CASE STUDY: REGAIND Image Aesthetics
  • 14. Coming back to the image of the fighters. Its possible, with todays’ technology, to develop weapon detection for images and in videos. This has great use and is of great importance to military and intelligence. CASE STUDY: SECURITY Weapon Detection
  • 15. The three previous application have been developed by small companies. If they were able to do this than so can you, if you follow the right methodology There is a secret sauce to tackle image recognition challenges specific to your industry: THE SECRET AI SAUCE TO SOLVE YOUR PROBLEMS
  • 16. First, you need a deep learning framework. These are available as they are open source. You just need an engineer to use them. A framework
  • 17. Second ingredient: annotated images. These must be relevant to the task you are tackling The images differ for each use-case and problem – need to develop a dataset. ANNOTATED DATA
  • 18. Annotators – human in the loop HUMANS-IN-THE-LOOP
  • 19. Assemble the 3 ingredients; You trained an algorithm thanks to your dataset and framework. Your first algorithm is applied to never-before-seen images (it is never perfect at first). Your algorithm won’t give an answer on all new images provides: you need humans-in-the-loop to keep annotating those that weren’t (when the machine lacked confidence), completing the task. AI + annotators who complete the job and also keeps building the dataset, creating a better algorithm … Dataset Neural network models Humans in the loop TrainingAnnotation Calling humans when the model is not sure THE LOOP
  • 20. This may all seem relatively simple but there’s a catch: you need to have a very good dataset for it to work well: a huge amount of perfectly annotated images Creating these datasets takes lot’s of time. BUT… THERE’S A CATCH
  • 21. Here’s an example of furniture detection To develop and algorithm that detects furniture in images you need a dataset with boxes around every single item in the image. Consequently, you need to do this manually at first, making sure the boxes are perfectly around the item, and that no object is missed. This takes 10 minutes for 1 image! Sadako Technology, mentioned earlier, needed to do this to train their technology: they put millions of boxes around plastic bottles to create their dataset. FURNITURE DETECTION (10 min)
  • 22. Some tasks are even more time consuming. If you want to develop algorithms for robotics (automated cars, robots, drones etc.) – they need to understand their entire environment. So in this case, to train algorithms you need to determine what each pixel represents in the image. This segmentation task takes over an hour URBAN SEGMENTATIO N (70min)
  • 23. The real bottleneck is now the dataset creation DATA IS AI BOTTLENECK
  • 24. Good dataset creation is crucial to speed up the pace of AI progress LACK OF DATA IS SLOWING DOWN AI EXPANSION
  • 25. To make datasets today there are 2 ways of doing it for now: 1) done internally by data scientists 2) Use crowdsourcing, such s Amazons Mechanical Turk – this isn’t too bad – but is time consuming and you need to do many quality reviews and check to ensure satisfactory results Every data scientist has, at least once, thrown out a dataset due to its poor quality. AMAZON MECHANICAL TURK Time consuming, poor quality DO IT INTERNALLY Make your data scientists want to quit or Solutions
  • 26. Real need for Industrialising the dataset creation process is the true solution to move forward to solve image related challenges for each company. There are a few elements that may help the industrialisation and democratisation of the dataset creation: INDUSTRIALISING THE ANNOTATION PROCESS
  • 27. 1. Improve the UX of annotation tools We need to have a dedicated software: today there is no software to produce datasets. Its crazy to think that each company develops their own small software. A big leap in productivity can be achieved by simply improving the design and the annotation experience
  • 28. Second element to increase pace fo AI production is to work on active learning. You don’t want to annotate millions of images. Active learning is science that helps select the most informative image to build AI with as few images as possible 2. Active Learning & HITL
  • 29. Improve software with machine learning: if software knows what you’re doing it can really improve the ease of the 3. Improve tools with AI
  • 30. We intend to reduce the time from 70 to 5 minutes increasing the productivity by 10x We intend to reduce the time from 70 to 5 minutes in 10 months
  • 31. Here machine learning helps software to annotate images. This video shows that if you are looking for a face the box will automatically adjust around the head. The same goes for when annotating objects pixel-wise – speeding up the completion of the task.
  • 32. AI really is for everyone and can solve any companies challenges. Algorithms are becoming commodities and datasets are the bottle neck of AI. Democratising and industrialising the process of dataset creation will allow for all of us, all companies to move forward with their AI. Applications and goals. THANK YOU

Editor's Notes

  1. In the past few years we have been witnessing incredible progress in the field of computer vision, mainly due to deep learning.
  2. Deep learning changed so much in solving image related questions. You need to feed your model with examples. Give your model images, thousands of images so that it learns to differentiate. Imagenet: Iconic challenge in the world of computer vision –with thousands of categories: all the images must be placed by algorithms into the correct category. Democratisation of image recognition – error rate dropped from 26% to 3% today. 5% is the error rate of a human.
  3. The deep learning progress was also made possible because the tech giants started to massively investing in it, developing bigger models (with more layers and millions of parameters) . This is a heavy process
  4. With bigger and more complex models and algorithms there is a need for great computing power. Here NVIDIA’s CEO, Jen-Hsun Huang, is presenting and Open AI supercomputer to Elon Musk. This is a relatively small box, aligned with GPU processing calculations amazingly fast. This type of supercomputer is now affordable and accessible, especially since it is in the Cloud. This progress and computing power has led to interesting applications in image recognition:
  5. google show and tell : Google’s image captioning model. The model is able to describe the scene with a collection of verbs, adjectives – like a human does.
  6. Style transfer is another example: Deep learning algorithms can understand style of a painting and reproduce it. Here it understands Van Gogh’s expressionist painting and - coupled with a picture of houses along a river - reproduces the style, creating a new picture.
  7. Video Coloration: Artificial intelligence colours the video turning the black and white video into a coloured one. How was this achieved?: 1.Engineers took thousands of coloured videos and made them black and white. 2. They then trained the algorithm to understand the correlation between the B&W videos and the respective coloured ones. 3. Then the algorithm was able to colour new, initially B&W videos.
  8. In specific industry problems those models don’t necessarily work. Here the following 3 images are taken from Microsoft’s image recognition platform online. Here an automotive part is mistaken for a “close up of a plane” - not quite!
  9. A terrorist is labeled as ‘man looking at the ocean' . Of course there's an ocean and a man but first of all he’s clearly looking away from the ocean. The machine doesn’t recognise the dark knife in his hands, and can’t properly identify the face because of the mask.
  10. a group of men standing on a dirt field : it’s not wrong: you have a group of men and yes it’s a dirt road. But we humans, understand the context of this picture better , unlike the algorithm: these group of men are fighters – they have weapons, they are fighters, and are probably engaging in an act of war All this goes to show that progress is undeniable but there is much to still do. When you want to apply these technologies to your industry or company specific challenges it might not work. You may think that therefore its not for you, that AI and computer vision doesn’t solve your need But…
  11. Artificial Intelligence is for everyone, for every company. The following examples show industry specific problems that were solved thanks to computer visions
  12. SADAKO TECHNOLOGIES- a Spanish firm- has developed a waste sorting device by combining robotics and computer vision. They are therefore able to automatically distinguish plastic from other waste on a conveyer belt. This can have incredibly promising applications for the future waste sorting systems and management,, and other cleaning applications
  13. Regaind is a startup that is able to qualify the image aesthetics. Selecting best pictures (amongst thousands of pictures taken during a vacation) it creates photo albums automatically by analysing the quality of the picture.
  14. Coming back to the image of the fighters. Its possible, with todays’ technology, to develop weapon detection for images and in videos. This has great use and is of great importance to military and intelligence.
  15. The three previous application have been developed by small companies. If they were able to do this than so can you, if you follow the right methodology There is a secret sauce to tackle image recognition challenges specific to your industry:
  16. First, you need a deep learning framework. These are available as they are open source. You just need an engineer to use them.
  17. Second ingredient: annotated images. These must be relevant to the task you are tackling The images differ for each use-case and problem – need to develop a dataset.
  18. Annotators – human in the loop
  19. Assemble the 3 ingredients; You trained an algorithm thanks to your dataset and framework. Your first algorithm is applied to never-before-seen images (it is never perfect at first). Your algorithm won’t give an answer on all new images provides: you need humans-in-the-loop to keep annotating those that weren’t (when the machine lacked confidence), completing the task. AI + annotators who complete the job and also keeps building the dataset, creating a better algorithm …
  20. This may all seem relatively simple but there’s a catch: you need to have a very good dataset for it to work well: a huge amount of perfectly annotated images Creating these datasets takes lot’s of time.
  21. Here’s an example of furniture detection To develop and algorithm that detects furniture in images you need a dataset with boxes around every single item in the image. Consequently, you need to do this manually at first, making sure the boxes are perfectly around the item, and that no object is missed. This takes 10 minutes for 1 image! Sadako Technology, mentioned earlier, needed to do this to train their technology: they put millions of boxes around plastic bottles to create their dataset.
  22. Some tasks are even more time consuming. If you want to develop algorithms for robotics (automated cars, robots, drones etc.) – they need to understand their entire environment. So in this case, to train algorithms you need to determine what each pixel represents in the image. This segmentation task takes over an hour
  23. The real bottleneck is now the dataset creation
  24. Good dataset creation is crucial to speed up the pace of AI progress
  25. To make datasets today there are 2 ways of doing it for now: 1) done internally by data scientists 2) Use crowdsourcing, such s Amazons Mechanical Turk – this isn’t too bad – but is time consuming and you need to do many quality reviews and check to ensure satisfactory results Every data scientist has, at least once, thrown out a dataset due to its poor quality.
  26. Real need for Industrialising the dataset creation process is the true solution to move forward to solve image related challenges for each company. There are a few elements that may help the industrialisation and democratisation of the dataset creation:
  27. We need to have a dedicated software: today there is no software to produce datasets. Its crazy to think that each company develops their own small software. A big leap in productivity can be achieved by simply improving the design and the annotation experience
  28. Second element to increase pace of AI production is to work on active learning. You don’t want to annotate millions of images. Active learning is science that helps select the most informative image to build AI with as few images as possible
  29. Improve software with machine learning: if software knows what you’re doing it can really improve the ease of the task.
  30. We intend to reduce the time from 70 to 5 minutes increasing the productivity by 10x
  31. Here machine learning helps software to annotate images. This video shows that if you are looking for a face the box will automatically adjust around the head. The same goes for when annotating objects pixel-wise – speeding up the completion of the task.
  32. AI really is for everyone and can solve any companies challenges. Algorithms are becoming commodities and datasets are the bottle neck of AI. Democratising and industrialising the process of dataset creation will allow for all of us, all companies to move forward with their AI. Applications and goals. THANK YOU