SlideShare a Scribd company logo
1 of 18
Image Recognition:
For RAX Studio Citrix Automation
Fast.ai
Deep Learning Algorithm
Convolutional Neural Network (CNN)
1
How it works:
Dog
3
Cat
System
Identifies:
What is this?
…..
…..
Feature Detection
Feature Matching
Specific patterns which are unique and can be
easily compared and tracked.
SIFT - Scale-Invariant Feature Transform
SURF - Speeded-Up Robust Features
ORB - Oriented FAST and Rotated BRIEF
2
How it works:
5
Template Matching
Searching for an object in an image3
How it works:
7
Calculation Methods:
8
SQDIFF - calculates square difference
CCORR - calculates correlation
COEFF - calculates correlation coefficient
Optical Character
Recognition
Electronic conversion of images to machine-
encoded text
4
10
How it should work:
- Use OCR on the whole
screen
- Find the nth occurence
of word
- Get the image position
of that word
“
- We cannot train fast.ai to identify all
icons/text that the user will use for
automation.
- Feature matching is good for finding
scene images but an image of a website
,for example, could be full of text.
Therefore confusing the features to be
matched.
11
Assessments:
“
- Template Matching is the most reliable
one for citrix automation especially if it
is for matching images with static
graphical user interface.
- OCR is good for web content which
contains dynamic changes to its
graphical design that may be hard for
template matching to track.
12
Assessments:
Proposal:
Template
Matching
13
Make use of 2 different methods to broaden the
option of the users.
OCR
Proposal:
Template
Matching
14
- Template Matching would be used for content
which is unique and those with minimal
changes to the GUI.
- The user should be able to choose the size of
the cropped template. Larger template means
more unique elements included.
Proposal:
15
- If template matching still fails, then the user
should opt in to OCR.
- OCR should be able to find the nth occurrence
of the word and get its position.
- Then it could click anywhere on the screen
relative to the text’s position.
OCR
Suggestion:
16
- The keyboard keys & shortcut keys would be a
great tool for navigating and getting the cursor
to the different textboxes/links.
Ex. Tab, Page up, Page Down
Win + Down = Minimize
etc.
Thanks!
Any questions?
17
References &
Resources:
◇ https://docs.fast.ai/
◇ http://cs231n.github.io/convolutional-networks/
◇ https://docs.opencv.org/
◇ https://www.kaggle.com/wesamelshamy/tutorial-image-feature-
extraction-and-matching
◇ http://www.aishack.in/tutorials/template-matching/
◇ http://scikit-image.org/docs
◇ https://www.slidescarnival.com/
◇ Google Images
18

More Related Content

Similar to Image Recognition for RAX Studio Citrix Automation

Licence plate recognition using matlab programming
Licence plate recognition using matlab programming Licence plate recognition using matlab programming
Licence plate recognition using matlab programming somchaturvedi
 
16 OpenCV Functions to Start your Computer Vision journey.docx
16 OpenCV Functions to Start your Computer Vision journey.docx16 OpenCV Functions to Start your Computer Vision journey.docx
16 OpenCV Functions to Start your Computer Vision journey.docxssuser90e017
 
MCA Society Project Seminar.pptx
MCA Society Project Seminar.pptxMCA Society Project Seminar.pptx
MCA Society Project Seminar.pptxNomearod1
 
The Guide To Wireframing
The Guide To WireframingThe Guide To Wireframing
The Guide To WireframingLewis Lin 🦊
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Image processing project list for java and dotnet
Image processing project list for java and dotnetImage processing project list for java and dotnet
Image processing project list for java and dotnetredpel dot com
 
SharePoint 2013 Preview
SharePoint 2013 PreviewSharePoint 2013 Preview
SharePoint 2013 PreviewRegroove
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobileAnirudh Koul
 
Over view of Technologies
Over view of TechnologiesOver view of Technologies
Over view of TechnologiesChris Mitchell
 
DSL (Domain Specific Language) for Maps Mashups
DSL (Domain Specific Language) for Maps MashupsDSL (Domain Specific Language) for Maps Mashups
DSL (Domain Specific Language) for Maps Mashupsaliraza786
 
Js foo famo.us- build native quality apps using html5 within a day
Js foo  famo.us- build native quality apps using html5 within a dayJs foo  famo.us- build native quality apps using html5 within a day
Js foo famo.us- build native quality apps using html5 within a dayDebnath Sinha
 
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014Endava
 
Angularjs architecture
Angularjs architectureAngularjs architecture
Angularjs architectureMichael He
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 

Similar to Image Recognition for RAX Studio Citrix Automation (20)

Licence plate recognition using matlab programming
Licence plate recognition using matlab programming Licence plate recognition using matlab programming
Licence plate recognition using matlab programming
 
16 OpenCV Functions to Start your Computer Vision journey.docx
16 OpenCV Functions to Start your Computer Vision journey.docx16 OpenCV Functions to Start your Computer Vision journey.docx
16 OpenCV Functions to Start your Computer Vision journey.docx
 
MCA Society Project Seminar.pptx
MCA Society Project Seminar.pptxMCA Society Project Seminar.pptx
MCA Society Project Seminar.pptx
 
The Guide To Wireframing
The Guide To WireframingThe Guide To Wireframing
The Guide To Wireframing
 
The guide to wireframing
The guide to wireframingThe guide to wireframing
The guide to wireframing
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Modern Web Applications
Modern Web ApplicationsModern Web Applications
Modern Web Applications
 
Image processing project list for java and dotnet
Image processing project list for java and dotnetImage processing project list for java and dotnet
Image processing project list for java and dotnet
 
Symphony Driver Essay
Symphony Driver EssaySymphony Driver Essay
Symphony Driver Essay
 
SharePoint 2013 Preview
SharePoint 2013 PreviewSharePoint 2013 Preview
SharePoint 2013 Preview
 
Mr bi
Mr biMr bi
Mr bi
 
What’s Up, EDoc?!
What’s Up,EDoc?!What’s Up,EDoc?!
What’s Up, EDoc?!
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
 
Over view of Technologies
Over view of TechnologiesOver view of Technologies
Over view of Technologies
 
DSL (Domain Specific Language) for Maps Mashups
DSL (Domain Specific Language) for Maps MashupsDSL (Domain Specific Language) for Maps Mashups
DSL (Domain Specific Language) for Maps Mashups
 
Js foo famo.us- build native quality apps using html5 within a day
Js foo  famo.us- build native quality apps using html5 within a dayJs foo  famo.us- build native quality apps using html5 within a day
Js foo famo.us- build native quality apps using html5 within a day
 
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014
 
Angularjs architecture
Angularjs architectureAngularjs architecture
Angularjs architecture
 
Ankur Bajad
Ankur BajadAnkur Bajad
Ankur Bajad
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Image Recognition for RAX Studio Citrix Automation

Editor's Notes

  1. CNN - similar to a neural network but assumes that the input is an image so it can extract the specific properties that an image have.
  2. Basically we will build our own model and train it based on what the elements on the desktop can be seen. These pictures of dogs and cats are fed as training images for the system which is labeled. So the system should be fed with a lot of pictures for higher accuracy. After that the system can give a percentage on how confident it is in identifying if it is a cat or dog. Dot product, Loss function.
  3. look for the regions in images which have maximum variation when moved (by a small amount) in all regions around it. SIFT - good when scale of images changes SURF - faster than SIFT ORB - since SIFT and SURF are both patented. OPENCV DEVS CREATED THIS ONE WHICH IS Said to be faster.
  4. Basically template matching uses a subimage and finds it specific position inside a larger image.
  5. The goal is the find the highest matching area. The algorithm will start by sliding the subimage and compare it.
  6. Square difference of the pixels Correlation - connection of two variables Correlation coefficient - statistical relationship of two variables