Image Recognition for RAX Studio Citrix Automation

•Download as PPTX, PDF•

1 like•276 views

The presentation contains brief introduction to a variety of algorithms which can be used for Image Recognition. It focuses on assessing the different tools and algorithms in terms of its reliability to identify desktop/web elements. The criterion is identified as such because the results of the research will be used for desktop automation. To know more about RAX Automation Suite, visit www.raxsuite.com

Technology

Image Recognition:
For RAX Studio Citrix Automation

Fast.ai
Deep Learning Algorithm
Convolutional Neural Network (CNN)
1

How it works:
Dog
3
Cat
System
Identifies:
What is this?
…..
…..

Feature Detection
Feature Matching
Specific patterns which are unique and can be
easily compared and tracked.
SIFT - Scale-Invariant Feature Transform
SURF - Speeded-Up Robust Features
ORB - Oriented FAST and Rotated BRIEF
2

Template Matching
Searching for an object in an image3

Calculation Methods:
8
SQDIFF - calculates square difference
CCORR - calculates correlation
COEFF - calculates correlation coefficient

Optical Character
Recognition
Electronic conversion of images to machine-
encoded text
4

10
How it should work:
- Use OCR on the whole
screen
- Find the nth occurence
of word
- Get the image position
of that word

“
- We cannot train fast.ai to identify all
icons/text that the user will use for
automation.
- Feature matching is good for finding
scene images but an image of a website
,for example, could be full of text.
Therefore confusing the features to be
matched.
11
Assessments:

“
- Template Matching is the most reliable
one for citrix automation especially if it
is for matching images with static
graphical user interface.
- OCR is good for web content which
contains dynamic changes to its
graphical design that may be hard for
template matching to track.
12
Assessments:

Proposal:
Template
Matching
13
Make use of 2 different methods to broaden the
option of the users.
OCR

Proposal:
Template
Matching
14
- Template Matching would be used for content
which is unique and those with minimal
changes to the GUI.
- The user should be able to choose the size of
the cropped template. Larger template means
more unique elements included.

Proposal:
15
- If template matching still fails, then the user
should opt in to OCR.
- OCR should be able to find the nth occurrence
of the word and get its position.
- Then it could click anywhere on the screen
relative to the text’s position.
OCR

Suggestion:
16
- The keyboard keys & shortcut keys would be a
great tool for navigating and getting the cursor
to the different textboxes/links.
Ex. Tab, Page up, Page Down
Win + Down = Minimize
etc.

References &
Resources:
◇ https://docs.fast.ai/
◇ http://cs231n.github.io/convolutional-networks/
◇ https://docs.opencv.org/
◇ https://www.kaggle.com/wesamelshamy/tutorial-image-feature-
extraction-and-matching
◇ http://www.aishack.in/tutorials/template-matching/
◇ http://scikit-image.org/docs
◇ https://www.slidescarnival.com/
◇ Google Images
18

Similar to Image Recognition for RAX Studio Citrix Automation

Licence plate recognition using matlab programming somchaturvedi

16 OpenCV Functions to Start your Computer Vision journey.docxssuser90e017

MCA Society Project Seminar.pptxNomearod1

The Guide To WireframingLewis Lin 🦊

The guide to wireframingMarcelo Graciolli

TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis

Modern Web ApplicationsÖmer Göktuğ Poyraz

Image processing project list for java and dotnetredpel dot com

Symphony Driver EssayAngie Jorgensen

SharePoint 2013 PreviewRegroove

Mr birenjan131

What’s Up,EDoc?!STC-Philadelphia Metro Chapter

Deep learning on mobileAnirudh Koul

Over view of TechnologiesChris Mitchell

DSL (Domain Specific Language) for Maps Mashupsaliraza786

Js foo famo.us- build native quality apps using html5 within a dayDebnath Sinha

LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014Endava

Angularjs architectureMichael He

Ankur Bajadankur bajad

Machine learning at scale - Webinar By zekeLabszekeLabs Technologies

Similar to Image Recognition for RAX Studio Citrix Automation (20)

Licence plate recognition using matlab programming

16 OpenCV Functions to Start your Computer Vision journey.docx

MCA Society Project Seminar.pptx

The Guide To Wireframing

The guide to wireframing

TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis

Modern Web Applications

Image processing project list for java and dotnet

Symphony Driver Essay

SharePoint 2013 Preview

Mr bi

What’s Up,EDoc?!

Deep learning on mobile

Over view of Technologies

DSL (Domain Specific Language) for Maps Mashups

Js foo famo.us- build native quality apps using html5 within a day

LogiLogicless UI prototyping with Node.js | SuperSpeaker@CodeCamp Iasi, 2014

Angularjs architecture

Ankur Bajad

Machine learning at scale - Webinar By zekeLabs

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

A Domino Admins Adventures (Engage 2024)Gabriella Davis

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Install Stable Diffusion in windows machinePadma Pradeep

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Key Features Of Token Development (1).pptxLBM Solutions

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Slack Application Development 101 Slidespraypatel2

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

How to convert PDF to text with Nanonetsnaman860154

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Scaling API-first – The story of a global engineering organization

Injustice - Developers Among Us (SciFiDevCon 2024)

[2024]Digital Global Overview Report 2024 Meltwater.pdf

A Domino Admins Adventures (Engage 2024)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Presentation on how to chat with PDF using ChatGPT code interpreter

Unblocking The Main Thread Solving ANRs and Frozen Frames

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Install Stable Diffusion in windows machine

SQL Database Design For Developers at php[tek] 2024

Key Features Of Token Development (1).pptx

Handwritten Text Recognition for manuscripts and early printed texts

How to Troubleshoot Apps for the Modern Connected Worker

Slack Application Development 101 Slides

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

How to convert PDF to text with Nanonets

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Image Recognition for RAX Studio Citrix Automation

1. Image Recognition: For RAX Studio Citrix Automation

2. Fast.ai Deep Learning Algorithm Convolutional Neural Network (CNN) 1

3. How it works: Dog 3 Cat System Identifies: What is this? ….. …..

4. Feature Detection Feature Matching Specific patterns which are unique and can be easily compared and tracked. SIFT - Scale-Invariant Feature Transform SURF - Speeded-Up Robust Features ORB - Oriented FAST and Rotated BRIEF 2

5. How it works: 5

6. Template Matching Searching for an object in an image3

7. How it works: 7

8. Calculation Methods: 8 SQDIFF - calculates square difference CCORR - calculates correlation COEFF - calculates correlation coefficient

9. Optical Character Recognition Electronic conversion of images to machine- encoded text 4

10. 10 How it should work: - Use OCR on the whole screen - Find the nth occurence of word - Get the image position of that word

11. “ - We cannot train fast.ai to identify all icons/text that the user will use for automation. - Feature matching is good for finding scene images but an image of a website ,for example, could be full of text. Therefore confusing the features to be matched. 11 Assessments:

12. “ - Template Matching is the most reliable one for citrix automation especially if it is for matching images with static graphical user interface. - OCR is good for web content which contains dynamic changes to its graphical design that may be hard for template matching to track. 12 Assessments:

13. Proposal: Template Matching 13 Make use of 2 different methods to broaden the option of the users. OCR

14. Proposal: Template Matching 14 - Template Matching would be used for content which is unique and those with minimal changes to the GUI. - The user should be able to choose the size of the cropped template. Larger template means more unique elements included.

15. Proposal: 15 - If template matching still fails, then the user should opt in to OCR. - OCR should be able to find the nth occurrence of the word and get its position. - Then it could click anywhere on the screen relative to the text’s position. OCR

16. Suggestion: 16 - The keyboard keys & shortcut keys would be a great tool for navigating and getting the cursor to the different textboxes/links. Ex. Tab, Page up, Page Down Win + Down = Minimize etc.

17. Thanks! Any questions? 17

18. References & Resources: ◇ https://docs.fast.ai/ ◇ http://cs231n.github.io/convolutional-networks/ ◇ https://docs.opencv.org/ ◇ https://www.kaggle.com/wesamelshamy/tutorial-image-feature- extraction-and-matching ◇ http://www.aishack.in/tutorials/template-matching/ ◇ http://scikit-image.org/docs ◇ https://www.slidescarnival.com/ ◇ Google Images 18

Editor's Notes

CNN - similar to a neural network but assumes that the input is an image so it can extract the specific properties that an image have.
Basically we will build our own model and train it based on what the elements on the desktop can be seen. These pictures of dogs and cats are fed as training images for the system which is labeled. So the system should be fed with a lot of pictures for higher accuracy. After that the system can give a percentage on how confident it is in identifying if it is a cat or dog. Dot product, Loss function.
look for the regions in images which have maximum variation when moved (by a small amount) in all regions around it. SIFT - good when scale of images changes SURF - faster than SIFT ORB - since SIFT and SURF are both patented. OPENCV DEVS CREATED THIS ONE WHICH IS Said to be faster.
Basically template matching uses a subimage and finds it specific position inside a larger image.
The goal is the find the highest matching area. The algorithm will start by sliding the subimage and compare it.
Square difference of the pixels Correlation - connection of two variables Correlation coefficient - statistical relationship of two variables

Image Recognition for RAX Studio Citrix Automation

Recommended

Recommended

More Related Content

Similar to Image Recognition for RAX Studio Citrix Automation

Similar to Image Recognition for RAX Studio Citrix Automation (20)

Recently uploaded

Recently uploaded (20)

Image Recognition for RAX Studio Citrix Automation

Editor's Notes