Mika Kaukoranta presents what computer vision is and how it can be utilized in software testing by gaining high-level understanding from digital images or videos.
2. Agenda
• Computer vision overview
• How computer vision relates to robotic SW testing?
• Under the hood: pixels, OCR, machine learning
Mika Kaukoranta @mikaukora2
3. Computer vision
Mika Kaukoranta @mikaukora3
Sub-domains
scene reconstruction, event
detection, video tracking, object
recognition, 3D pose estimation,
learning, indexing, motion
estimation, and image restoration
Related fields
artificial intelligence, solid-state physics,
neurobiology, signal Processing,
mathematics,
Distinctions
computer graphics, image
processing, image analysis,
machine vision, imaging, pattern
recognition, photogammetry
4. Overview
• Computer vision - an interdisciplinary field that deals with how computers
can be made to gain high-level understanding from digital images or videos
• Image processing - neither require assumptions nor produce
interpretations about the image content
• Machine vision - focus on applications, mainly in manufacturing, e.g.,
vision based robots and systems for vision based inspection
• Imaging - focus on the process of producing images, but sometimes also
deals with processing and analysis of images
Mika Kaukoranta @mikaukora4
5. Mika Kaukoranta @mikaukora5
object recognition
optical character detection (OCR)
medical imaging
machine vision
Reference
Reference
Reference
Reference
6. Computer vision in SW testing and
automation
Mika Kaukoranta @mikaukora6
Take screenshot
Analyze image
Control keyboard and
mouse
7. Computer vision in SW testing and
automation
• Generic instead of application specific approach
• Control over any UI (user interface)
‒ Legacy systems, remote desktop connections, systems that “can’t be
automated”
• Visual inspection (vs. API’s or objects)
• Enabler for machine learning approaches
Mika Kaukoranta @mikaukora7
9. Testing and control approaches
• Record mouse coordinates
‒ Fixed position.
• Template matching
‒ Crop and find match. Fixed UI.
• Object recognition
‒ Detect object positions. Fixed elements.
• Optical character recognition (OCR)
‒ Recognize text elements. Fixed texts.
• Combinations of the above
• Combinations with other approaches such as API access
Mika Kaukoranta @mikaukora9
ClickCoord 200,300
ClickIcon button.png
ClickButton 1
ClickText OK
10. Discussion
• Do you have systems that are hard to automate?
• Could computer vision help?
Mika Kaukoranta @mikaukora10
11. • Grayscale image
• Pixels represented as single 8-bit number (0-255)
Pixels in memory
Mika Kaukoranta @mikaukora11
Reference
12. • RGB image
• Pixels represented as three 8-bit numbers
[0-255, 0-255, 0-255]
Pixels in memory
Mika Kaukoranta @mikaukora12
Reference
13. Processing steps in OCR
Mika Kaukoranta @mikaukora13
Image capture
Image
preprocessing
Text detection
Character
segmentation
Character
recognition
Found text:
“value:”,
“123”,
“Unit:”,
“euro”
14. Trained model
Machine learning process
Mika Kaukoranta @mikaukora14
Gather and prepare
training data
Training
Inference (prediction)
“A” is “A”
“A” is “A”
“A” is “A”
“A” is ?
“A” with 87 % probability
15. • More machine learning
• Automatic testing, e.g. Testar, AET
• Robotic process automation (RPA)
Future development
Mika Kaukoranta @mikaukora15
16. • Recognize template images from video stream
• Test case passes when image is found
• Can be used for end user video testing, for example
Template matching demo
Mika Kaukoranta @mikaukora16