SlideShare a Scribd company logo
1
2022/10/24
@shawnmjones 1
2022/10/24
Managed by Triad National Security, LLC, for the U.S. Department of Energy’s NNSA.
Abstract Images Have Different Levels of
Retrievability Per Reverse Image Search Engine
Shawn M. Jones & Diane Oyen
Information Sciences (CCS-3)
2022/10/24
LA-UR-22-30888
2
2022/10/24
@shawnmjones
There are few computer vision research papers focused
on querying and retrieving abstract, technical drawings
• Technical documents typically contain
abstract images
• Many reasons exist to search for
abstract images online:
• protect intellectual property
• build datasets
• find evidence for legal cases
• establish scholarly evidence
• justify funding through image
reuse
https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg
https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
3
2022/10/24
@shawnmjones
Baidu Bing Google Yandex
Now major search engines support reverse image search
Screenshot source:
https://image.baidu.com
Screenshot source:
https://images.google.com
Screenshot source:
https://www.bing.com/
Screenshot source:
https://yandex.com/images
4
2022/10/24
@shawnmjones
With each service,
a user can upload
an image and
receive different
types of results
pages-with
results
similar-to
results
the uploaded
query image
Uploaded image source: https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg
Screenshot from: https://www.bing.com
5
2022/10/24
@shawnmjones
Research Question
When using the reverse image search
capability of general web search engines,
are natural images more easily discovered
than abstract images?
6
2022/10/24
@shawnmjones
To collect query images, we submitted terms to
Wikimedia Commons’ API
“diagram”
“schematic”
abstract images
“photo”
“photograph”
natural images
100 images
100 images
100 images
99 images
Previous studies have shown that Wikipedia content has high retrievability.
Image sources:
• https://commons.wikimedia.org/wiki/File:Galileo_Diagram.jpg
• https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
• https://commons.wikimedia.org/wiki/File:Bicycle_diagram-es.svg
• https://commons.wikimedia.org/wiki/File:Systems_Engineering_V_diagram.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Hvdc_bipolar_schematic.svg
• https://commons.wikimedia.org/wiki/File:Beve_gear_schematic.png
• https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
• https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
• https://commons.wikimedia.org/wiki/File:Frank_W._Micklethwaite_photo_of_downtown_Toronto,_1890_-2.jpg
• https://commons.wikimedia.org/wiki/File:James_Abram_Garfield,_photo_portrait_seated.jpg
• https://commons.wikimedia.org/wiki/File:Wtc-photo.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg
• https://commons.wikimedia.org/wiki/File:Photographing_sunrise_1745.jpg
• https://commons.wikimedia.org/wiki/File:FEMA_-_5399_-_Photograph_by_Andrea_Booher_taken_on_09-28-2001_in_New_York.jpg
• https://commons.wikimedia.org/wiki/File:Photographing_a_model.jpg
7
2022/10/24
@shawnmjones
We then submitted
the same image to
each reverse image
search engine
then again with:
and so on...
Image source: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
Image source: https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
Screenshot source:
https://images.google.com
Screenshot source:
https://www.bing.com/
Screenshot source:
https://image.baidu.com
Screenshot source:
https://yandex.com/images
8
2022/10/24
@shawnmjones
Using ImageHash’s pHash and GoFigure’s VisHash we
evaluated how often the same image existed in the
results
pHash was designed
to compare
photographs via
Discrete Cosine
Transforms (DCT).
VisHash was designed
to compare diagrams
and technical
drawings by finding
shapes in the image.
Uploaded images:
https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
Screenshots source:
https://yandex.com/images
9
2022/10/24
@shawnmjones
Precision differs based on pages-with or similar-to
results, with Yandex performing best
blue = abstract images
green = natural images
Precision@k:
What percentage of images in the results are the same as the query image if we stop at k results?
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
10
2022/10/24
@shawnmjones
After reviewing 10 pages-with results, Google has a max of 54% retrievability
difference between images from the categories of photograph and diagram
blue = abstract images
green = natural images
Retrievability:
Given a query image, was it retrieved within the cutoff c?
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
11
2022/10/24
@shawnmjones
For similar-to results, Yandex consistently provides a
high MRR (0.8) for natural images
MRR:
How many results, on
average, across all
queries, must a visitor
review before finding a
the same one again?
Google does well with pages-with results
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
12
2022/10/24
@shawnmjones
Key Takeaways
• We submitted abstract and natural images
from Wikimedia Commons to four major
reverse image search engines.
• When they do return results, Bing and Baidu
do not perform well.
• Google does not perform well for similar-to
results, likely indicating that their definition
of similar-to differs from other search
engines.
• Yandex performs best in all cases.
• Yandex and Google consistently perform
better for natural images in pages-with
results.
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).

More Related Content

Similar to Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

Paper id 25201491
Paper id 25201491Paper id 25201491
Paper id 25201491
IJRAT
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
researchinventy
 
Research Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and ScienceResearch Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and Science
researchinventy
 

Similar to Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine (20)

Exploring Machine Learning for Libraries and Archives: Present and Future
Exploring Machine Learning for Libraries and Archives: Present and FutureExploring Machine Learning for Libraries and Archives: Present and Future
Exploring Machine Learning for Libraries and Archives: Present and Future
 
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURESENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
 
Paper id 25201491
Paper id 25201491Paper id 25201491
Paper id 25201491
 
Sample CS Senior Capstone Projects
Sample CS Senior Capstone ProjectsSample CS Senior Capstone Projects
Sample CS Senior Capstone Projects
 
Silk Data - Recommendations
Silk Data - RecommendationsSilk Data - Recommendations
Silk Data - Recommendations
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
Research Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and ScienceResearch Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and Science
 
HILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill HoweHILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill Howe
 
Main principles of Data Science and Machine Learning
Main principles of Data Science and Machine LearningMain principles of Data Science and Machine Learning
Main principles of Data Science and Machine Learning
 
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEYAPPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
 
Applications of spatial features in cbir a survey
Applications of spatial features in cbir  a surveyApplications of spatial features in cbir  a survey
Applications of spatial features in cbir a survey
 
Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...
 
RDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesRDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data services
 
final ppt.pptx
final ppt.pptxfinal ppt.pptx
final ppt.pptx
 
final ppt.pptx
final ppt.pptxfinal ppt.pptx
final ppt.pptx
 
Multivariate feature descriptor based cbir model to query large image databases
Multivariate feature descriptor based cbir model to query large image databasesMultivariate feature descriptor based cbir model to query large image databases
Multivariate feature descriptor based cbir model to query large image databases
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
 
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)MediaEval 2017 Retrieving Diverse Social Images Task (Overview)
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)
 
Structured Data & Schema.org - SMX Milan 2014
Structured Data & Schema.org - SMX Milan 2014Structured Data & Schema.org - SMX Milan 2014
Structured Data & Schema.org - SMX Milan 2014
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 

More from Shawn Jones

DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
Shawn Jones
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Shawn Jones
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
Shawn Jones
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Shawn Jones
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
Shawn Jones
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Shawn Jones
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Shawn Jones
 

More from Shawn Jones (19)

DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web Archives
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web Archives
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento Toolkit
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-It
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web Archives
 
Reference Rot
Reference RotReference Rot
Reference Rot
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive Collections
 
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using MementoAvoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
 
Continuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonestContinuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonest
 
A Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven DevelopmentA Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven Development
 
Reconstructing the past with media wiki
Reconstructing the past with media wikiReconstructing the past with media wiki
Reconstructing the past with media wiki
 

Recently uploaded

Recently uploaded (20)

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 

Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

  • 1. 1 2022/10/24 @shawnmjones 1 2022/10/24 Managed by Triad National Security, LLC, for the U.S. Department of Energy’s NNSA. Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine Shawn M. Jones & Diane Oyen Information Sciences (CCS-3) 2022/10/24 LA-UR-22-30888
  • 2. 2 2022/10/24 @shawnmjones There are few computer vision research papers focused on querying and retrieving abstract, technical drawings • Technical documents typically contain abstract images • Many reasons exist to search for abstract images online: • protect intellectual property • build datasets • find evidence for legal cases • establish scholarly evidence • justify funding through image reuse https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
  • 3. 3 2022/10/24 @shawnmjones Baidu Bing Google Yandex Now major search engines support reverse image search Screenshot source: https://image.baidu.com Screenshot source: https://images.google.com Screenshot source: https://www.bing.com/ Screenshot source: https://yandex.com/images
  • 4. 4 2022/10/24 @shawnmjones With each service, a user can upload an image and receive different types of results pages-with results similar-to results the uploaded query image Uploaded image source: https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg Screenshot from: https://www.bing.com
  • 5. 5 2022/10/24 @shawnmjones Research Question When using the reverse image search capability of general web search engines, are natural images more easily discovered than abstract images?
  • 6. 6 2022/10/24 @shawnmjones To collect query images, we submitted terms to Wikimedia Commons’ API “diagram” “schematic” abstract images “photo” “photograph” natural images 100 images 100 images 100 images 99 images Previous studies have shown that Wikipedia content has high retrievability. Image sources: • https://commons.wikimedia.org/wiki/File:Galileo_Diagram.jpg • https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg • https://commons.wikimedia.org/wiki/File:Bicycle_diagram-es.svg • https://commons.wikimedia.org/wiki/File:Systems_Engineering_V_diagram.jpg Image sources : • https://commons.wikimedia.org/wiki/File:Hvdc_bipolar_schematic.svg • https://commons.wikimedia.org/wiki/File:Beve_gear_schematic.png • https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png • https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg Image sources : • https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg • https://commons.wikimedia.org/wiki/File:Frank_W._Micklethwaite_photo_of_downtown_Toronto,_1890_-2.jpg • https://commons.wikimedia.org/wiki/File:James_Abram_Garfield,_photo_portrait_seated.jpg • https://commons.wikimedia.org/wiki/File:Wtc-photo.jpg Image sources : • https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg • https://commons.wikimedia.org/wiki/File:Photographing_sunrise_1745.jpg • https://commons.wikimedia.org/wiki/File:FEMA_-_5399_-_Photograph_by_Andrea_Booher_taken_on_09-28-2001_in_New_York.jpg • https://commons.wikimedia.org/wiki/File:Photographing_a_model.jpg
  • 7. 7 2022/10/24 @shawnmjones We then submitted the same image to each reverse image search engine then again with: and so on... Image source: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg Image source: https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png Screenshot source: https://images.google.com Screenshot source: https://www.bing.com/ Screenshot source: https://image.baidu.com Screenshot source: https://yandex.com/images
  • 8. 8 2022/10/24 @shawnmjones Using ImageHash’s pHash and GoFigure’s VisHash we evaluated how often the same image existed in the results pHash was designed to compare photographs via Discrete Cosine Transforms (DCT). VisHash was designed to compare diagrams and technical drawings by finding shapes in the image. Uploaded images: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png Screenshots source: https://yandex.com/images
  • 9. 9 2022/10/24 @shawnmjones Precision differs based on pages-with or similar-to results, with Yandex performing best blue = abstract images green = natural images Precision@k: What percentage of images in the results are the same as the query image if we stop at k results? S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
  • 10. 10 2022/10/24 @shawnmjones After reviewing 10 pages-with results, Google has a max of 54% retrievability difference between images from the categories of photograph and diagram blue = abstract images green = natural images Retrievability: Given a query image, was it retrieved within the cutoff c? S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
  • 11. 11 2022/10/24 @shawnmjones For similar-to results, Yandex consistently provides a high MRR (0.8) for natural images MRR: How many results, on average, across all queries, must a visitor review before finding a the same one again? Google does well with pages-with results S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
  • 12. 12 2022/10/24 @shawnmjones Key Takeaways • We submitted abstract and natural images from Wikimedia Commons to four major reverse image search engines. • When they do return results, Bing and Baidu do not perform well. • Google does not perform well for similar-to results, likely indicating that their definition of similar-to differs from other search engines. • Yandex performs best in all cases. • Yandex and Google consistently perform better for natural images in pages-with results. S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).