SlideShare a Scribd company logo
1 of 12
Download to read offline
1
2022/10/24
@shawnmjones 1
2022/10/24
Managed by Triad National Security, LLC, for the U.S. Department of Energy’s NNSA.
Abstract Images Have Different Levels of
Retrievability Per Reverse Image Search Engine
Shawn M. Jones & Diane Oyen
Information Sciences (CCS-3)
2022/10/24
LA-UR-XXXXXX
2
2022/10/24
@shawnmjones
There are few computer vision research papers focused
on querying and retrieving abstract, technical drawings
• Technical documents typically contain
abstract images
• Many reasons exist to search for
abstract images online:
• protect intellectual property
• build datasets
• find evidence for legal cases
• establish scholarly evidence
• justify funding through image
reuse
https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg
https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
3
2022/10/24
@shawnmjones
Baidu Bing Google Yandex
Now major search engines support reverse image search
Screenshot source:
https://image.baidu.com
Screenshot source:
https://images.google.com
Screenshot source:
https://www.bing.com/
Screenshot source:
https://yandex.com/images
4
2022/10/24
@shawnmjones
With each service,
a user can upload
an image and
receive different
types of results
pages-with
results
similar-to
results
the uploaded
query image
Uploaded image source: https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg
Screenshot from: https://www.bing.com
5
2022/10/24
@shawnmjones
Research Question
When using the reverse image search
capability of general web search engines,
are natural images more easily discovered
than abstract images?
6
2022/10/24
@shawnmjones
To collect query images, we submitted terms to
Wikimedia Commons’ API
“diagram”
“schematic”
abstract images
“photo”
“photograph”
natural images
100 images
100 images
100 images
99 images
Previous studies have shown that Wikipedia content has high retrievability.
Image sources:
• https://commons.wikimedia.org/wiki/File:Galileo_Diagram.jpg
• https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
• https://commons.wikimedia.org/wiki/File:Bicycle_diagram-es.svg
• https://commons.wikimedia.org/wiki/File:Systems_Engineering_V_diagram.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Hvdc_bipolar_schematic.svg
• https://commons.wikimedia.org/wiki/File:Beve_gear_schematic.png
• https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
• https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
• https://commons.wikimedia.org/wiki/File:Frank_W._Micklethwaite_photo_of_downtown_Toronto,_1890_-2.jpg
• https://commons.wikimedia.org/wiki/File:James_Abram_Garfield,_photo_portrait_seated.jpg
• https://commons.wikimedia.org/wiki/File:Wtc-photo.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg
• https://commons.wikimedia.org/wiki/File:Photographing_sunrise_1745.jpg
• https://commons.wikimedia.org/wiki/File:FEMA_-_5399_-_Photograph_by_Andrea_Booher_taken_on_09-28-2001_in_New_York.jpg
• https://commons.wikimedia.org/wiki/File:Photographing_a_model.jpg
7
2022/10/24
@shawnmjones
We then submitted
the same image to
each reverse image
search engine
then again with:
and so on...
Image source: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
Image source: https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
Screenshot source:
https://images.google.com
Screenshot source:
https://www.bing.com/
Screenshot source:
https://image.baidu.com
Screenshot source:
https://yandex.com/images
8
2022/10/24
@shawnmjones
Using ImageHash’s pHash and GoFigure’s VisHash we
evaluated how often the same image existed in the
results
pHash was designed
to compare
photographs via
Discrete Cosine
Transforms (DCT).
VisHash was designed
to compare diagrams
and technical
drawings by finding
shapes in the image.
Uploaded images:
https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
Screenshots source:
https://yandex.com/images
9
2022/10/24
@shawnmjones
Precision differs based on pages-with or similar-to
results, with Yandex performing best
blue = abstract images
green = natural images
Precision@k:
What percentage of images in the results are the same as the query image if we stop at k results?
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
10
2022/10/24
@shawnmjones
After reviewing 10 pages-with results, Google has a max of 54% retrievability
difference between images from the categories of photograph and diagram
blue = abstract images
green = natural images
Retrievability:
Given a query image, was it retrieved within the cutoff c?
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
11
2022/10/24
@shawnmjones
For similar-to results, Yandex consistently provides a
high MRR (0.8) for natural images
MRR:
How many results, on
average, across all
queries, must a visitor
review before finding a
the same one again?
Google does well with pages-with results
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
12
2022/10/24
@shawnmjones
Key Takeaways
• We submitted abstract and natural images
from Wikimedia Commons to four major
reverse image search engines.
• When they do return results, Bing and Baidu
do not perform well.
• Google does not perform well for similar-to
results, likely indicating that their definition
of similar-to differs from other search
engines.
• Yandex performs best in all cases.
• Yandex and Google consistently perform
better for natural images in pages-with
results.
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).

More Related Content

Similar to Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
researchinventy
 
Research Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and ScienceResearch Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and Science
researchinventy
 
Paper id 25201491
Paper id 25201491Paper id 25201491
Paper id 25201491
IJRAT
 
Precision face image retrieval by extracting the face features and comparing ...
Precision face image retrieval by extracting the face features and comparing ...Precision face image retrieval by extracting the face features and comparing ...
Precision face image retrieval by extracting the face features and comparing ...
prjpublications
 

Similar to Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine (20)

Exploring Machine Learning for Libraries and Archives: Present and Future
Exploring Machine Learning for Libraries and Archives: Present and FutureExploring Machine Learning for Libraries and Archives: Present and Future
Exploring Machine Learning for Libraries and Archives: Present and Future
 
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURESENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
ENHANCED WEB IMAGE RE-RANKING USING SEMANTIC SIGNATURES
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
Research Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and ScienceResearch Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and Science
 
Sample CS Senior Capstone Projects
Sample CS Senior Capstone ProjectsSample CS Senior Capstone Projects
Sample CS Senior Capstone Projects
 
Silk Data - Recommendations
Silk Data - RecommendationsSilk Data - Recommendations
Silk Data - Recommendations
 
Paper id 25201491
Paper id 25201491Paper id 25201491
Paper id 25201491
 
HILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill HoweHILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill Howe
 
Multivariate feature descriptor based cbir model to query large image databases
Multivariate feature descriptor based cbir model to query large image databasesMultivariate feature descriptor based cbir model to query large image databases
Multivariate feature descriptor based cbir model to query large image databases
 
Main principles of Data Science and Machine Learning
Main principles of Data Science and Machine LearningMain principles of Data Science and Machine Learning
Main principles of Data Science and Machine Learning
 
RDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesRDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data services
 
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEYAPPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
APPLICATIONS OF SPATIAL FEATURES IN CBIR : A SURVEY
 
Applications of spatial features in cbir a survey
Applications of spatial features in cbir  a surveyApplications of spatial features in cbir  a survey
Applications of spatial features in cbir a survey
 
Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...
 
final ppt.pptx
final ppt.pptxfinal ppt.pptx
final ppt.pptx
 
final ppt.pptx
final ppt.pptxfinal ppt.pptx
final ppt.pptx
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)MediaEval 2017 Retrieving Diverse Social Images Task (Overview)
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)
 
Precision face image retrieval by extracting the face features and comparing ...
Precision face image retrieval by extracting the face features and comparing ...Precision face image retrieval by extracting the face features and comparing ...
Precision face image retrieval by extracting the face features and comparing ...
 

More from Shawn Jones

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Shawn Jones
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
Shawn Jones
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
Shawn Jones
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Shawn Jones
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
Shawn Jones
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Shawn Jones
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Shawn Jones
 

More from Shawn Jones (19)

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
 
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web Archives
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web Archives
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento Toolkit
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-It
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web Archives
 
Reference Rot
Reference RotReference Rot
Reference Rot
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive Collections
 
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using MementoAvoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
 
Continuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonestContinuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonest
 
A Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven DevelopmentA Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven Development
 
Reconstructing the past with media wiki
Reconstructing the past with media wikiReconstructing the past with media wiki
Reconstructing the past with media wiki
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

  • 1. 1 2022/10/24 @shawnmjones 1 2022/10/24 Managed by Triad National Security, LLC, for the U.S. Department of Energy’s NNSA. Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine Shawn M. Jones & Diane Oyen Information Sciences (CCS-3) 2022/10/24 LA-UR-XXXXXX
  • 2. 2 2022/10/24 @shawnmjones There are few computer vision research papers focused on querying and retrieving abstract, technical drawings • Technical documents typically contain abstract images • Many reasons exist to search for abstract images online: • protect intellectual property • build datasets • find evidence for legal cases • establish scholarly evidence • justify funding through image reuse https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
  • 3. 3 2022/10/24 @shawnmjones Baidu Bing Google Yandex Now major search engines support reverse image search Screenshot source: https://image.baidu.com Screenshot source: https://images.google.com Screenshot source: https://www.bing.com/ Screenshot source: https://yandex.com/images
  • 4. 4 2022/10/24 @shawnmjones With each service, a user can upload an image and receive different types of results pages-with results similar-to results the uploaded query image Uploaded image source: https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg Screenshot from: https://www.bing.com
  • 5. 5 2022/10/24 @shawnmjones Research Question When using the reverse image search capability of general web search engines, are natural images more easily discovered than abstract images?
  • 6. 6 2022/10/24 @shawnmjones To collect query images, we submitted terms to Wikimedia Commons’ API “diagram” “schematic” abstract images “photo” “photograph” natural images 100 images 100 images 100 images 99 images Previous studies have shown that Wikipedia content has high retrievability. Image sources: • https://commons.wikimedia.org/wiki/File:Galileo_Diagram.jpg • https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg • https://commons.wikimedia.org/wiki/File:Bicycle_diagram-es.svg • https://commons.wikimedia.org/wiki/File:Systems_Engineering_V_diagram.jpg Image sources : • https://commons.wikimedia.org/wiki/File:Hvdc_bipolar_schematic.svg • https://commons.wikimedia.org/wiki/File:Beve_gear_schematic.png • https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png • https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg Image sources : • https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg • https://commons.wikimedia.org/wiki/File:Frank_W._Micklethwaite_photo_of_downtown_Toronto,_1890_-2.jpg • https://commons.wikimedia.org/wiki/File:James_Abram_Garfield,_photo_portrait_seated.jpg • https://commons.wikimedia.org/wiki/File:Wtc-photo.jpg Image sources : • https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg • https://commons.wikimedia.org/wiki/File:Photographing_sunrise_1745.jpg • https://commons.wikimedia.org/wiki/File:FEMA_-_5399_-_Photograph_by_Andrea_Booher_taken_on_09-28-2001_in_New_York.jpg • https://commons.wikimedia.org/wiki/File:Photographing_a_model.jpg
  • 7. 7 2022/10/24 @shawnmjones We then submitted the same image to each reverse image search engine then again with: and so on... Image source: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg Image source: https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png Screenshot source: https://images.google.com Screenshot source: https://www.bing.com/ Screenshot source: https://image.baidu.com Screenshot source: https://yandex.com/images
  • 8. 8 2022/10/24 @shawnmjones Using ImageHash’s pHash and GoFigure’s VisHash we evaluated how often the same image existed in the results pHash was designed to compare photographs via Discrete Cosine Transforms (DCT). VisHash was designed to compare diagrams and technical drawings by finding shapes in the image. Uploaded images: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png Screenshots source: https://yandex.com/images
  • 9. 9 2022/10/24 @shawnmjones Precision differs based on pages-with or similar-to results, with Yandex performing best blue = abstract images green = natural images Precision@k: What percentage of images in the results are the same as the query image if we stop at k results? S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
  • 10. 10 2022/10/24 @shawnmjones After reviewing 10 pages-with results, Google has a max of 54% retrievability difference between images from the categories of photograph and diagram blue = abstract images green = natural images Retrievability: Given a query image, was it retrieved within the cutoff c? S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
  • 11. 11 2022/10/24 @shawnmjones For similar-to results, Yandex consistently provides a high MRR (0.8) for natural images MRR: How many results, on average, across all queries, must a visitor review before finding a the same one again? Google does well with pages-with results S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).
  • 12. 12 2022/10/24 @shawnmjones Key Takeaways • We submitted abstract and natural images from Wikimedia Commons to four major reverse image search engines. • When they do return results, Bing and Baidu do not perform well. • Google does not perform well for similar-to results, likely indicating that their definition of similar-to differs from other search engines. • Yandex performs best in all cases. • Yandex and Google consistently perform better for natural images in pages-with results. S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).