SlideShare a Scribd company logo
@shawnmjones @WebSciDL
It’s All About The Cards:
Sharing on Social Media
Encouraged HTML
Metadata Growth
Shawn M. Jones· Valentina Neblitt-Jones· Martin Klein
Los Alamos National Laboratory
Research Library
Michele C. Weigle· Michael L. Nelson
Old Dominion University
Web Science and Digital Libraries Research Group
@shawnmjones @WebSciDL
Metadata is key to organizing content and
providing context
Creating
metadata
takes time
and effort.
Web page
authors can
add
metadata to
their pages
with HTML’s
META
element.
2
@shawnmjones @WebSciDL
Web page authors have many
choices in metadata standards
3
@shawnmjones @WebSciDL
Creating metadata is
expensive
4
How do authors spend their metadata
budget?
@shawnmjones @WebSciDL
Past studies focused on Dublin Core, and
show that systems favor certain fields
5
title is the most popular field per 10 studies
description is the second most popular field per 6 studies
@shawnmjones @WebSciDL
Our study evaluates the evolution of
metadata usage over time
6
Web archives capture web page
HTML, JavaScript, CSS, and
embedded content as
mementos.
Mementos have a specific
capture date and time, their
memento-datetime.
Each memento represents an
author’s behavior at that
specific time.
2/28/2021
3/20/2021
3/27/2021
@shawnmjones @WebSciDL
We thank Max Grusky for access
to the NEWSROOM dataset
7
NEWSROOM contains 1.3 million
mementos of news articles that contain
metadata.
All articles contain at least an HTML
description field.
NEWSROOM’s mementos were captured
by the Internet Archive between 1998 and
2016.
@shawnmjones @WebSciDL
We sampled 277,724 mementos of news articles
from the 39 outlets found in NEWSROOM
8
@shawnmjones @WebSciDL
In 1998, the mean
number of
metadata fields
used was 2
by 2016, it was 39
9
The sharp increase in 2006
may be an artifact of the
uneven sampling in the
dataset.
2
39
If we look at each individual
metadata field, how are they
being used?
@shawnmjones @WebSciDL
We grouped
metadata fields
into categories
10
Metadata usage exploded
after 2008.
A category’s size =
percentage of articles that
contain at least one
metadata field from that
category.
@shawnmjones @WebSciDL
We evaluated the use of the fields specified in
HTML standards from HTML 2.0 to HTML 5
11
keywords are still in use
even though most search
engines do not process them.
author usage is on the rise.
The heavy use of
description is an artifact
of the dataset.
@shawnmjones @WebSciDL
To contrast with previous studies, we
analyzed the adoption of Dublin Core
12
Dublin Core’s usage has not
grown much compared to
other categories.
@shawnmjones @WebSciDL
Schema.org is designed to assist
search engines
13
SEO experts
imply better
placement
among
search
results for
pages using
schema.org,
but the
adoption rate
seems
moderate.
@shawnmjones @WebSciDL
Other search engine metadata usage has
not grown much either
14
We see very similar
usage for metadata
related to identifying
pages for Google and
Bing.
@shawnmjones @WebSciDL
Metadata that supports sharing on social
media has experienced a renaissance
Social cards are
summaries of web pages
shared on social media.
twitter:image
twitter:title
twitter:description
15
They are built from authors’
web page metadata.
@shawnmjones @WebSciDL
Usage of OGP (Facebook) fields for social cards
has skyrocketed since it was introduced
16
Card fields required per testing are outlined in red.
Additional card fields required per documentation are in dotted
red.
There has
been far
less growth
for fields not
related to
social cards.
@shawnmjones @WebSciDL
The Twitter Card standard shows the same meteoric
rise in metadata usage specific to social cards
17
The card fields required after we tested creating cards with Twitter are
outlined in red.
Additional card fields required per documentation are in dotted red.
The growing
field usage
mirrors their
Facebook
counterparts.
Twitter will use
OGP fields, but
only if
twitter:card
is specified.
@shawnmjones @WebSciDL
Facebook supports non-OGP fields as
part of its Marketing API
18
Facebook’s sharing debugger implies that authors need to supply fb:app_id for
Facebook to generate a card, but it works fine without it.
Many of the articles we reviewed contained a blank string or “dummy value” for
this field.
@shawnmjones @WebSciDL
In conclusion: It’s all about the cards
19
• We analyzed 227,724 mementos
of news articles to understand
how authors used their metadata
budget.
• In 2008, metadata usage
exploded.
• When we break down usage by
individual fields, we see that
authors favor fields associated
with social cards.
• This insight can help future
metadata standard authors
understand what spurs metadata
adoption.
S. M. Jones, V. Neblitt-Jones, M. C. Weigle, M. Klein, and M. L. Nelson, “It's All About The Cards: Sharing on Social
Media Probably Encouraged HTML Metadata Growth,” ACM/IEEE Joint Conference on Digital Libraries, 2021.
[preprint: https://arxiv.org/abs/2104.04116.]

More Related Content

What's hot

Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scale
Ori Reshef
 
SnapLogic Live: Big Data Integration
SnapLogic Live: Big Data IntegrationSnapLogic Live: Big Data Integration
SnapLogic Live: Big Data Integration
SnapLogic
 
A unified analytics platform with Kafka and Flink | Stephan Ewen, Ververica
A unified analytics platform with Kafka and Flink | Stephan Ewen, VervericaA unified analytics platform with Kafka and Flink | Stephan Ewen, Ververica
A unified analytics platform with Kafka and Flink | Stephan Ewen, Ververica
HostedbyConfluent
 
Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...
Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...
Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...
HostedbyConfluent
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Spark Summit
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
Cambridge Semantics
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
Marin Dimitrov
 
MongoDB and Azure Data Bricks - Microsoft
MongoDB and Azure Data Bricks - MicrosoftMongoDB and Azure Data Bricks - Microsoft
MongoDB and Azure Data Bricks - Microsoft
MongoDB
 
Spark and MongoDB
Spark and MongoDBSpark and MongoDB
Spark and MongoDB
Norberto Leite
 
Tracking data lineage at Stitch Fix
Tracking data lineage at Stitch FixTracking data lineage at Stitch Fix
Tracking data lineage at Stitch Fix
Stitch Fix Algorithms
 
Micro-Servicing Linked Data
Micro-Servicing Linked DataMicro-Servicing Linked Data
Micro-Servicing Linked Data
openCypher
 
Netflix Big Data Paris 2017
Netflix Big Data Paris 2017Netflix Big Data Paris 2017
Netflix Big Data Paris 2017
Jason Flittner
 
EUNIS 2018 - Migration of a web service back-end from a relational to a docum...
EUNIS 2018 - Migration of a web service back-end from a relational to a docum...EUNIS 2018 - Migration of a web service back-end from a relational to a docum...
EUNIS 2018 - Migration of a web service back-end from a relational to a docum...
Marius Politze
 
Detecting Mobile Malware with Apache Spark with David Pryce
Detecting Mobile Malware with Apache Spark with David PryceDetecting Mobile Malware with Apache Spark with David Pryce
Detecting Mobile Malware with Apache Spark with David Pryce
Databricks
 
Mongodb Spring
Mongodb SpringMongodb Spring
Mongodb Spring
Norberto Leite
 
Spark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu AdunuthulaSpark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu Adunuthula
Spark Summit
 
Build robust streaming data pipelines with MongoDB and Kafka P2
Build robust streaming data pipelines with MongoDB and Kafka P2Build robust streaming data pipelines with MongoDB and Kafka P2
Build robust streaming data pipelines with MongoDB and Kafka P2
Ashnikbiz
 
The Internet as a Single Database
The Internet as a Single DatabaseThe Internet as a Single Database
The Internet as a Single Database
Datafiniti
 
Building materialised views for linked data systems using microservices
Building materialised views for linked data systems using microservicesBuilding materialised views for linked data systems using microservices
Building materialised views for linked data systems using microservices
Connected Data World
 
Spark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren Nathan
Spark Summit
 

What's hot (20)

Mutable data @ scale
Mutable data @ scaleMutable data @ scale
Mutable data @ scale
 
SnapLogic Live: Big Data Integration
SnapLogic Live: Big Data IntegrationSnapLogic Live: Big Data Integration
SnapLogic Live: Big Data Integration
 
A unified analytics platform with Kafka and Flink | Stephan Ewen, Ververica
A unified analytics platform with Kafka and Flink | Stephan Ewen, VervericaA unified analytics platform with Kafka and Flink | Stephan Ewen, Ververica
A unified analytics platform with Kafka and Flink | Stephan Ewen, Ververica
 
Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...
Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...
Achieving Real-Time Analytics at Hermes | Zulf Qureshi, HVR and Dr. Stefan Ro...
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
MongoDB and Azure Data Bricks - Microsoft
MongoDB and Azure Data Bricks - MicrosoftMongoDB and Azure Data Bricks - Microsoft
MongoDB and Azure Data Bricks - Microsoft
 
Spark and MongoDB
Spark and MongoDBSpark and MongoDB
Spark and MongoDB
 
Tracking data lineage at Stitch Fix
Tracking data lineage at Stitch FixTracking data lineage at Stitch Fix
Tracking data lineage at Stitch Fix
 
Micro-Servicing Linked Data
Micro-Servicing Linked DataMicro-Servicing Linked Data
Micro-Servicing Linked Data
 
Netflix Big Data Paris 2017
Netflix Big Data Paris 2017Netflix Big Data Paris 2017
Netflix Big Data Paris 2017
 
EUNIS 2018 - Migration of a web service back-end from a relational to a docum...
EUNIS 2018 - Migration of a web service back-end from a relational to a docum...EUNIS 2018 - Migration of a web service back-end from a relational to a docum...
EUNIS 2018 - Migration of a web service back-end from a relational to a docum...
 
Detecting Mobile Malware with Apache Spark with David Pryce
Detecting Mobile Malware with Apache Spark with David PryceDetecting Mobile Malware with Apache Spark with David Pryce
Detecting Mobile Malware with Apache Spark with David Pryce
 
Mongodb Spring
Mongodb SpringMongodb Spring
Mongodb Spring
 
Spark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu AdunuthulaSpark Summit Keynote by Seshu Adunuthula
Spark Summit Keynote by Seshu Adunuthula
 
Build robust streaming data pipelines with MongoDB and Kafka P2
Build robust streaming data pipelines with MongoDB and Kafka P2Build robust streaming data pipelines with MongoDB and Kafka P2
Build robust streaming data pipelines with MongoDB and Kafka P2
 
The Internet as a Single Database
The Internet as a Single DatabaseThe Internet as a Single Database
The Internet as a Single Database
 
Building materialised views for linked data systems using microservices
Building materialised views for linked data systems using microservicesBuilding materialised views for linked data systems using microservices
Building materialised views for linked data systems using microservices
 
Spark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren NathanSpark Summit Keynote by Suren Nathan
Spark Summit Keynote by Suren Nathan
 

Similar to It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata Growth

Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...
eSAT Journals
 
Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...
eSAT Publishing House
 
GoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
GoodRelations & RDFa for Deep Comparison Shopping on a Web ScaleGoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
GoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
Martin Hepp
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
Scott Sanders
 
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014Robert Meusel
 
The Data Records Extraction from Web Pages
The Data Records Extraction from Web PagesThe Data Records Extraction from Web Pages
The Data Records Extraction from Web Pages
ijtsrd
 
Study on Web Content Extraction Techniques
Study on Web Content Extraction TechniquesStudy on Web Content Extraction Techniques
Study on Web Content Extraction Techniques
ijtsrd
 
Kellogg XML Holland Speech
Kellogg XML Holland SpeechKellogg XML Holland Speech
Kellogg XML Holland Speech
Dave Kellogg
 
BigData
BigDataBigData
BigData
Viveka Sharma
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
IRJET Journal
 
SMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebSMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebMatthew Brown
 
F43033234
F43033234F43033234
F43033234
IJERA Editor
 
A Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web DatabasesA Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web Databases
IJMER
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Web Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features ConceptWeb Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features Concept
ijceronline
 
A detail survey of page re ranking various web features and techniques
A detail survey of page re ranking various web features and techniquesA detail survey of page re ranking various web features and techniques
A detail survey of page re ranking various web features and techniques
ijctet
 
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKS
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKSA LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKS
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKS
csandit
 
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?Martin Hepp
 
Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...
Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...
Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...
IOSR Journals
 
Business Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search EngineBusiness Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search Engineankur881120
 

Similar to It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata Growth (20)

Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...
 
Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...Comparison of used metadata elements in digital libraries in iran with dublin...
Comparison of used metadata elements in digital libraries in iran with dublin...
 
GoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
GoodRelations & RDFa for Deep Comparison Shopping on a Web ScaleGoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
GoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
 
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
The Web Data Commons Microdata, RDFa, and Microformat Dataset Series @ ISWC2014
 
The Data Records Extraction from Web Pages
The Data Records Extraction from Web PagesThe Data Records Extraction from Web Pages
The Data Records Extraction from Web Pages
 
Study on Web Content Extraction Techniques
Study on Web Content Extraction TechniquesStudy on Web Content Extraction Techniques
Study on Web Content Extraction Techniques
 
Kellogg XML Holland Speech
Kellogg XML Holland SpeechKellogg XML Holland Speech
Kellogg XML Holland Speech
 
BigData
BigDataBigData
BigData
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
 
SMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebSMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic Web
 
F43033234
F43033234F43033234
F43033234
 
A Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web DatabasesA Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web Databases
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Web Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features ConceptWeb Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features Concept
 
A detail survey of page re ranking various web features and techniques
A detail survey of page re ranking various web features and techniquesA detail survey of page re ranking various web features and techniques
A detail survey of page re ranking various web features and techniques
 
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKS
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKSA LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKS
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKS
 
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
 
Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...
Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...
Advance Frameworks for Hidden Web Retrieval Using Innovative Vision-Based Pag...
 
Business Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search EngineBusiness Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search Engine
 

More from Shawn Jones

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Shawn Jones
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
Shawn Jones
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Shawn Jones
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Shawn Jones
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
Shawn Jones
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
Shawn Jones
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Shawn Jones
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web Archives
Shawn Jones
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web Archives
Shawn Jones
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Shawn Jones
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento Toolkit
Shawn Jones
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-It
Shawn Jones
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web Archives
Shawn Jones
 
Reference Rot
Reference RotReference Rot
Reference Rot
Shawn Jones
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive Collections
Shawn Jones
 
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using MementoAvoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
Shawn Jones
 
Continuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonestContinuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonest
Shawn Jones
 
A Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven DevelopmentA Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven Development
Shawn Jones
 
Reconstructing the past with media wiki
Reconstructing the past with media wikiReconstructing the past with media wiki
Reconstructing the past with media wiki
Shawn Jones
 

More from Shawn Jones (19)

Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
DIRA 2022 Poster -- Abstract Images Have Different Levels of Retrievability P...
 
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
Abstract Images Have Different Levels of Retrievability Per Reverse Image Sea...
 
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...Improving Collection Understanding For Web Archives With Storytelling: Shinin...
Improving Collection Understanding For Web Archives With Storytelling: Shinin...
 
Automatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social CardsAutomatically Selecting Striking Images for Social Cards
Automatically Selecting Striking Images for Social Cards
 
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)SHARI(StoryGraph Hypercane ArchiveNow Raintale Integration)
SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration)
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
 
Storytelling With Web Archives
Storytelling With Web ArchivesStorytelling With Web Archives
Storytelling With Web Archives
 
Combining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web ArchivesCombining Social Media Storytelling With Web Archives
Combining Social Media Storytelling With Web Archives
 
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...Improving Understanding of Web Archive Collections Through Storytelling - PhD...
Improving Understanding of Web Archive Collections Through Storytelling - PhD...
 
The Off-Topic Memento Toolkit
The Off-Topic Memento ToolkitThe Off-Topic Memento Toolkit
The Off-Topic Memento Toolkit
 
The Many Shapes of Archive-It
The Many Shapes of Archive-ItThe Many Shapes of Archive-It
The Many Shapes of Archive-It
 
Improving Collection Understanding in Web Archives
Improving Collection Understanding in Web ArchivesImproving Collection Understanding in Web Archives
Improving Collection Understanding in Web Archives
 
Reference Rot
Reference RotReference Rot
Reference Rot
 
Where Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive CollectionsWhere Can We Post Stories Summarizing Web Archive Collections
Where Can We Post Stories Summarizing Web Archive Collections
 
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using MementoAvoiding Spoilers On MediaWiki Fan Sites Using Memento
Avoiding Spoilers On MediaWiki Fan Sites Using Memento
 
Continuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonestContinuous Integration: Finding problems soonest
Continuous Integration: Finding problems soonest
 
A Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven DevelopmentA Brief Introduction to Test-Driven Development
A Brief Introduction to Test-Driven Development
 
Reconstructing the past with media wiki
Reconstructing the past with media wikiReconstructing the past with media wiki
Reconstructing the past with media wiki
 

Recently uploaded

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 

Recently uploaded (20)

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 

It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata Growth

  • 1. @shawnmjones @WebSciDL It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata Growth Shawn M. Jones· Valentina Neblitt-Jones· Martin Klein Los Alamos National Laboratory Research Library Michele C. Weigle· Michael L. Nelson Old Dominion University Web Science and Digital Libraries Research Group
  • 2. @shawnmjones @WebSciDL Metadata is key to organizing content and providing context Creating metadata takes time and effort. Web page authors can add metadata to their pages with HTML’s META element. 2
  • 3. @shawnmjones @WebSciDL Web page authors have many choices in metadata standards 3
  • 4. @shawnmjones @WebSciDL Creating metadata is expensive 4 How do authors spend their metadata budget?
  • 5. @shawnmjones @WebSciDL Past studies focused on Dublin Core, and show that systems favor certain fields 5 title is the most popular field per 10 studies description is the second most popular field per 6 studies
  • 6. @shawnmjones @WebSciDL Our study evaluates the evolution of metadata usage over time 6 Web archives capture web page HTML, JavaScript, CSS, and embedded content as mementos. Mementos have a specific capture date and time, their memento-datetime. Each memento represents an author’s behavior at that specific time. 2/28/2021 3/20/2021 3/27/2021
  • 7. @shawnmjones @WebSciDL We thank Max Grusky for access to the NEWSROOM dataset 7 NEWSROOM contains 1.3 million mementos of news articles that contain metadata. All articles contain at least an HTML description field. NEWSROOM’s mementos were captured by the Internet Archive between 1998 and 2016.
  • 8. @shawnmjones @WebSciDL We sampled 277,724 mementos of news articles from the 39 outlets found in NEWSROOM 8
  • 9. @shawnmjones @WebSciDL In 1998, the mean number of metadata fields used was 2 by 2016, it was 39 9 The sharp increase in 2006 may be an artifact of the uneven sampling in the dataset. 2 39 If we look at each individual metadata field, how are they being used?
  • 10. @shawnmjones @WebSciDL We grouped metadata fields into categories 10 Metadata usage exploded after 2008. A category’s size = percentage of articles that contain at least one metadata field from that category.
  • 11. @shawnmjones @WebSciDL We evaluated the use of the fields specified in HTML standards from HTML 2.0 to HTML 5 11 keywords are still in use even though most search engines do not process them. author usage is on the rise. The heavy use of description is an artifact of the dataset.
  • 12. @shawnmjones @WebSciDL To contrast with previous studies, we analyzed the adoption of Dublin Core 12 Dublin Core’s usage has not grown much compared to other categories.
  • 13. @shawnmjones @WebSciDL Schema.org is designed to assist search engines 13 SEO experts imply better placement among search results for pages using schema.org, but the adoption rate seems moderate.
  • 14. @shawnmjones @WebSciDL Other search engine metadata usage has not grown much either 14 We see very similar usage for metadata related to identifying pages for Google and Bing.
  • 15. @shawnmjones @WebSciDL Metadata that supports sharing on social media has experienced a renaissance Social cards are summaries of web pages shared on social media. twitter:image twitter:title twitter:description 15 They are built from authors’ web page metadata.
  • 16. @shawnmjones @WebSciDL Usage of OGP (Facebook) fields for social cards has skyrocketed since it was introduced 16 Card fields required per testing are outlined in red. Additional card fields required per documentation are in dotted red. There has been far less growth for fields not related to social cards.
  • 17. @shawnmjones @WebSciDL The Twitter Card standard shows the same meteoric rise in metadata usage specific to social cards 17 The card fields required after we tested creating cards with Twitter are outlined in red. Additional card fields required per documentation are in dotted red. The growing field usage mirrors their Facebook counterparts. Twitter will use OGP fields, but only if twitter:card is specified.
  • 18. @shawnmjones @WebSciDL Facebook supports non-OGP fields as part of its Marketing API 18 Facebook’s sharing debugger implies that authors need to supply fb:app_id for Facebook to generate a card, but it works fine without it. Many of the articles we reviewed contained a blank string or “dummy value” for this field.
  • 19. @shawnmjones @WebSciDL In conclusion: It’s all about the cards 19 • We analyzed 227,724 mementos of news articles to understand how authors used their metadata budget. • In 2008, metadata usage exploded. • When we break down usage by individual fields, we see that authors favor fields associated with social cards. • This insight can help future metadata standard authors understand what spurs metadata adoption. S. M. Jones, V. Neblitt-Jones, M. C. Weigle, M. Klein, and M. L. Nelson, “It's All About The Cards: Sharing on Social Media Probably Encouraged HTML Metadata Growth,” ACM/IEEE Joint Conference on Digital Libraries, 2021. [preprint: https://arxiv.org/abs/2104.04116.]