Interactive and Context-Aware Tag Spell Check and Correction

•

1 like•1,366 views

Fabrizio Silvestri

Technology

Interactive and Context-Aware
Tag Spell Check and Correction
F. Bonchi, Yahoo! Research Barcelona
O. Frieder, Georgetown University, Washington DC, US
F. M. Nardini, F. Silvestri, H. Vahabi, ISTI – CNR, Pisa, Italy

Data Model:
Weighteg Tag Co-occurrence Graph
Consider the
Neighborhoods of a tag’s
contexts. Tags in these
neighborhoods are ranked
according to a measure
function.

Experimental Settings
• We assess the effectiveness and the efficiency of our
spell check and correction technique.
• We run our experiments on a dataset of tags obtained
by a crawl of YouTube videos:
– 568,458 distinct videos
– 5,817,896 tags in total
• After preprocessing the dataset by removing noisy tags
(i.e., the ones having a low frequency) the dataset
contains 4,357,971 tags, 434,156 of which are unique.
• Human editors to assess a random subset of 250 set of
tags associated with the video.

Conclusions
• We introduced a model for context-aware, interactive, tag spell
check and correction:
– our described approach exploits this context together with the
available co-occurrences
• of tags in a set of resources to provide an interactive tag spell
correction.
• Experiments on a dataset of tags coming from YouTube videos show
that our method is effective and outperforms two important
baselines.
• We also prove that the method is efficient as it is able to check and
correct a tag in two milliseconds with a good trade-off in terms of
average precision and average coverage.

In our future investigation, we intend to increase the degree of
interactivity of the system by enabling “as-you-type” tag spell check
and correction.

Similar to Interactive and Context-Aware Tag Spell Check and Correction

Deep neural networks for Youtube recommendationsAryan Khandal

BU - Wellesely iGEM 2011 World FinalsConsuelo Valdes

150219 agbt giab_poster_marcGenomeInABottle

ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGIRJET Journal

Agile Mumbai 2022 - Rohit Handa | Combining Human and Artificial Intelligence...AgileNetwork

IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...IRJET Journal

Metaphors as design points for collaboration 2012KM Chicago

A Research Paper on BFO and PSO Based Movie Recommendation System | J4RV4I1016Journal For Research

IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...IRJET Journal

Content - Based Recommendations Enhanced with Collaborative InformationAlessandro Liparoti

Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres

Tags Prediction from Movie Plot Synopsis Using Machine LearningIRJET Journal

Video + Language: Where Does Domain Knowledge Fit in?Goergen Institute for Data Science

Estimating the overall sentiment score by inferring modus ponens lawInternational Journal of Advance Research and Innovative Ideas in Education

Content-based filtering with applications on tv viewing dataElaine Cecília Gatto

IRJET- Fusion Method for Image Reranking and Similarity Finding based on Topi...IRJET Journal

Visual Object Category RecognitionAshish Gupta

Proceedings Template - WORDbutest

IRJET- Semantic Analysis of Online Customer QueriesIRJET Journal

Similar to Interactive and Context-Aware Tag Spell Check and Correction (20)

Deep neural networks for Youtube recommendations

BU - Wellesely iGEM 2011 World Finals

150219 agbt giab_poster_marc

ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING

Agile Mumbai 2022 - Rohit Handa | Combining Human and Artificial Intelligence...

IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...

Metaphors as design points for collaboration 2012

A Research Paper on BFO and PSO Based Movie Recommendation System | J4RV4I1016

IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...

Content - Based Recommendations Enhanced with Collaborative Information

Web Apollo Tutorial for the i5K copepod research community.

Tags Prediction from Movie Plot Synopsis Using Machine Learning

Video + Language: Where Does Domain Knowledge Fit in?

Estimating the overall sentiment score by inferring modus ponens law

Content-based filtering with applications on tv viewing data

IRJET- Fusion Method for Image Reranking and Similarity Finding based on Topi...

Visual Object Category Recognition

Proceedings Template - WORD

IRJET- Semantic Analysis of Online Customer Queries

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

A Domino Admins Adventures (Engage 2024)Gabriella Davis

GenCyber Cyber Security Day PresentationMichael W. Hawkins

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Handwritten Text Recognition for manuscripts and early printed texts

Driving Behavioral Change for Information Management through Data-Driven Gree...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Salesforce Community Group Quito, Salesforce 101

CNv6 Instructor Chapter 6 Quality of Service

2024: Domino Containers - The Next Step. News from the Domino Container commu...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Finology Group – Insurtech Innovation Award 2024

Presentation on how to chat with PDF using ChatGPT code interpreter

Injustice - Developers Among Us (SciFiDevCon 2024)

[2024]Digital Global Overview Report 2024 Meltwater.pdf

A Domino Admins Adventures (Engage 2024)

GenCyber Cyber Security Day Presentation

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

08448380779 Call Girls In Civil Lines Women Seeking Men

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Scaling API-first – The story of a global engineering organization

Interactive and Context-Aware Tag Spell Check and Correction

1. Interactive and Context-Aware Tag Spell Check and Correction F. Bonchi, Yahoo! Research Barcelona O. Frieder, Georgetown University, Washington DC, US F. M. Nardini, F. Silvestri, H. Vahabi, ISTI – CNR, Pisa, Italy

2. Motivations

3. Data Model: Weighteg Tag Co-occurrence Graph Consider the Neighborhoods of a tag’s contexts. Tags in these neighborhoods are ranked according to a measure function.

4. Measures

5. Experimental Settings • We assess the effectiveness and the efficiency of our spell check and correction technique. • We run our experiments on a dataset of tags obtained by a crawl of YouTube videos: – 568,458 distinct videos – 5,817,896 tags in total • After preprocessing the dataset by removing noisy tags (i.e., the ones having a low frequency) the dataset contains 4,357,971 tags, 434,156 of which are unique. • Human editors to assess a random subset of 250 set of tags associated with the video.

6. Effectiveness

7. Efficiency

8. Conclusions • We introduced a model for context-aware, interactive, tag spell check and correction: – our described approach exploits this context together with the available co-occurrences • of tags in a set of resources to provide an interactive tag spell correction. • Experiments on a dataset of tags coming from YouTube videos show that our method is effective and outperforms two important baselines. • We also prove that the method is efficient as it is able to check and correct a tag in two milliseconds with a good trade-off in terms of average precision and average coverage. In our future investigation, we intend to increase the degree of interactivity of the system by enabling “as-you-type” tag spell check and correction.

Interactive and Context-Aware Tag Spell Check and Correction

Recommended

Recommended

More Related Content

Similar to Interactive and Context-Aware Tag Spell Check and Correction

Similar to Interactive and Context-Aware Tag Spell Check and Correction (20)

Recently uploaded

Recently uploaded (20)

Interactive and Context-Aware Tag Spell Check and Correction