SlideShare a Scribd company logo
1 of 33
Download to read offline
Enhancing Navigation on Wikipedia with Social Tags
                   Wikimania 2009


                   Arkaitz Zubiaga

                 NLP & IR Group @ UNED


                  August 28th, 2009
Introduction


Index


1   Introduction

2   Navigating on Wikipedia

3   Benefits of Tagging

4   Dataset Generation

5   Results

6   Conclusions



    Arkaitz Zubiaga (UNED)     Wikipedia & Social Tags   August 28th, 2009   2 / 33
Introduction


A Tagging System




  Arkaitz Zubiaga (UNED)    Wikipedia & Social Tags   August 28th, 2009   3 / 33
Introduction


Simple Tagging




  Arkaitz Zubiaga (UNED)    Wikipedia & Social Tags   August 28th, 2009   4 / 33
Introduction


Collaborative Tagging




  Arkaitz Zubiaga (UNED)    Wikipedia & Social Tags   August 28th, 2009   5 / 33
Introduction


Tag Cloud




  Arkaitz Zubiaga (UNED)    Wikipedia & Social Tags   August 28th, 2009   6 / 33
Navigating on Wikipedia


Index


1   Introduction

2   Navigating on Wikipedia

3   Benefits of Tagging

4   Dataset Generation

5   Results

6   Conclusions



    Arkaitz Zubiaga (UNED)                Wikipedia & Social Tags   August 28th, 2009   7 / 33
Navigating on Wikipedia


Navigating on Wikipedia




    Categories
    Links in articles
    Search engine




  Arkaitz Zubiaga (UNED)                Wikipedia & Social Tags   August 28th, 2009   8 / 33
Navigating on Wikipedia


Navigating on Wikipedia: Categories




    Pro: Great for organizing content
    Con: Limited to the taxonomy and categorization.
           An article is or is not in a category.




  Arkaitz Zubiaga (UNED)                Wikipedia & Social Tags   August 28th, 2009   9 / 33
Navigating on Wikipedia


Navigating on Wikipedia: Links in articles




     Pro: Access to related articles
     Con: Subject to link availability




   Arkaitz Zubiaga (UNED)                Wikipedia & Social Tags   August 28th, 2009   10 / 33
Navigating on Wikipedia


Navigating on Wikipedia: Search engine




    Pro: Article search by keyword
    Con: Subject to term availability in content/categories

  Arkaitz Zubiaga (UNED)                Wikipedia & Social Tags   August 28th, 2009   11 / 33
Navigating on Wikipedia


Motivation




Can article retrieval be improved by means of
tagging?




  Arkaitz Zubiaga (UNED)                Wikipedia & Social Tags   August 28th, 2009   12 / 33
Benefits of Tagging


Index


1   Introduction

2   Navigating on Wikipedia

3   Benefits of Tagging

4   Dataset Generation

5   Results

6   Conclusions



    Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   13 / 33
Benefits of Tagging


Benefits of Tagging




Tags...
     ...are simple
     ...rely on an open vocabulary
     ...can be aggregated
     ...allow users to differ on the definition




   Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   14 / 33
Benefits of Tagging


How Can We Navigate Through Tags?




Three ways to navigate through tags1 :
       Pivot-browsing
       Popularity
       Filtering




  1
      Tagging: People-Powered Metadata for the Social Web, by Gene Smith
   Arkaitz Zubiaga (UNED)            Wikipedia & Social Tags    August 28th, 2009   15 / 33
Benefits of Tagging


Tag Navigation: Pivot-browsing

    Benefit: Switching to related tags/topics.




  Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   16 / 33
Benefits of Tagging


Tag Navigation: Popularity


    Benefit: What has people considered interesting/relevant for a tag?




  Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   17 / 33
Benefits of Tagging


Tag Navigation: Filtering
    Benefit: Retrieving documents containing a tag(s), but not another.
    Hence, users can retrieve articles related to a topic, and excluding a
    subtopic.




  Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   18 / 33
Benefits of Tagging


How Can Tagging Enhance Taxonomies? (1)




    Definition
           Categories must be created before using them, unlike tags, which are
           created the first time a user uses them
           Each user defines a set of tags, generating a global weighted set of tags
           as a result. This allows a tag to be more relevant than another, unlike
           with categories, for which an agreement among users is necessary.
           Tags can handle synonimy, hypernims, hyponims,... unlike categories.




  Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags     August 28th, 2009   19 / 33
Benefits of Tagging


How Can Tagging Enhance Taxonomies? (and 2)




    Navigation and search
           Providing the aforementioned new ways to navigate.
           Providing new terms to describe the articles.
           Providing a bottom-up classification, instead of a top-down one.




  Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags    August 28th, 2009   20 / 33
Benefits of Tagging


An Example Combining Tags and Categories: Amazon




  Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   21 / 33
Benefits of Tagging


Avoiding Tag Mess


       Letting users to relate tags, as in Librarything2 :




  2
      http://www.librarything.com
  Arkaitz Zubiaga (UNED)             Wikipedia & Social Tags   August 28th, 2009   22 / 33
Dataset Generation


Index


1   Introduction

2   Navigating on Wikipedia

3   Benefits of Tagging

4   Dataset Generation

5   Results

6   Conclusions



    Arkaitz Zubiaga (UNED)           Wikipedia & Social Tags   August 28th, 2009   23 / 33
Dataset Generation


Dataset Generation


      Starting point: a set of 2M+ English Wikipedia3 articles.
      Data retrieval:
           Tag data for each article on Delicious4 .
           Article content.
      Removed not relevant tags for Wikipedia articles: wikipedia,
      reference, wiki.
      Filtered to articles annotated by at least 10 users.
      Result: 20,764 tagged Wikipedia articles5 .



  3
    http://en.wikipedia.org
  4
    http://delicious.com
  5
    The dataset is available for download at http://nlp.uned.es/social-tagging/
  Arkaitz Zubiaga (UNED)            Wikipedia & Social Tags        August 28th, 2009   24 / 33
Results


Index


1   Introduction

2   Navigating on Wikipedia

3   Benefits of Tagging

4   Dataset Generation

5   Results

6   Conclusions



    Arkaitz Zubiaga (UNED)    Wikipedia & Social Tags   August 28th, 2009   25 / 33
Results


Results: Tag cloud




  Arkaitz Zubiaga (UNED)   Wikipedia & Social Tags   August 28th, 2009   26 / 33
Results


Results: Presence of tags




                               Found         Not Found    % Found
                 Document     251,139         206,569     54.86%
                 Content      202,151         255,557     44.16%
                 Categories   35,237          422,471      7.70%




  Arkaitz Zubiaga (UNED)        Wikipedia & Social Tags       August 28th, 2009   27 / 33
Results


Results: Presence of tags




  Arkaitz Zubiaga (UNED)   Wikipedia & Social Tags   August 28th, 2009   28 / 33
Results


Results: Some examples




    Users defined the tag programming for the article Framework, but
    that word is not present in content.
    The same occurs for the tag mathematics in the article Zipf’s law.
    As well as for the tag audio in the article List of Internet stations.




  Arkaitz Zubiaga (UNED)      Wikipedia & Social Tags      August 28th, 2009   29 / 33
Results


Prototype




A preliminary prototype on applying tags to
Wikipedia can be found at:
http://taggedwiki.zubiaga.org




  Arkaitz Zubiaga (UNED)   Wikipedia & Social Tags   August 28th, 2009   30 / 33
Conclusions


Index


1   Introduction

2   Navigating on Wikipedia

3   Benefits of Tagging

4   Dataset Generation

5   Results

6   Conclusions



    Arkaitz Zubiaga (UNED)     Wikipedia & Social Tags   August 28th, 2009   31 / 33
Conclusions


Conclusions


    A social tagging system would provide new ways to navigate
    Wikipedia:
           Pivot-browsing
           Popularity
           Filtering
    Bottom-up classification besides top-down classification.
    An experiment with tags over Wikipedia shows encouraging results.
           Providing new unexisting terms.
           Providing new ways to navigate through tags.
           Helping to improve the search engine.
           Discovering popular articles.




  Arkaitz Zubiaga (UNED)        Wikipedia & Social Tags   August 28th, 2009   32 / 33
Conclusions


Thank You for Your Attention



    Achiu Arigato Danke Dhannvaad Dua Netjer en ek         Efcharisto
  Gracias Gr`cies Gratia Grazie Guishepeli Hvala Kiitos
               a
 K¨sz¨n¨m Merc´ Merci Mila esker Obrigado Shukran
   o o o          e
Shukriya Tack Tak Takk T¨nan Tapadh leat Tesekk¨r ederim Thank
                        a                      u
                            you Toda
                           http://blog.zubiaga.org



  Arkaitz Zubiaga (UNED)         Wikipedia & Social Tags    August 28th, 2009   33 / 33

More Related Content

Similar to Enhancing Navigation on Wikipedia with Social Tags

FirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearchFirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearchwebuploader
 
Sm4em Social Media for Environmental Monitoring
Sm4em Social Media for Environmental MonitoringSm4em Social Media for Environmental Monitoring
Sm4em Social Media for Environmental MonitoringMike Seyfang
 
From Academic Library 2.0 to (Literature) Research 2.0
From Academic Library 2.0  to (Literature) Research 2.0From Academic Library 2.0  to (Literature) Research 2.0
From Academic Library 2.0 to (Literature) Research 2.0Michael Habib
 
Next Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounNext Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounKaren S Calhoun
 
Everyday digital scholarship: Using web-based tools for research
Everyday digital scholarship: Using web-based tools for researchEveryday digital scholarship: Using web-based tools for research
Everyday digital scholarship: Using web-based tools for researchFrancesca Di Donato
 
Wikimedia Foundation - Wikipedia and journalism
Wikimedia Foundation - Wikipedia and journalismWikimedia Foundation - Wikipedia and journalism
Wikimedia Foundation - Wikipedia and journalismWikimedia Foundation
 
Wikipedia - The most successful encyclopedia in the world
Wikipedia - The most successful encyclopedia in the worldWikipedia - The most successful encyclopedia in the world
Wikipedia - The most successful encyclopedia in the worldMubashar Iqbal
 
20090326 Speech Ceal Cjm Presentation.Ver
20090326 Speech Ceal Cjm Presentation.Ver20090326 Speech Ceal Cjm Presentation.Ver
20090326 Speech Ceal Cjm Presentation.Ver真 岡本
 
CSSIS - Social Bookmarking
CSSIS - Social BookmarkingCSSIS - Social Bookmarking
CSSIS - Social Bookmarkinglizg123
 
Open Knowledge Management
Open Knowledge ManagementOpen Knowledge Management
Open Knowledge ManagementFrieda Brioschi
 
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...Gregoire Burel
 
Concrete meta research - how to collect, manage, and read papers?
Concrete meta research - how to collect, manage, and read papers?Concrete meta research - how to collect, manage, and read papers?
Concrete meta research - how to collect, manage, and read papers?Tao He
 

Similar to Enhancing Navigation on Wikipedia with Social Tags (20)

Web 2.0 in Forestry
Web 2.0 in ForestryWeb 2.0 in Forestry
Web 2.0 in Forestry
 
Wikimedia, Wiki & Web 2.0, Free Culture
Wikimedia, Wiki & Web 2.0, Free CultureWikimedia, Wiki & Web 2.0, Free Culture
Wikimedia, Wiki & Web 2.0, Free Culture
 
Web searching
Web searchingWeb searching
Web searching
 
Participation Literacy
Participation Literacy Participation Literacy
Participation Literacy
 
FirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearchFirstWorkshopOnWikipediaResearch
FirstWorkshopOnWikipediaResearch
 
Sm4em Social Media for Environmental Monitoring
Sm4em Social Media for Environmental MonitoringSm4em Social Media for Environmental Monitoring
Sm4em Social Media for Environmental Monitoring
 
From Academic Library 2.0 to (Literature) Research 2.0
From Academic Library 2.0  to (Literature) Research 2.0From Academic Library 2.0  to (Literature) Research 2.0
From Academic Library 2.0 to (Literature) Research 2.0
 
Next Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 CalhounNext Generation Technical Services May 2009 Calhoun
Next Generation Technical Services May 2009 Calhoun
 
Everyday digital scholarship: Using web-based tools for research
Everyday digital scholarship: Using web-based tools for researchEveryday digital scholarship: Using web-based tools for research
Everyday digital scholarship: Using web-based tools for research
 
Wikimedia Foundation - Wikipedia and journalism
Wikimedia Foundation - Wikipedia and journalismWikimedia Foundation - Wikipedia and journalism
Wikimedia Foundation - Wikipedia and journalism
 
Wikipedia - The most successful encyclopedia in the world
Wikipedia - The most successful encyclopedia in the worldWikipedia - The most successful encyclopedia in the world
Wikipedia - The most successful encyclopedia in the world
 
Use of web_2.0_tools_01_june_2012
Use of web_2.0_tools_01_june_2012Use of web_2.0_tools_01_june_2012
Use of web_2.0_tools_01_june_2012
 
20090326 Speech Ceal Cjm Presentation.Ver
20090326 Speech Ceal Cjm Presentation.Ver20090326 Speech Ceal Cjm Presentation.Ver
20090326 Speech Ceal Cjm Presentation.Ver
 
BioSharing
BioSharingBioSharing
BioSharing
 
BioSharing catalogue
BioSharing catalogueBioSharing catalogue
BioSharing catalogue
 
CSSIS - Social Bookmarking
CSSIS - Social BookmarkingCSSIS - Social Bookmarking
CSSIS - Social Bookmarking
 
Open Knowledge Management
Open Knowledge ManagementOpen Knowledge Management
Open Knowledge Management
 
Wikimedia india
Wikimedia indiaWikimedia india
Wikimedia india
 
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...
 
Concrete meta research - how to collect, manage, and read papers?
Concrete meta research - how to collect, manage, and read papers?Concrete meta research - how to collect, manage, and read papers?
Concrete meta research - how to collect, manage, and read papers?
 

More from azubiaga

Exploiting context for rumour detection in social media
Exploiting context for rumour detection in social mediaExploiting context for rumour detection in social media
Exploiting context for rumour detection in social mediaazubiaga
 
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social MediaCrowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social Mediaazubiaga
 
Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...
Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...
Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...azubiaga
 
Clasificación de Páginas Web con Anotaciones Sociales
Clasificación de Páginas Web con Anotaciones SocialesClasificación de Páginas Web con Anotaciones Sociales
Clasificación de Páginas Web con Anotaciones Socialesazubiaga
 
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?azubiaga
 
Master thesis presentation
Master thesis presentationMaster thesis presentation
Master thesis presentationazubiaga
 

More from azubiaga (6)

Exploiting context for rumour detection in social media
Exploiting context for rumour detection in social mediaExploiting context for rumour detection in social media
Exploiting context for rumour detection in social media
 
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social MediaCrowdsourcing the Annotation of Rumourous Conversations in Social Media
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
 
Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...
Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...
Newspaper Editors vs the Crowd: On the Appropriateness of Front Page News Sel...
 
Clasificación de Páginas Web con Anotaciones Sociales
Clasificación de Páginas Web con Anotaciones SocialesClasificación de Páginas Web con Anotaciones Sociales
Clasificación de Páginas Web con Anotaciones Sociales
 
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
 
Master thesis presentation
Master thesis presentationMaster thesis presentation
Master thesis presentation
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Enhancing Navigation on Wikipedia with Social Tags

  • 1. Enhancing Navigation on Wikipedia with Social Tags Wikimania 2009 Arkaitz Zubiaga NLP & IR Group @ UNED August 28th, 2009
  • 2. Introduction Index 1 Introduction 2 Navigating on Wikipedia 3 Benefits of Tagging 4 Dataset Generation 5 Results 6 Conclusions Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 2 / 33
  • 3. Introduction A Tagging System Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 3 / 33
  • 4. Introduction Simple Tagging Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 4 / 33
  • 5. Introduction Collaborative Tagging Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 5 / 33
  • 6. Introduction Tag Cloud Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 6 / 33
  • 7. Navigating on Wikipedia Index 1 Introduction 2 Navigating on Wikipedia 3 Benefits of Tagging 4 Dataset Generation 5 Results 6 Conclusions Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 7 / 33
  • 8. Navigating on Wikipedia Navigating on Wikipedia Categories Links in articles Search engine Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 8 / 33
  • 9. Navigating on Wikipedia Navigating on Wikipedia: Categories Pro: Great for organizing content Con: Limited to the taxonomy and categorization. An article is or is not in a category. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 9 / 33
  • 10. Navigating on Wikipedia Navigating on Wikipedia: Links in articles Pro: Access to related articles Con: Subject to link availability Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 10 / 33
  • 11. Navigating on Wikipedia Navigating on Wikipedia: Search engine Pro: Article search by keyword Con: Subject to term availability in content/categories Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 11 / 33
  • 12. Navigating on Wikipedia Motivation Can article retrieval be improved by means of tagging? Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 12 / 33
  • 13. Benefits of Tagging Index 1 Introduction 2 Navigating on Wikipedia 3 Benefits of Tagging 4 Dataset Generation 5 Results 6 Conclusions Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 13 / 33
  • 14. Benefits of Tagging Benefits of Tagging Tags... ...are simple ...rely on an open vocabulary ...can be aggregated ...allow users to differ on the definition Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 14 / 33
  • 15. Benefits of Tagging How Can We Navigate Through Tags? Three ways to navigate through tags1 : Pivot-browsing Popularity Filtering 1 Tagging: People-Powered Metadata for the Social Web, by Gene Smith Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 15 / 33
  • 16. Benefits of Tagging Tag Navigation: Pivot-browsing Benefit: Switching to related tags/topics. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 16 / 33
  • 17. Benefits of Tagging Tag Navigation: Popularity Benefit: What has people considered interesting/relevant for a tag? Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 17 / 33
  • 18. Benefits of Tagging Tag Navigation: Filtering Benefit: Retrieving documents containing a tag(s), but not another. Hence, users can retrieve articles related to a topic, and excluding a subtopic. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 18 / 33
  • 19. Benefits of Tagging How Can Tagging Enhance Taxonomies? (1) Definition Categories must be created before using them, unlike tags, which are created the first time a user uses them Each user defines a set of tags, generating a global weighted set of tags as a result. This allows a tag to be more relevant than another, unlike with categories, for which an agreement among users is necessary. Tags can handle synonimy, hypernims, hyponims,... unlike categories. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 19 / 33
  • 20. Benefits of Tagging How Can Tagging Enhance Taxonomies? (and 2) Navigation and search Providing the aforementioned new ways to navigate. Providing new terms to describe the articles. Providing a bottom-up classification, instead of a top-down one. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 20 / 33
  • 21. Benefits of Tagging An Example Combining Tags and Categories: Amazon Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 21 / 33
  • 22. Benefits of Tagging Avoiding Tag Mess Letting users to relate tags, as in Librarything2 : 2 http://www.librarything.com Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 22 / 33
  • 23. Dataset Generation Index 1 Introduction 2 Navigating on Wikipedia 3 Benefits of Tagging 4 Dataset Generation 5 Results 6 Conclusions Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 23 / 33
  • 24. Dataset Generation Dataset Generation Starting point: a set of 2M+ English Wikipedia3 articles. Data retrieval: Tag data for each article on Delicious4 . Article content. Removed not relevant tags for Wikipedia articles: wikipedia, reference, wiki. Filtered to articles annotated by at least 10 users. Result: 20,764 tagged Wikipedia articles5 . 3 http://en.wikipedia.org 4 http://delicious.com 5 The dataset is available for download at http://nlp.uned.es/social-tagging/ Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 24 / 33
  • 25. Results Index 1 Introduction 2 Navigating on Wikipedia 3 Benefits of Tagging 4 Dataset Generation 5 Results 6 Conclusions Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 25 / 33
  • 26. Results Results: Tag cloud Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 26 / 33
  • 27. Results Results: Presence of tags Found Not Found % Found Document 251,139 206,569 54.86% Content 202,151 255,557 44.16% Categories 35,237 422,471 7.70% Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 27 / 33
  • 28. Results Results: Presence of tags Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 28 / 33
  • 29. Results Results: Some examples Users defined the tag programming for the article Framework, but that word is not present in content. The same occurs for the tag mathematics in the article Zipf’s law. As well as for the tag audio in the article List of Internet stations. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 29 / 33
  • 30. Results Prototype A preliminary prototype on applying tags to Wikipedia can be found at: http://taggedwiki.zubiaga.org Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 30 / 33
  • 31. Conclusions Index 1 Introduction 2 Navigating on Wikipedia 3 Benefits of Tagging 4 Dataset Generation 5 Results 6 Conclusions Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 31 / 33
  • 32. Conclusions Conclusions A social tagging system would provide new ways to navigate Wikipedia: Pivot-browsing Popularity Filtering Bottom-up classification besides top-down classification. An experiment with tags over Wikipedia shows encouraging results. Providing new unexisting terms. Providing new ways to navigate through tags. Helping to improve the search engine. Discovering popular articles. Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 32 / 33
  • 33. Conclusions Thank You for Your Attention Achiu Arigato Danke Dhannvaad Dua Netjer en ek Efcharisto Gracias Gr`cies Gratia Grazie Guishepeli Hvala Kiitos a K¨sz¨n¨m Merc´ Merci Mila esker Obrigado Shukran o o o e Shukriya Tack Tak Takk T¨nan Tapadh leat Tesekk¨r ederim Thank a u you Toda http://blog.zubiaga.org Arkaitz Zubiaga (UNED) Wikipedia & Social Tags August 28th, 2009 33 / 33