SlideShare a Scribd company logo
1 of 15
Download to read offline
Introduction Results Conclusions




Biographical social networks on Wikipedia
     A cross-cultural study of links that made history


            Pablo Aragon, Andreas Kaltenbrunner,
              David Laniado and Yana Volkovich

                           Social Media Research Group,
                                 Barcelona Media,
                                 Barcelona, Spain


                           August 27th , 2012
                       WikiSym ’12, Linz, Austria



Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions

Outline


  1   Introduction
         Motivation
         Data extraction


  2   Results
        Global network statistics
        Most central persons
        Similarity between languages


  3   Conclusions




       Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions     Motivation Data extraction

Outline


  1   Introduction
         Motivation
         Data extraction


  2   Results
        Global network statistics
        Most central persons
        Similarity between languages


  3   Conclusions




       Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions     Motivation Data extraction

Motivation

  Is history made by great man and women or vice-versa?
      Unclear, but undoubtably social connections shape history.

  Wikipedia as global collective memory place ...
      allows to extract from biographies how social links are
      recorded across cultures ...
      to generate networks of links between biographical articles.

  Research questions
      Who are the most central characters in these networks?
      Do culture related peculiarities exist?
      Which cultures are more similar?
      What is the shared knowledge about connections between
      persons across cultures?
       Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions     Motivation Data extraction

Data extraction
Building biographical networks for 15 language editions of Wikipedia


         Selected the 15 largest language editions of Wikipedias
         Starting point: 296 511 biographies from the English
         Wikipedia (from DBpedia)
         Identified the corresponding articles (when existing) on the
         remaining 14 languages
         Generated a directed network for each language version:
                nodes → persons
                edges → links between the articles of the corresponding
                persons
         Manage alternative titles of articles: track redirects
         Data collected through Wikipedia APIs between
         September 8th and 13th, 2011



          Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions         Motivation Data extraction

Redirect statistics
Distribution of the number of redirects per biographical article in the English Wikipedia




                         5
                        10                                                                           Persons with most redirects:
                                                                                                       Muammar al−Gaddafi 251
                                                                                                          Osama bin Laden 117
                                                                                                               Barack Obama 114
                         4                                                                                              Jesus 109
                        10                                                                                         Elizabeth II 101
                                                                                                                       Eminem 96
                                                                                                                 Joseph Stalin 88
                                                                                                               Omar al−Bashir 87
           # articles




                         3
                        10                                                                                       Genghis Khan 84
                                                                                                      Pyotr Ilyich Tchaikovsky 84
                                                                                                        Athelred the Unready 83
                                                                                                              George W. Bush 80
                         2
                        10                                                                            Mary (mother of Jesus) 80



                         1
                        10




                         0
                                 # redirects per article
                        10 0                                       1                                             2
                          10                                      10                                          10
                                                                       # redirects




          Aragon, Kaltenbrunner, Laniado & Volkovich                    Biographical social networks on Wikipedia
Introduction Results Conclusions     Global network statistics Most central persons Similarity betwe

Outline


  1   Introduction
         Motivation
         Data extraction


  2   Results
        Global network statistics
        Most central persons
        Similarity between languages


  3   Conclusions




       Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions        Global network statistics Most central persons Similarity betwe

Properties of the different language networks

        Language       code          N           K       C     % GC         d       r      dmax
        English         en      198 190     928 339    0.03     95%       6.53    0.17      43
        German          de       62 402     260 889    0.05     94%       6.83    0.14      33
        French           fr      51 811     283 453    0.06     96%       6.11    0.15      36
        Italian          it      35 756     190 867    0.06     95%       6.28    0.14      42
        Spanish         es       34 828     169 302    0.06     97%       6.29    0.16      36
        Japanese        ja       26 155     109 081    0.08     96%       6.47    0.20      26
        Dutch           nl       24 496      76 651    0.08     94%       7.91    0.18      37
        Portuguese      pt       23 705      85 295    0.07     94%       6.98    0.18      45
        Swedish         sv       23 085      60 745    0.07     91%       8.27    0.20      46
        Polish          pl       22 438      50 050    0.08     85%       8.94    0.16      43
        Finish           fi       18 594      44 941    0.07     87%       7.80    0.17      30
        Norwegian       no       18 423      49 303    0.09     83%       8.31    0.22      48
        Russian         ru       16 403      34 436    0.06     87%       9.10    0.10      35
        Chinese         zh       11 715      44 739    0.17     91%       7.20    0.20      32
        Catalan         ca       11 027      42 321    0.09     93%       7.14    0.17      32


     N, K → number of (not isolated) nodes and edges
      C → average clustering coefficient
     GC → percentage of nodes in the giant component
     r → reciprocity
      d → average path-length between nodes
     dmax → maximal distance between two nodes in the network

      Aragon, Kaltenbrunner, Laniado & Volkovich      Biographical social networks on Wikipedia
Introduction Results Conclusions         Global network statistics Most central persons Similarity betwe

Most central persons in the English Wikipedia
sorted by in-degree. Ranks for out-degree, betweenness and PageRank in parenthesis


             person                    in-degree        out-degree     betw.        PageRank
             George W. Bush                 2123       89      (107)     (1)     0.00209    (1)
             Barack Obama                   1677       51      (710)     (8)     0.00162    (2)
             Bill Clinton                   1660       74      (205)     (4)     0.00156    (4)
             Ronald Reagan                  1652       90      (103)     (2)     0.00156    (3)
             Adolf Hitler                   1407      119       (26)     (3)     0.00149    (5)
             Richard Nixon                  1299       86      (127)     (7)     0.00136    (6)
             William Shakespeare            1229       25     (4203)    (63)     0.00113    (9)
             John F. Kennedy                1208      104       (53)     (5)     0.00123    (8)
             Franklin D. Roosevelt          1052       71      (237)    (15)     0.00131    (7)
             Lyndon B. Johnson              1000      106       (50)    (12)     0.00108   (11)
             Jimmy Carter                    953       80      (158)     (9)     0.00113   (10)
             Elvis Presley                   948       82      (142)    (27)     0.00063   (24)
             Pope John Paul II               941       59      (444)    (11)     0.00083   (18)
             Dwight D. Eisenhower            891       55      (564)    (22)     0.00095   (14)
             Frank Sinatra                   882      108       (47)    (18)     0.00056   (28)
             George H. W. Bush               878       87      (118)    (19)     0.00096   (13)
             Abraham Lincoln                 846       54      (593)    (40)     0.00089   (16)
             Bob Dylan                       835      151       (11)    (14)     0.00055   (30)
             Winston Churchill               748       84      (136)    (10)     0.00092   (15)
             Harry S. Truman                 743       81      (145)    (24)     0.00099   (12)
             Joseph Stalin                   723       69      (265)    (43)     0.00089   (17)
             Michael Jackson                 663       71      (237)    (34)     0.00042   (51)
             Elizabeth II                    653       52      (665)     (6)     0.00074   (19)
             Jesus                           572       38     (1595)    (51)     0.00068   (20)
             Hillary Rodham Clinton          554       87      (118)    (32)     0.00063   (25)



         Aragon, Kaltenbrunner, Laniado & Volkovich       Biographical social networks on Wikipedia
Introduction Results Conclusions     Global network statistics Most central persons Similarity betwe

Most central persons in different language Wikipedias
Top 5 most central persons for each language by betweenness

     lang            #1                 #2                   #3                    #4                  #5
     en    George W. Bush       Ronald Reagan           Adolf Hitler          Bill Clinton    John F. Kennedy
     de         Adolf Hitler   George W. Bush     Martin Luther King, Jr    Barack Obama        Frank Sinatra
     fr         Adolf Hitler   George W. Bush     William Shakespeare       Barack Obama       Jacques Chirac
     it       Frank Sinatra    George W. Bush      Pope John Paul II       Michael Jackson        Elton John
     es    Michael Jackson         Fidel Castro   William Shakespeare        Che Guevara          Adolf Hitler
     ja         Adolf Hitler   Michael Jackson      Ronald Reagan           Yukio Mishima      Barack Obama
     nl        Elvis Presley        Adolf Hitler        Bill Clinton        Joseph Stalin   William Shakespeare
     pt    Michael Jackson      Richard Wagner          Adolf Hitler       Ronald Reagan         David Bowie
     sv    George W. Bush      Winston Churchill        Elizabeth II       Michael Jackson        Adolf Hitler
     pl         Elizabeth II   Pope John Paul II   Margaret Thatcher       George W. Bush     Ronald Reagan
     fi       Barack Obama           Adolf Hitler    Michael Jackson        George W. Bush     Benito Mussolini
     no     Marilyn Monroe          Adolf Hitler    John F. Kennedy           Bob Dylan           Bill Clinton
     ru   William Shakespeare       Napoleon II    Kenneth Branagh            Elton John        Joseph Stalin
     zh    Chiang Kai-Shek    William Shakespeare    Barack Obama           Deng Xiaoping         Adolf Hitler
     ca         Adolf Hitler       Che Guevara         Juan Carlos I     Michael Schumacher Juan Manuel Fangio


   Most are known to be (or have been) highly influential
          We find political leaders, revolutionaries, famous
          musicians, writers and actors.
          Hitler, Bush, Obama dominate in almost all top rankings.
          Top ranked in many languages reflect country specifities.

           Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions     Global network statistics Most central persons Similarity betwe

Languages similarity network
Every language links to the two most similar ones according to Jaccard coefficient




   Definition of Jaccard coefficient J
      Given the set of links A and B of two networks
                                                       |A ∩ B|
                                                  J=
                                                       |A ∪ B|

         J is the ratio between the number of links present in both
         networks (their intersection) and the number of links
         existing in their union.
          Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions     Global network statistics Most central persons Similarity betwe

Intersection of networks in different languages




                                                                                         .

      Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions

Outline


  1   Introduction
         Motivation
         Data extraction


  2   Results
        Global network statistics
        Most central persons
        Similarity between languages


  3   Conclusions




       Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions

Conclusions and future work
  Conclusions
     Global social network measures are largely similar for all
     networks.
      Most central persons unveil interesting peculiarities about
      the language communities.
      Networks are more similar for geographically or
      linguistically closer communities.
      Many connections which can be found in most of the
      analysed language Wikipedias.

  Future work
      Application of the methodology to generate subnetworks of
      other kinds of article categories
      Consider all biographies for each language.
      Analyse links missing only in a few language Wikipedias.
       Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia
Introduction Results Conclusions

Questions?




     Aragon, Kaltenbrunner, Laniado & Volkovich   Biographical social networks on Wikipedia

More Related Content

Viewers also liked

Desarrollo de una herramienta de planificación social media
Desarrollo de una herramienta de planificación social mediaDesarrollo de una herramienta de planificación social media
Desarrollo de una herramienta de planificación social mediaPablo Aragón
 
Datactic, Data with Tactics
Datactic, Data with TacticsDatactic, Data with Tactics
Datactic, Data with TacticsPablo Aragón
 
Not all paths lead to Rome: Analysing the network of sister cities
Not all paths lead to Rome: Analysing the network of sister citiesNot all paths lead to Rome: Analysing the network of sister cities
Not all paths lead to Rome: Analysing the network of sister citiesAndreas Kaltenbrunner
 
Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...
Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...
Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...Pablo Aragón
 
From Citizen Data to the Wisdom of the Crowds: The Case Study of Decide Madrid
From Citizen Data to the Wisdom of the Crowds: The Case Study of Decide MadridFrom Citizen Data to the Wisdom of the Crowds: The Case Study of Decide Madrid
From Citizen Data to the Wisdom of the Crowds: The Case Study of Decide MadridPablo Aragón
 
Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...
Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...
Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...Pablo Aragón
 
UAB: Análisis de redes sociales
UAB: Análisis de redes socialesUAB: Análisis de redes sociales
UAB: Análisis de redes socialesPablo Aragón
 
When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...
When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...
When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...Pablo Aragón
 
Gestión de instancias en amazon ec2 desde consola
Gestión de instancias en amazon ec2 desde consolaGestión de instancias en amazon ec2 desde consola
Gestión de instancias en amazon ec2 desde consolaPablo Aragón
 
The missing link between Network Science and the Social Media Monitoring indu...
The missing link between Network Science and the Social Media Monitoring indu...The missing link between Network Science and the Social Media Monitoring indu...
The missing link between Network Science and the Social Media Monitoring indu...Pablo Aragón
 
Who are my Audiences? Evolution of Target Audiences in Microblogs
Who are my Audiences? Evolution of Target Audiences in MicroblogsWho are my Audiences? Evolution of Target Audiences in Microblogs
Who are my Audiences? Evolution of Target Audiences in MicroblogsRuth Garcia Gavilanes
 

Viewers also liked (12)

Desarrollo de una herramienta de planificación social media
Desarrollo de una herramienta de planificación social mediaDesarrollo de una herramienta de planificación social media
Desarrollo de una herramienta de planificación social media
 
Smmart for partners
Smmart for partnersSmmart for partners
Smmart for partners
 
Datactic, Data with Tactics
Datactic, Data with TacticsDatactic, Data with Tactics
Datactic, Data with Tactics
 
Not all paths lead to Rome: Analysing the network of sister cities
Not all paths lead to Rome: Analysing the network of sister citiesNot all paths lead to Rome: Analysing the network of sister cities
Not all paths lead to Rome: Analysing the network of sister cities
 
Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...
Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...
Graph Visualization Tool for Twittersphere users based on a high-scalable Ext...
 
From Citizen Data to the Wisdom of the Crowds: The Case Study of Decide Madrid
From Citizen Data to the Wisdom of the Crowds: The Case Study of Decide MadridFrom Citizen Data to the Wisdom of the Crowds: The Case Study of Decide Madrid
From Citizen Data to the Wisdom of the Crowds: The Case Study of Decide Madrid
 
Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...
Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...
Tweeting the campaign: Evaluation of the Strategies performed by Spanish Poli...
 
UAB: Análisis de redes sociales
UAB: Análisis de redes socialesUAB: Análisis de redes sociales
UAB: Análisis de redes sociales
 
When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...
When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...
When a Movement Becomes a Party: Computational Assessment of New Forms of Pol...
 
Gestión de instancias en amazon ec2 desde consola
Gestión de instancias en amazon ec2 desde consolaGestión de instancias en amazon ec2 desde consola
Gestión de instancias en amazon ec2 desde consola
 
The missing link between Network Science and the Social Media Monitoring indu...
The missing link between Network Science and the Social Media Monitoring indu...The missing link between Network Science and the Social Media Monitoring indu...
The missing link between Network Science and the Social Media Monitoring indu...
 
Who are my Audiences? Evolution of Target Audiences in Microblogs
Who are my Audiences? Evolution of Target Audiences in MicroblogsWho are my Audiences? Evolution of Target Audiences in Microblogs
Who are my Audiences? Evolution of Target Audiences in Microblogs
 

More from Pablo Aragón

A preliminary approach to knowledge integrity risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity risk assessment in Wikipedia p...Pablo Aragón
 
Civic Technologies: Research, Practice, and Open Challenges
Civic Technologies: Research, Practice, and Open ChallengesCivic Technologies: Research, Practice, and Open Challenges
Civic Technologies: Research, Practice, and Open ChallengesPablo Aragón
 
Characterizing Online Participation in Civic Technologies - PhD
Characterizing Online Participation in Civic Technologies - PhDCharacterizing Online Participation in Civic Technologies - PhD
Characterizing Online Participation in Civic Technologies - PhDPablo Aragón
 
The DECODE Ecosystem: Tools for citizens’ data sovereignty in Barcelona
The DECODE Ecosystem: Tools for citizens’ data sovereignty in BarcelonaThe DECODE Ecosystem: Tools for citizens’ data sovereignty in Barcelona
The DECODE Ecosystem: Tools for citizens’ data sovereignty in BarcelonaPablo Aragón
 
Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...
Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...
Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...Pablo Aragón
 
DECODE project: Barcelona pilots
DECODE project: Barcelona pilotsDECODE project: Barcelona pilots
DECODE project: Barcelona pilotsPablo Aragón
 
Generative models of online discussion threads (ASONAM 2018 tutorial)
Generative models of online discussion threads (ASONAM 2018 tutorial)Generative models of online discussion threads (ASONAM 2018 tutorial)
Generative models of online discussion threads (ASONAM 2018 tutorial)Pablo Aragón
 
Online Petitioning Through Data Exploration and What We Found There: A Datase...
Online Petitioning Through Data Exploration and What We Found There: A Datase...Online Petitioning Through Data Exploration and What We Found There: A Datase...
Online Petitioning Through Data Exploration and What We Found There: A Datase...Pablo Aragón
 
Decidim: Visualización de datos para la innovación democrática
Decidim: Visualización de datos para la innovación democráticaDecidim: Visualización de datos para la innovación democrática
Decidim: Visualización de datos para la innovación democráticaPablo Aragón
 
Data Science in the era of Fake News
Data Science in the era of Fake NewsData Science in the era of Fake News
Data Science in the era of Fake NewsPablo Aragón
 
Datos para la participación
Datos para la participaciónDatos para la participación
Datos para la participaciónPablo Aragón
 
Decidim en redes sociales
Decidim en redes socialesDecidim en redes sociales
Decidim en redes socialesPablo Aragón
 
The dynamics of a social convention
The dynamics of a social conventionThe dynamics of a social convention
The dynamics of a social conventionPablo Aragón
 
Discussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim BarcelonaDiscussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim BarcelonaPablo Aragón
 
Computational Framework for the Assessment of New Forms of Political Organiza...
Computational Framework for the Assessment of New Forms of Political Organiza...Computational Framework for the Assessment of New Forms of Political Organiza...
Computational Framework for the Assessment of New Forms of Political Organiza...Pablo Aragón
 
4th Databeers BCN - When a movement becomes a party
4th Databeers BCN - When a movement becomes a party4th Databeers BCN - When a movement becomes a party
4th Databeers BCN - When a movement becomes a partyPablo Aragón
 

More from Pablo Aragón (19)

A preliminary approach to knowledge integrity risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...A preliminary approach to knowledge integrity  risk assessment in Wikipedia p...
A preliminary approach to knowledge integrity risk assessment in Wikipedia p...
 
Civic Technologies: Research, Practice, and Open Challenges
Civic Technologies: Research, Practice, and Open ChallengesCivic Technologies: Research, Practice, and Open Challenges
Civic Technologies: Research, Practice, and Open Challenges
 
Characterizing Online Participation in Civic Technologies - PhD
Characterizing Online Participation in Civic Technologies - PhDCharacterizing Online Participation in Civic Technologies - PhD
Characterizing Online Participation in Civic Technologies - PhD
 
The DECODE Ecosystem: Tools for citizens’ data sovereignty in Barcelona
The DECODE Ecosystem: Tools for citizens’ data sovereignty in BarcelonaThe DECODE Ecosystem: Tools for citizens’ data sovereignty in Barcelona
The DECODE Ecosystem: Tools for citizens’ data sovereignty in Barcelona
 
Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...
Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...
Sistema interactivo para el descubrimiento de temas y propuestas de Decide Ma...
 
DECODE project: Barcelona pilots
DECODE project: Barcelona pilotsDECODE project: Barcelona pilots
DECODE project: Barcelona pilots
 
Labmeeting18
Labmeeting18Labmeeting18
Labmeeting18
 
Generative models of online discussion threads (ASONAM 2018 tutorial)
Generative models of online discussion threads (ASONAM 2018 tutorial)Generative models of online discussion threads (ASONAM 2018 tutorial)
Generative models of online discussion threads (ASONAM 2018 tutorial)
 
Online Petitioning Through Data Exploration and What We Found There: A Datase...
Online Petitioning Through Data Exploration and What We Found There: A Datase...Online Petitioning Through Data Exploration and What We Found There: A Datase...
Online Petitioning Through Data Exploration and What We Found There: A Datase...
 
Decidim: Visualización de datos para la innovación democrática
Decidim: Visualización de datos para la innovación democráticaDecidim: Visualización de datos para la innovación democrática
Decidim: Visualización de datos para la innovación democrática
 
Data Science in the era of Fake News
Data Science in the era of Fake NewsData Science in the era of Fake News
Data Science in the era of Fake News
 
Uk seminar
Uk seminarUk seminar
Uk seminar
 
Datos para la participación
Datos para la participaciónDatos para la participación
Datos para la participación
 
Lab metadecidim
Lab metadecidimLab metadecidim
Lab metadecidim
 
Decidim en redes sociales
Decidim en redes socialesDecidim en redes sociales
Decidim en redes sociales
 
The dynamics of a social convention
The dynamics of a social conventionThe dynamics of a social convention
The dynamics of a social convention
 
Discussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim BarcelonaDiscussions and decisions on Decidim Barcelona
Discussions and decisions on Decidim Barcelona
 
Computational Framework for the Assessment of New Forms of Political Organiza...
Computational Framework for the Assessment of New Forms of Political Organiza...Computational Framework for the Assessment of New Forms of Political Organiza...
Computational Framework for the Assessment of New Forms of Political Organiza...
 
4th Databeers BCN - When a movement becomes a party
4th Databeers BCN - When a movement becomes a party4th Databeers BCN - When a movement becomes a party
4th Databeers BCN - When a movement becomes a party
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Biographical social networks on Wikipedia: A cross-cultural study

  • 1. Introduction Results Conclusions Biographical social networks on Wikipedia A cross-cultural study of links that made history Pablo Aragon, Andreas Kaltenbrunner, David Laniado and Yana Volkovich Social Media Research Group, Barcelona Media, Barcelona, Spain August 27th , 2012 WikiSym ’12, Linz, Austria Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 2. Introduction Results Conclusions Outline 1 Introduction Motivation Data extraction 2 Results Global network statistics Most central persons Similarity between languages 3 Conclusions Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 3. Introduction Results Conclusions Motivation Data extraction Outline 1 Introduction Motivation Data extraction 2 Results Global network statistics Most central persons Similarity between languages 3 Conclusions Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 4. Introduction Results Conclusions Motivation Data extraction Motivation Is history made by great man and women or vice-versa? Unclear, but undoubtably social connections shape history. Wikipedia as global collective memory place ... allows to extract from biographies how social links are recorded across cultures ... to generate networks of links between biographical articles. Research questions Who are the most central characters in these networks? Do culture related peculiarities exist? Which cultures are more similar? What is the shared knowledge about connections between persons across cultures? Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 5. Introduction Results Conclusions Motivation Data extraction Data extraction Building biographical networks for 15 language editions of Wikipedia Selected the 15 largest language editions of Wikipedias Starting point: 296 511 biographies from the English Wikipedia (from DBpedia) Identified the corresponding articles (when existing) on the remaining 14 languages Generated a directed network for each language version: nodes → persons edges → links between the articles of the corresponding persons Manage alternative titles of articles: track redirects Data collected through Wikipedia APIs between September 8th and 13th, 2011 Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 6. Introduction Results Conclusions Motivation Data extraction Redirect statistics Distribution of the number of redirects per biographical article in the English Wikipedia 5 10 Persons with most redirects: Muammar al−Gaddafi 251 Osama bin Laden 117 Barack Obama 114 4 Jesus 109 10 Elizabeth II 101 Eminem 96 Joseph Stalin 88 Omar al−Bashir 87 # articles 3 10 Genghis Khan 84 Pyotr Ilyich Tchaikovsky 84 Athelred the Unready 83 George W. Bush 80 2 10 Mary (mother of Jesus) 80 1 10 0 # redirects per article 10 0 1 2 10 10 10 # redirects Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 7. Introduction Results Conclusions Global network statistics Most central persons Similarity betwe Outline 1 Introduction Motivation Data extraction 2 Results Global network statistics Most central persons Similarity between languages 3 Conclusions Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 8. Introduction Results Conclusions Global network statistics Most central persons Similarity betwe Properties of the different language networks Language code N K C % GC d r dmax English en 198 190 928 339 0.03 95% 6.53 0.17 43 German de 62 402 260 889 0.05 94% 6.83 0.14 33 French fr 51 811 283 453 0.06 96% 6.11 0.15 36 Italian it 35 756 190 867 0.06 95% 6.28 0.14 42 Spanish es 34 828 169 302 0.06 97% 6.29 0.16 36 Japanese ja 26 155 109 081 0.08 96% 6.47 0.20 26 Dutch nl 24 496 76 651 0.08 94% 7.91 0.18 37 Portuguese pt 23 705 85 295 0.07 94% 6.98 0.18 45 Swedish sv 23 085 60 745 0.07 91% 8.27 0.20 46 Polish pl 22 438 50 050 0.08 85% 8.94 0.16 43 Finish fi 18 594 44 941 0.07 87% 7.80 0.17 30 Norwegian no 18 423 49 303 0.09 83% 8.31 0.22 48 Russian ru 16 403 34 436 0.06 87% 9.10 0.10 35 Chinese zh 11 715 44 739 0.17 91% 7.20 0.20 32 Catalan ca 11 027 42 321 0.09 93% 7.14 0.17 32 N, K → number of (not isolated) nodes and edges C → average clustering coefficient GC → percentage of nodes in the giant component r → reciprocity d → average path-length between nodes dmax → maximal distance between two nodes in the network Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 9. Introduction Results Conclusions Global network statistics Most central persons Similarity betwe Most central persons in the English Wikipedia sorted by in-degree. Ranks for out-degree, betweenness and PageRank in parenthesis person in-degree out-degree betw. PageRank George W. Bush 2123 89 (107) (1) 0.00209 (1) Barack Obama 1677 51 (710) (8) 0.00162 (2) Bill Clinton 1660 74 (205) (4) 0.00156 (4) Ronald Reagan 1652 90 (103) (2) 0.00156 (3) Adolf Hitler 1407 119 (26) (3) 0.00149 (5) Richard Nixon 1299 86 (127) (7) 0.00136 (6) William Shakespeare 1229 25 (4203) (63) 0.00113 (9) John F. Kennedy 1208 104 (53) (5) 0.00123 (8) Franklin D. Roosevelt 1052 71 (237) (15) 0.00131 (7) Lyndon B. Johnson 1000 106 (50) (12) 0.00108 (11) Jimmy Carter 953 80 (158) (9) 0.00113 (10) Elvis Presley 948 82 (142) (27) 0.00063 (24) Pope John Paul II 941 59 (444) (11) 0.00083 (18) Dwight D. Eisenhower 891 55 (564) (22) 0.00095 (14) Frank Sinatra 882 108 (47) (18) 0.00056 (28) George H. W. Bush 878 87 (118) (19) 0.00096 (13) Abraham Lincoln 846 54 (593) (40) 0.00089 (16) Bob Dylan 835 151 (11) (14) 0.00055 (30) Winston Churchill 748 84 (136) (10) 0.00092 (15) Harry S. Truman 743 81 (145) (24) 0.00099 (12) Joseph Stalin 723 69 (265) (43) 0.00089 (17) Michael Jackson 663 71 (237) (34) 0.00042 (51) Elizabeth II 653 52 (665) (6) 0.00074 (19) Jesus 572 38 (1595) (51) 0.00068 (20) Hillary Rodham Clinton 554 87 (118) (32) 0.00063 (25) Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 10. Introduction Results Conclusions Global network statistics Most central persons Similarity betwe Most central persons in different language Wikipedias Top 5 most central persons for each language by betweenness lang #1 #2 #3 #4 #5 en George W. Bush Ronald Reagan Adolf Hitler Bill Clinton John F. Kennedy de Adolf Hitler George W. Bush Martin Luther King, Jr Barack Obama Frank Sinatra fr Adolf Hitler George W. Bush William Shakespeare Barack Obama Jacques Chirac it Frank Sinatra George W. Bush Pope John Paul II Michael Jackson Elton John es Michael Jackson Fidel Castro William Shakespeare Che Guevara Adolf Hitler ja Adolf Hitler Michael Jackson Ronald Reagan Yukio Mishima Barack Obama nl Elvis Presley Adolf Hitler Bill Clinton Joseph Stalin William Shakespeare pt Michael Jackson Richard Wagner Adolf Hitler Ronald Reagan David Bowie sv George W. Bush Winston Churchill Elizabeth II Michael Jackson Adolf Hitler pl Elizabeth II Pope John Paul II Margaret Thatcher George W. Bush Ronald Reagan fi Barack Obama Adolf Hitler Michael Jackson George W. Bush Benito Mussolini no Marilyn Monroe Adolf Hitler John F. Kennedy Bob Dylan Bill Clinton ru William Shakespeare Napoleon II Kenneth Branagh Elton John Joseph Stalin zh Chiang Kai-Shek William Shakespeare Barack Obama Deng Xiaoping Adolf Hitler ca Adolf Hitler Che Guevara Juan Carlos I Michael Schumacher Juan Manuel Fangio Most are known to be (or have been) highly influential We find political leaders, revolutionaries, famous musicians, writers and actors. Hitler, Bush, Obama dominate in almost all top rankings. Top ranked in many languages reflect country specifities. Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 11. Introduction Results Conclusions Global network statistics Most central persons Similarity betwe Languages similarity network Every language links to the two most similar ones according to Jaccard coefficient Definition of Jaccard coefficient J Given the set of links A and B of two networks |A ∩ B| J= |A ∪ B| J is the ratio between the number of links present in both networks (their intersection) and the number of links existing in their union. Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 12. Introduction Results Conclusions Global network statistics Most central persons Similarity betwe Intersection of networks in different languages . Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 13. Introduction Results Conclusions Outline 1 Introduction Motivation Data extraction 2 Results Global network statistics Most central persons Similarity between languages 3 Conclusions Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 14. Introduction Results Conclusions Conclusions and future work Conclusions Global social network measures are largely similar for all networks. Most central persons unveil interesting peculiarities about the language communities. Networks are more similar for geographically or linguistically closer communities. Many connections which can be found in most of the analysed language Wikipedias. Future work Application of the methodology to generate subnetworks of other kinds of article categories Consider all biographies for each language. Analyse links missing only in a few language Wikipedias. Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia
  • 15. Introduction Results Conclusions Questions? Aragon, Kaltenbrunner, Laniado & Volkovich Biographical social networks on Wikipedia