SlideShare a Scribd company logo
1 of 25
Download to read offline
Introduction Peaks Growth measure Conclusions




      There is No Deadline - Time Evolution of
               Wikipedia Discussions

                 Andreas Kaltenbrunner                 David Laniado

                                Social Media Research Group,
                                      Barcelona Media,
                                      Barcelona, Spain


                                August 28th, 2012
                             WikiSym ’12, Linz, Austria




Andreas Kaltenbrunner & David Laniado           Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Motivation Dataset

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Motivation Dataset

Motivation

  Wiki means quick in Hawaiian
        How to study the speed with which an article changes?
        First choice would the number of edits per time unit.

  But the larger an article becomes ...
        more of its generative process happens in talk pages.
        ⇒ Looking at the associated discussion is often the most
        effective way to understand the editing process.

  Research questions
        What is the relationship between discussion and edits?
        How frequent are spikes of activity?
        How fast do discussions grow, and for how long?
        Which are the fastest discussions?
   Andreas Kaltenbrunner & David Laniado          Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Motivation Dataset

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                                  Motivation Dataset

Dataset Dump of March 12th, 2010
Co-evolution of comments and edits

                                    4                       number of comments and edits per day
                                 x 10
                           15                                                                                                9000
                                        edits




                                                                                                                                    number of comments
                          12.5          comments                                                                             7500
        number of edits




                           10                                                                                                6000

                           7.5                                                                                               4500

                            5                                                                                                3000

                           2.5                                                                                               1500

                             0                                                                                               0
                            2003              2004     2005          2006          2007      2008         2009       2010

                                    5                               zoom on the year 2007
                                 x 10
                           1.5                                                                                               9000




                                                                                                                                    number of comments
        number of edits




                          1.25                                                                                               7500


                            1                                                                                                6000


                          0.75                                                                                               4500


                           0.5                                                                                            3000
                           Jan−1        Feb     Mar   Apr     May    Jun     Jul     Ago    Sep     Oct   Nov    Dec−1 Dec−31
                                                                            2007

                            6 comments per 100 edits
     Andreas Kaltenbrunner & David Laniado                                         Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                    Motivation Dataset

Example for a single article
Activity is less synchronised


                              Peaks in the discussion and edit activity of the article "Barack Obama"
             100                                                                                                   100

               0                   2010                                                                            0

             300                                                                                                   300

             200

             100         2009                                                                                      200

                                                                                                                   100




                                                                                                                         #comments, #edits per day.
               0                                                                                                   0

             300                                                                                                   300

             200

             100

               0
                         2008                                                                                      200

                                                                                                                   100

                                                                                                                   0

             300                                                                                                   300

             200

             100

                0
                         2007                                                                    #comments per day
                                                                                                 #edits per day
                                                                                                                   200

                                                                                                                   100

                                                                                                                   0
              Jan−01   Feb   Mar    Apr      May      Jun      Jul       Ago     Sep      Oct      Nov   Dec−01 Dec−31




    How to detect & David Laniado
     Andreas Kaltenbrunner
                           peaks?                                    Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Peak Detection Algorithm Peak Statistics

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                     Peak Detection Algorithm Peak Statistics

How to detect peaks?
Compare with median activity


                               Peaks in the discussion and edit activity of the article "Barack Obama"
              100                                                                                                   100

                0                   2010                                                                            0

              300                                                                                                   300

              200

              100         2009                                                                                      200

                                                                                                                    100




                                                                                                                          #comments, #edits per day.
                0                                                                                                   0

              300                                                                                                   300

              200

              100

                0
                          2008                                                                                      200

                                                                                                                    100

                                                                                                                    0

              300                                                                                                  300

              200

              100

                 0
                          2007                                                   #comments per day
                                                                                 median #comments during ± 2 weeks 200
                                                                                 #edits per day
                                                                                 median #edits during ± 2 weeks    100

                                                                                                                    0
               Jan−01   Feb   Mar    Apr      May      Jun      Jul       Ago     Sep      Oct      Nov   Dec−01 Dec−31




     Andreas Kaltenbrunner & David Laniado                            Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                     Peak Detection Algorithm Peak Statistics

Peak if activity > c · max(m(t), nmin ) adapted from [Lehmann 2012]
m(t) . . . 4 weeks median, nmin . . . activity minimum, c . . . peak factor


                                Peaks in the discussion and edit activity of the article "Barack Obama"
               100                                                                                                   100

                 0                   2010                                                                            0

               300                                                                                                   300

               200

               100         2009                                                                                      200

                                                                                                                     100




                                                                                                                           #comments, #edits per day.
                 0                                                                                                   0

               300                                                                                                   300

               200

               100

                 0
                           2008                                                                                      200

                                                                                                                     100

                                                                                                                     0
                                                                                  #comments per day
               300                                                                median #comments during ± 2 weeks 300

               200

               100

                  0
                           2007                                                   comment peaks
                                                                                  #edits per day
                                                                                  median #edits during ± 2 weeks
                                                                                  edit peaks
                                                                                                                    200

                                                                                                                    100

                                                                                                                     0
                Jan−01   Feb   Mar    Apr      May      Jun      Jul       Ago     Sep      Oct      Nov   Dec−01 Dec−31




      Andreas Kaltenbrunner & David Laniado                            Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                                           Peak Detection Algorithm Peak Statistics

Edit and comment peaks do not always coincide ...
and can be caused be endogenous or exogenous events


                                        Peaks in the discussion and edit activity of the article "Barack Obama"
             100                                                                                                                   400         05−Nov−2008
                                                                                                                                                                 100

               0                              2010                         09 and 10−Mar−2009
                                                                         350 Endogenous peak
                                                                         300                                                       300
                                                                                                                                               Pres. Elections
                                                                                                                                                                 0
                                                      20−Jan−2009        250                                09−Oct−2009
             300                                    Pres. Inaguaration                                     Nobel Price Win                                       300



                                2009
                                              200                        200                       200                             200
                                                                         150
             200                                                                                                                                                 200
                                              100                        100                       100                             100
                                                                          50
             100                                                                                                                                                 100




                                                                                                                                                                       #comments, #edits per day.
                                                0                          0                          0                              0

               0                                           04−Jun−2008                                                                                           0
                   200       17−Mar−2008              200 Nomination Win        200    29−Aug−2008              200     10−Oct−2008
                            Endogenous peak                                           Official Nomiation              Endogenous peak
             300                                                                                                                                                 300



                                2008
                   150                                150                       150                             150

                   100                                100                       100                             100
             200                                                                                                                                                 200
                       50                              50                        50                              50

                        0                                 0                       0                               0
             100                                                                                                                                                 100

               0                                                                                                                                                 0
                                                     15−Feb−2007           11 and 12−Mar−2007
                                                    Endogenous peak         Endogenous peak                  #comments per day
             300                              200                        200
                                                                                                             median #comments during ± 2 weeks 300

             200

             100

                0
                                2007          150

                                              100

                                               50

                                                0
                                                                         150

                                                                         100

                                                                          50

                                                                           0
                                                                                                             comment peaks
                                                                                                             #edits per day
                                                                                                             median #edits during ± 2 weeks
                                                                                                             edit peaks
                                                                                                                                               200

                                                                                                                                               100

                                                                                                                                                         0
              Jan−01        Feb      Mar            Apr       May         Jun         Jul        Ago           Sep           Oct         Nov   Dec−01 Dec−31




    Andreas Kaltenbrunner & David Laniado                                                   Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Peak Detection Algorithm Peak Statistics

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                                                                                                Peak Detection Algorithm Peak Statistics

Peak Statistics with c = 5, n                                                                                        min   = 10, 2 580 discussion and 32 853 edit peaks



                                                                                                             5
                                                      1198 articles with comment peaks                      10
                                                                                                                                                          2580 comment peaks                                                                                comment peaks
                                                      y~x−2.57                                                                                                                                                                                              y~x−1.41
                            4
                           10                                                                                                                             y~x−3.52                                              3
                                                      20681 articles with edit peaks                                                                                                                           10                                           edit peaks
                                                                                                             4
                                                                                                                                                          32853 edit peaks
                                                      y~x−3.50                                              10                                            y~x−3.97                                                                                          y~x−1.36




                                                                                                                                                                                   number of time intervalls
                            3
                           10
      number of articles




                                                                                          number of peaks
                                                                                                             3                                                                                                  2
                                                                                                            10                                                                                                 10

                            2
                           10
                                                                                                             2
                                                                                                            10
                                                                                                                                                                                                                1
                                                                                                                                                                                                               10
                            1
                           10                                                                                1
                                                                                                            10


                            0                                                                                                                                                                                   0
                           10                                                                                0
                                                                                                            10                                                                                                 10 0
                                1      2      3     4    5 6 7 8 9 10          15    20                          1              2           3         4     5    6   7   8     9                                            1                   2                  3
                                           number of peaks per article                                                                                                                                           10       10                  10                 10
                                                                                                                                    peak length (days)                                                                time between consecutive peaks (in days)




   In the entire dataset
        only 12% of all comment peaks coincide with an edit peak
                                    27% when allowing one day of difference
                                    33.8% when allowing two.

   Peaks in the discussion activity do not have to lead to peaks in
   the editing activity as well.
    Andreas Kaltenbrunner & David Laniado                                                                                                        Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions          Peak Detection Algorithm Peak Statistics

Top 10 articles with most comment and edit peaks
   Title                                             #comment-peaks      #edit-peaks
   Intelligent design                                      15                 2
   September 11 attacks                                    15                 3
   Race and intelligence                                   14                 5
   British Isles                                           11                 0
   Main page                                               11                 0
   Anarchism                                               10                 12
   Catholic church                                         10                 0
   Canada                                                  10                 0
   Transnistria                                             9                 3
   New Anti-Semitism                                        9                 0

   Title                                               #edit-peaks       #com.-peaks
   Uxbridge, Massachusetts                                 19                 0
   Voodoo (D’Angelo album)                                 17                 0
   List of World Wrestling Entertainment employees         16                 3
   Super Smash Bros. Brawl                                 16                 2
   Michael Jackson                                         16                 1
   The Biggest Loser: Couples 2                            16                 0
   Roger Federer                                           15                 0
   Rafael Nadal                                            15                 0
   List of Barney & Friends episodes and videos            15                 0
   Total Drama Action                                      15                 0




   Andreas Kaltenbrunner & David Laniado                 Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Discussion Complexity Growth in Complexity

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Discussion Complexity Growth in Complexity

How to measure the complexity of a Discussion?
Discussion tree for article “Presidency of Barack Obama”




                                                               red → root (the article)
                                                               blue → structural nodes
                                                               green → anonymous
                                                               comments
                                                               grey → registered
                                                               comments




     Andreas Kaltenbrunner & David Laniado          Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Discussion Complexity Growth in Complexity

Using the h-index of a discussion introduced in [Gómez 2008]

   The h-index ...                                 Example
       is a balanced depth
       measure.
         is the maximal number
         h such that there are at
         least h comments at
         level (depth) h, but not
         h + 1 comments at
         level h + 1.
                                                                    h-index=3
         In other words there
         are h sub-threads of
         depth at least h.



    Andreas Kaltenbrunner & David Laniado          Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions   Discussion Complexity Growth in Complexity

Outline

  1      Introduction
            Motivation
            Dataset

  2      Peaks
           Peak Detection Algorithm
           Peak Statistics

  3      Growth measure
           Discussion Complexity
           Growth in Complexity

  4      Conclusions


      Andreas Kaltenbrunner & David Laniado       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                                                                          Discussion Complexity Growth in Complexity

Example for growth of discussions

                                                                                                                         5
                                                                                                                        10
                             George W. Bush                                                                                  George W. Bush
                             smoothed trend Bush                                                                             Barack Obama
                             Barack Obama                                                                                    Bill Clinton
                         3   smoothed trend Obama
                        10   Bill Clinton                                                                                4
                                                                                                                        10
                             smoothed trend Clinton
   number of comments




                                                                                                   number of comments
                                                                                                                         3
                                                                                                                        10
                         2
                        10


                                                                                                                         2
                                                                                                                        10

                         1
                        10
                                                                                                                         1
                                                                                                                        10



                         0
                 10                                                                                              10
                                                                                                                         0
                Jan−2002 Jan−2003 Jan−2004 Jan−2005 Jan−2006 Jan−2007 Jan−2008 Jan−2009 Jan−2010                Jan−2002 Jan−2003 Jan−2004 Jan−2005 Jan−2006 Jan−2007 Jan−2008 Jan−2009 Jan−2010




                             Can we use the h-index to measure this growth?




            Andreas Kaltenbrunner & David Laniado                                                                        Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                                Discussion Complexity Growth in Complexity

Example for growth rate ∆h
            14
                 George W. Bush
                 Barack Obama
            12   Bill Clinton
                                                                                                  George W. Bush
            10
                                                                                                  ∆h =70.7 days
            8
  h−index




                                                                                                  Barack Obama
            6
                                                                                                  ∆h =90.2 days
            4
                                                                                                  Bill Clinton
            2
                                                                                                  ∆h =331.9 days
      0
   Jan−2003        Jan−2004   Jan−2005   Jan−2006   Jan−2007   Jan−2008   Jan−2009




  We define the growth rate ∆h as
                 the average time a discussion increases its h-index by one
                                                                                 th − t1
                                                                   ∆h =
                                                                                  h−1
                 related to the inverse of the m-index proposed in [Hirsch 2005]
     Andreas Kaltenbrunner & David Laniado                                     Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions                Discussion Complexity Growth in Complexity

Distribution of growth rates ∆h
of all discussions with more than 1000 comments

                                                          ∆h
                                            60


                                            50


                                            40
                            # discussions

                                            30


                                            20


                                            10


                                            0
                                                 1   10                100            1000
                                                          days




   Different growth rates
          We find several orders of magnitude of different rates of
          increase in complexity of the discussions.

     Andreas Kaltenbrunner & David Laniado                       Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions          Discussion Complexity Growth in Complexity

The 15 fastest and slowest discussions (∆h and duration in days)
    Title                                         ∆h        start date        end date      duration   final h-index
    Virginia Tech massacre                        0.5     15-Apr-2007       20-Apr-2007         5           9
    2009 flu pandemic                              0.9     25-Apr-2009       30-Apr-2009         5           7
    Bronze Soldier of Tallinn                     0.9     26-Apr-2007      02-May-2007          6           7
    2009 Honduran constitutional crisis           1.0     27-Jun-2009        05-Jul-2009        8           8
    Seung-Hui Cho                                 1.0     16-Apr-2007       24-Apr-2007         8           8
    2008 Mumbai attacks                           1.0     26-Nov-2008      01-Dec-2008          5           6
    Israeli-occupied territories                  1.2     22-Sep-2005       03-Oct-2005        11          10
    International status of Abkhazia and South    1.3     25-Aug-2008      04-Sep-2008         10           8
    Air France Flight 447                         1.4     01-Jun-2009      08-Jun-2009          7           6
    7 July 2005 London bombings                   1.7      10-Jul-2005       15-Jul-2005        5           5
    State terrorism and the United States         1.7     15-Feb-2008      06-Mar-2008         20          13
    July 2009 Ürümqi riots                        1.9      06-Jul-2009       21-Jul-2009       15           9
    Henry Louis Gates arrest controversy          2.0      24-Jul-2009     09-Aug-2009         16           9
    Teach the Controversy                         2.6     11-Apr-2005       29-Apr-2005        18           8
    Shakespeare authorship question               485.6   02-Jun-2003      24-Jan-2010        2428          7
    Karl Marx                                     487.6   19-Sep-2004      21-Jan-2010        1950          6
    Led Zeppelin                                  511.9   31-Jan-2003      03-Feb-2010        2560          6
    Vampire                                       517.1   19-Nov-2002       18-Jul-2008       2068          6
    World War II casualties                       523.0   13-Sep-2004      29-Dec-2008        1568          4
    War on Terrorism                              533.1   07-Oct-2005      22-Feb-2010        1599          6
    Fathers’ rights movement                      546.0   07-Mar-2004      01-Sep-2008        1639          4
    Instant-runoff voting                         546.3    09-Jul-2003     03-Jan-2008        1639          5
    Scientific method                              553.5   15-Jun-2003       08-Jul-2009       2215          6
    France                                        566.5   13-Nov-2003      26-Jan-2010        2266          6
    Harry Potter                                  589.9   27-Nov-2002      02-Oct-2007        1770          6
    Anna Anderson                                 604.5   17-Mar-2004       09-Jul-2007       1209          3
    New York City                                 617.3   09-Dec-2003      03-Jan-2009        1852          5
    Pi                                            627.3   07-Dec-2002      20-Oct-2009        2509          6
    Christopher Columbus                         1159.0   24-Oct-2003      27-Feb-2010        2318          5

    Andreas Kaltenbrunner & David Laniado                 Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions

Conclusions and future work
  Conclusions
     Discussion and edit peaks occur mostly independently of
     each other.
        Both endogenous (Wikipedia internal) and exogenous
        (offline world) events can be the cause of such peaks.
        We have introduced a simple growth measure.
        Some discussions need only a few days to evolve, while
        the slowest go on over years.

  Future work
      Use metrics for early detection of controversies.
        Apply metrics on sub-threads to detect hot spots.
        Assess discussion maturity.


   Andreas Kaltenbrunner & David Laniado          Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions

Questions?




   Andreas Kaltenbrunner & David Laniado          Time Evolution of Wikipedia Discussions
Introduction Peaks Growth measure Conclusions

Bibliography I
        Vicenç Gómez, Andreas Kaltenbrunner & Vicente López.
        Statistical analysis of the social network and discussion threads in Slashdot.
        In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 645–654, New
        York, NY, USA, 2008. ACM.

        J. E. Hirsch.
        An index to quantify an individual’s scientific research output.
        PNAS, vol. 102, no. 46, pages 16569–16572, 2005.

        J. Lehmann, B. Gonçalves, J.J. Ramasco & C. Cattuto.
        Dynamical Classes of Collective Attention in Twitter.
        In Proc. of WWW, 2012.




   Andreas Kaltenbrunner & David Laniado                       Time Evolution of Wikipedia Discussions

More Related Content

Similar to There is No Deadline - Time Evolution of Wikipedia Discussions

Use Distributed Filesystem as a Storage Tier
Use Distributed Filesystem as a Storage TierUse Distributed Filesystem as a Storage Tier
Use Distributed Filesystem as a Storage TierManfred Furuholmen
 
Using dynaTrace to optimise application performance
Using dynaTrace to optimise application performanceUsing dynaTrace to optimise application performance
Using dynaTrace to optimise application performanceRichard Bishop
 
Ash Maurya, Running Lean Presentation at Lean Startup SXSW
Ash Maurya, Running Lean Presentation at Lean Startup SXSWAsh Maurya, Running Lean Presentation at Lean Startup SXSW
Ash Maurya, Running Lean Presentation at Lean Startup SXSW500 Startups
 
Social Media Metrics Dashboard
Social Media Metrics DashboardSocial Media Metrics Dashboard
Social Media Metrics DashboardDemand Metric
 
SCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning Data
SCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning DataSCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning Data
SCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning DataEdTechTeacher.org
 
SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...
SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...
SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...SimpliFlying
 
Lowering GHG Emmisions Through EE & RE Measures, Emani Kumar
Lowering GHG Emmisions Through EE & RE Measures, Emani KumarLowering GHG Emmisions Through EE & RE Measures, Emani Kumar
Lowering GHG Emmisions Through EE & RE Measures, Emani KumarAlliance To Save Energy
 
Bass Diffusion Model
Bass Diffusion ModelBass Diffusion Model
Bass Diffusion ModelJoe Estephan
 
Lattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platformsLattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platformsRoberto C. S. Pacheco
 

Similar to There is No Deadline - Time Evolution of Wikipedia Discussions (11)

SDS Amazon RDS
SDS Amazon RDSSDS Amazon RDS
SDS Amazon RDS
 
Use Distributed Filesystem as a Storage Tier
Use Distributed Filesystem as a Storage TierUse Distributed Filesystem as a Storage Tier
Use Distributed Filesystem as a Storage Tier
 
Using dynaTrace to optimise application performance
Using dynaTrace to optimise application performanceUsing dynaTrace to optimise application performance
Using dynaTrace to optimise application performance
 
Ash Maurya, Running Lean Presentation at Lean Startup SXSW
Ash Maurya, Running Lean Presentation at Lean Startup SXSWAsh Maurya, Running Lean Presentation at Lean Startup SXSW
Ash Maurya, Running Lean Presentation at Lean Startup SXSW
 
Social Media Metrics Dashboard
Social Media Metrics DashboardSocial Media Metrics Dashboard
Social Media Metrics Dashboard
 
SCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning Data
SCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning DataSCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning Data
SCRIBL Data- Scalable, Real-Time, Individual Behavior and Learning Data
 
SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...
SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...
SFAwards12: Best Airlines Driving Revenue from Social Media (Finalist Present...
 
Lowering GHG Emmisions Through EE & RE Measures, Emani Kumar
Lowering GHG Emmisions Through EE & RE Measures, Emani KumarLowering GHG Emmisions Through EE & RE Measures, Emani Kumar
Lowering GHG Emmisions Through EE & RE Measures, Emani Kumar
 
237 valeof tiersssp_june07dl
237 valeof tiersssp_june07dl237 valeof tiersssp_june07dl
237 valeof tiersssp_june07dl
 
Bass Diffusion Model
Bass Diffusion ModelBass Diffusion Model
Bass Diffusion Model
 
Lattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platformsLattes and other Brazilian ST&I platforms
Lattes and other Brazilian ST&I platforms
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 

There is No Deadline - Time Evolution of Wikipedia Discussions

  • 1. Introduction Peaks Growth measure Conclusions There is No Deadline - Time Evolution of Wikipedia Discussions Andreas Kaltenbrunner David Laniado Social Media Research Group, Barcelona Media, Barcelona, Spain August 28th, 2012 WikiSym ’12, Linz, Austria Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 2. Introduction Peaks Growth measure Conclusions Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 3. Introduction Peaks Growth measure Conclusions Motivation Dataset Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 4. Introduction Peaks Growth measure Conclusions Motivation Dataset Motivation Wiki means quick in Hawaiian How to study the speed with which an article changes? First choice would the number of edits per time unit. But the larger an article becomes ... more of its generative process happens in talk pages. ⇒ Looking at the associated discussion is often the most effective way to understand the editing process. Research questions What is the relationship between discussion and edits? How frequent are spikes of activity? How fast do discussions grow, and for how long? Which are the fastest discussions? Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 5. Introduction Peaks Growth measure Conclusions Motivation Dataset Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 6. Introduction Peaks Growth measure Conclusions Motivation Dataset Dataset Dump of March 12th, 2010 Co-evolution of comments and edits 4 number of comments and edits per day x 10 15 9000 edits number of comments 12.5 comments 7500 number of edits 10 6000 7.5 4500 5 3000 2.5 1500 0 0 2003 2004 2005 2006 2007 2008 2009 2010 5 zoom on the year 2007 x 10 1.5 9000 number of comments number of edits 1.25 7500 1 6000 0.75 4500 0.5 3000 Jan−1 Feb Mar Apr May Jun Jul Ago Sep Oct Nov Dec−1 Dec−31 2007 6 comments per 100 edits Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 7. Introduction Peaks Growth measure Conclusions Motivation Dataset Example for a single article Activity is less synchronised Peaks in the discussion and edit activity of the article "Barack Obama" 100 100 0 2010 0 300 300 200 100 2009 200 100 #comments, #edits per day. 0 0 300 300 200 100 0 2008 200 100 0 300 300 200 100 0 2007 #comments per day #edits per day 200 100 0 Jan−01 Feb Mar Apr May Jun Jul Ago Sep Oct Nov Dec−01 Dec−31 How to detect & David Laniado Andreas Kaltenbrunner peaks? Time Evolution of Wikipedia Discussions
  • 8. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 9. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics How to detect peaks? Compare with median activity Peaks in the discussion and edit activity of the article "Barack Obama" 100 100 0 2010 0 300 300 200 100 2009 200 100 #comments, #edits per day. 0 0 300 300 200 100 0 2008 200 100 0 300 300 200 100 0 2007 #comments per day median #comments during ± 2 weeks 200 #edits per day median #edits during ± 2 weeks 100 0 Jan−01 Feb Mar Apr May Jun Jul Ago Sep Oct Nov Dec−01 Dec−31 Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 10. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics Peak if activity > c · max(m(t), nmin ) adapted from [Lehmann 2012] m(t) . . . 4 weeks median, nmin . . . activity minimum, c . . . peak factor Peaks in the discussion and edit activity of the article "Barack Obama" 100 100 0 2010 0 300 300 200 100 2009 200 100 #comments, #edits per day. 0 0 300 300 200 100 0 2008 200 100 0 #comments per day 300 median #comments during ± 2 weeks 300 200 100 0 2007 comment peaks #edits per day median #edits during ± 2 weeks edit peaks 200 100 0 Jan−01 Feb Mar Apr May Jun Jul Ago Sep Oct Nov Dec−01 Dec−31 Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 11. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics Edit and comment peaks do not always coincide ... and can be caused be endogenous or exogenous events Peaks in the discussion and edit activity of the article "Barack Obama" 100 400 05−Nov−2008 100 0 2010 09 and 10−Mar−2009 350 Endogenous peak 300 300 Pres. Elections 0 20−Jan−2009 250 09−Oct−2009 300 Pres. Inaguaration Nobel Price Win 300 2009 200 200 200 200 150 200 200 100 100 100 100 50 100 100 #comments, #edits per day. 0 0 0 0 0 04−Jun−2008 0 200 17−Mar−2008 200 Nomination Win 200 29−Aug−2008 200 10−Oct−2008 Endogenous peak Official Nomiation Endogenous peak 300 300 2008 150 150 150 150 100 100 100 100 200 200 50 50 50 50 0 0 0 0 100 100 0 0 15−Feb−2007 11 and 12−Mar−2007 Endogenous peak Endogenous peak #comments per day 300 200 200 median #comments during ± 2 weeks 300 200 100 0 2007 150 100 50 0 150 100 50 0 comment peaks #edits per day median #edits during ± 2 weeks edit peaks 200 100 0 Jan−01 Feb Mar Apr May Jun Jul Ago Sep Oct Nov Dec−01 Dec−31 Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 12. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 13. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics Peak Statistics with c = 5, n min = 10, 2 580 discussion and 32 853 edit peaks 5 1198 articles with comment peaks 10 2580 comment peaks comment peaks y~x−2.57 y~x−1.41 4 10 y~x−3.52 3 20681 articles with edit peaks 10 edit peaks 4 32853 edit peaks y~x−3.50 10 y~x−3.97 y~x−1.36 number of time intervalls 3 10 number of articles number of peaks 3 2 10 10 2 10 2 10 1 10 1 10 1 10 0 0 10 0 10 10 0 1 2 3 4 5 6 7 8 9 10 15 20 1 2 3 4 5 6 7 8 9 1 2 3 number of peaks per article 10 10 10 10 peak length (days) time between consecutive peaks (in days) In the entire dataset only 12% of all comment peaks coincide with an edit peak 27% when allowing one day of difference 33.8% when allowing two. Peaks in the discussion activity do not have to lead to peaks in the editing activity as well. Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 14. Introduction Peaks Growth measure Conclusions Peak Detection Algorithm Peak Statistics Top 10 articles with most comment and edit peaks Title #comment-peaks #edit-peaks Intelligent design 15 2 September 11 attacks 15 3 Race and intelligence 14 5 British Isles 11 0 Main page 11 0 Anarchism 10 12 Catholic church 10 0 Canada 10 0 Transnistria 9 3 New Anti-Semitism 9 0 Title #edit-peaks #com.-peaks Uxbridge, Massachusetts 19 0 Voodoo (D’Angelo album) 17 0 List of World Wrestling Entertainment employees 16 3 Super Smash Bros. Brawl 16 2 Michael Jackson 16 1 The Biggest Loser: Couples 2 16 0 Roger Federer 15 0 Rafael Nadal 15 0 List of Barney & Friends episodes and videos 15 0 Total Drama Action 15 0 Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 15. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 16. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity How to measure the complexity of a Discussion? Discussion tree for article “Presidency of Barack Obama” red → root (the article) blue → structural nodes green → anonymous comments grey → registered comments Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 17. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity Using the h-index of a discussion introduced in [Gómez 2008] The h-index ... Example is a balanced depth measure. is the maximal number h such that there are at least h comments at level (depth) h, but not h + 1 comments at level h + 1. h-index=3 In other words there are h sub-threads of depth at least h. Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 18. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity Outline 1 Introduction Motivation Dataset 2 Peaks Peak Detection Algorithm Peak Statistics 3 Growth measure Discussion Complexity Growth in Complexity 4 Conclusions Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 19. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity Example for growth of discussions 5 10 George W. Bush George W. Bush smoothed trend Bush Barack Obama Barack Obama Bill Clinton 3 smoothed trend Obama 10 Bill Clinton 4 10 smoothed trend Clinton number of comments number of comments 3 10 2 10 2 10 1 10 1 10 0 10 10 0 Jan−2002 Jan−2003 Jan−2004 Jan−2005 Jan−2006 Jan−2007 Jan−2008 Jan−2009 Jan−2010 Jan−2002 Jan−2003 Jan−2004 Jan−2005 Jan−2006 Jan−2007 Jan−2008 Jan−2009 Jan−2010 Can we use the h-index to measure this growth? Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 20. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity Example for growth rate ∆h 14 George W. Bush Barack Obama 12 Bill Clinton George W. Bush 10 ∆h =70.7 days 8 h−index Barack Obama 6 ∆h =90.2 days 4 Bill Clinton 2 ∆h =331.9 days 0 Jan−2003 Jan−2004 Jan−2005 Jan−2006 Jan−2007 Jan−2008 Jan−2009 We define the growth rate ∆h as the average time a discussion increases its h-index by one th − t1 ∆h = h−1 related to the inverse of the m-index proposed in [Hirsch 2005] Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 21. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity Distribution of growth rates ∆h of all discussions with more than 1000 comments ∆h 60 50 40 # discussions 30 20 10 0 1 10 100 1000 days Different growth rates We find several orders of magnitude of different rates of increase in complexity of the discussions. Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 22. Introduction Peaks Growth measure Conclusions Discussion Complexity Growth in Complexity The 15 fastest and slowest discussions (∆h and duration in days) Title ∆h start date end date duration final h-index Virginia Tech massacre 0.5 15-Apr-2007 20-Apr-2007 5 9 2009 flu pandemic 0.9 25-Apr-2009 30-Apr-2009 5 7 Bronze Soldier of Tallinn 0.9 26-Apr-2007 02-May-2007 6 7 2009 Honduran constitutional crisis 1.0 27-Jun-2009 05-Jul-2009 8 8 Seung-Hui Cho 1.0 16-Apr-2007 24-Apr-2007 8 8 2008 Mumbai attacks 1.0 26-Nov-2008 01-Dec-2008 5 6 Israeli-occupied territories 1.2 22-Sep-2005 03-Oct-2005 11 10 International status of Abkhazia and South 1.3 25-Aug-2008 04-Sep-2008 10 8 Air France Flight 447 1.4 01-Jun-2009 08-Jun-2009 7 6 7 July 2005 London bombings 1.7 10-Jul-2005 15-Jul-2005 5 5 State terrorism and the United States 1.7 15-Feb-2008 06-Mar-2008 20 13 July 2009 Ürümqi riots 1.9 06-Jul-2009 21-Jul-2009 15 9 Henry Louis Gates arrest controversy 2.0 24-Jul-2009 09-Aug-2009 16 9 Teach the Controversy 2.6 11-Apr-2005 29-Apr-2005 18 8 Shakespeare authorship question 485.6 02-Jun-2003 24-Jan-2010 2428 7 Karl Marx 487.6 19-Sep-2004 21-Jan-2010 1950 6 Led Zeppelin 511.9 31-Jan-2003 03-Feb-2010 2560 6 Vampire 517.1 19-Nov-2002 18-Jul-2008 2068 6 World War II casualties 523.0 13-Sep-2004 29-Dec-2008 1568 4 War on Terrorism 533.1 07-Oct-2005 22-Feb-2010 1599 6 Fathers’ rights movement 546.0 07-Mar-2004 01-Sep-2008 1639 4 Instant-runoff voting 546.3 09-Jul-2003 03-Jan-2008 1639 5 Scientific method 553.5 15-Jun-2003 08-Jul-2009 2215 6 France 566.5 13-Nov-2003 26-Jan-2010 2266 6 Harry Potter 589.9 27-Nov-2002 02-Oct-2007 1770 6 Anna Anderson 604.5 17-Mar-2004 09-Jul-2007 1209 3 New York City 617.3 09-Dec-2003 03-Jan-2009 1852 5 Pi 627.3 07-Dec-2002 20-Oct-2009 2509 6 Christopher Columbus 1159.0 24-Oct-2003 27-Feb-2010 2318 5 Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 23. Introduction Peaks Growth measure Conclusions Conclusions and future work Conclusions Discussion and edit peaks occur mostly independently of each other. Both endogenous (Wikipedia internal) and exogenous (offline world) events can be the cause of such peaks. We have introduced a simple growth measure. Some discussions need only a few days to evolve, while the slowest go on over years. Future work Use metrics for early detection of controversies. Apply metrics on sub-threads to detect hot spots. Assess discussion maturity. Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 24. Introduction Peaks Growth measure Conclusions Questions? Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions
  • 25. Introduction Peaks Growth measure Conclusions Bibliography I Vicenç Gómez, Andreas Kaltenbrunner & Vicente López. Statistical analysis of the social network and discussion threads in Slashdot. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 645–654, New York, NY, USA, 2008. ACM. J. E. Hirsch. An index to quantify an individual’s scientific research output. PNAS, vol. 102, no. 46, pages 16569–16572, 2005. J. Lehmann, B. Gonçalves, J.J. Ramasco & C. Cattuto. Dynamical Classes of Collective Attention in Twitter. In Proc. of WWW, 2012. Andreas Kaltenbrunner & David Laniado Time Evolution of Wikipedia Discussions