Distributed wikis
         Brianna Laugher
  Freedom in the Cloud miniconf
             LCA2011
[[SPEAKER:IS]]
   Wikipedian c. 2005-2010


       [[TALK:IS]]
      Strictly vaporware
       Philosophising


     [[TALK:ISNOT]]
          A demo
Concerned with technical specs
Distribute/decentralise what?
Part           Software             Wiki

Interface      Already existed with Barely exists,
               centralised VCS.     although possible.
                                    Vast majority of
                                    access via web UI.
Repository/    =DVCS                meh
storage
Access point   Possible but not     “marketplace of
               done, projects use   ideas” model
               official releases.
Community      Kinda, like Linux?   Not really
“Marketplace of ideas” model
● Multiple versions of articles
● Opposite of “One True Version”

● Some mechanism allows the best to “rise to the

top” (like PageRank?)
    ●   Isn't that like the internet before Wikipedia? …
● Similar to Knol? UrbanDictionary? StackOverflow?
● Problems:

  ● rewards older contributions

  ● evaluating is boring

  ● no canonical/reliable version

  ● does not force/reward collaboration
No more “One True Version”?


 “A new-generation Wikipedia based on Git-style
technologies could allow there to be not just one
Ocelot article per language, but an infinite number
of them, each of which could be easily mixed and
     merged into your own preferred version.”

                      – Anil Dash, “Forking is a Feature”
               http://dashes.com/anil/2010/09/forking-is-a-feature.html
Some ideas

● Wiki = VCS + prose text project + web UI.
● Copyleft license => “right to fork” => “keeps the

bastards honest”.
● (Software) releases : (wiki) approved versions?

● English Wikipedia is 10. Can it survive to 20?

  ● Too big to fail?

  ● Too big to fork?
Wiki = web front-end for VCS
   for prose text content
What's missing from wiki VCS?
VCS for code vs prose
● Diffs need to be per-word, not per-line
●Code contributions generally expected to be self-

contained, generally in larger chunks than
w/prose
●Code needs to be machine readable,

(optionally?) human readable. Onus is on
contributor to check machine readability
   => higher technical barrier to contributing is
      widely accepted
●Drive-by vandalism virtually non-existent

● Prose projects rarely do “releases”
Merging for code vs prose



Code for unrelated technical functions should be
               able to be merged

  Can we make the same promise for prose?
Can Wikipedia survive another 10?



   Sense of dissatisfaction in the community

Unlike software, a certain critical mass is needed
             to stave off vandalism
Low barrier to entry
 (incl. anonymity)
          +
    high visibility
          +
    many pages
         =>
     vandalism
Wikipedia the monopoly
●   One destination –           ●   Practically, impossible
    convenient and simple           to fork
    for users                       ●   hardware/bandwidth
●   Great SEO (=> project           ●   community
    growth)
                                ●   Widespread
●   Potential for serendipity       bureaucratese,
    in editor activities
                                    instruction creep
●   Consistency (at least
    superficially)
                                ●   Impersonal
MediaWiki has a write API!

# Init site object
import mwclient
site = mwclient.Site('commons.wikimedia.org')
site.login(username, password) # Optional

# Edit page
page = site.Pages['Commons:Sandbox']
text = page.edit()
print 'Text in sandbox:', text.encode('utf-8')
page.save(text + u'nExtra data', summary = 'Test
                                          edit')
“Pending changes” aka
    “FlaggedRevs”
“Pending changes” separates

“what change I want to make”
          (a commit)
             from
“what I want users to receive”
 (tag as approved=”release”)
It's almost like
having a release branch
 mixed in with trunk....
Free license + API – what's the
           hold-up?



        Parsing mark-up :(

          Templates :( :(
“WikiProjects” FTW




● Self-organised groups of editors dedicated to a
particular topic (e.g. Australia) or, less commonly,
focus (e.g. standardising dates)
● Very informal, light-weight

● Narrower focus => better opportunity for

community
Fork the UI?




(a la Twitter API)
“Why would anyone contribute to a
         feeder wiki?”


      Promise of Wikipedia visibility
                     +
   Domain-specific and relevant interface
                     +
              Community
What is community?
             People

           Intent/aims

           Social norms
          - for interacting
        - for contributing
         (eg. style guide)

             License

Meta-planning for all of the above
In summary...




If forking Wikipedia is too hard, what can we do to
              make it practical again?
Credits
Screenshots and logos are © their respective owners
Wikipedia 10: David Peters, CC-BY-SA
http://commons.wikimedia.org/wiki/File:Wikipedia_10mark_rev_k.svg

Vandalism post-its: cdaltonrowe, CC-BY
http://www.flickr.com/photos/30485180@N06/3490537301/

Coloured post-its: Michael Goodine, CC-BY
http://www.flickr.com/photos/watchsmart/3227691975/

Branches: Piotrus, CC-BY-SA
http://commons.wikimedia.org/wiki/File:Shaped_tree_branches_Tenerife.JPG

Wikimania schedule editing: Kat Walsh, CC-BY-SA
http://commons.wikimedia.org/wiki/File:Wikimania2007_everythings_a_wiki.jpg

WikiProject Council logo: Neurolysis, CC-BY-SA
http://commons.wikimedia.org/wiki/File:WikiProject_Council.svg

flagged revs maybe: Neurolysis/Kotra, CC-BY-SA
http://commons.wikimedia.org/wiki/File:FlagRevsMaybe.png

pending changes clock logo: Adam Miller/Anomie/Dodoïste, CC-BY-SA
http://commons.wikimedia.org/wiki/File:Pending_changes_clock.svg
Thanks!

        brianna@modernthings.org

         identi.ca/pfctdayelise

          brianna.laugher.id.au

This work is © Brianna Laugher and licensed under
   the Creative Commons Attribution ShareAlike
      license, except where otherwise noted.

Distributed wikis

  • 1.
    Distributed wikis Brianna Laugher Freedom in the Cloud miniconf LCA2011
  • 2.
    [[SPEAKER:IS]] Wikipedian c. 2005-2010 [[TALK:IS]] Strictly vaporware Philosophising [[TALK:ISNOT]] A demo Concerned with technical specs
  • 3.
    Distribute/decentralise what? Part Software Wiki Interface Already existed with Barely exists, centralised VCS. although possible. Vast majority of access via web UI. Repository/ =DVCS meh storage Access point Possible but not “marketplace of done, projects use ideas” model official releases. Community Kinda, like Linux? Not really
  • 4.
    “Marketplace of ideas”model ● Multiple versions of articles ● Opposite of “One True Version” ● Some mechanism allows the best to “rise to the top” (like PageRank?) ● Isn't that like the internet before Wikipedia? … ● Similar to Knol? UrbanDictionary? StackOverflow? ● Problems: ● rewards older contributions ● evaluating is boring ● no canonical/reliable version ● does not force/reward collaboration
  • 5.
    No more “OneTrue Version”? “A new-generation Wikipedia based on Git-style technologies could allow there to be not just one Ocelot article per language, but an infinite number of them, each of which could be easily mixed and merged into your own preferred version.” – Anil Dash, “Forking is a Feature” http://dashes.com/anil/2010/09/forking-is-a-feature.html
  • 6.
    Some ideas ● Wiki= VCS + prose text project + web UI. ● Copyleft license => “right to fork” => “keeps the bastards honest”. ● (Software) releases : (wiki) approved versions? ● English Wikipedia is 10. Can it survive to 20? ● Too big to fail? ● Too big to fork?
  • 7.
    Wiki = webfront-end for VCS for prose text content
  • 8.
  • 9.
    VCS for codevs prose ● Diffs need to be per-word, not per-line ●Code contributions generally expected to be self- contained, generally in larger chunks than w/prose ●Code needs to be machine readable, (optionally?) human readable. Onus is on contributor to check machine readability => higher technical barrier to contributing is widely accepted ●Drive-by vandalism virtually non-existent ● Prose projects rarely do “releases”
  • 10.
    Merging for codevs prose Code for unrelated technical functions should be able to be merged Can we make the same promise for prose?
  • 12.
    Can Wikipedia surviveanother 10? Sense of dissatisfaction in the community Unlike software, a certain critical mass is needed to stave off vandalism
  • 13.
    Low barrier toentry (incl. anonymity) + high visibility + many pages => vandalism
  • 14.
    Wikipedia the monopoly ● One destination – ● Practically, impossible convenient and simple to fork for users ● hardware/bandwidth ● Great SEO (=> project ● community growth) ● Widespread ● Potential for serendipity bureaucratese, in editor activities instruction creep ● Consistency (at least superficially) ● Impersonal
  • 15.
    MediaWiki has awrite API! # Init site object import mwclient site = mwclient.Site('commons.wikimedia.org') site.login(username, password) # Optional # Edit page page = site.Pages['Commons:Sandbox'] text = page.edit() print 'Text in sandbox:', text.encode('utf-8') page.save(text + u'nExtra data', summary = 'Test edit')
  • 16.
    “Pending changes” aka “FlaggedRevs”
  • 19.
    “Pending changes” separates “whatchange I want to make” (a commit) from “what I want users to receive” (tag as approved=”release”)
  • 20.
    It's almost like havinga release branch mixed in with trunk....
  • 21.
    Free license +API – what's the hold-up? Parsing mark-up :( Templates :( :(
  • 22.
    “WikiProjects” FTW ● Self-organisedgroups of editors dedicated to a particular topic (e.g. Australia) or, less commonly, focus (e.g. standardising dates) ● Very informal, light-weight ● Narrower focus => better opportunity for community
  • 23.
    Fork the UI? (ala Twitter API)
  • 24.
    “Why would anyonecontribute to a feeder wiki?” Promise of Wikipedia visibility + Domain-specific and relevant interface + Community
  • 25.
    What is community? People Intent/aims Social norms - for interacting - for contributing (eg. style guide) License Meta-planning for all of the above
  • 26.
    In summary... If forkingWikipedia is too hard, what can we do to make it practical again?
  • 27.
    Credits Screenshots and logosare © their respective owners Wikipedia 10: David Peters, CC-BY-SA http://commons.wikimedia.org/wiki/File:Wikipedia_10mark_rev_k.svg Vandalism post-its: cdaltonrowe, CC-BY http://www.flickr.com/photos/30485180@N06/3490537301/ Coloured post-its: Michael Goodine, CC-BY http://www.flickr.com/photos/watchsmart/3227691975/ Branches: Piotrus, CC-BY-SA http://commons.wikimedia.org/wiki/File:Shaped_tree_branches_Tenerife.JPG Wikimania schedule editing: Kat Walsh, CC-BY-SA http://commons.wikimedia.org/wiki/File:Wikimania2007_everythings_a_wiki.jpg WikiProject Council logo: Neurolysis, CC-BY-SA http://commons.wikimedia.org/wiki/File:WikiProject_Council.svg flagged revs maybe: Neurolysis/Kotra, CC-BY-SA http://commons.wikimedia.org/wiki/File:FlagRevsMaybe.png pending changes clock logo: Adam Miller/Anomie/Dodoïste, CC-BY-SA http://commons.wikimedia.org/wiki/File:Pending_changes_clock.svg
  • 28.
    Thanks! brianna@modernthings.org identi.ca/pfctdayelise brianna.laugher.id.au This work is © Brianna Laugher and licensed under the Creative Commons Attribution ShareAlike license, except where otherwise noted.