Digital Methods Initiative
How can the internet be made to show what is happening in
society?
How to collect and analyze data and distill trends from the Web?
Follow the medium as opposed to importing standard methods
from social sciences.
tools @ dmi wiki
http://wiki.digitalmethods.net/Dmi/ToolDatabase?
cat=DeviceCentric&subcat=Wikipedia
wikipedia bot edits
S. Niederer and J. Van Dijck (2010). “The case of Wikipedia:
Wisdom of the crowd or technicity of content?” New Media and
Society
Short version @ http://wiki.digitalmethods.net/Dmi/
NetworkedContent
wikipedia bot edits scraper
How?
• Enter the link to an article
• Scraper retrieves all edit logs for an article
• Filters out all mentions of ‘bot’ and ‘using’
• Returns permalink, date, time, user, permalink, comment
Why?
to find out dependency of article upkeep by bots
wikipedia edits scraper and ip localizer
How?
• Enter the link to an article
• Scraper retrieves all edit logs for an article
• When an IP is encountered instead of a username, MaxMinds
IP-to-GEO database will be queried for geo information
• Returns permalink, date, time, user (or IP), permalink,
comment, (city, country, lat, lon)
Why? Edit-history analysis, scandal research, places of edits.
ip to geo cases
Scandal research
WikiScanner (http://wikiscanner.virgil.gr)
Places of edits
http://mastersofmedia.hum.uva.nl/2007/10/07/
repurposing-the-wikiscanner-comparing-dutch-universities-
edits-on-wikipedia/
wikipedia network analysis
How?
• Enter the link to an article
• Scraper retrieves all bidirectional links to the article, from
within Wikipedia
• Scraper parses those articles and retrieves all their links
• (reiterate previous step until certain depth)
• List links in table (link from -> to)
• Visualize
Why? Article network ecology.
wip: controversy generator
Wikipedia can be seen as a controversy-defusing device as it
strives to NPOV but well-balanced articles.
What if one disentangles the consensus and lays bare
controversies? How would one do that?
wip: controversy generator, possible ways forward
• analyze traces in the system
• edit-histories
• protected pages
• amount of followers
• forkings / splits
• article length
• bot edits
• templates (detecting controversy types)
• ...