The Internet is Big Data: How internet research has changed our understanding of the world
The Internet is Big DataHow internet research has changed ourunderstanding of the worldRalph Schroeder, Professor & MSc Programme DirectorEric T. Meyer, Research Fellow & DPhil Programme Director @etmeyer http://www.slideshare.net/etmeyer/2012oiisvcoSilicon Valley Comes to Oxford 20:20 Talk18 November 2012
Introduction What is the Oxford Internet Institute?
Introduction Big Data: Our definition Big data are data that are unprecedented in scale and scope in relation to a given phenomenon. They are often streams of data (rather than fixed datasets), accumulating large volumes, often at high velocity.
Business Value and Academic Value Strategic Knowledge • Generally time-limited (with exceptions) • Value comes from knowing what your competitors don’t • Often has high monetary value if it can be exploited
Business Value and Academic Value Durable Knowledge • Less time-limited (with exceptions) • Value comes from adding to the world’s knowledge (the global brain is cumulative/scientific) • Rarely has direct monetary value, but has value in terms of creating the possibility both of future knowledge and of future exploitation and commercial uses
Mark Graham & BernieHogan’s projectinvestigated inequalitiesin the creation ofknowledgeThe map shown herereveals uneven spread ofgeo-tagged Wikipediaarticles 2011-12
Sandra Gonzalez-Bailon et al.USENET Political Discussions (1999-2005) 8 x 10000 6 4 2 0 09/1999 09/2000 09/2001 09/2002 09/2003 09/2004 gun white war war world war war white black gun white news people world gun white news white news time world good time news war news terrorist world time people peace gun death people people people peoplegood news time hate house dead black terrorist party hate dead hatewhite socialdead hate world war house good f ree man truth lie man house good man party free deathgood black death hate partydeath house party mancrime time man dead black free black lie god death truth gun torture fraud win free house time house peace truth free letter god terrorist gun black money boy abortion flag party world good cut Showed the connection between emotional death reactions and U.S. presidential approval ratings, power but also how emotional language underwent 0:1 0:1 0:1 0:1 0:1 hate man fraud free long-term shifts after events such as 9/11 truthcrimeSource: González-Bailón, S., Banchs, R. E., & Kaltenbrunner, A. (2012). Emotions, Public Opinion, and U.S. Presidential Approval Rates: A 5-Year Analysis ofOnline Political Discussions. Human Communication Research, 38(2), 121-143.
Twitter-bots OII master’s students Alexander Furnas and Devin Gaffney saw a large spike in then-US presidential candidate Mitt Romney’s Twitter followers, and decided to look at the new followers:Furnas, A. and Gaffney, D. (2012). ‘Statistical Probability That Mitt Romneys New Twitter Followers Are Just Normal Users: 0%’. The Atlantic, July 31,http://www.theatlantic.com/technology/archive/2012/07/statistical-probability-that-mitt-romneys-new-twitter-followers-are-just-normal-users-0/260539/ (accessed August 31, 2012).
Hi Idea Idea ? Idea !! Idea Idea Idea ! Idea? Idea Idea ! ? ! Idea ? ?
Hi Idea Idea ? Idea !Idea Suggestion Idea Comment Link Copy! Idea Idea IdeaShare !Question Thought Information IdeaFeeling Action Fact Idea Question Comment? IdeaAction Link Suggestion Share Thought Idea IdeaReply Retweet Support Cats Deny Reply !Link ?Thought Question Fact Comment Idea !Suggestion Information Information ActionComment Action Link Query Comment IdeaSuggestion Denial Support Retweet Thought ? ?
? ? ? ? ? ? ? ? ?Source: Waller, V. (2011). “Not Just Information:Who Searches for What on the Search Engine Google?” Journal of the American Society for Information Science &Technology 62(4): 761-775.
? ? ? “Surprisingly, ?the distribution of ? types of search query did not vary ? significantly across the different ? ? Lifestyle Groups (p>0.01).” ?Source: Waller, V. (2011). “Not Just Information:Who Searches for What on the Search Engine Google?” Journal of the American Society for Information Science &Technology 62(4): 761-775.
Big Data Analytics • Cost of analytical tools • Access to data • Why should anyone share? • Skills to use the tools • How different skills and disciplines work together • From Big Data to Big (Hi-res) Picture • Marketing Tailoring • Forecasting Prediction • A/B and other experiments • Complex Trends Linking datasets plus modelling
Fishing for Knowledge Data Trawling the Sea of Bigin a Sea of InformationImage sources (All CC): http://www.flickr.com/photos/mikecogh/6610354927/; http://www.flickr.com/photos/ponyboy101/2278057689/; http://www.flickr.com/photos/buzzhoffman/4140232128/; http://www.flickr.com/photos/circulating/1785350080/
Additional readings and referencesBond, Robert et al. (2012). ‘A 61-million-person experiment in social influence and political mobilization’,Nature 489: 295–298.Bruns, A. and Liang, Y.E. (2012). ‘Tools and methods for capturing Twitter data during natural disasters’, FirstMonday, 17 (4 – 2), http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/3937/3193Furnas, A. and Gaffney, D. (2012). ‘Statistical Probability That Mitt Romneys New Twitter Followers Are JustNormal Users: 0%’. The Atlantic, July 31, http://www.theatlantic.com/technology/archive/2012/07/statistical-probability-that-mitt-romneys-new-twitter-followers-are-just-normal-users-0/260539/ (accessed August 31,2012).Giles, J. (2012). ‘Making the Links: From E-mails to Social Networks, the Digital Traces left Life in theModern World are Transforming Social Science’, Nature, 488: 448-50.Kwak, H. et al. (2010). ‘What is Twitter, a Social Network or a News Media?’ Proceedings of the 19thInternational World Wide Web (WWW) Conference, April 26-30, 2010, Raleigh NC.Manyika, J. et al. (2011). ‘Big data: the next frontier for innovation, competition and productivity’, McKinseyGlobal Institute, available at: http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation (last accessed August 29, 2012).Silver, Nate. (2012). The Signal and the Noise: The Art and Science of Prediction. London: Allen Lane.Tancer, B. (2009). Click: What Millions of People are Doing Online and Why It Matters. New York: HarperCollins, 2009.Wu, S. , J.M. Hofman, W.A. Mason, and D.J. Watts, (2011). ‘Who says what to whom on twitter’, Proceedingsof the 20th international conference on World Wide Web. (on Duncan Watts webpage,http://research.microsoft.com/en-us/people/duncan/, last accessed August 29, 2012).
Oxford Internet Institute Ralph Schroeder Eric T. Meyer email@example.com firstname.lastname@example.org://www.oii.ox.ac.uk/people/?id=26 http://www.oii.ox.ac.uk/people/?id=120 @etmeyer http://www.slideshare.net/etmeyer/2012oiisvcoSee http://www.oii.ox.ac.uk/research/projects/?id=98 With support from: