Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Computational Social Science - Lecture 1

4,770 views

Published on

First lecture of the course CSS01: Introduction to Computational Social Science at the University of Helsinki, Spring 2015. (http://blogs.helsinki.fi/computationalsocialscience/).

Lecturer: Lauri Eloranta
Questions & Comments: https://twitter.com/laurieloranta

Published in: Data & Analytics

Introduction to Computational Social Science - Lecture 1

  1. 1. INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE LECTURE 1, 1.9.2015 INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE (CSS01) LAURI ELORANTA
  2. 2. @LAURIELORANTA
  3. 3. DATA MINING DATAAND SOCIETY BIG DATA PREDICTIVE ANALYSIS DIGITAL METHODS DIGITAL HUMANITIES SOCIAL NETWORK ANALYSIS PROGRAMMING IN SOCIAL SCIENCE IT IS A JUNGLE OUT THERECOMPLEX SYSTEMS DATA SCIENCE HADOOP/MAP REDUCE REACTIVE PROGRAMMING PERSONAL DATA MY DATA OPEN DATA IOT / WEARABLES BUZZ HYPE BUZZ HYPE BUZZ HYPE THE BACKGROUNDIMAGE “JUNGLE”BY LUKE JONES IS UNDERCREATIVECOMMONS LICENSE. SEE ORIGINALIMAGEHERE. SEE LICENSE TERMS HERE.
  4. 4. NOT THAT MUCH TALKINGAND EVEN LESS DOINGONLYAFEW PIONEERS INTHE DESERTED CSS SCENE IN FINLAND THE BACKGROUNDIMAGE “DESERT”BY MOYAN BRENN IS UNDERCREATIVECOMMONS LICENSE. SEE ORIGINALIMAGEHERE. SEE LICENSE TERMS HERE.
  5. 5. • Practicalities • What is computational social science? • Areas of Computational Social Science • (Big) Data & automated information extraction • Social Networks • Social Complexity • Simulation • Research examples • Lecture 1 Reading LECTURE 1OVERVIEW
  6. 6. PRACTICALITIES
  7. 7. • The slides and all materials will be online at http://blogs.helsinki.fi/computationalsocialscience/ • Course consists of • 8 Lectures • A Research Plan Assignment (required, if you want study credits, 5op) • Any questions? • Contact lecturer Lauri Eloranta at firstname dot lastname @helsinki.fi PRACTICALITIESGENERAL
  8. 8. • LECTURE 1: Introduction to Computational Social Science [TODAY] • Tuesday 01.09. 16:00 – 18:00, U35, Seminar room114 • LECTURE 2: Basics of Computation and Modeling • Wednesday 02.09. 16:00 – 18:00, U35, Seminar room 113 • LECTURE 3: Big Data and Information Extraction • Monday 07.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 4: Network Analysis • Monday 14.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 5: Complex Systems • Tuesday 15.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 6: Simulation in Social Science • Wednesday 16.09. 16:00 – 18:00, U35, Seminar room 113 • LECTURE 7: Ethical and Legal issues in CSS • Monday 21.09. 16:00 – 18:00, U35, Seminar room 114 • LECTURE 8: Summary • Tuesday 22.09. 17:00 – 19:00, U35, Seminar room 114 LECTURESSCHEDULE
  9. 9. • Course Book • Cioffi-Revilla, Claudio (2014). Introduction to Computational Social Science. Springer- Verlag, London. • Further Reading: LITERATURECOURSEBOOK
  10. 10. • The full eBook is available via Helsinki University Library: https://helka.linneanet.fi/cgi- bin/Pwebrecon.cgi?BBID=2753081 LITERATURECOURSEBOOK
  11. 11. LITERATUREADDITIONALREADING • There will be additional reading given for each lecture • Research articles on the topic at hand, some will be given for “homework reading” • The full list of articles can be found at: http://blogs.helsinki.fi/computationalsocialscience
  12. 12. • Write a short research plan where you apply a computational social science method to a research problem • Length: 8 pages for Master’s students, 10 pages for PhD students • Focus on research method <-> research data <-> research problem • How to write a research plan, general instructions: • http://www.uta.fi/cmt/en/doctoralstudies/apply/Tutkimussuunnitelmaohje et_EN%5B1%5D.pdf • https://into.aalto.fi/display/endoctoraltaik/Research+Plan ASSIGNMENTGENERAL
  13. 13. • Assignment DL is Friday 2.10.2015 at EOD/Midnight. • All assignments are returned in PDF-format • How to save my work in pdf-format ?  You can ”Save as PDF” or ”Print to PDF” in MS Word • Include your name, student ID and contact details • Assignments are returned to the lecturer Lauri Eloranta via email: firstname dot lastname @ helsinki.fi • Grading is done in one month’s time, and you will receive the study credits on or before 30.10.2015. ASSIGNMENTHOWTO RETURN THEASSIGNMENT
  14. 14. • Contains six course, covering different aspects of computational social science • Full stydy block 25-30 op. • Basic courses (mandatory) • Introduction to Computational Social Science (5 op) (I period) • Introduction to Programming in Social Science (5 op) (II period) • Special courses • Data extraction (5 op) (IV period) • Network Analysis (5 op) (in 2016 – 2017) • Complex Systems (5 op) (III period) • Simulation (5 op) (in 2016 – 2017) COMPUTATIONALSOCIAL SCIENCE STUDYBLOCK
  15. 15. WHATIS COMPUTATIONAL SOCIALSCIENCE?
  16. 16. “In short, a computational social science is emerging [field] that leverages the capacity to collect and analyze data with an unprecedented breadth and depth and scale.” (Lazer et al. 2009.) Lazer, D. et al. 2009. Computational Social Science. Science. 6 February 2009: Vol. 323, no. 5915, pp. 721-723.
  17. 17. • “In short, a computational social science is emerging [field] that leverages the capacity to collect and analyze data with an unprecedented breadth and depth and scale.” • Lazer, D. et al. 2009. Computational Social Science. Science. 6 February 2009: Vol. 323, no. 5915, pp. 721-723. LAZER ETAL. 2009
  18. 18. • “The increasing integration of technology into our lives has created unprecedented volumes of data on society’s everyday behaviour. Such data opens up exciting new opportunities to work towards a quantitative understanding of our complex social systems, within the realms of a new discipline known as Computational Social Science. Against a background of financial crises, riots and international epidemics, the urgent need for a greater comprehension of the complexity of our interconnected global society and an ability to apply such insights in policy decisions is clear. (Conte et al. 2012) • Conte, R. 2012. Manifesto of Computational Social Science. The European Physical Journal Special Topics. November 2012: Vol. 214, Issue 1, pp. 325-346. CSS MANIFESTO(CONTE ETAL. 2012)
  19. 19. • “Computational social science refers to the academic sub-disciplines concerned with computational approaches to the social sciences. Fields include computational economics and computational sociology. It is a multi-disciplinary and integrated approach to social survey focusing on information processing by means of advanced information technology. The computational tasks include the analysis of social networks and social geographic systems.” • (Wikipedia 2015, http://en.wikipedia.org/wiki/Computational_social_science) WIKIPEDIA
  20. 20. • “The new field of Computational Social Science can be defined as the interdisciplinary investigation of the social universe of many scales, ranging from individual actors to the largest groupings, through the medium of computation.” (Cioffi-Revilla, 2014.) CIOFFI-REVILLA, 2014 Cioffi-Revilla, Claudio (2014). Introduction to Computational Social Science. Springer-Verlag, London.
  21. 21. INCREASINGLY COMPLEX SOCIETY THE BACKGROUNDIMAGE “POINTAND LINE TO (MULTIPLE)PLANE(S).”RODRIGOCARVALHO IS UNDERNON COMMERCIALCREATIVECOMMONS LICENSE.SEE ORIGINALIMAGE HERE. SEE LICENSETERMS HERE.
  22. 22. INSTRUMENTAL REVOLUTION THE BACKGROUNDIMAGE “TATELTELESCOPE”BY EP_JHU IS UNDERNON COMMERCIALCREATIVECOMMONS LICENSE. SEE ORIGINALIMAGEHERE. SEE LICENSE TERMS HERE. ITISFOREMOSTAN
  23. 23. COMPUTER SCIENCE SOCIAL SCIENCE STATISTICS COMPUTATIONAL SOCIALSCIENCE
  24. 24. Time More Less • Speed and performance of IT (CPU, RAM, Network) • Access to IT / Internet • Amount of data generated • Cost of IT
  25. 25. FUNDAMENTAL CHANGES IN RESEARCH SETUP THE BACKGROUNDIMAGE “HOME VISIT”BY NICOLAS NOVA IS UNDERCREATIVECOMMONS LICENSE. SEE ORIGINALIMAGE HERE. SEE LICENSE TERMS HERE.
  26. 26. MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUNDIMAGE “CAMÉRA DE SURVEILLANCE”BY TRISTANNITOT IS UNDERCREATIVECOMMONS LICENSE. SEE ORIGINALIMAGE HERE. SEE LICENSE TERMS HERE.
  27. 27. NOTA SILVER BULLET COMPUTATIONAL SOCIALSCIENCEIS THE BACKGROUNDIMAGE “9MM BULLET BW”BY AN NGUYEN IS UNDERCREATIVECOMMONS LICENSE. SEE ORIGINALIMAGE HERE. SEE LICENSE TERMS HERE.
  28. 28. Computational Social Science proposes revolutionary opportunities for the social sciences, but it has still some challenges in relation to methods, interdisciplinary cooperation and research ethics.
  29. 29. 1. Solving increasingly complex problems: The problems of global world are complex: computational methods might be able to solve these complex issues 2. The rise of data: The amounts of data has exploded during the 21st century 3. IT and Instrumental revolution: all the new tools and possibilities 4. Complex systems: modeling our dynamic organisations and societies 5. Social networks: modeling human behavior as networks 6. Making predictions and simulations: predicting future from the past 7. Interdisciplinary field: (social sciences, math, computer science…) 8. Many problems and challenges, especially regarding research ethics CSS COMPONENTS
  30. 30. • Information processing paradigm has two aspects in relation to CSS: 1. Information processing is substantive to the complex systems of society that CSS researches: This means that information processing is takes part in forming and evolution of complex systems. 2. Information processing is methodological in the sense that it serves as the core instrument of CSS COMPUTATIONAL PARADIGM OF SOCIETY (Cioffi-Revilla, 2014.)
  31. 31. BIG DATA& AUTOMATED INFROMATION EXTRACTION SOCIAL NETWORK ANALYSIS COMPLEX SYSTEMS & MODELING SIMULATION 1 2 3 4 THEMAINAREASOFCSS
  32. 32. • Areas of Computational Social Science 1. (Big) Data & automated data extraction • Generate, retrieve, sort, modify, transform, … data 2. Social Networks • Network analysis and social networks 3. Social Complexity • Social complexity, complex adaptive systems, complex systems modeling 4. Simulation FOUR MAINAREAS OF CSS (Cioffi-Revilla, 2014.)
  33. 33. • Data and automated information extraction can be seen as foundation for the other areas of CSS • Raw data can be used as: 1. Data for its own sake: as research data -> data is the subject of research 2. Data for modeling or validating other phenomena via. e.g. network analysis, complex systems analysis or simulation • Data is generated, retrieved, modified, transformed,… for research purposes via computational automation BIG DATA&AUTOMATED INFORMATION EXTRACTION (Cioffi-Revilla, 2014.)
  34. 34. • A long tradition in network analysis (much older field than CSS) • Social Networks (Facebook, Twitter, etc.) just one part of network analysis • Many other social interactions can be modeled as networks -> thus social networks are not technology dependent as such • -> e.g. modeling family as network • -> e.g. modeling a project as network SOCIALNETWORKS (Cioffi-Revilla, 2014.)
  35. 35. • Society seen as a complex adaptive system: • Phase transitions • Adaptation (multi stage process) • Need -> intent -> capacity -> implementation • Goal • Information processing in many parts of Complex adaptive systems • To help adaptation, allocating resources, coordination, … • Family as and complex adaptive system: • Development, hardships, births, deaths, successes, failures • Adaptation over decades SOCIALCOMPLEXITY (Cioffi-Revilla, 2014.)
  36. 36. • Three types of systems 1. Natural systems 2. Human systems 3. Artificial systems • Artificial systems (or artifacts) exist because they have a function: they serve as adaptive buffers between humans and nature • Humans pursue the strategy of building artifacts to achieve goals • Two kinds of artificial systems working in synergy • Tanglible (e.g. roads, buildings) • Intanglibe ( e.g. organisations, social structures) SIMON’STHEORYOFARTIFACTS ANDSOCIALCOMPLEXITY (Cioffi-Revilla, 2014.)
  37. 37. • Large (and old) research field • Two main areas of simulation 1. Variable-Oriented Models • System Dynamics Models (e.g. modeling a nuclear plant) • Queuing Models (e.g modeling how a box office line behaves) 2. Object-Oriented Models • Cellular automate (e.g. Game of life: http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life, http://pmav.eu/stuff/javascript-game-of-life-v3.1.1/) • Agent based models (eg. Modeling the communication of a project organisation of many individuals) • Also, Evolutionary Models SIMULATION (Cioffi-Revilla, 2014.)
  38. 38. • 4 main areas of Computational Social Science 1. Big data and automatic information extraction 2. Social networks 3. Social complexity 4. Simulation • Typically all of these working together • CSS has a lot of problems, especially concerning privacy and ethics • CSS is not a silver bullet and it does not replace other social science fields or methods: Instead, CSS complements other research fields and methods SUMMARY
  39. 39. SOMERESEARCH EXAMPLES
  40. 40. • Tracking and predicting how flu or other contagious diseases spread • Based on network and social media analysis and modeling • Many different variations, one of the first: Google Flu Trends, based on flu related search queries • For example: • Achrekar, H.; Gandhe, A.; Lazarus, R.; Ssu-Hsin Yu; Benyuan Liu, 2011. Predicting Flu Trends using Twitter data. Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference on , vol., no., pp.702,707, 10-15 April 2011 MODELINGTHE SPREAD OF DISEASESALREADYANEPIDEMOLOGYCLASSIC
  41. 41. • http://www.google.org/flutrends/intl/en_us/ GOOGLE FLUTRENDS
  42. 42. • Leskovec, J.; Backstrom, L.; Kleinberg, J. 2009. Meme-tracking and the dynamics of the news cycle. Proceedings of the 15th ACM ACM SIGKDD international conference on Knowledge discovery and data mining, Pages 497-506 , 2009 - dl.acm.org • Tracking new topics, ideas, and "memes" across the Web has been an issue of considerable interest. Recent work has developed methods for tracking topic shifts over long time scales, as well as abrupt spikes in the appearance of particular named entities. However, these approaches are less well suited to the identification of content that spreads widely and then fades over time scales on the order of days - the time scale at which we perceive news and events. • We develop a framework for tracking short, distinctive phrases that travel relatively intact through on-line text; developing scalable algorithms for clustering textual variants of such phrases, we identify a broad class of memes that exhibit wide spread and rich variation on a daily basis. MODELING NEWS CYCLE DYNAMICS
  43. 43. • Athanasiadis, I. N.; Mentes, A. K.; Mitkas, P. A.; Mylopoulos, Y. A. 2005. A Hybrid Agent- Based Model for Estimating Residential Water Demand SIMULATION March 2005 81: 175-187, doi:10.1177/0037549705053172 • Picardi, C. and Saeed, K. 1979.The dynamics of water policy in southwestern Saudi Arabia Anthony. SIMULATION, October 1979; vol. 33, 4: pp. 109-118. SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
  44. 44. • Venturini, T.; Laffite, N. B.; Cointet, J-P.; Gray, I.; Zabban, V.; De Pryck, K. 2014.Three maps and three misunderstandings: A digital mapping of climate diplomacy. Big Data & Society July-December 2014 1: 2053951714543804, first published on August 5, 2014 doi:10.1177/2053951714543804 CLIMATE DIPLOMACY MAPPING
  45. 45. • Can electoral popularity be predicted using socially generated big data? Information Technology. Volume 56, Issue 5, Pages 246–253, ISSN (Online) 2196-7032, ISSN (Print) 1611-2776, DOI: 10.1515/itit- 2014-1046, September 2014 • Today, our more-than-ever digital lives leave significant footprints in cyberspace. Large scale collections of these socially generated footprints, often known as big data, could help us to re-investigate different aspects of our social collective behaviour in a quantitative framework. In this contribution we discuss one such possibility: the monitoring and predicting of popularity dynamics of candidates and parties through the analysis of socially generated data on the web during electoral campaigns. Such data offer considerable possibility for improving our awareness of popularity dynamics. However they also suffer from significant drawbacks in terms of representativeness and generalisability. In this paper we discuss potential ways around such problems, suggesting the nature of different political systems and contexts might lend differing levels of predictive power to certain types of data source. We offer an initial exploratory test of these ideas, focussing on two data streams, Wikipedia page views and Google search queries. On the basis of this data, we present popularity dynamics from real case examples of recent elections in three different countries. PREDICTING ELECTIONS?
  46. 46. • DIGIVAALIT 2015 • http://www.hiit.fi/digivaalit-2015 • Researching the parliamentary elections 2015 in Finland, focusing on digital media data (Twitter, Facebook) • Trying to understand how media is used and how public agenda is set • CITIZEN MINDSCAPES • http://challenge.helsinki.fi/blog/citizen-mindscapes-kansakunnan- mielentila • Diving deep into the unscoped virtual territories of a nation’s collective consciousness may reveal something remarkable. The Finnish, hugely popular Suomi24 discussion forum has 1.9 million monthly visitors, who use the online town square to talk about anything and everything close to their hearts. If this data could be harnessed into research use, what amazing things could we learn about Finnish society? A team of media professionals at the forums owner company Aller and researchers at the National Consumer Research Center plan to make use of this immense database. DIGIVAALIT2015 & CITIZEN MINDSCAPES
  47. 47. • Listen the “The Trust Engineers” podcast by Radiolab • http://www.radiolab.org/story/trust-engineers/ • Think about and discuss different ethical research issues in relation to what you heard ETHICS
  48. 48. • Lazer, D. et al. 2009. Computational Social Science. Science. 6 February 2009: Vol. 323, no. 5915, pp. 721-723. • Conte, R. 2012. Manifesto of Computational Social Science. The European Physical Journal Special Topics. November 2012: Vol. 214, Issue 1, pp. 325-346. • Anderson, C. 2008. The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired. http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory • Einav, L. and Levin, J. 2014. The Data Revolution and Economic Analysis. In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern. http://web.stanford.edu/~leinav/pubs/IPE2014.pdf • King, G. 2011. Ensuring the Data-Rich Future of the Social Sciences. Science. 11 February 2011: Vol. 331 no. 6018 pp. 719-721. • Wallach, H. 2014. Big Data, Machine Learning, and the Social Sciences: Fairness, Accountability, and Transparency. Medium.com. https://medium.com/@hannawallach/big-data-machine-learning-and- thesocial-sciences-927a8e20460d LECTURE 1 READING
  49. 49. Thank You! Questions and comments? twitter: @laurieloranta

×