Introduction to Computational Social Science - Lecture 1

INTRODUCTION TO
COMPUTATIONAL
SOCIAL SCIENCE
LECTURE 1, 1.9.2015
INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE (CSS01)
LAURI ELORANTA

DATA MINING
DATAAND SOCIETY
BIG DATA
PREDICTIVE ANALYSIS
DIGITAL METHODS
DIGITAL HUMANITIES
SOCIAL NETWORK ANALYSIS
PROGRAMMING IN SOCIAL SCIENCE
IT IS A JUNGLE
OUT THERECOMPLEX SYSTEMS
DATA SCIENCE
HADOOP/MAP REDUCE
REACTIVE PROGRAMMING
PERSONAL DATA
MY DATA
OPEN DATA
IOT / WEARABLES
BUZZ
HYPE
BUZZ
HYPE
BUZZ
HYPE
THE BACKGROUNDIMAGE “JUNGLE”BY LUKE JONES
IS UNDERCREATIVECOMMONS LICENSE.
SEE ORIGINALIMAGEHERE. SEE LICENSE TERMS HERE.

NOT THAT MUCH
TALKINGAND
EVEN LESS
DOINGONLYAFEW PIONEERS
INTHE DESERTED CSS SCENE IN FINLAND
THE BACKGROUNDIMAGE “DESERT”BY MOYAN BRENN

• Practicalities
• What is computational social science?
• Areas of Computational Social Science
• (Big) Data & automated information extraction
• Social Networks
• Social Complexity
• Simulation
• Research examples
• Lecture 1 Reading
LECTURE 1OVERVIEW

• The slides and all materials will be online at
http://blogs.helsinki.fi/computationalsocialscience/
• Course consists of
• 8 Lectures
• A Research Plan Assignment (required, if you want study credits, 5op)
• Any questions?
• Contact lecturer Lauri Eloranta at firstname dot lastname @helsinki.fi
PRACTICALITIESGENERAL

• LECTURE 1: Introduction to Computational Social Science [TODAY]
• Tuesday 01.09. 16:00 – 18:00, U35, Seminar room114
• LECTURE 2: Basics of Computation and Modeling
• Wednesday 02.09. 16:00 – 18:00, U35, Seminar room 113
• LECTURE 3: Big Data and Information Extraction
• Monday 07.09. 16:00 – 18:00, U35, Seminar room 114
• LECTURE 4: Network Analysis
• LECTURE 5: Complex Systems
• Tuesday 15.09. 16:00 – 18:00, U35, Seminar room 114
• LECTURE 6: Simulation in Social Science
• Wednesday 16.09. 16:00 – 18:00, U35, Seminar room 113
• LECTURE 7: Ethical and Legal issues in CSS
• LECTURE 8: Summary
• Tuesday 22.09. 17:00 – 19:00, U35, Seminar room 114
LECTURESSCHEDULE

• Course Book
• Cioffi-Revilla, Claudio (2014). Introduction to
Computational Social Science. Springer-
Verlag, London.
• Further
Reading:
LITERATURECOURSEBOOK

• The full eBook is available via Helsinki
University Library:
https://helka.linneanet.fi/cgi-
bin/Pwebrecon.cgi?BBID=2753081
LITERATURECOURSEBOOK

LITERATUREADDITIONALREADING
• There will be additional reading given for each lecture
• Research articles on the topic at hand, some will be given for “homework
reading”
• The full list of articles can be found at:
http://blogs.helsinki.fi/computationalsocialscience

• Write a short research plan where you apply a computational social
science method to a research problem
• Length: 8 pages for Master’s students, 10 pages for PhD students
• Focus on research method <-> research data <-> research problem
• How to write a research plan, general instructions:
• http://www.uta.fi/cmt/en/doctoralstudies/apply/Tutkimussuunnitelmaohje
et_EN%5B1%5D.pdf
• https://into.aalto.fi/display/endoctoraltaik/Research+Plan
ASSIGNMENTGENERAL

• Assignment DL is Friday 2.10.2015 at EOD/Midnight.
• All assignments are returned in PDF-format
• How to save my work in pdf-format ?  You can ”Save as PDF” or ”Print to PDF” in MS
Word
• Include your name, student ID and contact details
• Assignments are returned to the lecturer Lauri Eloranta via email:
firstname dot lastname @ helsinki.fi
• Grading is done in one month’s time, and you will receive the study
credits on or before 30.10.2015.
ASSIGNMENTHOWTO RETURN THEASSIGNMENT

• Contains six course, covering different aspects of computational social
science
• Full stydy block 25-30 op.
• Basic courses (mandatory)
• Introduction to Computational Social Science (5 op) (I period)
• Introduction to Programming in Social Science (5 op) (II period)
• Special courses
• Data extraction (5 op) (IV period)
• Network Analysis (5 op) (in 2016 – 2017)
• Complex Systems (5 op) (III period)
• Simulation (5 op) (in 2016 – 2017)
COMPUTATIONALSOCIAL
SCIENCE STUDYBLOCK

WHATIS
COMPUTATIONAL
SOCIALSCIENCE?

“In short, a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scale.” (Lazer et al. 2009.)
Lazer, D. et al. 2009. Computational Social Science. Science. 6 February 2009: Vol. 323, no. 5915, pp. 721-723.

• “In short, a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scale.”
• Lazer, D. et al. 2009. Computational Social Science. Science. 6 February
2009: Vol. 323, no. 5915, pp. 721-723.
LAZER ETAL. 2009

• “The increasing integration of technology into our lives has created
unprecedented volumes of data on society’s everyday behaviour. Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems, within the realms of a
new discipline known as Computational Social Science. Against a
background of financial crises, riots and international epidemics, the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear. (Conte et al. 2012)
• Conte, R. 2012. Manifesto of Computational Social Science. The
European Physical Journal Special Topics. November 2012: Vol. 214,
Issue 1, pp. 325-346.
CSS MANIFESTO(CONTE ETAL. 2012)

• “Computational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences. Fields
include computational economics and computational sociology.
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology. The computational tasks include the analysis of social
networks and social geographic systems.”
• (Wikipedia 2015, http://en.wikipedia.org/wiki/Computational_social_science)
WIKIPEDIA

• “The new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales, ranging from individual actors to
the largest groupings, through the medium of computation.”
(Cioffi-Revilla, 2014.)
CIOFFI-REVILLA, 2014
Cioffi-Revilla, Claudio (2014). Introduction to Computational Social Science.
Springer-Verlag, London.

INCREASINGLY
COMPLEX
SOCIETY
THE BACKGROUNDIMAGE “POINTAND LINE TO (MULTIPLE)PLANE(S).”RODRIGOCARVALHO
IS UNDERNON COMMERCIALCREATIVECOMMONS LICENSE.SEE ORIGINALIMAGE HERE. SEE LICENSETERMS HERE.

INSTRUMENTAL
REVOLUTION
THE BACKGROUNDIMAGE “TATELTELESCOPE”BY EP_JHU
IS UNDERNON COMMERCIALCREATIVECOMMONS LICENSE.
ITISFOREMOSTAN

COMPUTER
SCIENCE
SOCIAL
SCIENCE
STATISTICS
COMPUTATIONAL
SOCIALSCIENCE

Time
More
Less
• Speed and
performance of IT
(CPU, RAM,
Network)
• Access to IT /
Internet
• Amount of data
generated
• Cost of IT

FUNDAMENTAL
CHANGES IN
RESEARCH
SETUP
THE BACKGROUNDIMAGE “HOME VISIT”BY NICOLAS NOVA
SEE ORIGINALIMAGE HERE. SEE LICENSE TERMS HERE.

MAJOR
QUESTIONS
REGARDING
RESEARCH
ETHICS THE BACKGROUNDIMAGE “CAMÉRA DE SURVEILLANCE”BY TRISTANNITOT

NOTA
SILVER
BULLET
COMPUTATIONAL
SOCIALSCIENCEIS
THE BACKGROUNDIMAGE “9MM BULLET BW”BY AN NGUYEN

Computational Social Science
proposes revolutionary opportunities
for the social sciences, but it has still
some challenges in relation to
methods, interdisciplinary
cooperation and research ethics.

1. Solving increasingly complex problems: The problems of global
world are complex: computational methods might be able to solve
these complex issues
2. The rise of data: The amounts of data has exploded during the 21st
century
3. IT and Instrumental revolution: all the new tools and possibilities
4. Complex systems: modeling our dynamic organisations and societies
5. Social networks: modeling human behavior as networks
6. Making predictions and simulations: predicting future from the past
7. Interdisciplinary field: (social sciences, math, computer science…)
8. Many problems and challenges, especially regarding research
ethics
CSS COMPONENTS

• Information processing paradigm has two aspects in relation
to CSS:
1. Information processing is substantive to the complex
systems of society that CSS researches: This means that
information processing is takes part in forming and
evolution of complex systems.
2. Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL
PARADIGM OF SOCIETY

BIG DATA&
AUTOMATED
INFROMATION
EXTRACTION
SOCIAL
NETWORK
ANALYSIS
COMPLEX
SYSTEMS &
MODELING
SIMULATION
1
2
3
4
THEMAINAREASOFCSS

• Areas of Computational Social Science
1. (Big) Data & automated data extraction
• Generate, retrieve, sort, modify, transform, … data
2. Social Networks
• Network analysis and social networks
3. Social Complexity
• Social complexity, complex adaptive systems, complex
systems modeling
4. Simulation
FOUR MAINAREAS OF CSS

• Data and automated information extraction can be seen as foundation
for the other areas of CSS
• Raw data can be used as:
1. Data for its own sake: as research data -> data is the subject of
research
2. Data for modeling or validating other phenomena via. e.g. network
analysis, complex systems analysis or simulation
• Data is generated, retrieved, modified, transformed,… for research
purposes via computational automation
BIG DATA&AUTOMATED
INFORMATION EXTRACTION

• A long tradition in network analysis (much older field than CSS)
• Social Networks (Facebook, Twitter, etc.) just one part of network
analysis
• Many other social interactions can be modeled as networks -> thus
social networks are not technology dependent as such
• -> e.g. modeling family as network
• -> e.g. modeling a project as network
SOCIALNETWORKS

• Society seen as a complex adaptive system:
• Phase transitions
• Adaptation (multi stage process)
• Need -> intent -> capacity -> implementation
• Goal
• Information processing in many parts of Complex adaptive systems
• To help adaptation, allocating resources, coordination, …
• Family as and complex adaptive system:
• Development, hardships, births, deaths, successes, failures
• Adaptation over decades
SOCIALCOMPLEXITY

• Three types of systems
1. Natural systems
2. Human systems
3. Artificial systems
• Artificial systems (or artifacts) exist because they have a function: they
serve as adaptive buffers between humans and nature
• Humans pursue the strategy of building artifacts to achieve goals
• Two kinds of artificial systems working in synergy
• Tanglible (e.g. roads, buildings)
• Intanglibe ( e.g. organisations, social structures)
SIMON’STHEORYOFARTIFACTS
ANDSOCIALCOMPLEXITY

• Large (and old) research field
• Two main areas of simulation
1. Variable-Oriented Models
• System Dynamics Models (e.g. modeling a nuclear plant)
• Queuing Models (e.g modeling how a box office line behaves)
2. Object-Oriented Models
• Cellular automate (e.g. Game of life: http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life,
http://pmav.eu/stuff/javascript-game-of-life-v3.1.1/)
• Agent based models (eg. Modeling the communication of a project
organisation of many individuals)
• Also, Evolutionary Models
SIMULATION

• 4 main areas of Computational Social Science
1. Big data and automatic information extraction
2. Social networks
3. Social complexity
4. Simulation
• Typically all of these working together
• CSS has a lot of problems, especially concerning privacy and ethics
• CSS is not a silver bullet and it does not replace other social science
fields or methods: Instead, CSS complements other research fields and
methods
SUMMARY

• Tracking and predicting how flu or other contagious diseases spread
• Based on network and social media analysis and modeling
• Many different variations, one of the first: Google Flu Trends, based on
flu related search queries
• For example:
• Achrekar, H.; Gandhe, A.; Lazarus, R.; Ssu-Hsin Yu; Benyuan Liu, 2011. Predicting Flu
Trends using Twitter data. Computer Communications Workshops (INFOCOM
WKSHPS), 2011 IEEE Conference on , vol., no., pp.702,707, 10-15 April 2011
MODELINGTHE SPREAD
OF DISEASESALREADYANEPIDEMOLOGYCLASSIC

• http://www.google.org/flutrends/intl/en_us/
GOOGLE FLUTRENDS

• Leskovec, J.; Backstrom, L.; Kleinberg, J. 2009. Meme-tracking and the dynamics of
the news cycle. Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining, Pages 497-506 , 2009 - dl.acm.org
• Tracking new topics, ideas, and "memes" across the Web has been an issue of considerable interest.
Recent work has developed methods for tracking topic shifts over long time scales, as well as abrupt
spikes in the appearance of particular named entities. However, these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events.
• We develop a framework for tracking short, distinctive phrases that travel relatively intact through on-line
text; developing scalable algorithms for clustering textual variants of such phrases, we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis.
MODELING NEWS CYCLE
DYNAMICS

• Athanasiadis, I. N.; Mentes, A. K.; Mitkas, P. A.; Mylopoulos, Y. A. 2005. A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81:
175-187, doi:10.1177/0037549705053172
• Picardi, C. and Saeed, K. 1979.The dynamics of water policy in southwestern Saudi
Arabia Anthony. SIMULATION, October 1979; vol. 33, 4: pp. 109-118.
SUSTAINABLE WATER
DEMAND MANAGEMENT
MODELING

• Venturini, T.; Laffite, N. B.; Cointet, J-P.; Gray, I.; Zabban, V.; De Pryck, K. 2014.Three
maps and three misunderstandings: A digital mapping of climate diplomacy. Big Data
& Society July-December 2014 1: 2053951714543804, first published on August 5, 2014
doi:10.1177/2053951714543804
CLIMATE DIPLOMACY
MAPPING

• Can electoral popularity be predicted using socially generated big
data? Information Technology. Volume 56, Issue 5, Pages 246–253,
ISSN (Online) 2196-7032, ISSN (Print) 1611-2776, DOI: 10.1515/itit-
2014-1046, September 2014
• Today, our more-than-ever digital lives leave significant footprints in cyberspace. Large scale collections
of these socially generated footprints, often known as big data, could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework. In this contribution we discuss one
such possibility: the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns. Such data offer
considerable possibility for improving our awareness of popularity dynamics. However they also suffer
from significant drawbacks in terms of representativeness and generalisability. In this paper we discuss
potential ways around such problems, suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source. We offer an initial
exploratory test of these ideas, focussing on two data streams, Wikipedia page views and Google
search queries. On the basis of this data, we present popularity dynamics from real case examples of
recent elections in three different countries.
PREDICTING ELECTIONS?

• DIGIVAALIT 2015
• http://www.hiit.fi/digivaalit-2015
• Researching the parliamentary elections 2015 in Finland, focusing on
digital media data (Twitter, Facebook)
• Trying to understand how media is used and how public agenda is set
• CITIZEN MINDSCAPES
• http://challenge.helsinki.fi/blog/citizen-mindscapes-kansakunnan-
mielentila
• Diving deep into the unscoped virtual territories of a nation’s collective consciousness may reveal something remarkable. The
Finnish, hugely popular Suomi24 discussion forum has 1.9 million monthly visitors, who use the online town square to talk about
anything and everything close to their hearts. If this data could be harnessed into research use, what amazing things could we learn
about Finnish society? A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database.
DIGIVAALIT2015 & CITIZEN
MINDSCAPES

• Listen the “The Trust Engineers” podcast by Radiolab
• http://www.radiolab.org/story/trust-engineers/
• Think about and discuss different ethical research issues in relation to
what you heard
ETHICS

• Lazer, D. et al. 2009. Computational Social Science. Science. 6 February 2009: Vol. 323, no. 5915, pp.
721-723.
• Conte, R. 2012. Manifesto of Computational Social Science. The European Physical Journal Special
Topics. November 2012: Vol. 214, Issue 1, pp. 325-346.
• Anderson, C. 2008. The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired.
http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory
• Einav, L. and Levin, J. 2014. The Data Revolution and Economic Analysis. In Innovation Policy and the
Economy edited by Josh Lerner and Scott Stern. http://web.stanford.edu/~leinav/pubs/IPE2014.pdf
• King, G. 2011. Ensuring the Data-Rich Future of the Social Sciences. Science. 11 February 2011: Vol.
331 no. 6018 pp. 719-721.
• Wallach, H. 2014. Big Data, Machine Learning, and the Social Sciences: Fairness, Accountability, and
Transparency. Medium.com. https://medium.com/@hannawallach/big-data-machine-learning-and-
thesocial-sciences-927a8e20460d
LECTURE 1 READING

Thank You!
Questions and comments?
twitter: @laurieloranta

Introduction to Computational Social Science - Lecture 1

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Introduction to Computational Social Science - Lecture 1

Similar to Introduction to Computational Social Science - Lecture 1 (20)

Recently uploaded

Recently uploaded (20)

Introduction to Computational Social Science - Lecture 1

Editor's Notes