"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
2013 05 20 field_directors
1. Computational Social Science:
The Pros and Cons of 'Big Data’
Cliff Lampe
- School of Information, University of Michigan
May 20, 2013
Monday, May 20, 13
2. Cliff Lampe
School of Information
- associate professor
Social media
“Socio-technical systems”
Primarily a social scientist
Monday, May 20, 13
3. Samples of my research in this area
Effects of participation on Facebook
Information cascades on Twitter
User collaboration on Wikipedia
Discussion patterns in large-scale news sites
Coordination in massive online games
Information seeking on search engines vs. social media
Monday, May 20, 13
4. Interactions in social media leave communication
traces we can mine to understand social processes.
These compete with insights from surveys.
Monday, May 20, 13
5. Defining “Big Data”
“Big Data” is a rough categorization, a marketing
term, and a paradigm shift.
Monday, May 20, 13
6. Why “big data” has become
a big deal...
More devices collecting data
More data born digital
Easier/cheaper to store
Better processors
New skills / techniques
Insights have proven effective
Monday, May 20, 13
9. Big Data is increasingly being
applied to social science questions
Monday, May 20, 13
10. What counts as “big”?
LHC: .001% of sensors lead
to 25 petabytes annually.
Wikipedia: 17 terabytes
Twitter: ~ 10 GB/day
How many observations
needed to count as “big”?
Monday, May 20, 13
11. ‘Big Data’ require multiple,
interlinked skills and tools.
Monday, May 20, 13
16. Common characteristics
User generated content
Direct user-to-user interaction
Bundles of applications
More than Facebook and Twitter
Monday, May 20, 13
30. Social media skill
Nearly 1 million people join Facebook every week
People spend on average 16 hours a month on
Facebook
There are about 250 million Tweets per day
People upload 3000 pictures to Flickr every minute
Wikipedia has 17 million articles by 91,000 editors
YouTube has 490 million unique visitors per month
Google + reached 10 million users in 16 days
Monday, May 20, 13
31. The social media landscape
is constantly changing.
Monday, May 20, 13
48. Issues with social media
trace data
Access
Representing results
Representativeness
Validity
Cross-channel difficulty
Appropriate skill sets
Ethics
Monday, May 20, 13
49. Access
Data often owned by
private corporations.
Need special skills to
access.
Monday, May 20, 13
52. Validity
Social media users are
performing (though
don’t know scientists
are observing them)
Different sites have
different purposes.
Monday, May 20, 13
53. Cross-channel use
How do you track one
user over multiple
social media sites?
Monday, May 20, 13
59. Humble Suggestions
More interdisciplinary work.
Propose and fund work to test these issues.
Don’t pretend it isn’t coming OR is a panacea.
We’re just at the beginning of the journey.
Monday, May 20, 13
60. Social media and surveys
project
Can social media data ever replace and/or supplement
social measurement, especially for official statistics,
based on self-reported answers to questions asked of
a representative sample?
Fred Conrad, Michael Schober, Josh Pasek
Monday, May 20, 13