Pendekatan untuk riset big data di bidang sosial dan politik:
1. Data governance dan privacy
2. Media Analysis
3. Social Network Analysis
4. Complex System Analysis
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Big data dan riset sosial dan politik
1. Big Data untuk Riset Sosial dan Politik
widyawan@ugm.ac.id
2. • Dosen dteti ugm
• Direktur dssdi ugm
• Komisaris pt gamatechno dan & datains
• Ketua Pokja Big Data Forum Masyarakat Statistik
• S1 teknik elektro ugm
• S2 erasmus univ netherland
• S3 electronic, cork institute of tech, ireland
3. Agenda
• New Oil
• Data Governance
• Text Analysis
• Social Network Analysis
4. “Processed data is information, processed information is knowledge
processed knowledge is wisdom.”
Ankala V. Subbarao
9. Why Data Silos
• Structural
• Government and organization increasingly divided into units and teams
• Often left to implement their own process and software
• Creates data separation
• Ex.: BPS, Dukcapil, MDP, Data Kesehatan
• Social
• No incentives for sharing data
• People do not share data to assert control and power
• Technological
• Legacy system
• Vendor lock-in
10. How
• Stop silos mentality
• Clear regulation in data transparency
• Encourage open data
11. “Personal data is a new currency of the digital world”
Meglena Kuneva, European Consumer Commissioner
12. People’s Privacy is the Loser
• Perlindungan terhadap
personal data lemah
(privacy)
• Pengguna tidak
mempunyai kendali
bagaimana personal
data digunakan, dibagi,
dikomersialkan dan
disebarluaskan
• Kedaulatan data
(berhubungan dgn
lokasi fisik data)
seringkali merupakan
isu antara negara dgn
korporasi
NegaraPeople
Korporasi
personaldata
personal data
surveillance
m
onetize
kedaulatan
13. Peta Perlindungan Privasi Data
Endemic surveillance societies
Extensive surveillance societies
Systemic failure to uphold safeguards
Some safeguards but weakened protections
Adequate safeguards against abuse
https://www.privacyinternational.org/
17. “In God we trust, all others must bring data.”
W. Edwards Deming
18.
19. Notes about data source
Data Source Availability Veracity
Sensor IoT Closed Medium - High
Software Database Closed High
Social Media Open Low
Online news Open Medium
• Twitter public API provide access to 1% of its data
• Facebook and Instagram only provide access to public
groups or pages, since the Cambridge Analytica case has
become a more difficult mechanism
• WhatsApp does not provide access to their data, at least
legally
• Online news can be obtained from RSS or scrapping
20. Mathematical Modelling in Social Data
Media
Analysis
Social
Network
Analysis
Complexity
Analysis
Social
Simulations
C. Ciof-Revilla, Introduction to Computational Social Science: Principles and Applications. London, U.K.: Springer, 2013.
21. Media Analysis
• Comprise of information extraction and classification
Information extraction:
• Unobtrusive method of parsing and coding documents to extract
information from data
• Mainly text, increasingly all source of media
• Wisdom of the crowd à a vast majority of the extant literature is on
twitter datasets with only 5% of the papers analyzing Facebook
Social Set Analysis: A Set Theoretical Approach to Big Data Analytics, Ravi Patrapu et.all., IEEE Access, 2016
22. Wisdom of the crowd, consideration
• Information will spread/diffuse in various media. Online News,
Facebook or Twitter contents will reflect the content and resonance
of other media.
• The difference à online news will have a curation process by the
editor. Therefore (ideally) the content are facts (not
gossip/speculation), cover both sides, impartial, do not contain
personal opinions/neutral.
• Big Data, Big Noise à on the other hand, social media such as twitter
and Facebook, a status/tweet is a personal expression. It contains
opinions and there is no mechanism for curating/checking, it's more
prone to hoaxes
26. Classification
• the action or process of classifying something according to shared
qualities or characteristics.
• a computational linguistic approach using various mathematical
method
• Regression
• Logical/rule based
• Geometric model
• Probabilistic model
• Neural Network à evolve into deep learning
29. Metcalfe’s Law & Network Economics
● Value or power of a network grows exponentially as a function of
the number of network members
● As network members increase, more people want to use it
network value = n(n-1)/2
30. Social Network Analysis
• Study of social structures using
networks and graphs theory
• Explorer basic relation in dyadic
structure
• A node can be an actor/person
and the relations/edges are
relationships between nodes
(e.g. retweet or mention)
Retweet
@gusmusgusmu @me
34. Statistik dari tweet dan onlinenews
waktu: 3-8 Oktober 2018
twitter keyword: jogja, yogya, jogjakarta, DIY
hit: 30.435 tweet
onlinenews keyword: jogja, yogya, jogjakarta, DIY
hit: 356 berita
@YogyakartaCity
@JogjaUpdate
topik: sepakbola
topik: BMKG
topik: ulangtahun
SNA
layout: Yifan hu algoritma
keterangan: warna pada SNA menunjukkan pengelompokan topik
37. Notes about SNA
• Polarization clearly visible in a divided political issue
• Some nodes have higher degree of centrality à influencer
• Some nodes need to play a role as boundary spanner.
• Boundary spanner is a node that connects / bridges between two different
communities, which without them will not communicate with each other
(MAC case: republikaonline, vivacoid, maklambeturah, MbahUyok).
• Otherwise, echo chamber effect
42. Take Out
• To get value from data, data governance is needed
• Privacy needs more protection from commercial interest and state
surveillance
• Media analysis comprise of information extraction and classification
• SNA is a study of social structures using networks and graphs theory