These are the slides to a workshop I presented on September 23, 2014 to the University of Wisconsin-Madison Digital Humanities Research Network (http://dhresearchnetwork.wordpress.com/). The workshop covered an overview of my research using DiscoverText, steps to collect data in the cloud-based big data analytics software DiscoverText (https://discovertext.com/), and coding data, as well as limitations, challenges and other resources for social media data collection and analysis.
Collecting and Coding Twitter Data in DiscoverText
1. Collec4ng
and
Coding
TwiJer
Data
in
DiscoverText
Jill
Hopke
@jillhopke
Digital
Humani4es
Research
Network
UW-‐Madison
September
23,
2014
2. Workshop
Overview
My
Research
on
Global
Frackdown
Steps
to
collect
Twi9er
data
in
DiscoverText
Coding
data
in
DiscoverText
LimitaAons
and
Challenges
Other
Tools/Resources
7. Research
Ques4ons
RQ1:
What
TwiJer
strategies
do
Global
Frackdown
ac4vists
use
to
mobilize
for
the
October
19,
2013
day
of
ac4on?
RQ2:
How
do
Global
Frackdown
tweeters
frame
protest
against
hydraulic
fracturing?
8. Project
Data
• Dataset
of
9,449
tweets
for
the
hashtag
#globalfrackdown.
• Data
collected
from
October
13
to
October
27,
2013
using
DiscoverText.
• Textual
analysis
of
English
(n=7,678)
and
Spanish
(n=1,314)
tweets.
• Unit
of
analysis
is
the
individual
tweet.
• Also
conducted
in-‐depth
interviews
with
transna4onal
ac4vists.
31. (Part
of)
What
I
Did
(Theory-‐Driven)
• First
round,
code
for
language.
• Second
round,
read
sub-‐sec4on
of
data
and
developed
set
of
“working
themes.”
• Code
for
themes.
Memo/annotate
interes4ng
examples.
• Refine
codebook
(themes)
and
con4nue
coding.
• Intercoder
reliability
(you
might
want
to
do
this…
Depends
on
your
methodological
approach).
• I
also
used
the
machine-‐learning
func4ons
for
a
separate
chapter
to
“classify”
data
for
valence
and
certainty.
34. Limita4ons
and
Challenges
PRO:
Doesn’t
require
programming
knowledge.
User-‐
friendly
interface.
Powerful
tool.
CON:
Solware’s
advanced
machine-‐learning
func4ons
are
expensive!
DiscoverText
is
one
of
the
“affordable”
plaworms.
Also,
human
subjects
research/IRB
considera4ons.
=
Need
for
collabora4ons
and
grant
funding.
36. Other
Tools
and
Resources
• “Social
Media
Data
Collec4on
Tools”
(see
here):
Running
list
of
tools
curated
by
Deen
Freelon,
Ph.D.,
freelon@american.edu,
hJp://dfreelon.org,
@dfreelon.
• Digital
Methods
Ini4a4ve
at
University
of
Amsterdam
(see
here).
• Digital
Methods
(2013)
by
Richard
Rogers
(see
here).
• Join
Associa4on
of
Internet
Researchers
AIR-‐L
mailing
list
(see
here)!