Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Tcat
1. SMAC LAB, LSU
Sep 28, 2018
SMAC Talks
TCAT
Instructor: Dr. Ke (Jenny) Jiang
2. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 1: User Stats (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
# of tweets, tweets with links, tweets with hashtags, tweets with
mentions, retweets, replies
Get a feel for the overall characteristics of your data set
3. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 2: User Stats Overall(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains the min, max, average, Q1, median, Q3, and trimmed mean for:
number of tweets per user, urls per user, number of followers, number of
friends, number of tweets
5. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 3: User Stats Individual(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists users and their number of tweets, number of followers, number of
friends, how many times they are listed, their UTC time offset, whether
the user has a verified account and how many times they appear in the
data set.
8. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 4: Hashtag frequency(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
find out which hashtags are most often associated with your subject.
9. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 5: Hashtag-user activity(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists hashtags, the number of tweets with that hashtag, the number of
distinct users tweeting with that hashtag, the number of distinct
mentions tweeted together with the hashtag, and the total number of
mentions tweeted together with the hashtag.
10. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 6: Twitter client (source) frequency(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
List the frequency of tweet software sources per interval.
11. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 7:
Twitter client (source) stats (individual)(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists sources and their number of tweets, retweets, hashtags, URLs and mentions
12. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 8:
User visibility (mention frequency)(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists usernames and the number of times they were mentioned by others.
find out which users are "influentials"
13. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 9:
User activity (tweet frequency)(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists usernames and the amount of tweets posted.
find the most active tweeters
see if the dataset is dominated by certain twitterati.
14. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 10:
User activity + visibility (tweet+mention frequency)(.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists usernames and the amount of tweets posted.
see wether the users mentioned are also those who tweet a lot
15. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 11:
Url frequency (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains the frequencies of tweeted URLs.
find out which contents (articles, videos, etc.) are referenced most often
16. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 12:
Host name frequency (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains the frequencies of tweeted domain names.
find out which sources (media, platforms, etc.) are referenced most often
17. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 13:
Identical tweet frequency (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains tweets and the number of times they have been (re)tweeted identically
get a grasp of the most "popular" content
18. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 14:
Word frequency (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains words and the number of times they have been used
get a grasp of the most used language
19. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 15:
Media frequency (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains media URLs and the number of times they have been used
get a grasp of the most popular media
20. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet Statistics and Activity Metrics 16:
Export table with potential gaps in your data (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Exports a spreadsheet with all known data gaps in your current query, during which
TCAT was not running or capturing data for this bin
Gain insight in possible missing data due to outages
21. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet exports 1:
Random set of tweets from selection (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains 1000 randomly selected tweets and information about them (user, date
created, from_user_name, retweet_count, favorite_count, lang, to_user_name
in_reply_to_status_id, quoted_status_id source, location, lat, lng, from_user_id
from_user_realname, from_user_verified, from_user_description, from_user_url,
from_user_profile_image_url, from_user_timezone, from_user_tweetcount
from_user_followercount, from_user_friendcount, from_user_favourites_count
from_user_listed, from_user_created_at)
a random subset of tweets is a representative sample that can be manually
classified and coded much more easily than the full set
22. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet exports 2:
List each individual retweet (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains all tweets and information about them (user, date created, ...)
spend time with your data
23. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet exports 3:
List each individual retweet (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Lists all retweets (and all the tweets metadata like follower_count)
chronologically.:RT @
This script is slow. Small datasets only!
24. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet exports 4:
Only tweets with lat/lon (.csv)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains only geo-located tweets
Geo location is different from the self-reported location
26. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet exports 7:
Export mentions table (tweet id, user from id, user from name, user to
id, user to name, mention, mention type)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains tweet ids from your selection, with mentions and the mention type.
Mention network
27. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Tweet exports 8:
Export URLs table (tweet id, url, expanded url, followed url)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Contains tweet ids from your selection and URLs.
28. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 1: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Social graph by mentions
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a directed graph based on interactions between users. If a users
mentions another one, a directed link is created. The more often a user
mentions another, the stronger the link ("link weight"). The "count" value
contains the number of tweets for each user in the specified period.
analyze patterns in communication, find "hubs" and "communities",
categorize user accounts.
29. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 2: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Social graph by in_reply_to_status_id
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a directed graph based on interactions between users. If a tweet
was written in reply to another one, a directed link is created.
analyze patterns in communication, find "hubs" and "communities",
categorize user accounts.
30. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 3: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Co-hashtag graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces an undirected graph based on co-word analysis of hashtags. If
two hashtags appear in the same tweet, they are linked. The more often they
appear together, the stronger the link ("link weight").
explore the relations between hashtags, find and analyze sub-issues,
distinguish between different types of hashtags (event related, etc.).
31. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 4: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Bipartite hashtag-mention graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a bipartite graph based on co-occurence of hashtags and
@mentions. If an @mention co-occurs in a tweet with a certain hashtag,
there will be a link between that @mention and the hashtag. The more often
they appear together, the stronger the link ("link weight").
explore the relational activity between mentioned users and hashtags,
find and analyze which users are considered experts around which
topics.
32. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 5: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Bipartite hashtag-source graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a bipartite graph based on co-occurence of hashtags and
"sources" (the client a tweet was sent from is its source) . If a hashtag is
tweeted from a particular client, there will be a link between that client and
the hashtag. The more often they appear together, the stronger the link ("link
weight").
explore the relations between clients and hashtags, find and analyze
which clients are related to which topics.
33. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 6: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
user-source graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a bipartite graph based on co-occurence of users and
"sources" (the client a tweet was sent from is its source) . If a users tweets
from a particular client, there will be a link between that client and the user.
The more often they appear together, the stronger the link ("link weight").
explore the relations between clients and users, find and analyze which
users use which clients.
34. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 7: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Bipartite domain-source graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a bipartite graph based on co-occurence of (URL-)domains and
"sources" (the client a tweet was sent from is its source) . If a domain is
tweeted from a particular client, there will be a link between that client and
the domain. The more often they appear together, the stronger the link ("link
weight").
explore the relations between domains and hashtags, find and analyze
which domains are related to which sources.
35. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 8: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Bipartite URL-user graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces a bipartite graph based on co-occurence of URLS and users. If a
user wrote a tweet with a certain URL, there will be a link between that user
and the URL. The more often they appear together, the stronger the link
("link weight").
explore the relations between users and URLs, find and analyze which
users group around which URLs.
36. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 8: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Bipartite hashtag-URL graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Creates a .csv file that contains URLs and the number of times they have
co-occured with a particular hashtag.
Creates a .gexf file that contains a bipartite graph (.gexf, open in gephi)
based on co-occurence of URLs and hashtags. If a URL co-occurs with a
certain hashtag, there will be a link between that URL and the hashtag. The
more often they appear together, the stronger the link ("link weight").
get a grasp of how urls are qualified
37. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Networks 9: All network exports come as .gexf or .gdf files which you can
open in Gephi or similar
Bipartite hashtag-host (domain) graph
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Creates a .csv file that contains hosts and the number of times they have
co-occured with a particular hashtag.
Creates a .gexf file that contains a bipartite graph (.gexf, open in gephi)
based on co-occurence of hosts and hashtags. If a hosts co-occurs with a
certain hashtag, there will be a link between that host and the hashtag. The
more often they appear together, the stronger the link ("link weight").
get a grasp of how hosts are qualified
38. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 1:
Cascade
(overall, /min, /hour, /day, /week, /month, /year, custom…)
User accounts are distributed vertically; tweets - shown as dots - are spread
out horizontally over time. Lines indicate retweets..
visually explore temporal structures and retweets patterns.
39. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 1:
Cascade
(overall, /min, /hour, /day, /week, /month, /year, custom…)
User accounts are distributed vertically; tweets - shown as dots - are spread
out horizontally over time. Lines indicate retweets.
visually explore temporal structures and retweets patterns.
40. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 2:
The Sankey Maker
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces an alluvial diagram. Alluvial diagrams are a type of flow diagram
originally developed to represent changes in network structure over time.
plot the relation between various fields such as from_user_lang,
hashtags or Twitter client
41. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 2:
The Sankey Maker
(overall, /min, /hour, /day, /week, /month, /year, custom…)
42. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 3:
Associational profile (hashtags)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces an associational profile as well as a time-encoded co-hashtag
network.
explore shifts in hashtags associations
43. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 3:
Associational profile (hashtags)
(overall, /min, /hour, /day, /week, /month, /year, custom…)
Produces an associational profile as well as a time-encoded co-hashtag
network.
explore shifts in hashtags associations
44. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Experimental 3:
Associational profile (hashtags)
explore shifts in hashtags associations
45. TCAT
Twitter Capture and Analysis Toolkit (DMI-TCAT) - By Keywords
Please
visit
this
TCAT
installa/on
at
these
URLs:
h5p://18.223.107.254/analysis/
TCAT
standard
login
(for
analysis
only):
Username:
tcat
Password:
FTHnX73cFuUVp7KyVzGZLxdkLPSEp7KCMc