Your SlideShare is downloading. ×
0
Hashtag Conversations,
Eventgraphs,
and User Ego Neighborhoods:
Extracting Social Network Data
from Twitter
Shalin Hai-Jew...
Presentation Overview
• This introduces methods for extracting and analyzing social network
data from Twitter for hashtag ...
Self-Intros
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data from Twitter
3
...
Twitter Social Networking and Microblogging
Social Media Platform
• 140-character text-based Tweets
• Images (Twitpics) an...
Electronic Social Network Analysis
• Extraction of social network data from social media platforms
(through their APIs): s...
Some Basics of E-SNA
6
Some Basics of E-SNA (cont.)
• Core-periphery dynamic and influence (and power) / “primary” and
“secondary” membership in ...
Some Basics of E-SNA (cont.)
Global Social Network Structures
• Betweenness centrality
(shortest path betweenness
centrali...
Units of Analysis
• Entity: Node or vertex
• Relationships: Links, edges
• Dyads, triads, … motifs (different relational s...
Why Learn about Electronic Social Networks?
• Understand respective roles in the community
• Identify informally influenti...
E-SNA on Twitter….
• Hashtag conversations (#)
• Event graphs (unfolding formal and informal events by hashtags and
key wo...
Questions so Far?
• What do you think about (electronic) social network analysis (and
structure mining)? Do you think that...
Hashtag Conversations
• Narrow-casting (to a distinct small group) and broad-casting
(communicating broadly to any who car...
Eventgraphs
• Mapped networks of interactions based around a physical or virtual
or other event (in this case)
• Formal, i...
Search (Social) Networks (Online)
• Identification of
• particular topics in discussion (the less
ambiguity of the term, t...
User Social Networks
• Node / vertex / entity / agent analysis
• Link / edge / arc / tie / relationship analysis
• Identif...
Motif Censuses
• Understanding of the global nature of the network
• The power structures within the network
• The cluster...
The Data Extraction and Network
Visualization Tool: NodeXL
Network Overview, Discovery and Exploration for Excel
Hashtag C...
Network Overview, Discovery and Exploration
for Excel (NodeXL)
• NodeXL
• Free and open-source code
• Data scraping from s...
Types of Data Extractions from Twitter
NodeXL (relations, structure, select
contents)
• #hashtag
• Search
• Twitter “List ...
Input Parameters
• Size of the crawl
• Degree of the crawl
• Image capture
• Tweet capture
• Direction (followed by/ follo...
Data Processing: Graph Metrics
• Degree, in-degree, out-degree
• Betweenness and closeness
centralities
• Eigenvector cent...
Data Processing: Grouping
• Group by vertex attribute
• Group by connected component
• Group by cluster
• Group by motif
H...
Data Visualization
• Type of layout algorithm applied to the data
• Autofill
• Labeling of vertices
• Labeling of edges
• ...
Dynamic Filtering
• Adjust parameters
(with the sliders) to
limit what is visualized
• Change up the time
zones to analyze...
Data Analysis
• Use both the dataset and the visualizations (they both complement
each other and are necessary for full un...
Limits -> Controlling for Input Parameters for
the Data Extraction
• Social media platform (Twitter
and its data processin...
Addendum
• May apply Boolean operators into the query (and query multiple
terms simultaneously)
• May use macros
• May re-...
Some Sample
Graph Visualizations
From NodeXL Extractions from Twitter
29
Note: Other details have been excluded because th...
Grid
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data from Twitter
30
Circle Layout (Ring Lattice Graph)
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Netwo...
Harel-Koren Fast Multiscale with Vertex
Labels
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting ...
Random Layout Algorithm, Images at the
Vertices
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting...
Sugiyama Layout of Groups, Force-Based
Overall Network Layout
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoo...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Horizontal Sine Wave
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data from T...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Motif, Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Netwo...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Fruchterman-Reingold Layout, Partitioned
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social...
3D Fruchterman-Reingold Force-Based Graph
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Socia...
Circle Layout / Ring Lattice Graph at Group
Level, Force-Based Layout at Network Level
Hashtag Conversations, Eventgraphs,...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Fruchterman-Reingold Layout, Imagery for
Vertices
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracti...
Random Layout of Groups, Force-Based
Layout of Network with Combined Edges
Hashtag Conversations, Eventgraphs, and User Eg...
Harel-Koren Fast Multiscale Layout at Cluster
Level, Force-Based Layout at Network Level
Hashtag Conversations, Eventgraph...
Motifs Extraction (Census), Sugiyama Layout
at Network Level
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhood...
Harel-Koren Fast Multiscale for Groups,
Force-Based Layout at Network Level
Hashtag Conversations, Eventgraphs, and User E...
Clustering by Clauset-Newman-Moore, Network
Layout with Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, an...
Motifs at Group Level, Spiral at Network Level
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting ...
Random at Group Level, Packed Rectangles
for Network
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extra...
Harel-Koren Fast Multiscale for Clusters,
Treemap Layout for Network
Hashtag Conversations, Eventgraphs, and User Ego
Neig...
Horizontal Sine Wave Layout (on beta)
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Ne...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Sugiyama, Stacked Rectangles
56
Fruchterman-Reingold
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data from T...
Fruchterman-Reingold
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data from T...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Harel-Koren Fast Multiscale
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data...
Motif, Fruchterman-Reingold, on Grid
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Net...
Grid, Imagery on Vertices
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data f...
Multi-Sequence Mixed Visualization
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Netwo...
And…
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data from Twitter
64
NodeXL Graph Server
• Continuous crawl based on a certain term or account for over a
month
• Academic purposes only
• Must...
NodeXL Beta Layouts
• Treemap
• Packed rectangles
• Force directed
66
Mixing Up Datasets
Twitter Data Grants
• Feb. 2014
• Twitter Engineering Blog
Other Sources
• Content-sharing sites (with
...
Semantic (Meaning) Analysis of a
Tweet Stream
Using NCapture (add-in to Google Chrome and MS Internet Explorer browsers) a...
(Partial) Twitter Feed Capture using NCapture
of NVivo 10
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: ...
Word Cloud based on Word Frequency Count
from Twitter Feed (Gist)
Hashtag Conversations, Eventgraphs, and User Ego
Neighbo...
Geolocation (Lat / Long) Data of Active Twitter
User Accounts on a Tweet Stream / Feed
Hashtag Conversations, Eventgraphs,...
Word Similarity Analysis
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Network Data fr...
Word Frequency Treemap
(classical content analysis)
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extrac...
Word Search Word Tree (and Stemming)
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Social Net...
Manual Analysis…through Coding,
Categorizing, and Evaluation
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhood...
Human-Machine Analysis
• Network Text Analysis Theory (language modeled as networks of
words and relations)
• Semantic net...
Human-Machine Analysis (cont.)
• Meta-network analysis based on a text corpus / merged text
corpuses
• Drawn from unstruct...
Human-Machine Analysis (cont.)
• AutoMap…requires data pre-processing (setting parameters)
• Requires text corpuses as .tx...
Human-Machine Analysis (cont.)
• …requires data processing and data visualization
• May run the textual data processing
• ...
Sampler: Wordle™ Word Cloud to Create an
Emergent Thesaurus
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods...
Sampler: Excerpt from a Year’s Worth of a
Blog’s Text Corpus
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhood...
Sampler: @kstate_pres Tweets Visualization
Hashtag Conversations, Eventgraphs, and User Ego
Neighborhoods: Extracting Soci...
Demos?
• Would you like to see how to set up a simple data crawl from Twitter
using NodeXL? (Note: Twitter rate limiting m...
Conclusion and Contact
• Dr. Shalin Hai-Jew
• Instructional Designer
• Information Technology Assistance Center
• Kansas S...
Upcoming SlideShare
Loading in...5
×

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

383

Published on

Shalin Hai-Jew
Kansas State University
2014 National Extension Technology Conference

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
383
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter "

  1. 1. Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter Shalin Hai-Jew Kansas State University 2014 National Extension Technology Conference May 2014
  2. 2. Presentation Overview • This introduces methods for extracting and analyzing social network data from Twitter for hashtag conversations (and emergent events), event graphs, search networks, and user ego neighborhoods (using NodeXL). There will be direct demonstrations and discussions of how to analyze social network graphs. This information may be extended with human- and / or machine-based sentiment analysis. Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 2
  3. 3. Self-Intros Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 3 • Do you use Twitter? If so, how? • Who do you follow on Twitter, and why? • Have you analyzed your own social networks on Twitter? What’s the company you keep (online)? • Have you ever created a hashtag for a formal conference event? Were you able to gain some insights about what your participants were experiencing during the conference? • What would you like to learn in this session? * My goal for you is to learn capability (what is fairly easily possible), not method… Method is for another day, another time.
  4. 4. Twitter Social Networking and Microblogging Social Media Platform • 140-character text-based Tweets • Images (Twitpics) and videos (Vine) • Accounts as humans, ‘bots (collecting and re-tweeting information, sensor networks), and cyborgs (humans and ‘bots co-Tweeting) • Created in 2006 and based out of San Francisco, California • 500 million registered users in 2012 • 340 million Tweets a day as the “SMS of the Internet” • Has attracted a range of public, private, and governmental organizations; groups (religious, political, advocacy, and others); individuals • Has an application programming interface (API) which enables some limited access to their public data Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 4
  5. 5. Electronic Social Network Analysis • Extraction of social network data from social media platforms (through their APIs): social networking sites, email systems, wikis, blogs, microblogging sites, web networks, and others • Node-link, vertex-edge, entity-relationship • A form of structure mining with implications for • Organizational analysis • Entity (node) analysis • Social ties • Understandings of social structure and power • Diffusion of innovation, information, culture, attitudes, and other transmissible resources • Electronic event analysis Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 5
  6. 6. Some Basics of E-SNA 6
  7. 7. Some Basics of E-SNA (cont.) • Core-periphery dynamic and influence (and power) / “primary” and “secondary” membership in the network • Knowledge and influence • Collection of resources • Clustering • Motif censuses, network structures, network topologies, geodesic distance, connectivity • Bridging • Network structure, network topology • Thick ties / tight coupling in electronic social spaces • Thin ties / loose coupling in electronic social spaces • Homophily vs. heterophily • The company you keep Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 7
  8. 8. Some Basics of E-SNA (cont.) Global Social Network Structures • Betweenness centrality (shortest path betweenness centrality) • Closeness centrality (closeness of a node to all other nodes in the network graph) • Eigenvector centrality (closeness to important neighbors) • Clustering coefficient (the amount of clustering in a network) Local Social Network Structures • Degree centrality (in-degree and out-degree) • Clustering coefficient (embeddedness) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 8
  9. 9. Units of Analysis • Entity: Node or vertex • Relationships: Links, edges • Dyads, triads, … motifs (different relational structures) • Clusters and sub-clusters (groups or meta-nodes) • Islands • Pendants (one node, one link); whiskers (one link, multiple nodes) • Isolates • Ego neighborhoods • Social network • Multiple social networks • “Big data” universes Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 9
  10. 10. Why Learn about Electronic Social Networks? • Understand respective roles in the community • Identify informally influential individuals who are otherwise hidden • Monitor what messages are moving through the network to understand public sentiment and understandings • Plan diffusion of prosocial information and actions; head off negative diffusions in a social network • Wire new networks for social and individual resilience (such as regarding health, emotion, economics, and other) • Rewire social networks for different objectives and aims; optimize social groups based on what is known about people’s socializing and preferences Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 10
  11. 11. E-SNA on Twitter…. • Hashtag conversations (#) • Event graphs (unfolding formal and informal events by hashtags and key words) • Search networks • Understanding user (account) social networks • Ego neighborhoods on Twitter (direct alters) • Clusters and sub-clusters; islands; pendants; isolates • Motif censuses • Egos Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 11
  12. 12. Questions so Far? • What do you think about (electronic) social network analysis (and structure mining)? Do you think that the assumptions are valid? Why or why not? • What do you think about electronic social network analysis? Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 12
  13. 13. Hashtag Conversations • Narrow-casting (to a distinct small group) and broad-casting (communicating broadly to any who care to follow) • Identifying the messages shared • Sentiments • Semantics • Main conversationalists • Calls to action • Identifying the networks of accounts in connection to each other around this discussion • Observing the interactions between accounts (nodes or vertices) around the particular discussion • Identifying the “mayor of your hashtag” (using Dr. Marc A. Smith’s phrasing) or the influential discussants and their important (central, widely followed, re-tweeted) messaging Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 13
  14. 14. Eventgraphs • Mapped networks of interactions based around a physical or virtual or other event (in this case) • Formal, informal, or semi-formal • Planned or unplanned events • Conferences with disambiguated or original hashtags; may include online or augmented reality games to increase participation (planned) • Accidents, mass health events, or unusual “spectacle” occurrences (unplanned) • Micro (local or distributed) or mass (locationally clustered or distributed) • Trending microblogging messaging over time (exponential messaging to peaks or multiple peaks and gradual diminishment or steep drop- off) • Multimedial with microblogged text, images, and video; interactive; dynamic • Identification of the main geographical locations of the discussants Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 14
  15. 15. Search (Social) Networks (Online) • Identification of • particular topics in discussion (the less ambiguity of the term, the better; otherwise, the tools will track a broad range of terms with various word senses) • discussants (social media platform accounts) • main messaging of the discussants (Tweet or microblogging streams) • main physical locations of the discussants (based on noisy geo information) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 15
  16. 16. User Social Networks • Node / vertex / entity / agent analysis • Link / edge / arc / tie / relationship analysis • Identification of the alters in the ego neighborhood • Analysis of transitivity among the alters in the ego neighborhood • Capture of a 2-degree social network on Twitter Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 16
  17. 17. Motif Censuses • Understanding of the global nature of the network • The power structures within the network • The clusters, sub-clusters, islands, pendants, and isolates • The social individuals and entities within the network • The transmissibles moving through the network • Static (vs. dynamic information captures) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 17
  18. 18. The Data Extraction and Network Visualization Tool: NodeXL Network Overview, Discovery and Exploration for Excel Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 18
  19. 19. Network Overview, Discovery and Exploration for Excel (NodeXL) • NodeXL • Free and open-source code • Data scraping from social media platforms through their respect APIs (of publicly available information only) • Add-on to Excel (formerly known as NetMap) • Available on the Microsoft CodePlex platform • Requires Windows (or parallels on Mac) • Sponsored by the Social Media Research Foundation • NodeXL Graph Gallery for shared graphs and datasets Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 19
  20. 20. Types of Data Extractions from Twitter NodeXL (relations, structure, select contents) • #hashtag • Search • Twitter “List Network” • Twitter User Network NCapture of NVivo (semantics, message contents) • Twitter User Tweets • Twitter List Tweets Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 20
  21. 21. Input Parameters • Size of the crawl • Degree of the crawl • Image capture • Tweet capture • Direction (followed by/ following / both) • Edge definition: Followed / following; replies-to; mentions • Tweet column Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 21
  22. 22. Data Processing: Graph Metrics • Degree, in-degree, out-degree • Betweenness and closeness centralities • Eigenvector centrality • Vertex clustering coefficient • Vertex pagerank • Edge reciprocation • Words and word pairs • Twitter search network top items • …and others Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 22
  23. 23. Data Processing: Grouping • Group by vertex attribute • Group by connected component • Group by cluster • Group by motif Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 23
  24. 24. Data Visualization • Type of layout algorithm applied to the data • Autofill • Labeling of vertices • Labeling of edges • Graph pane • Graph options • Zoom • Scale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 24
  25. 25. Dynamic Filtering • Adjust parameters (with the sliders) to limit what is visualized • Change up the time zones to analyze what is being communicating and by whom at which time (UTC / coordinated universal time) • Capture broadly and then focus in using dynamic filtering Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 25
  26. 26. Data Analysis • Use both the dataset and the visualizations (they both complement each other and are necessary for full understanding) • Capture the Tweets column and import that into a text analysis software program Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 26
  27. 27. Limits -> Controlling for Input Parameters for the Data Extraction • Social media platform (Twitter and its data processing rate limits), even with an account for “whitelisting” (and the time-of- day of the data extraction through its data-streaming API) • NodeXL (up to about 300,000 records or so) • Computational power of researcher machine • Computer memory of researcher machine Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 27 • No early indicator of size of data crawl or the acquire- ability of the electronic social network • Costly (computational and time expense) non-captures at system limits
  28. 28. Addendum • May apply Boolean operators into the query (and query multiple terms simultaneously) • May use macros • May re-crawl using original parameters of a data extraction • May automate data extractions Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 28
  29. 29. Some Sample Graph Visualizations From NodeXL Extractions from Twitter 29 Note: Other details have been excluded because these visualizations are incomplete without the graph metrics and other complementary data…and it would be misrepresentational to explain the contexts of the data crawl behind the social network graphs incompletely. All of these graphs may be found in fuller detail and some with downloadable data sets on the NodeXL Graph Gallery. At the graph gallery, put “SHJ” in the Search bar at the top right.
  30. 30. Grid Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 30
  31. 31. Circle Layout (Ring Lattice Graph) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 31
  32. 32. Harel-Koren Fast Multiscale with Vertex Labels Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 32
  33. 33. Random Layout Algorithm, Images at the Vertices Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 33
  34. 34. Sugiyama Layout of Groups, Force-Based Overall Network Layout Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 34
  35. 35. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 35
  36. 36. Horizontal Sine Wave Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 36
  37. 37. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 37
  38. 38. Motif, Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 38
  39. 39. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 39
  40. 40. Fruchterman-Reingold Layout, Partitioned Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 40
  41. 41. 3D Fruchterman-Reingold Force-Based Graph Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 41
  42. 42. Circle Layout / Ring Lattice Graph at Group Level, Force-Based Layout at Network Level Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 42
  43. 43. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 43
  44. 44. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 44
  45. 45. Fruchterman-Reingold Layout, Imagery for Vertices Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 45
  46. 46. Random Layout of Groups, Force-Based Layout of Network with Combined Edges Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 46
  47. 47. Harel-Koren Fast Multiscale Layout at Cluster Level, Force-Based Layout at Network Level Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 47
  48. 48. Motifs Extraction (Census), Sugiyama Layout at Network Level Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 48
  49. 49. Harel-Koren Fast Multiscale for Groups, Force-Based Layout at Network Level Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 49
  50. 50. Clustering by Clauset-Newman-Moore, Network Layout with Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 50
  51. 51. Motifs at Group Level, Spiral at Network Level Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 51
  52. 52. Random at Group Level, Packed Rectangles for Network Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 52
  53. 53. Harel-Koren Fast Multiscale for Clusters, Treemap Layout for Network Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 53
  54. 54. Horizontal Sine Wave Layout (on beta) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 54
  55. 55. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 55
  56. 56. Sugiyama, Stacked Rectangles 56
  57. 57. Fruchterman-Reingold Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 57
  58. 58. Fruchterman-Reingold Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 58
  59. 59. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 59
  60. 60. Harel-Koren Fast Multiscale Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 60
  61. 61. Motif, Fruchterman-Reingold, on Grid Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 61
  62. 62. Grid, Imagery on Vertices Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 62
  63. 63. Multi-Sequence Mixed Visualization Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 63
  64. 64. And… Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 64
  65. 65. NodeXL Graph Server • Continuous crawl based on a certain term or account for over a month • Academic purposes only • Must be requested through Dr. Marc A. Smith (Connected Action Consulting Group @ marc@connectedaction.net) • Not retroactive crawls (a limitation of Twitter) 65 Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter
  66. 66. NodeXL Beta Layouts • Treemap • Packed rectangles • Force directed 66
  67. 67. Mixing Up Datasets Twitter Data Grants • Feb. 2014 • Twitter Engineering Blog Other Sources • Content-sharing sites (with public APIs) • YouTube • Flickr • Social networking sites (with public APIs) • Facebook • LinkedIn • Email Networks • Web networks • Wiki networks Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 67
  68. 68. Semantic (Meaning) Analysis of a Tweet Stream Using NCapture (add-in to Google Chrome and MS Internet Explorer browsers) and NVivo (a qualitative and mixed methods data analysis tool) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 68
  69. 69. (Partial) Twitter Feed Capture using NCapture of NVivo 10 Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 69
  70. 70. Word Cloud based on Word Frequency Count from Twitter Feed (Gist) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 70
  71. 71. Geolocation (Lat / Long) Data of Active Twitter User Accounts on a Tweet Stream / Feed Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 71
  72. 72. Word Similarity Analysis Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 72
  73. 73. Word Frequency Treemap (classical content analysis) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 73
  74. 74. Word Search Word Tree (and Stemming) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 74
  75. 75. Manual Analysis…through Coding, Categorizing, and Evaluation Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 75 • Data reduction • Summary • Matrix analysis • Coding and analysis Topic Pro (sentiment) Con (sentiment)
  76. 76. Human-Machine Analysis • Network Text Analysis Theory (language modeled as networks of words and relations) • Semantic network • Nodes: concepts or ideas, ideational kernels • Links: statements, relationships (strength of relationship, directionality such as agreement / disagreement or positive / negative, type of relation, sentiment • Network: semantic map, union of all statements • May be a one-mode network (all nodes of a type) • Concepts • May be a multi-modal network (based on ontological coding with various mixes of node types) • Persons, places, concepts, sentiments, locations, and others Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 76
  77. 77. Human-Machine Analysis (cont.) • Meta-network analysis based on a text corpus / merged text corpuses • Drawn from unstructured natural language text data • Identification of users (account holders on Twitter) and their interrelationships with others based on messaging and re-Tweeting and following / not following • May use Carnegie Mellon University’s freeware text-mining tool AutoMap 3.0.10.18 on Windows (by Center for Computational Analysis of Social and Organizational Systems, CASOS) (2001 – present) • Graph visualizations in 2D and 3D made in ORA-NetScenes (CASOS) Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 77
  78. 78. Human-Machine Analysis (cont.) • AutoMap…requires data pre-processing (setting parameters) • Requires text corpuses as .txt files (transcoding from .doc, .docx, .HTML, or other) • May combine multiple text sets (through merging); can then query on the whole set or on the individual text sets • May create “stop words” (or “delete”) lists to de-noise data (with “stop words” like relative pronouns, personal pronouns, articles, conjunctions, and other words with less semantic meaning, etc.) • May use universal or domain-specific “thesauruses” to define, filter, and hone the meta-network extractions • Enables the defining of sentiment • Requires testing of a sample set and meta network visualization to ensure appropriateness of the data refinements • Involves the design of meta-networks and ontologies from the text corpuses Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 78
  79. 79. Human-Machine Analysis (cont.) • …requires data processing and data visualization • May run the textual data processing • Includes a web scraper to main social media platforms in its ScriptRunner feature • …requires data post-processing • Includes accessing AutoMap data from ORA-NetSense to create network visualizations • Includes data “mining” for meaning / sense-making (identification of patterns) • Includes data visualization analysis • Note: The work may require re-running this cycle multiple times for different data queries. Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 79
  80. 80. Sampler: Wordle™ Word Cloud to Create an Emergent Thesaurus Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 80
  81. 81. Sampler: Excerpt from a Year’s Worth of a Blog’s Text Corpus Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 81
  82. 82. Sampler: @kstate_pres Tweets Visualization Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 82
  83. 83. Demos? • Would you like to see how to set up a simple data crawl from Twitter using NodeXL? (Note: Twitter rate limiting may mean that a completed data extraction may not be achieved, but you can at least see what a basic setup may look like.) • Any questions? Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 83
  84. 84. Conclusion and Contact • Dr. Shalin Hai-Jew • Instructional Designer • Information Technology Assistance Center • Kansas State University • 212 Hale Library • 785-532-5262 • shalin@k-state.edu • Thanks to Dr. Marc A. Smith, sociologist and Chief Social Scientist for Connected Action, for generously presenting a webinar at K-State to our faculty and staff. Also, Tony Capone, NodeXL developer, made the NodeXL beta available to me and has been very gracious and encouraging. Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter 84
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×