Your SlideShare is downloading. ×
Boston DataSwap 2013 -- Network Visualization in NodeXL
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Boston DataSwap 2013 -- Network Visualization in NodeXL

5,785
views

Published on

Published in: Technology, Business

0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,785
On Slideshare
0
From Embeds
0
Number of Embeds
36
Actions
Shares
0
Downloads
40
Comments
0
Likes
9
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Brent Spiner as Data on Star Trek: TNG
  • Visual bandwidth is enormousHuman perceptual skills are remarkableTrend, cluster, gap, outlier...Human image storage is fast and vastChallengesMeaningful visual displays of massive data Color, size, shape, proximity...Interaction: widgets & window coordinationImage from Wikipedia user Shultz: http://en.wikipedia.org/wiki/File:Anscombe%27s_quartet_3.svg
  • Tukey, John W. "We Need Both Exploratory and Confirmatory." The American Statistician 34.1 (1980): 23-25.
  • A NodeXL social media network diagram of relationships among Twitter users mentioning the hashtag “#WIN09” used by attendees of a conference on network science at New York University in September 2009. The size or each user’s vertex is proportional to the number of tweets that user has ever made.Edge for follow, mention, or replyTwo distinct groups – separate disciplines
  • Sociology (Newman & Girvan, 2004)Scientometrics (Henry et al., 2007)Politics (Adamic & Glance, 2005)Urban Planning (Scott Dempwolf)Biology (Kelley et al., 2003)Archaeology (Tom Brughmans)WWW (Cheswick et al. 2000)
  • Matrix – unreadable with many nodesAggregation o Attribute grouping or clustering o Lose topology info infoDunne, C.; Riche, N. H.; Lee, B.; Metoyer, R. A. & Robertson, G. G.GraphTrail: Analyzing large multivariate and heterogeneous networks while supporting exploration historyCHI '12: Proc. 2012 international conference on Human factors in computing systems, 2012Gove, R.; Gramsky, N.; Kirby, R.; Sefer, E.; Sopan, A.; Dunne, C.; Shneiderman, B. & Taieb-Maimon, M.NetVisia: Heat map & matrix visualization of dynamic social network statistics & contentSocialCom '11: Proc. 2011 IEEE 3rd International Conference on Social Computing, 2011, 19-26.DOI:10.1109/PASSAT/SocialCom.2011.216Blue R, Dunne C, Fuchs A, King K and Schulman A (2008), "Visualizing real-time network resource usage", In VizSec '08. pp. 119-135.Henry, N. & Fekete, J.-D.MatrixExplorer: A dual-representation system to explore social networksTVCG: IEEE Transactions on Visualization and Computer Graphics, 2006, 12, 677-684.DOI:10.1109/TVCG.2006.160Freire, M.; Plaisant, C.; Shneiderman, B. & Golbeck, J.ManyNets: An interface for multiple network analysis and visualizationCHI '10: Proc. 28th international conference on Human factors in computing systems, ACM, 2010, 213-222.DOI:10.1145/1753326.1753358Wattenberg, M.Visual exploration of multivariate graphsCHI '06: Proc. SIGCHI conference on Human Factors in Computing Systems, ACM, 2006, 811-819.DOI:10.1145/1124772.1124891
  • Collect data, Excel analysis, statistics, visualization, layout algorithms, filtering, clustering, attribute mapping…
  • ~30 courses on network analysisTutorials I taught.
  • involved since 2008As of June 2012, ~20 team, 7 me. Total: 80 ACM, 117 Scopus, 1270 Google ScholarDunne C and Shneiderman B (2013), "Motif simplification: improving network visualization readability with fan, connector, and clique glyphs", In CHI '13.Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642-36763-2_2Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First steps to NetViz Nirvana: Evaluating social network analysis with NodeXL", In SocialCom '09. pp. 332-339. DOI:10.1109/CSE.2009.120Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608.Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave E (2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264. DOI:0.1145/1556460.1556497Dunne C, Chaturvedi S, Ashktorab Z, Zacharia R, and Shneiderman B (2013), "Fitted rectangles and force-directed group-in-a-box layouts for clustered network visualization", In preparation.Dunne C and Shneiderman B (2009), "Improving graph drawing readability by incorporating readability metrics: A software tool for network analysts". University of Maryland. Human-Computer Interaction Lab Tech Report No. (HCIL-2009-13).
  • Aggregate topology to reduce stored information – combine functional equivalent nodesHard to tell underlying structureDifficult to understand summarization processCan’t see attributesNavlakha, S.; Rastogi, R. & Shrivastava, N.Graph summarization with bounded errorSIGMOD '08: Proc. 2008 ACM SIGMOD international conference on Management of data, ACM, 2008, 419-432.DOI:10.1145/1376616.1376661
  • ArcFix left side, move clockwiseFixed radiusShapeSizeRetain attribute encodings (head)Unique colorAttribute color
  • ShapeSizeSame colorMeta-edge size & color (unbalanced)
  • Lossless transformationsDirect manipulationVisual & textual cues
  • Ben Nelson (NE) in main DChuck Hagel (NE) in main R
  • Ben Nelson (NE) in main DChuck Hagel (NE) wildcard: Hard right. Less connections with moderates.
  • Ben Nelson (NE) blue dog: Elected under moderate platform. Closest to potential moderate like Snowe, Lieberman, CollinsChuck Hagel (NE) hard right: Against no child left behind, the rest of the party lined up for. Against Bush prescription drug (medicare) act.
  • Published in NodeXL bookThink straightforwardSize unclearHidden relationships
  • Based on Lee et al. 2006 taxonomy:Node count: About how many nodes are in the network?Articulation point: Which individual node would we remove to disconnect the most nodes from the main network?Largest motif & size: Which is the largest ( fan | connector | clique ) motif and how many nodes does it contain?Labels: Which node has the label “XXX”?Shortest path: What is the length of the shortest path between the two highlighted nodes?Neighbors: Which of the two highlighted nodes has more neighbors?Common Neighbors: How many common neighbors are shared by the two highlighted nodes?Common Neighbors: Which of these two pairs of nodes has more common neighbors?
  • Clustered with CNM
  • http://www.boardgamegeek.com/image/1466865/risk
  • "never get involved in a land war in Asia”
  • Based on Lee et al. 2006 taxonomy:Node count: About how many nodes are in the network?Articulation point: Which individual node would we remove to disconnect the most nodes from the main network?Largest motif & size: Which is the largest ( fan | connector | clique ) motif and how many nodes does it contain?Labels: Which node has the label “XXX”?Shortest path: What is the length of the shortest path between the two highlighted nodes?Neighbors: Which of the two highlighted nodes has more neighbors?Common Neighbors: How many common neighbors are shared by the two highlighted nodes?Common Neighbors: Which of these two pairs of nodes has more common neighbors?
  • Transcript

    • 1. Network Visualization in NodeXL Cody Dunne IBM Research – Cambridge, MA cdunne@us.ibm.com Boston Data Swap Skill-A-Thon Oct. 17, 2013 1
    • 2. The Data Problem 2
    • 3. Anscombe’s Quartet I x II y x III y x IV y x y 10.00 8.04 10.00 9.14 10.00 7.46 8.00 6.58 8.00 6.95 8.00 8.14 8.00 6.77 8.00 5.76 13.00 7.58 13.00 8.74 13.00 12.74 8.00 7.71 9.00 8.81 9.00 8.77 9.00 7.11 8.00 8.84 11.00 8.33 11.00 9.26 11.00 7.81 8.00 8.47 14.00 9.96 14.00 8.10 14.00 8.84 8.00 7.04 6.00 7.24 6.00 6.13 6.00 6.08 8.00 5.25 4.00 4.26 4.00 3.10 4.00 5.39 19.00 12.50 12.00 10.84 12.00 9.13 12.00 8.15 8.00 5.56 7.00 4.82 7.00 7.26 7.00 6.42 8.00 7.91 5.00 5.68 5.00 4.74 5.00 5.73 8.00 6.89 3
    • 4. Anscombe’s Quartet - Statistics Property Value Equality Mean of x in each case 9 Exact Variance of x in each case 11 Exact Mean of y in each case 7.50 To 2 decimal places Variance of y in each case 4.122 or 4.127 To 3 decimal places Correlation between x and 0.816 y in each case Linear regression line in each case To 3 decimal places To 2 and 3 decimal y = 3.00 + 0.500x places, respectively 4
    • 5. Anscombe’s Quartet - Scatterplots 5
    • 6. No catalogue of techniques can convey a willingness to look for what can be seen, whether or not anticipated. Yet this is at the heart of exploratory data analysis. ... the picture-examining eye is the best finder we have of the wholly unanticipated. – Tukey, 1980 6
    • 7. Node-Link Network Visualization Node 1 Node 2 Alice Bob Alice Cathy Cathy Alice 7
    • 8. Tweets of the #Win09 Workshop # User 1 User 2 # User 1 User 2 1 20andlife barrywellman 15 danevans87 informor 2 20andlife BrianDavidson 16 danevans87 NetSciWestPoint 3 barrywellman elizabethmdaly 17 danielequercia BrianDavidson 4 barrywellman informor 18 danielequercia drewconway 5 BrianDavidson hcraygliangjie 19 danielequercia ipeirotis 6 BrianDavidson informor 20 danielequercia johnflurry 7 BrianDavidson NetSciWestPoint 21 danielequercia loyan 8 byaber barrywellman 22 danielequercia loyan 9 byaber danielequercia 23 danielequercia mcscharf 10 byaber mcscharf 24 danielequercia NetSciWestPoint 11 chrisnordyke RebeccaBadger 12 danevans87 barrywellman 106 sechrest Japportreport 13 danevans87 BrianDavidson 107 sechrest loyan 14 danevans87 drewconway 108 sechrest RebeccaBadger … … … 8
    • 9. Tweets of the #Win09 Workshop 9
    • 10. Who Uses Network Analysis Sociology Scientometrics Biology Urban Planning Politics Archaeology WWW
    • 11. Network visualization is highly useful, but hard! There are many ways to make it easier 11
    • 12. Alternate visualizations... Dunne et al., 2012 Gove et al., 2011 Blue et al., 2008 Henry & Fekete, 2006 Freire et al., 2010 Wattenberg, 2006 12
    • 13. 1. Tools for network analysis that are easy to learn, powerful, and insightful 13
    • 14. 14
    • 15. 15
    • 16. 16
    • 17. 17
    • 18. 18
    • 19. 19
    • 20. 20
    • 21. 21
    • 22. 22
    • 23. 23
    • 24. 24
    • 25. 25
    • 26. 26
    • 27. 27
    • 28. 28
    • 29. 29
    • 30. 30
    • 31. 31
    • 32. 32
    • 33. 33
    • 34. NodeXL Graph Gallery 34
    • 35. NodeXL as a Teaching Tool I. Getting Started with Analyzing Social Media Networks 1. Introduction to Social Media and Social Networks 2. Social media: New Technologies of Collaboration 3. Social Network Analysis II. NodeXL Tutorial: Learning by Doing 4. Layout, Visual Design & Labeling 5. Calculating & Visualizing Network Metrics 6. Preparing Data & Filtering 7. Clustering &Grouping III Social Media Network Analysis Case Studies 8. Email 9. Threaded Networks 10. Twitter 11. Facebook 12. WWW 13. Flickr 14. YouTube 15. Wiki Networks http://www.elsevier.com/wps/find/bookdescription.cws_home/723354/description 35
    • 36. NodeXL as a Research Tool 36
    • 37. NodeXL Results • Easy to learn, yet powerful and insightful • Widely used by both students and researchers • Free and open source sofware • World-wide team of collaborators Malik S, Smith A, Papadatos P, Li J, Dunne C, and Shneiderman B (2013), “TopicFlow: Visualizing topic alignment of Twitter data over time. In ASONAM '13. Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First steps to NetViz Nirvana: Evaluating social network analysis with NodeXL", In CSE '09. pp. 332-339. DOI:10.1109/CSE.2009.120 Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608. Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave E (2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264. 37 DOI:0.1145/1556460.1556497
    • 38. 2. Visualize complex relationships with limited screen space 38
    • 39. Lostpedia articles Observations 1: There are repeating patterns in networks (motifs) 2: Motifs often dominate the visualization 3: Motifs members can be functionally equivalent 39
    • 40. Graph Summarization… Navlakha et al., 2008 40
    • 41. Motif Simplification Fan Motif 2-Connector Motif 41
    • 42. Lostpedia articles 42
    • 43. Lostpedia articles 43
    • 44. Glyph Design: Fan 44
    • 45. Glyph Design: Connector 45
    • 46. Cliques too! 46
    • 47. Interactivity Fan motif: 133 leaf vertices with head vertex “Theory” 47
    • 48. Interactivity in NodeXL 48
    • 49. Senate Co-Voting: 65% Agreement 49
    • 50. Senate Co-Voting: 70% Agreement 50
    • 51. Senate Co-Voting: 80% Agreement 51
    • 52. Voson Web Crawl
    • 53. Voson Web Crawl
    • 54. Voson Web Crawl
    • 55. Motif Simplification Results • Controlled experiment with 36 users showed that motif simplification improves user task performance • Reducing complexity • Understanding larger or hidden relationships • Algorithms for detecting fans, connectors, and cliques • Publicly available implementation in NodeXL: nodexl.codeplex.com Dunne C and Shneiderman B (2013), "Motif simplification: improving network visualization readability with fan, connector, and clique glyphs", In CHI '13. pp. 3247-3256. DOI:10.1145/2470654.2466444 Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-642- 55
    • 56. 3. Explore groups in the network, including their size, membership, and relationships 56
    • 57. 57
    • 58. Previous Meta-Layouts • Poorly show ties (Rodrigues et al., 2011) • Long ties • Group arrangement • Aggregate relationships OR • Poorly show nodes & groups (Noack, 2003) • Require much more space • Harder to see groups 58
    • 59. Group-in-a-Box Meta-Layouts • Squarified Treemap • Croissant-Donut • Force-Directed 59
    • 60. 60
    • 61. Risk Movements Plain Layout with Clusters 61
    • 62. Risk Movements GIB Treemap 62
    • 63. Risk Movements GIB Croissant 63
    • 64. Risk Movements GIB Force-Directed 64
    • 65. Meta-Layout Results • Three Group-in-a-Box layout algorithms for dissecting networks • Improved group and overview visualization • Empirical evaluation on 309 Twitter networks using readability metrics • Publicly available implementation in NodeXL: nodexl.codeplex.com Shneiderman B and Dunne C (2012), "Interactive network exploration to derive insights: Filtering, clustering, grouping, and simplification", In Graph Drawing ‘12. pp. 2-18. DOI:10.1007/978-3-64236763-2_2 Chaturvedi S, Ashktorab Z, Dunne C, Zacharia R, and Shneiderman B (2013), “Croissant-Donut and ForceDirected Group-in-a-Box layouts for clustered network visualization", In preparation. Rodrigues EM, Milic-Frayling N, Smith M, Shneiderman B, and Hansen (2011), “Group-in-a-Box layout for multi-faceted analysis of communities”, In SocialCom ’11. pp. 354-361. 65
    • 66. Available Now in NodeXL! • • • • • • • • • • • • • Motif Simplification Group-in-a-Box Layouts Data import spigots Excel functions & macros Network statistics Layout algorithms Filtering Clustering Attribute mapping Automate analyses Email reporting Graph Gallery C# libraries nodexl.codeplex.com Cody Dunne IBM Research – Cambridge, MA cdunne@us.ibm.com

    ×