Visualizing (BIG) data.

  • 3,073 views
Uploaded on

A collection of slides on visualizing data (BIG or not). I am still adding slides here and tweaking things so if you have a correction, or opinion, or addition please let me know on Twitter …

A collection of slides on visualizing data (BIG or not). I am still adding slides here and tweaking things so if you have a correction, or opinion, or addition please let me know on Twitter @jamesonthecrow

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,073
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
93
Comments
0
Likes
6

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. !1 Visualizing (BIG) Data Jameson Toole! PhD Candidate Human Mobility and Networks Lab (HuMNet) MIT
  • 2. !2 Outline 1.General Principles 2.Tools 3.Geographic Data 4.Networks 5.Inspiration
  • 3. 3 Before we start… 1.There are no rules, only suggestions. 2.Sometimes suggestions are contradictory. 3.Be opinionated. 4.Guidelines may vary depending on your intended audience.
  • 4. 4 *Blue Seven - “Clean Start Project”
  • 5. sensor favorite knee paint stain 5 *Blue Seven - “Clean Start Project”
  • 6. E.q. Time 1 10 1 2 –2 10 1 10 100 Distance traveled in one day, D (km) December November October August September PaP Metro Ouest Sud-Est Nord Nord Est Artibonite Centre Sud Grande Anse Nord Ouest Nippes December November October 1.8 September Out of PaP on EQ August E July –3 June 1.9 May 10 In PaP on EQ 10 100 Distance traveled in one day, D (km) 2.1 April D –3 0.1 March 10 –2 2.2 in PaP at quake others February 0.1 F F December Cumulative distribution, P(d Cumulative distribution, P(d D) Dec 10, 2009 Jan 20, 2010 Oct 1, 2010 2.3 D) 1 January Time July –2 June 0 –1.5 May B 1 April 50 March 2 –1 February 3 –0.5 January 100 ×10 5 December November October September August July June May April March 4 0 December Port-au-Prince 5 150 C 0.5 Population difference since December 1, 2009 km 6 Percentage traveled further than d d, distance from PaP (km) 50 10 0k m 15 0k m 20 0k m A February 200 December Earthquake January Tell a story.! Time Fig. 1. Overview of population movements. (A) Shows the geography of Haiti, with distances from PaP marked. The epicenter of the earthquake is marked by a cross. (B) Gives the proportion of individuals who traveled more than d km between day t − 1 and t. Distances are calculated by comparing the person’s http://www.pnas.org/content/early/2012/06/11/1203882109.abstract current location with his or her latest observed location. In (C), we graph the change in the number of individuals in the various provinces in Haiti. (D) Gives a cumulative probability distribution of the daily travel distances d for people in PaP at the time of the earthquake. (E) Shows the cumulative probability dis- APPLIED PHYSICAL The Big Picture !6 MEDICAL SCIENCES The increase in average daily travel distances lasted for two to phone users in PaP. Increased numbers of people are present in three weeks after the earthquake. It is worth noting that other PaP during working days, with corresponding smaller numbers periods also saw sudden increases in average daily travel dispresent during weekends (Fig. 1C). This pattern was restored tances. These periods coincided with Christmas and New Year as early as three weeks after the earthquake. To get a detailed view of the daily travel distances, d, we plot from around December 20 to January 3—just before the earthfor a few different dates the cumulative probability distributions quake—as well as the Easter holidays (early April). The earthquake did not directly affect large parts of Haiti. In of d for two groups of people: persons present and not present in the rest of our analyses, we therefore focus on the population of PaP on the day of the earthquake. The distributions are basically the heavily affected capital region (PaP). As we show in Fig. 1C, the same for both groups before the earthquake as well as eight the population movements after the earthquake on January 12, months after the earthquake, when social life had stabilized 2010, led to a rapid decrease in the PaP population. Nineteen considerably. However, right after the disaster there is a striking deviation in the distribution of travel distances (Fig. 1D), which is days after the earthquake (January 31), the net population denot present for people located outside PaP on the day of the crease was an estimated 23% compared to the stable level before “The (December 1–20, 2009), assuming the phone move- earthquake (Fig. 1E). We fitted the earthquake” ChristmasPredictability of population displacement after the 2010 Haiti curves in panels D and E
  • 7. 7 Scientific vs. Pop Visualization Scientific! Popular! • Must maintain data integrity. • Quantification more important. • • • Interpretable by viewers of different backgrounds. Be consistent with tradition and expectations. • “Smooth” data to show trends without losing people in details. Work within publication medium (black and white, non-interactive) • Experiment with new formats, styles, designs. Many principals overlap!
  • 8. !8 Tufte Design Principals • • Maximize data-ink ratio • Prof. Edward Tufte! Statistics Computer Science Political Science Yale University
 Maximize data density Avoid “chartjunk”
  • 9. !9 Data Density Data Density = (# Data Points) / (Sq. Area) http://bmander.com/dotmap/index.html
  • 10. !10 Data Density Data Density = (# Data Points) / (Sq. Area) Faded series provide context and comparison. Highlight focal point. http://projects.flowingdata.com/life-expectancy/
  • 11. !11 Data-Ink Ratio Data-Ink Ratio= (Data-Ink) / (Total-Ink) Low http://www.statmethods.net/advgraphs/ggplot2.html High The Visual Display of Quantitative Information - Edward Tufte
  • 12. 12 Tools: Analog Don’t assume visualization has to be digital! Blue Seven - “Clean Start Project”
  • 13. 13 Tools: Analog Don’t assume visualization has to be digital! http://petapixel.com/2011/05/24/long-exposure-night-photos-of-airplanes-taking-off-and-landing/
  • 14. 14 Tools: Spreadsheets Be careful! Defaults make bad choices easy! Examples: Excel, Open Office, Google Docs Which slice is bigger?
  • 15. 15 Tools: Spreadsheets Be careful! Defaults make bad choices easy! Examples: Excel, Open Office, Google Docs Which series is the minimum?
  • 16. 16 Tools: Spreadsheets Be careful! Defaults make bad choices easy! Examples: Excel, Open Office, Google Docs What is this value?
  • 17. 17 Tools: Spreadsheets You think I’m joking. This was actually published! http://bioinformatics.oxfordjournals.org/content/25/12/i39/F4.expansion
  • 18. 18 Tools: Spreadsheets Keep it simple.
  • 19. 19 Tools: Design Examples: Adobe Photoshop, Illustrator, Inkscape http://flowingdata.com/2009/11/12/how-to-make-a-us-county-thematic-map-using-free-tools/
  • 20. 20 Tools: Web & Interactive Examples: JavaScript (d3.js), HTML5, CSS, WebGL http://d3js.org/
  • 21. 21 Tools: Web & Interactive Examples: JavaScript (d3.js), HTML5, CSS, WebGL http://www.chromeexperiments.com/webgl/
  • 22. 22 Tools: Web & Interactive Examples: Tableu, Google Fusion Tables http://www.tableausoftware.com/ http://www.google.com/drive/apps.html#fusiontables
  • 23. 23 Tools: Programming (Figures) Examples: R (ggplot2), Matlab, Python (Matplotlib) Scripting figures programmatically for higher control and reproducibility. http://is-r.tumblr.com http://flowingdata.com
  • 24. 24 Tools: Programming (Advanced) Examples: processing.org • • • • • • Built on java Use any java library Relatively fast Rapid prototyping Active community Hard to share
  • 25. 25 Tools: Programming (Advanced) Examples: processing.org Eric Fisher - http://www.flickr.com/photos/walkingsf/
  • 26. 26 Tools: Programming (Advanced) Examples: processing.org htttp://www.vispolitics.com
  • 27. 27 Tools: Programming (Advanced) Examples: processing.org http://casualdata.com/senseofpatterns/ Jer Thorp - http://blog.blprnt.com/
  • 28. 28 Tools: Programming (Advanced) Examples: processing.org
  • 29. !29 Chartjunk (Infographics) If you have to write every data value on your chart, re-think your design. http://junkcharts.typepad.com/junk_charts/2010/04/another-ipad-post.html
  • 30. !30 Chartjunk (Infographics) Don’t use stick figure people. Just don’t. Please. http://www.marketplace.org/topics/business/news-brief/us-unemployment-picture-glance-august-2011
  • 31. 31 Visualization Toolkit Models Component Grammar D3.js HTML/DOM Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 32. 32 Visualization Toolkit Models name lat lon time # TD Garden 42.2 -71.1 1pm 100 South Stn 42.3 -72.1 1pm 200 TD Garden 42.2 -71.1 3pm 100 Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 33. 33 Component model Bar Chart Line Chart Renderer Data Stacked Line Chart Pie Chart Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 34. 34 Component model Bar Chart • Data • • • Learn attr domains Map data, bar attributes Render bars, axes, legend Interaction Renderer Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 35. 35 Visualization Toolkit Models Component Grammar D3.js HTML/DOM Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 36. 36 Visualization Toolkit Models Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 37. 37 Visualization Toolkit Models Data " Scales Response: categorical Gender: categorical " Statistical Transform Bin " Geometry Mapping Data Interval " Positioning Stacked " Coordinates Euclidian Polar " Aesthetic Mappings Color: Response " Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 38. 38 Visualization Toolkit Models Data " Scales Response: categorical Gender: categorical " Statistical Transform Bin " Geometry Mapping Data Interval " Positioning Stacked " Coordinates Euclidian Polar " Aesthetic Mappings Color: Response " Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 39. 39 Visualization Toolkit Models Data ⨝ DOM el Array Utilities Formatting Shapes Layout Data Utilities Color Scales Interaction http://bost.ocks.org/mike/join/ Slides by Eugene Wu: http://www.mit.edu/~eugenewu/
  • 40. !40 Retinal Variables (Encodings) Making Maps: A Visual Guide to Map Design for GIS by John Krygier and Denis Wood.
  • 41. !41 Retinal Variables (Encodings) How many encodings? • Size • Color • Position http://www.nytimes.com/interactive/2012/06/11/sports/basketball/nba-shot-analysis.html
  • 42. !42 Retinal Variables (Encodings) The color encoding is good, but what about size? http://fathom.info/dencity/
  • 43. !43 Basic Charts What do you want to show? " A trend, a distribution, a relationship? " Choose a chart that tells your story. http://labs.juiceanalytics.com/chartchooser.html
  • 44. !44 Schematics What is the minimum amount of detail needed to convey your message?
  • 45. !45 Schematics It’s possible to be too simple…
  • 46. !46 Advanced Charts How many variables are displayed here? What is (sort of) missing? http://en.wikipedia.org/wiki/File:Minard.png
  • 47. !47 Advanced Charts
  • 48. !48 Advanced Charts http://xkcd.com/657/large/
  • 49. !49 Small Multiples http://flowingdata.com/2013/10/14/pizza-place-geography/
  • 50. !50 Color What type of relationship do you want to show? Qualitative or quantitative? Do they blend? ColorBrewer for maps and more! Brewer, Cynthia A., 2013. http://www.ColorBrewer.org
  • 51. !51 Geographic Data Formats Address! (Qualitative) • " " 77 Mass. Ave Cambridge, MA 02139 Geocoded! (Quantitative) • • • " Latitude: 42.359368 ° Longitude: -71.094208 ° (N 42 ° 21' 33.7", W 71 ° 5' 39.1")
  • 52. !52 Geographic Data Formats Points " Lines " Polygons (lat , lon) " (lat1, lon1),(lat2,lon2) " (lat1, lon1)....(latN,lonN)
  • 53. !53 Geographic Data Formats Address! (Qualitative) • " " 77 Mass. Ave Cambridge, MA 02139 Geocoded! (Quantitative) • • • " Latitude: 42.359368 ° Longitude: -71.094208 ° (N 42 ° 21' 33.7", W 71 ° 5' 39.1")
  • 54. !54 Geocoding Address Latitude/Longitude
  • 55. !55 Geocoding Coordinate systems: Map Projection Sphere (3D) to Plane (2D)
  • 56. !56 Geocoding Coordinate systems: Map Projection Some terminology:! • UTM - Universal Transverse Mercator (cartesian) • UPS - Universal Polar Stereograph (degrees) • Datum - Reference point (origin) Standards: • World Geodetic System (WGS84) • North American Datum (NAD83) • UTM Zones • State Plane Coordinate System (SPCS) ** Be sure all data are using the same projection! ** http://en.wikipedia.org/wiki/Geodetic_system
  • 57. !57 Geocoding Watch out for ambiguous addresses! eg. Georgia
  • 58. !58 Geocoding Know where your geocoder defaults.
  • 59. !59 Geocoding Tips • Look for locations accumulating more points than expected " • Know where your software defaults at higher spatial levels (city centers, state centers, etc.) " • Clean common typos before geocoding addresses
  • 60. !60 Scale and Scope Draw a scale on your map, or use common references. http://www.theatlantic.com/technology/archive/2012/08/the-apollo-11-landing-site-superimposed-on-a-baseball-diamond/261802/
  • 61. !61 Scale and Scope How much context do you need? http://en.wikipedia.org/ ProTip: wikimedia.org has amazing SVG maps!
  • 62. !62 Scale and Scope Convey order of magnitude. http://xkcd.com/radiation/
  • 63. !63 Scale and Scope Convey order of magnitude. http://www.informationisbeautiful.net/visualizations/million-lines-of-code/
  • 64. !64 Overlays An image is placed on top of a geographic map. http://visual.ly/tornado-tracks
  • 65. !65 Overlays An image is placed on top of a geographic map. • • • Google earth KML Google Maps API Need to know the spatial coordinates of the image boundaries
  • 66. !66 Logarithmic Scaling When the Z-axis has extreme variance, logarithm scaling make data easier to display.
  • 67. !67 Text and Labeling What is needed and where has evolved. http://google-latlong.blogspot.com/2011/07/evolving-look-of-google-maps-redux.html
  • 68. !68 Text and Labeling Use text data as your marker instead of a label. http://names.mappinglondon.co.uk/
  • 69. !69 Points of interest Tooltip or overlay box to display attributes.
  • 70. !70 Points of interest Colors help the eye define polygons while still displaying all the data! http://livehoods.org/
  • 71. !71 Routing The London Tube Map is a masterpiece!
  • 72. !72 Routing Simple color overlays and markers. (What about colorblind?)
  • 73. !73 Routing Do we even need the map? http://www.aaronkoblin.com/work/flightpatterns/
  • 74. !74 Routing Complex encodings http://casualdata.com/senseofpatterns/
  • 75. !75 Effective Distance What distance do we care about? Time, geographic, number of transfers? 20 min walk. http://www.mapnificent.net/ 20 min drive. 20 min subway.
  • 76. !76 Spatial Patterns How can we reveal different spreading behaviors? http://www.historyofinformation.com/index.php? category=Statistics+%2F+Demography http://mobs.soic.indiana.edu/projects/contagion-models-andadaptive-behavior
  • 77. !77 Spatial Patterns Do we need backgrounds, scales, context? http://cargocollective.com/coopersmith#1327371/Nike-Plus-Visualization
  • 78. !78 Geographic Data Good start, but what about Tufte’s principals? How could we improve this? What encodings are could we add? http://blog.echen.me/2012/07/06/soda-vs-pop-with-twitter/
  • 79. !79 Geographic Data Seriously, are you still showing me dot maps? YES! http://ny.spatial.ly/
  • 80. !80 Geographic Data Public data, from space! natronics.github.com/ISS-photo-locations/
  • 81. !81 Geographic Data Eric Fischer is the king of mapping dots. https://www.mapbox.com/labs/twitter-gnip/locals/#5/38.000/-95.000 https://www.mapbox.com/blog/mapping-millions-of-dots/ http://www.flickr.com/photos/walkingsf/sets/72157627140310742/ http://demographics.coopercenter.org/DotMap/index.html
  • 82. !82 Geographic Divisions Choose a division that fits your analysis. census tract zip code precinct
  • 83. !83 Geographic Data Area is misleading. 2008 presidential election http://www-personal.umich.edu/~mejn/election/2008/
  • 84. !84 Geographic Data Cartographs rescale area to represent data. 2008 presidential election http://www-personal.umich.edu/~mejn/election/2008/
  • 85. !85 Geographic Data (Yes, of course there is a dot map for this too) http://demographics.coopercenter.org/DotMap/congress.html
  • 86. !86 Geographic Data GDP http://www-personal.umich.edu/~mejn/cartograms/
  • 87. !87 Geographic Data Child Mortality http://www-personal.umich.edu/~mejn/cartograms/
  • 88. !88 Changing variables http://www.stonebrowndesign.com/boston-t-time.html
  • 89. !89 Provide context Even if you have never been to Paris, you know how big your country is. http://persquaremile.com/2011/01/18/if-the-worlds-population-lived-in-one-city/
  • 90. !90 Just population maps… Everything correlates with population density! http://xkcd.com/1138/
  • 91. !91 When maps shouldn’t be maps “But sometimes the reflexive impulse to map the data can make you forget that showing the data in another form might answer other — and sometimes more important — questions.” - Matthew Ericson http://www.ericson.net/content/2011/10/when-maps-shouldnt-be-maps/
  • 92. 92 Maps for non-spatial data Show hierarchy and proportion. http://xkcd.com/802/
  • 93. 93 Maps for non-spatial data Show hierarchy and proportion. http://bigthink.com/strange-maps/579-a-1939-map-of-physics
  • 94. !94 When maps shouldn’t be maps • When the interesting patterns aren’t geographic patterns • When the geographic data is more effective for analysis http://www.ericson.net/content/2011/10/when-maps-shouldnt-be-maps/
  • 95. !95 When maps shouldn’t be maps Should this be a better map or not a map at all? http://life.mappinglondon.co.uk/
  • 96. 96 Relationships Same system, different intent. http://www.washingtontimes.com/blog/watercooler/ 2010/jul/28/republicans-release-new-more-complexobamacare-cha/ http://stevemackley.com/2009/08/healthcare-graphic/
  • 97. 97 Network Visualization Problem: Networks are high dimensional objects that must be visualized in 2 dimensional space. The same network has many visualizations. 9 Thursday, June 23, 2011 Choose a mapping that gives insight into the structure of your data. Using just visualization 10 Thursday, June 23, 2011 http://jponnela.com/web_documents/icpsr_visualization.pdf http://upload.wikimedia.org/wikipedia/commons/d/d2/ Internet_map_1024.jpg
  • 98. 98 Network Flows Nodes and edges are encoded with color, size, and direction. http://www.nytimes.com/imagepages/2011/10/22/opinion/ 20111023_DATAPOINTS.html?ref=sunday-review
  • 99. 99 Networks Networks can be stunningly effective if presented correctly. Watch Eric Berlow explain how to interpret this network in a great TED Talk. http://www.ted.com/talks/ eric_berlow_how_complexity_leads_to_simplicity.html http://www.nytimes.com/2010/04/27/world/27powerpoint.html
  • 100. 100 Networks Circular layouts show ego connections http://chrisharrison.net/index.php/Visualizations/ClusterBall
  • 101. 101 Networks Spatial networks have constrained topology and different statistical properties. Eric Fischer uses Twitter data to map important roads. http://www.flickr.com/photos/walkingsf/6747484741/
  • 102. 102 Networks 5 5 Statistically similar networks can have strikingly different topologies. Thursday, June 23, 2011 Thursday, June 23, 2011 Using just Using just metrics metrics Network A Barabasi-Albert Network A Network B Barabasi-Albert Watts-Strogatz 6 Thursday, June 23, 2011 Thursday, June 23, 2011 http://jponnela.com/web_documents/icpsr_visualization.pdf 6 Network B Watts-Strogatz
  • 103. 103 Networks Shells show communities of egos at a glance. http://www.d3.do/labs/circleoftrust/
  • 104. 104 Graph Drawing Algorithms • • • • • • • • Spring-force layout (communities) Spectral Layout Orthogonal Layout Tree Layout (hierarchical networks) Layered Graph Drawing Arc Diagram Circular Layout (good for ego network) Dominance Drawings http://en.wikipedia.org/wiki/Graph_drawing
  • 105. 105 Networks Citation networks have a built in temporal order. http://www.autodeskresearch.com/projects/citeology
  • 106. 106 Networks Don’t forget about adjacency matrices! A= http://en.wikipedia.org/wiki/Adjacency_matrix http://www.cs.purdue.edu/homes/dgleich/demos/matlab/spectral/spectral.html
  • 107. 107 Bi-Partite Networks Can show cross type network or the dual graphs. Actors Movies
  • 108. 108 Network Visualization Node and edge attributes show important relationships (sometimes)… http://mashable.com/2010/12/13/facebook-members-visualization/ Population density problem!
  • 109. 109 Mashups make stories The whole of two data sets is greater than the sum of it’s parts. http://woj.com/False-Color-Facebook-NASA-Mashup.png Now its tells a geopolitical story!
  • 110. 110 Mashups make stories Interacting, overlapping networks and systems. http://www.globia.org
  • 111. 111 Mashups make stories Does this visualization best convey the claim? “An image of regional communication diversity and socioeconomic ranking for the UK. We find that communities with diverse communication patterns tend to rank higher (represented from light blue to dark blue) than the regions with more insular communication. This result implies that communication diversity is a key indicator of an economically healthy community.” http://www.sciencemag.org/content/328/5981/1029.abstract
  • 112. 112 Mashups make stories Layering helps make correlations accessible. http://project.wnyc.org/stop-frisk-guns/
  • 113. 113 Time MAPPING PATHS TO PROSPERITY | 81 How do you show changes in order/rank over time? FIGURE 4.1: Evolution of the ranking of countries based on ECI between 1964 and 2008. Please see pages 352-353 for a larger version. CHE 1 SWE 2 AUT 3 GBR 4 JPN 5 FRA 6 USA 7 ITA 8 BEL 9 NOR 10 FIN 11 DNK 12 NLD 13 HKG 14 HUN 15 POL 16 IRL 17 PAN 18 PRT 19 KOR 20 ISR 21 CAN 22 BGR 23 ESP 24 CHN 25 ROU 26 SLV 27 SGP 28 JOR 29 CRI 30 NZL 31 AUS 32 URY 33 GRC 34 MEX 35 CHL 36 GTM 37 IND 38 LBN 39 MAR 40 MRT 41 ARG 42 CUB 43 COL 44 EGY 45 DZA 46 TUN 47 ZAF 48 MNG 49 ZWE 50 ALB 51 PAK 52 VEN 53 JAM 54 HND 55 NIC 56 TTO 57 SEN 58 SYR 59 VNM 60 OMN 61 PER 62 TUR 63 PHL 64 ECU 65 LBR 66 BOL 67 IRN 68 PRY 69 MYS 70 BRA 71 THA 72 ZMB 73 MWI 74 CIV 75 GIN 76 MLI 77 KHM 78 LAO 79 LKA 80 KEN 81 GHA 82 MDG 83 COG 84 DOM 85 ETH 86 SAU 87 IDN 88 CMR 89 AGO 90 PNG 91 MOZ 92 TZA 93 UGA 94 SDN 95 GAB 96 NGA 97 QAT 98 KWT 99 MUS 100 LBY 101 JPN 1 CHE 2 SWE 3 FIN 4 AUT 5 GBR 6 SGP 7 KOR 8 HUN 9 FRA 10 USA 11 ITA 12 DNK 13 IRL 14 ISR 15 BEL 16 MEX 17 POL 18 NLD 19 ESP 20 HKG 21 ROU 22 CHN 23 NOR 24 THA 25 MYS 26 PRT 27 PAN 28 CAN 29 BGR 30 LBN 31 TUR 32 BRA 33 NZL 34 TUN 35 JOR 36 CRI 37 GRC 38 IND 39 COL 40 ZAF 41 ARG 42 URY 43 PHL 44 SLV 45 IDN 46 DOM 47 ALB 48 GTM 49 TTO 50 EGY 51 CHL 52 VNM 53 AUS 54 SAU 55 LKA 56 SYR 57 MUS 58 SEN 59 QAT 60 MAR 61 KEN 62 ZWE 63 HND 64 JAM 65 CUB 66 PRY 67 PAK 68 OMN 69 PER 70 UGA 71 MDG 72 NIC 73 KWT 74 ECU 75 TZA 76 ZMB 77 LAO 78 GHA 79 KHM 80 BOL 81 CIV 82 VEN 83 ETH 84 IRN 85 MWI 86 MNG 87 LBR 88 MOZ 89 MLI 90 GAB 91 LBY 92 CMR 93 DZA 94 GIN 95 NGA 96 PNG 97 AGO 98 COG 99 MRT 100 SDN 101 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 ranking 4 looks at changes in economic complexity. Here countries are ranked according to the change in ECI experihttp://www.chidalgo.com/Papers/HidalgoHausmann_DAI_2008.pdf enced between 1964 and 2008. Because of data availability, 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 position of China in this ranking reflects the fact that China’s transformation built on a productive structure that was more sophisticated than that of many of its regional neigh-
  • 114. 114 Time The x-axis is reserved for left and right political meanings, time is moved to the y-axis. http://friggeri.net/research/senate/
  • 115. 115 Time Aligning different units of time makes for easier comparison. http://www.vijayp.ca/blog/2012/06/colours-in-movie-posters-since-1914/
  • 116. 116 Streamgraphs Show relative proportion over time. What is lost? http://www.nytimes.com/interactive/ 2008/02/23/movies/ 20080223_REVENUE_GRAPHIC.html https://euro2012.twitter.com/
  • 117. 117 Know your audience Who is watching? What do they need to know? The Weather Channel http://understandinggraphics.com/visualizations/communicatingcritical-information-hurricane-irene/ New York Times
  • 118. 118 Complexity “Measures of Complexity a non--exhaustive list” " 1. Difficulty of description. Typically measured in bits. • Information; • Entropy; • Algorithmic Complexity or Algorithmic Information Content; • Minimum Description Length; • Fisher Information; Renyi Entropy; • Code Length (prefix-free, Huffman, Shannon- Fano, errorcorrecting, Hamming); • Chernoff Information; • Dimension; • Fractal Dimension; • Lempel--Ziv Complexity. 2. Difficulty of creation. Typically measured in time, energy, dollars, etc. • Computational Complexity; • Time Computational Complexity; • Space Computational Complexity; • Information--Based Complexity; • Logical Depth; • Thermodynamic Depth; • Cost; • Crypticity. " 3. Degree of organization. This may be divided up into two quantities: a) Difficulty of describing organizational structure, whether corporate, chemical, cellular, etc.; b) Amount of information shared between the parts of a system as the result of this organizational structure. " a) Effective Complexity • Metric Entropy; Fractal Dimension; Excess Entropy; • Stochastic Complexity; • Sophistication; • Effective Measure Complexity; • True Measure Complexity; • Topological epsilon-machine size; • Conditional Information; • Conditional Algorithmic Information Content; • Schema length; • Ideal Complexity; • Hierarchical Complexity; • Tree subgraph diversity;Homogeneous Complexity; • Grammatical Complexity. " b) Mutual Information: • Algorithmic Mutual Information; • Channel Capacity; • Correlation; • Stored Information; • Organization. • In addition to the above measures, there are a number of related concepts that are not • quantitative measures of complex Gell-Mann, Murray and Seth Lloyd. "Information measures, effective complexity, and total information." Complexity 2 (1996): 44-52.
  • 119. 119 Complexity How do you visualize a relationship? Hi YouTube Like Love
  • 120. 120 Complexity Can you scan it like a barcode? } } Bad boyfriend Technology shift
  • 121. 121 Complexity When “at a glance” is not enough. DEVELOPING ALTERNATIVES 8 FIGURE 2. NETWORK REPRESENTATION OF THE 1998–2000 PRODUCT SPACE Fruit Fishing Oil Vegetable Oils Vegetables Forest Products Vehicles Mining Garments Iron/Steel Textiles Machinery Electronics Node Color Petroleum Chemicals Raw Materials Forest Products Tropical Agriculture Animal Agriculture Cereals Labor Intensive Capital Intensive Link Color (proximity) Animal Agriculture >0.65 >0.55 Machinery >0.4 Chemicals <0.4 Node Size (millions of dollars) 0.3 2 8 40 2000 most poor countries can only reach the levels of development enjoyed by rich countries if they are able to jump distances that are quite infrequent http://www.chidalgo.com/Papers/HidalgoHausmann_DAI_2008.pdf in the historical record (Figure 2). In other words, the “stairway to heaven” presents some very tall steps.
  • 122. 122 Inspiration Senseable City Lab http://senseable.mit.edu/
  • 123. 123 Inspiration Senseable City Lab What are some points of critique? http://senseable.mit.edu/
  • 124. 124 Inspiration New York Times Thursday, October 18, 12
  • 125. 125 Inspiration New York Times rsday, October 18, 12 Thursday, October 18, 12 schematics small multiples quantitative color
  • 126. 126 Inspiration New York Times Thursday, October 18, 12 Thursday, October 18, 12 ursday, October 18, 12 diverging colors small multiples inserts
  • 127. 127 Inspiration New York Times Thursday, October 18, 12 Map that isn’t a map Thursday, October 18, 12 Thursday, October 18, 12
  • 128. 128 Inspiration New York Times High data density!
  • 129. 129 Inspiration Nicholas Felton Annual Reports http://feltron.com/
  • 130. 130 Inspiration Nicholas Felton Annual Reports http://feltron.com/
  • 131. 131 Inspiration Scientific plots. Understanding Road Usage Patterns in Urban Areas (P. Wang, T. Hunter, A. Bayen, K. Schechtner, M.C. González), In Scientific Reports, volume 2, 2012.
  • 132. 132 Inspiration Beautiful Infographics http://giorgialupi.net/
  • 133. 133 Inspiration Beautiful Infographics http://giorgialupi.net/
  • 134. 134 Inspiration Karl Gude http://karlgude.com/
  • 135. 135 Huge list of resources Matplotlib (python plotting) ggplot2 (R and Python) Processing Unfolding Maps (maps for processing) D3.js OpenLayers (javascript drawing on maps) WebGL Google Maps API Open Street Maps API CloudMade (OSM Style Editor) Quantum GIS (QGIS) TileMile (Custom Map Tiles) ColorBrewer ColorLouver (Crowdsourced color palettes) JunkCharts (commentary on bad visualizations) Flowing Data (tutorials, commentary, and more) WTFviz (collection of bad examples) Information is Beautiful (data visualizations) InfoAesthetics (data visualization blog) "
  • 136. 136 Many thanks to Tom Crawford! Visual Thinker, Speaker Coach, App Designer, Educator http://www.viznetwork.com/about.html " Karl Gude! Designer, Story Teller, Journalist, Educator http://karlgude.com/
  • 137. !137 Visualizing (BIG) Data Thank you! Questions? jltoole@mit.edu, @jamesonthecrow, humnetlab.mit.edu