Intro to Data Vis for the Humanities nov 2013

3,075
-1

Published on

This is an extensive but high level look at principles, methods, and tools looking to a couple case studies around the use of data visualisation for humanities research.

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,075
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
33
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Intro to Data Vis for the Humanities nov 2013

  1. 1. Data Visualisation 
 in the Digital Arts and Humanities Tools, Methods & Techniques to
 Put Your Research Data to Work ! Shawn Day
 Queen’s University Library
  2. 2. Objective ‣ To appreciate the rich variety of techniques and tools available to digital humanities scholars for data visualisation and analysis. 
 
 This workshop will provide an introduction to the varied use of data visualisation in the humanities through examples, case studies and hopefully inspire you to some hands-on fun.
  3. 3. The beginning
 of a conversation …
  4. 4. Upcoming Seminars and Workshops ‣ ‣ ‣ ‣ ‣ ‣ ‣ 18 November - A Survey of Digital Humanities 2 December - Engaging Your Auduence with Your Research Data (Exhibit) 9 December - Telling Stories with Data – Collections Visualisation for Arts and Humanities Scholars (OMEKA) January - Digital Project Management Februrary - Hands On Workshop – Data Visualisation for Presentation February - Social Scholarship – Tools for Collaborative Research March - Data Visualisation for Textual and Spatial Analysis ! ‣ More to come: http://qubdh.co.uk
  5. 5. Agenda ‣ ‣ ‣ ‣ ‣ ‣ ‣ Introduction What is Data Visualisation Why Visualise Data? Case Studies Things to Visualise Ways to Visualise Tools for Visualisation
  6. 6. Breakpoint ‣ One of the keys to good visualization is understanding what your immediate (and longer term) goals are. ! ‣ Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others? ! ‣ You - Visualisation for Data Analysis ! ‣ Share with Others - Visualisation for Presentation
  7. 7. Why Visualise?
 The Basics ‣ Open Up Large Datasets ‣ Increase Density of Observable Data ‣ Reduce Complexity ‣ Aestheticise Data ‣ Illustrate an Interpretation ‣ Make an Argument
  8. 8. Why Visualise? 
 The Psychology and Physiology ‣ Bypass language centres to tap directly into the visual cortex; ‣ Leverage ability to recognise patterns - what they call visual sense-making; ‣ Powerful graphics engines now allow for live data processing and sophisticated animations and interactive research environments.
  9. 9. Why Visualise?
 From a Data Perspective ‣ Can link different formats ‣ Can share more easily with others ‣ Can see new meanings and connections ‣ Sort and re-organize in automated fashion ‣ Manage larger amounts of information ‣ Visualise your results
  10. 10. Why Visualise?
 For Humanities Research ‣ Work with new data to create new knowledge ‣ Explore data to discover things that used to be unknown, unknowable or impractical to know ‣ Take a new perspective on the familiar to reveal previously hidden insights
  11. 11. Data Visualisation has
 definitely hit the big-time ‣ ‣ ‣ Guardian Awards New York Times Why?
  12. 12. Visualise New Information Tourists vs Locals, Eric Fischer, 2010 - Flickr
  13. 13. Red - Tourists Blue - Locals Yellow- NA
  14. 14. Areas of Interest
  15. 15. Crowdsourcing
  16. 16. Visualising New Information
  17. 17. The Familiar
 through New
 Eyes The London Times Atlas
  18. 18. Joanna Kamradt and Christian Tate
  19. 19. How Could You Use Data Analysis? ‣ ‣ ‣ ‣ “In the Lab” - for your own analysis Online as part of collabourative groups Through dissemination for extension of own work crowdsourcing Others?
  20. 20. Case Study: The Time Strip
  21. 21. Visualisation Objective ‣ Exploring the ‘ordinary’ lives of rural pioneers/farmers in nineteenth century Ontario
  22. 22. Canada
  23. 23. Ontario
  24. 24. South Western Ontario
  25. 25. Farm Journal Raw Materials ‣ ‣ ‣ ‣ ‣ 100s of pages Varying hands Varying quality Columns No Context William Sunter Farm Diary, 1858
  26. 26. Example: Medical Diary Medical Diary by BlueChillies
  27. 27. Example: History Flow History flow by Martin Wattenberg and Fernanda Viegas
  28. 28. Mechanics of the Process ‣ Generate word frequency (Voyant, TAPoR) ‣ Isolate known farm activities (NLP - LanguageWare) ‣ Collocate to link activity references to time, duration, and resources (Voyant)
  29. 29. The Result/ New Patterns
  30. 30. The Result/ New Patterns ‣ ‣ ‣ Less time haying The impact of technology More tasks faster
  31. 31. How Else Could this be done?
  32. 32. What is the Value of this Visualisation? ‣ ‣ ‣ Easier to compare over intervals Multiple vectors with greater granularity in a compressed space The challenge is to find rich enough source materials to yield substantive datasets
  33. 33. Case Study: The Tree Map
  34. 34. Example: Newsmap http://newsmap.jp/
  35. 35. Example: Panopticon Ben Scheiderman and Hard Drive Space
  36. 36. Example: Bachelor’s Degrees 2011 Ben Schmidt, 2013 http://benschmidt.org/Degrees/2011Overview/
  37. 37. Case Study: Occupations of Politicians ‣ What are we studying? • Self-declared occupations of politicians ‣ Why? • What bias might they bring to their job? ‣ How? • Visualising past occupation and mapping to political platform of party affiliated with
  38. 38. Occupations of MPs in the 2nd Canadian Parliament
  39. 39. Occupations of MPs in the 37th Canadian Parliament
  40. 40. Occupations of TDs in the 30th Dáil Éireann
  41. 41. The Result/ New Patterns ‣ ‣ ‣ The emergence of the professional politician with no private sector experience Occupational continuity across changes in governing party http://dev.dho.ie/~sday/dail/index.html
  42. 42. How Else Could this be Done?
  43. 43. How Else Could this be Done?
  44. 44. The Value of Data Vis for Analysis ‣ New ways of presenting allow new ways of seeing ‣ Hidden patterns become evident ‣ Suggest other hypotheses to test for ‣ Good research raises more questions than answers
  45. 45. People demanding more… ‣ ‣ ‣ ‣ ‣ Interactivity Involvement Action Participation Web 2.0 … 3.0 ….
  46. 46. General Steps in Data Vis for DH 1.Discovery / Acquisition 2.Cleaning / ‘Munging’ 3.Analysis / Exploratory Vis 4.Presentation
  47. 47. Types of Data to Visualise ‣ ‣ ‣ ‣ ‣ Audio Data Categorical Data Cartographic Data Collections Image Data • Still • Moving ‣ Metadata ‣ Multimedia Data ‣ Network Data • Social • Other ‣ Numerical Data ‣ Temporal Data ‣ Textual Data • Narrative • Qualitative ‣ ????
  48. 48. Audio Data ‣ ‣ ‣ ‣ ‣ Spectrogram Wave forms Notes Frequency Beats
  49. 49. Audio Data ‣ What does sound look like? Visualisation of "Canada is Really Big" by The Arrogant Worms” 
 http://www.sonicvisualiser.org/
  50. 50. Audio Data: The Shape of Song ‣ ‣ ‣ http://www.turbulence.org/Works/song/index.html Measuring Musical Patterns using Translucent Arcs Repetition Phillip Glass, Candyman 2 Madonna, Like a Prayer
  51. 51. Audio Data: IBM ‘Glass Engine’ http://www.philipglass.com/glassengine/
  52. 52. Categorical Data ‣ ‣ ‣ Data is grouped into categories based on a qualitative trait, The resulting data represents the labels of these groups. Nominal, Ordinal
 and/or Binary
  53. 53. Cartographic Data ‣ Communicate spatial information
  54. 54. Cartographic Visualisation
  55. 55. Cartographic Visualisation http://maps.stamen.com/watercolor/#13/53.3355/-6.2181
  56. 56. Digital Collections ‣ Collections of data, images, movies, sound … etc • Visualise the
 object in
 context as
 part of
 collection • Represent
 the structure
 of the
 collection
  57. 57. Digital Collection Visualisation Google Art Project: Visualising Museum Collections
  58. 58. Digital Still Image Data ‣ ‣ ‣ ‣ ‣ ‣ ‣ Colour Texture Shape Content Format Metadata Luminosity/Hue/
 Saturation/Range
  59. 59. Digital Moving Image Data ‣ Adding Data on: • Narrative • Length • Frame rate • Sound/Image • Key Frames • Storyboard
  60. 60. Metadata
  61. 61. Numerical/Quantitative Data ‣ Does anyone really need me to tell them about this? • Analysed using statistical methods • displayed using tables, charts, histograms and graphs…
  62. 62. Social Network Data ‣ ‣ Nodes and Edges Representing relations and quantifying and qualifyign the same between objects
  63. 63. Temporal Data ‣ ‣ Show changes over time Show temporal clusters
  64. 64. Different Ways of Seeing Time http://www.itc.nl/personal/kraak/ Xerox Parc, Stuart K.Card, George G. Robertson, Jock D. Mackinlay
  65. 65. Combining Time and Space http://www.edwardtufte.com/tufte/posters
  66. 66. Quantitative Textual Visualisation
  67. 67. Textual - Qualitative ‣ Textual attributes graphically represented • Frequency • Collocation • Adjacency
  68. 68. Textual - Narrative
  69. 69. Time, Space, Narrative: MythEngine
  70. 70. Time, Space, Narrative: MythEngine http://www.bbc.co.uk/blogs/researchanddevelopment/2010/03/the-mythology-engine-represent.shtml
  71. 71. General Steps in Data Vis for DH 1. Discovery / Acquisition 2. Cleaning / ‘Munging’ 3. Analysis / Exploratory Vis 4. Presentation
  72. 72. Step 1 Discovery / Acquisition
  73. 73. An Iterative Process ACQUIRE w PARSE w FILTER w MINE w REPRESENT w REFINE w INTERACT
  74. 74. Visualizing What? ‣ Basic types of content that we are used to deal with: • Text • Numbers • Image • Video ‣ Other, more “complex” stuff: • Relations, connections, links - a genealogy • Time and space coords - the path of migratory birds • Animations – a piece of courseware • 3D models – the plan of your house
  75. 75. Acquisition: Junar ‣ http://www.junar.com http://goo.gl/oexnB
  76. 76. Acquisition: Public Data Sources ‣ ‣ CSO: Data Formats The Data Hub: Linked Data
  77. 77. Acquisition: Public Data Sources
  78. 78. Cleaning / Munging
 (Normalisation, Format Conversion) ‣ Tools: • Data Wrangler • Google Refine • Mr. Data Converter ! ‣ Data Wrangler • Does simple, split, clear, fold/unfold transforms on data • See example --> Data and Script ! ‣ Google Refine • Works with larger datasets
  79. 79. Open Data/Linked Data Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  80. 80. Munging Tool: Data Wrangler ‣ http://vis.stanford.edu/wrangler/app/
  81. 81. Cleaning Exercise
  82. 82. Alternate: Google Refine ‣ http://code.google.com/p/google-refine/
  83. 83. Alternate: Mr Data Converter ‣ http://shancarter.com/data_converter/
  84. 84. Now You’ve Got Data ... ‣ ‣ ‣ What’s Next? Data Visualisation in the Analysis Process Data Visualisation for Presentation
  85. 85. General Steps in Data Vis for DH 1. Discovery / Acquisition 2. Cleaning / ‘Munging’ 3. Analysis / Exploratory Vis 4. Presentation
  86. 86. Breakpoint ‣ Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others? ‣ You - Visualisation for Data Analysis ‣ Share with Others - Visualisation for Presentation
  87. 87. Google NGram Viewers ‣ Examine word frequency in digitised books ‣ Currently about 4% of books ever published ‣ In English, Chinese, French, German, Hebrew, Russian, and Spanish ‣ Changes in word usage ‣ Trends
  88. 88. Google NGram Viewer http://books.google.com/ngrams/graph
  89. 89. The Value of Data Vis for Analysis ‣ New ways of presenting allow new ways of seeing ‣ Hidden patterns become evident ‣ Suggest other hypotheses to test for ‣ Good research raises more questions than answers
  90. 90. Infographics?
  91. 91. Data Analysis Principles 1. Process is a Way of Thinking, not a Substitute for Thinking 2. Data needs to be considered and reported in Context 3. Look Before you Leap - Get to Know Your Data 4. Question Everything - CollectionProcess, Bias, etc. 5. Do a Gut Check 6. Coincidence is Not the Same as Causality 7. Just Because Data Exists Doesn’t Mean its Relevant Fern Halper - Seven Guiding Principles
  92. 92. Analysis / Exploratory Visualisation
  93. 93. Cleaning&Structuring: Google Fusion Tables
  94. 94. Orange http://orange.biolab.si/
  95. 95. IBM ManyEyes
  96. 96. Text Analysis:Voyant http://voyeurtools.org
  97. 97. Gephi: Analysis and Discovery of Networks
  98. 98. Where to Keep up with the Community ‣ Highbrow: http://osc.hul.harvard.edu/highbrow ! ! ! ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ http://chronicle.com/blogs/profhacker Flowing Data: http://flowingdata.com Perceptual Edge: http://www.perceptualedge.com Info is Beautiful: http://www.informationisbeautiful.net Visualising Data: http://www.visualisingdata.com Infosthetics: http://infosthetics.com Datavisualisation.ch: http://datavisualization.ch Dig Hum Specialist: https://dhs.stanford.edu/the-digitalhumanities-as
  99. 99. New Perspectives
 on Old Data Presenting Your Data Visually
  100. 100. Objectives ‣ ‣ ‣ Consider best practices in sharing research findings using visualisation tools; Identify and judge between publicly available tools to create and deploy humanities visualisation research products; Consider data visualisation as part of a larger research discussion.
  101. 101. General Steps in Data Vis for DH ‣ ‣ ‣ ‣ Discovery / Acquisition Cleaning / ‘Munging’ Analysis / Exploratory Vis Presentation
  102. 102. Academic Visualisation?
 There’s lots of published papers out there
 http://www.autodeskresearch.com/projects/citeology
  103. 103. The Life on An Idea through Citations
  104. 104. Data Visualisation Lessons from Tufte ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ Show the Data Provoke Thought about the Subject at Hand Avoid Distorting the Data Present Many Numbers in a Small Space Make Large Datasets Coherent Encourage Eyes to Compare Data Reveal Data at Several Levels of Detail Serve a Reasonably Clear Purpose Be Closely Integrated with Statistical and Verbal Descriptions of the Dataset
  105. 105. What Visual Techniques Exist? ‣ ‣ ‣ ‣ ‣ Connecting your data with the right visualisation What is your message? How do we know what we might use? Start with your Exploratory/Research/Analytical Environment How do visuals fit into your narrative?
  106. 106. What Visual Techniques Exist? Connecting your data with the right visualisation r data with the right visualisation Visual Everything
  107. 107. Structured Data Presentation Tools
 (a tiny subset) ‣ Webservices • Temporal: TimeFlow • Google Fusion Tables • Textual, Spatial and Numeric: Many Eyes • Temporal: Dipity • Infographics:Visual.ly ! ! ! ! ‣ Frameworks • GraphViz • Gephi • Prefuse • D3 • Processing • Exhibit (Exercise)
  108. 108. TimeFlow ‣ ‣ Journalism Getting the flow
 of events and facts
 straight ! ! ! ! ‣ ‣ http://flowingmedia.com/timeflow.html Great for historians
  109. 109. Google FusionTables ‣ ‣ ‣ ‣ ‣ Initially Exploratory
 and useful for ‘Munging’ Allows for Embedding And for User Interaction Transparency Experimental (Good) ! ! ‣ http://www.google.com/fusiontables/Home/
  110. 110. Many Eyes ‣ ‣ ‣ http://www-958.ibm.com Rich,Varied and Accessible Free Rapid Prototyping
  111. 111. Visual.ly
  112. 112. Visual.ly ‣ ‣ ‣ ‣ Well crafted Infographics gaining credibility The new poster presentation Data-driven narrative in words and pictures Visual.ly currently driven by social media
  113. 113. Dipity
  114. 114. Frameworks and Languages ‣ GraphViz ‣ R Programming Language ‣ JIT (JavaScript Infovis Toolkit) ‣ Protovis ‣ D3 ‣ Processing ‣ Tableau ‣ Prefuse ‣ Gephi ‣ WEAVE (http:// www.oicweave.org/) ! ‣ Exhibit (Exercise)
  115. 115. Graphviz ‣ ‣ ‣ ‣ ‣ An Open Source Framework Mature (1988) AT&T Labs Used as a basis for subsequent A great prototyping and starting point ! ! ! ! ! ‣ http://www.graphviz.org/
  116. 116. R Programming Language ‣ ‣ ‣ ‣ ‣ Geared towards statistical analysis More recently has had some powerful graphics frameworks added Open Source Typically Command Line but a variety of GUI editors available > Jeff Rydberg-Cox: R for the Digital Humanities
  117. 117. JavaScript InfoVis Toolkit (JIT) ‣ ‣ ‣ ‣ JIT Demos (http://thejit.org/demos/) The JavaScript InfoVis Toolkit is a complete set of tools to create Interactive Data Visualizations for the Web. It includes JSON loading, animation, 2D point and graph classes and some predefined tree visualization methods. Smaller datasets in a clean form Related and Aggregated/Categorised Data
  118. 118. JavaScript InfoVis Toolkit (JIT)
  119. 119. JavaScript InfoVis Toolkit (JIT)
  120. 120. JavaScript InfoVis Toolkit (JIT)
  121. 121. ProtoVis ‣ ‣ ‣ ‣ Protovis is a visualization toolkit for JavaScript using SVG. It takes a graphical approach to data visualization, composing custom views of data with simple graphical primitives like bars and dots. These primitives are called marks, and each mark encodes data visually through dynamic properties such as color and position. Jerome Cukier: ProtoVis Tutorial Development shifted to D3 ProtoVis still very accessible and usable
  122. 122. ProtoVis http://mbostock.github.com/protovis/ex/crimea-rose.html
  123. 123. ProtoVis http://mbostock.github.com/protovis/ex/napoleon.html
  124. 124. D3 ‣ ‣ D3 allows you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. As a trivial example, you can use D3 to generate a basic HTML table from an array of numbers. Or, use the same data to create an interactive SVG bar chart with smooth transitions and interaction. Open Source
  125. 125. D3 http://www.visualizing.org/full-screen/16266
  126. 126. Processing ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ Now we are getting serious... Ben Fry Like R has a serious statistical bent Has a client and development environment, but deploys easily to the web using processing.js Large and VL datasets Good with related data Serious support for aesthetics Modelling Environment http://processing.org/ http://www.openprocessing.org/
  127. 127. OpenProcessing
  128. 128. Processing.js
  129. 129. Processing.JS http://nytlabs.com/ projects/cascade.html
  130. 130. Tableau ‣ ‣ ‣ ‣ ‣ Commercial Offers a Free Public Application Encourages sharing and focusses on building a narrative around visualisation of your research data Education and Non-Commercial Licenses available Mature and evolving rapidly to demonstrate the newest and most exciting visualisation types
  131. 131. Tableau http://www.tableausoftware.com/public
  132. 132. Prefuse ‣ ‣ ‣ ‣ ‣ flare.prefuse Flash-based Great transitions and very approachable Beware of Datalocking http://flare.prefuse.org/demo
  133. 133. Gephi ‣ ‣ ‣ ‣ ‣ ‣ Open Source Mapping and Visualising Relationships and Networks An outstanding Visual Development Environment Multiplatform Extensible!! https://gephi.org/
  134. 134. Gephi
  135. 135. Gephi
  136. 136. Where to go further ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ DIRT (Digital Research Toolkit) Timeline Tools Visualisation in Education Visual Complexity DataVis.ca R: A Tiny Handbook of R - Springer Using R in DH MONK http://datajournalism.stanford.edu/
  137. 137. Upcoming Workshops ‣ ‣ ‣ ‣ ‣ ‣ ‣ 18 November - A Survey of Digital Humanities 2 December - Engaging Your Auduence with Your Research Data (Exhibit) 9 December - Telling Stories with Data – Collections Visualisation for Arts and Humanities Scholars (OMEKA) January - Digital Project Management Februrary - Hands On Workshop – Data Visualisation for Presentation February - Social Scholarship – Tools for Collaborative Research March - Data Visualisation for Textual and Spatial Analysis ! ‣ More to come: http://qubdh.co.uk
  138. 138. Thank You Shawn Day - s.day@qub.co.uk - @iridium ! The Library/Institute for Collaborative Research in the Humanities
 18 University Square Ground Floor http://qubdh.co.uk

×