Intro to Data Vis for the Humanities nov 2013

  • 2,636 views
Uploaded on

This is an extensive but high level look at principles, methods, and tools looking to a couple case studies around the use of data visualisation for humanities research.

This is an extensive but high level look at principles, methods, and tools looking to a couple case studies around the use of data visualisation for humanities research.

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,636
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
28
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Data Visualisation 
 in the Digital Arts and Humanities Tools, Methods & Techniques to
 Put Your Research Data to Work ! Shawn Day
 Queen’s University Library
  • 2. Objective ‣ To appreciate the rich variety of techniques and tools available to digital humanities scholars for data visualisation and analysis. 
 
 This workshop will provide an introduction to the varied use of data visualisation in the humanities through examples, case studies and hopefully inspire you to some hands-on fun.
  • 3. The beginning
 of a conversation …
  • 4. Upcoming Seminars and Workshops ‣ ‣ ‣ ‣ ‣ ‣ ‣ 18 November - A Survey of Digital Humanities 2 December - Engaging Your Auduence with Your Research Data (Exhibit) 9 December - Telling Stories with Data – Collections Visualisation for Arts and Humanities Scholars (OMEKA) January - Digital Project Management Februrary - Hands On Workshop – Data Visualisation for Presentation February - Social Scholarship – Tools for Collaborative Research March - Data Visualisation for Textual and Spatial Analysis ! ‣ More to come: http://qubdh.co.uk
  • 5. Agenda ‣ ‣ ‣ ‣ ‣ ‣ ‣ Introduction What is Data Visualisation Why Visualise Data? Case Studies Things to Visualise Ways to Visualise Tools for Visualisation
  • 6. Breakpoint ‣ One of the keys to good visualization is understanding what your immediate (and longer term) goals are. ! ‣ Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others? ! ‣ You - Visualisation for Data Analysis ! ‣ Share with Others - Visualisation for Presentation
  • 7. Why Visualise?
 The Basics ‣ Open Up Large Datasets ‣ Increase Density of Observable Data ‣ Reduce Complexity ‣ Aestheticise Data ‣ Illustrate an Interpretation ‣ Make an Argument
  • 8. Why Visualise? 
 The Psychology and Physiology ‣ Bypass language centres to tap directly into the visual cortex; ‣ Leverage ability to recognise patterns - what they call visual sense-making; ‣ Powerful graphics engines now allow for live data processing and sophisticated animations and interactive research environments.
  • 9. Why Visualise?
 From a Data Perspective ‣ Can link different formats ‣ Can share more easily with others ‣ Can see new meanings and connections ‣ Sort and re-organize in automated fashion ‣ Manage larger amounts of information ‣ Visualise your results
  • 10. Why Visualise?
 For Humanities Research ‣ Work with new data to create new knowledge ‣ Explore data to discover things that used to be unknown, unknowable or impractical to know ‣ Take a new perspective on the familiar to reveal previously hidden insights
  • 11. Data Visualisation has
 definitely hit the big-time ‣ ‣ ‣ Guardian Awards New York Times Why?
  • 12. Visualise New Information Tourists vs Locals, Eric Fischer, 2010 - Flickr
  • 13. Red - Tourists Blue - Locals Yellow- NA
  • 14. Areas of Interest
  • 15. Crowdsourcing
  • 16. Visualising New Information
  • 17. The Familiar
 through New
 Eyes The London Times Atlas
  • 18. Joanna Kamradt and Christian Tate
  • 19. How Could You Use Data Analysis? ‣ ‣ ‣ ‣ “In the Lab” - for your own analysis Online as part of collabourative groups Through dissemination for extension of own work crowdsourcing Others?
  • 20. Case Study: The Time Strip
  • 21. Visualisation Objective ‣ Exploring the ‘ordinary’ lives of rural pioneers/farmers in nineteenth century Ontario
  • 22. Canada
  • 23. Ontario
  • 24. South Western Ontario
  • 25. Farm Journal Raw Materials ‣ ‣ ‣ ‣ ‣ 100s of pages Varying hands Varying quality Columns No Context William Sunter Farm Diary, 1858
  • 26. Example: Medical Diary Medical Diary by BlueChillies
  • 27. Example: History Flow History flow by Martin Wattenberg and Fernanda Viegas
  • 28. Mechanics of the Process ‣ Generate word frequency (Voyant, TAPoR) ‣ Isolate known farm activities (NLP - LanguageWare) ‣ Collocate to link activity references to time, duration, and resources (Voyant)
  • 29. The Result/ New Patterns
  • 30. The Result/ New Patterns ‣ ‣ ‣ Less time haying The impact of technology More tasks faster
  • 31. How Else Could this be done?
  • 32. What is the Value of this Visualisation? ‣ ‣ ‣ Easier to compare over intervals Multiple vectors with greater granularity in a compressed space The challenge is to find rich enough source materials to yield substantive datasets
  • 33. Case Study: The Tree Map
  • 34. Example: Newsmap http://newsmap.jp/
  • 35. Example: Panopticon Ben Scheiderman and Hard Drive Space
  • 36. Example: Bachelor’s Degrees 2011 Ben Schmidt, 2013 http://benschmidt.org/Degrees/2011Overview/
  • 37. Case Study: Occupations of Politicians ‣ What are we studying? • Self-declared occupations of politicians ‣ Why? • What bias might they bring to their job? ‣ How? • Visualising past occupation and mapping to political platform of party affiliated with
  • 38. Occupations of MPs in the 2nd Canadian Parliament
  • 39. Occupations of MPs in the 37th Canadian Parliament
  • 40. Occupations of TDs in the 30th Dáil Éireann
  • 41. The Result/ New Patterns ‣ ‣ ‣ The emergence of the professional politician with no private sector experience Occupational continuity across changes in governing party http://dev.dho.ie/~sday/dail/index.html
  • 42. How Else Could this be Done?
  • 43. How Else Could this be Done?
  • 44. The Value of Data Vis for Analysis ‣ New ways of presenting allow new ways of seeing ‣ Hidden patterns become evident ‣ Suggest other hypotheses to test for ‣ Good research raises more questions than answers
  • 45. People demanding more… ‣ ‣ ‣ ‣ ‣ Interactivity Involvement Action Participation Web 2.0 … 3.0 ….
  • 46. General Steps in Data Vis for DH 1.Discovery / Acquisition 2.Cleaning / ‘Munging’ 3.Analysis / Exploratory Vis 4.Presentation
  • 47. Types of Data to Visualise ‣ ‣ ‣ ‣ ‣ Audio Data Categorical Data Cartographic Data Collections Image Data • Still • Moving ‣ Metadata ‣ Multimedia Data ‣ Network Data • Social • Other ‣ Numerical Data ‣ Temporal Data ‣ Textual Data • Narrative • Qualitative ‣ ????
  • 48. Audio Data ‣ ‣ ‣ ‣ ‣ Spectrogram Wave forms Notes Frequency Beats
  • 49. Audio Data ‣ What does sound look like? Visualisation of "Canada is Really Big" by The Arrogant Worms” 
 http://www.sonicvisualiser.org/
  • 50. Audio Data: The Shape of Song ‣ ‣ ‣ http://www.turbulence.org/Works/song/index.html Measuring Musical Patterns using Translucent Arcs Repetition Phillip Glass, Candyman 2 Madonna, Like a Prayer
  • 51. Audio Data: IBM ‘Glass Engine’ http://www.philipglass.com/glassengine/
  • 52. Categorical Data ‣ ‣ ‣ Data is grouped into categories based on a qualitative trait, The resulting data represents the labels of these groups. Nominal, Ordinal
 and/or Binary
  • 53. Cartographic Data ‣ Communicate spatial information
  • 54. Cartographic Visualisation
  • 55. Cartographic Visualisation http://maps.stamen.com/watercolor/#13/53.3355/-6.2181
  • 56. Digital Collections ‣ Collections of data, images, movies, sound … etc • Visualise the
 object in
 context as
 part of
 collection • Represent
 the structure
 of the
 collection
  • 57. Digital Collection Visualisation Google Art Project: Visualising Museum Collections
  • 58. Digital Still Image Data ‣ ‣ ‣ ‣ ‣ ‣ ‣ Colour Texture Shape Content Format Metadata Luminosity/Hue/
 Saturation/Range
  • 59. Digital Moving Image Data ‣ Adding Data on: • Narrative • Length • Frame rate • Sound/Image • Key Frames • Storyboard
  • 60. Metadata
  • 61. Numerical/Quantitative Data ‣ Does anyone really need me to tell them about this? • Analysed using statistical methods • displayed using tables, charts, histograms and graphs…
  • 62. Social Network Data ‣ ‣ Nodes and Edges Representing relations and quantifying and qualifyign the same between objects
  • 63. Temporal Data ‣ ‣ Show changes over time Show temporal clusters
  • 64. Different Ways of Seeing Time http://www.itc.nl/personal/kraak/ Xerox Parc, Stuart K.Card, George G. Robertson, Jock D. Mackinlay
  • 65. Combining Time and Space http://www.edwardtufte.com/tufte/posters
  • 66. Quantitative Textual Visualisation
  • 67. Textual - Qualitative ‣ Textual attributes graphically represented • Frequency • Collocation • Adjacency
  • 68. Textual - Narrative
  • 69. Time, Space, Narrative: MythEngine
  • 70. Time, Space, Narrative: MythEngine http://www.bbc.co.uk/blogs/researchanddevelopment/2010/03/the-mythology-engine-represent.shtml
  • 71. General Steps in Data Vis for DH 1. Discovery / Acquisition 2. Cleaning / ‘Munging’ 3. Analysis / Exploratory Vis 4. Presentation
  • 72. Step 1 Discovery / Acquisition
  • 73. An Iterative Process ACQUIRE w PARSE w FILTER w MINE w REPRESENT w REFINE w INTERACT
  • 74. Visualizing What? ‣ Basic types of content that we are used to deal with: • Text • Numbers • Image • Video ‣ Other, more “complex” stuff: • Relations, connections, links - a genealogy • Time and space coords - the path of migratory birds • Animations – a piece of courseware • 3D models – the plan of your house
  • 75. Acquisition: Junar ‣ http://www.junar.com http://goo.gl/oexnB
  • 76. Acquisition: Public Data Sources ‣ ‣ CSO: Data Formats The Data Hub: Linked Data
  • 77. Acquisition: Public Data Sources
  • 78. Cleaning / Munging
 (Normalisation, Format Conversion) ‣ Tools: • Data Wrangler • Google Refine • Mr. Data Converter ! ‣ Data Wrangler • Does simple, split, clear, fold/unfold transforms on data • See example --> Data and Script ! ‣ Google Refine • Works with larger datasets
  • 79. Open Data/Linked Data Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 80. Munging Tool: Data Wrangler ‣ http://vis.stanford.edu/wrangler/app/
  • 81. Cleaning Exercise
  • 82. Alternate: Google Refine ‣ http://code.google.com/p/google-refine/
  • 83. Alternate: Mr Data Converter ‣ http://shancarter.com/data_converter/
  • 84. Now You’ve Got Data ... ‣ ‣ ‣ What’s Next? Data Visualisation in the Analysis Process Data Visualisation for Presentation
  • 85. General Steps in Data Vis for DH 1. Discovery / Acquisition 2. Cleaning / ‘Munging’ 3. Analysis / Exploratory Vis 4. Presentation
  • 86. Breakpoint ‣ Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others? ‣ You - Visualisation for Data Analysis ‣ Share with Others - Visualisation for Presentation
  • 87. Google NGram Viewers ‣ Examine word frequency in digitised books ‣ Currently about 4% of books ever published ‣ In English, Chinese, French, German, Hebrew, Russian, and Spanish ‣ Changes in word usage ‣ Trends
  • 88. Google NGram Viewer http://books.google.com/ngrams/graph
  • 89. The Value of Data Vis for Analysis ‣ New ways of presenting allow new ways of seeing ‣ Hidden patterns become evident ‣ Suggest other hypotheses to test for ‣ Good research raises more questions than answers
  • 90. Infographics?
  • 91. Data Analysis Principles 1. Process is a Way of Thinking, not a Substitute for Thinking 2. Data needs to be considered and reported in Context 3. Look Before you Leap - Get to Know Your Data 4. Question Everything - CollectionProcess, Bias, etc. 5. Do a Gut Check 6. Coincidence is Not the Same as Causality 7. Just Because Data Exists Doesn’t Mean its Relevant Fern Halper - Seven Guiding Principles
  • 92. Analysis / Exploratory Visualisation
  • 93. Cleaning&Structuring: Google Fusion Tables
  • 94. Orange http://orange.biolab.si/
  • 95. IBM ManyEyes
  • 96. Text Analysis:Voyant http://voyeurtools.org
  • 97. Gephi: Analysis and Discovery of Networks
  • 98. Where to Keep up with the Community ‣ Highbrow: http://osc.hul.harvard.edu/highbrow ! ! ! ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ http://chronicle.com/blogs/profhacker Flowing Data: http://flowingdata.com Perceptual Edge: http://www.perceptualedge.com Info is Beautiful: http://www.informationisbeautiful.net Visualising Data: http://www.visualisingdata.com Infosthetics: http://infosthetics.com Datavisualisation.ch: http://datavisualization.ch Dig Hum Specialist: https://dhs.stanford.edu/the-digitalhumanities-as
  • 99. New Perspectives
 on Old Data Presenting Your Data Visually
  • 100. Objectives ‣ ‣ ‣ Consider best practices in sharing research findings using visualisation tools; Identify and judge between publicly available tools to create and deploy humanities visualisation research products; Consider data visualisation as part of a larger research discussion.
  • 101. General Steps in Data Vis for DH ‣ ‣ ‣ ‣ Discovery / Acquisition Cleaning / ‘Munging’ Analysis / Exploratory Vis Presentation
  • 102. Academic Visualisation?
 There’s lots of published papers out there
 http://www.autodeskresearch.com/projects/citeology
  • 103. The Life on An Idea through Citations
  • 104. Data Visualisation Lessons from Tufte ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ Show the Data Provoke Thought about the Subject at Hand Avoid Distorting the Data Present Many Numbers in a Small Space Make Large Datasets Coherent Encourage Eyes to Compare Data Reveal Data at Several Levels of Detail Serve a Reasonably Clear Purpose Be Closely Integrated with Statistical and Verbal Descriptions of the Dataset
  • 105. What Visual Techniques Exist? ‣ ‣ ‣ ‣ ‣ Connecting your data with the right visualisation What is your message? How do we know what we might use? Start with your Exploratory/Research/Analytical Environment How do visuals fit into your narrative?
  • 106. What Visual Techniques Exist? Connecting your data with the right visualisation r data with the right visualisation Visual Everything
  • 107. Structured Data Presentation Tools
 (a tiny subset) ‣ Webservices • Temporal: TimeFlow • Google Fusion Tables • Textual, Spatial and Numeric: Many Eyes • Temporal: Dipity • Infographics:Visual.ly ! ! ! ! ‣ Frameworks • GraphViz • Gephi • Prefuse • D3 • Processing • Exhibit (Exercise)
  • 108. TimeFlow ‣ ‣ Journalism Getting the flow
 of events and facts
 straight ! ! ! ! ‣ ‣ http://flowingmedia.com/timeflow.html Great for historians
  • 109. Google FusionTables ‣ ‣ ‣ ‣ ‣ Initially Exploratory
 and useful for ‘Munging’ Allows for Embedding And for User Interaction Transparency Experimental (Good) ! ! ‣ http://www.google.com/fusiontables/Home/
  • 110. Many Eyes ‣ ‣ ‣ http://www-958.ibm.com Rich,Varied and Accessible Free Rapid Prototyping
  • 111. Visual.ly
  • 112. Visual.ly ‣ ‣ ‣ ‣ Well crafted Infographics gaining credibility The new poster presentation Data-driven narrative in words and pictures Visual.ly currently driven by social media
  • 113. Dipity
  • 114. Frameworks and Languages ‣ GraphViz ‣ R Programming Language ‣ JIT (JavaScript Infovis Toolkit) ‣ Protovis ‣ D3 ‣ Processing ‣ Tableau ‣ Prefuse ‣ Gephi ‣ WEAVE (http:// www.oicweave.org/) ! ‣ Exhibit (Exercise)
  • 115. Graphviz ‣ ‣ ‣ ‣ ‣ An Open Source Framework Mature (1988) AT&T Labs Used as a basis for subsequent A great prototyping and starting point ! ! ! ! ! ‣ http://www.graphviz.org/
  • 116. R Programming Language ‣ ‣ ‣ ‣ ‣ Geared towards statistical analysis More recently has had some powerful graphics frameworks added Open Source Typically Command Line but a variety of GUI editors available > Jeff Rydberg-Cox: R for the Digital Humanities
  • 117. JavaScript InfoVis Toolkit (JIT) ‣ ‣ ‣ ‣ JIT Demos (http://thejit.org/demos/) The JavaScript InfoVis Toolkit is a complete set of tools to create Interactive Data Visualizations for the Web. It includes JSON loading, animation, 2D point and graph classes and some predefined tree visualization methods. Smaller datasets in a clean form Related and Aggregated/Categorised Data
  • 118. JavaScript InfoVis Toolkit (JIT)
  • 119. JavaScript InfoVis Toolkit (JIT)
  • 120. JavaScript InfoVis Toolkit (JIT)
  • 121. ProtoVis ‣ ‣ ‣ ‣ Protovis is a visualization toolkit for JavaScript using SVG. It takes a graphical approach to data visualization, composing custom views of data with simple graphical primitives like bars and dots. These primitives are called marks, and each mark encodes data visually through dynamic properties such as color and position. Jerome Cukier: ProtoVis Tutorial Development shifted to D3 ProtoVis still very accessible and usable
  • 122. ProtoVis http://mbostock.github.com/protovis/ex/crimea-rose.html
  • 123. ProtoVis http://mbostock.github.com/protovis/ex/napoleon.html
  • 124. D3 ‣ ‣ D3 allows you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. As a trivial example, you can use D3 to generate a basic HTML table from an array of numbers. Or, use the same data to create an interactive SVG bar chart with smooth transitions and interaction. Open Source
  • 125. D3 http://www.visualizing.org/full-screen/16266
  • 126. Processing ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ Now we are getting serious... Ben Fry Like R has a serious statistical bent Has a client and development environment, but deploys easily to the web using processing.js Large and VL datasets Good with related data Serious support for aesthetics Modelling Environment http://processing.org/ http://www.openprocessing.org/
  • 127. OpenProcessing
  • 128. Processing.js
  • 129. Processing.JS http://nytlabs.com/ projects/cascade.html
  • 130. Tableau ‣ ‣ ‣ ‣ ‣ Commercial Offers a Free Public Application Encourages sharing and focusses on building a narrative around visualisation of your research data Education and Non-Commercial Licenses available Mature and evolving rapidly to demonstrate the newest and most exciting visualisation types
  • 131. Tableau http://www.tableausoftware.com/public
  • 132. Prefuse ‣ ‣ ‣ ‣ ‣ flare.prefuse Flash-based Great transitions and very approachable Beware of Datalocking http://flare.prefuse.org/demo
  • 133. Gephi ‣ ‣ ‣ ‣ ‣ ‣ Open Source Mapping and Visualising Relationships and Networks An outstanding Visual Development Environment Multiplatform Extensible!! https://gephi.org/
  • 134. Gephi
  • 135. Gephi
  • 136. Where to go further ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ ‣ DIRT (Digital Research Toolkit) Timeline Tools Visualisation in Education Visual Complexity DataVis.ca R: A Tiny Handbook of R - Springer Using R in DH MONK http://datajournalism.stanford.edu/
  • 137. Upcoming Workshops ‣ ‣ ‣ ‣ ‣ ‣ ‣ 18 November - A Survey of Digital Humanities 2 December - Engaging Your Auduence with Your Research Data (Exhibit) 9 December - Telling Stories with Data – Collections Visualisation for Arts and Humanities Scholars (OMEKA) January - Digital Project Management Februrary - Hands On Workshop – Data Visualisation for Presentation February - Social Scholarship – Tools for Collaborative Research March - Data Visualisation for Textual and Spatial Analysis ! ‣ More to come: http://qubdh.co.uk
  • 138. Thank You Shawn Day - s.day@qub.co.uk - @iridium ! The Library/Institute for Collaborative Research in the Humanities
 18 University Square Ground Floor http://qubdh.co.uk