Introduction to information visualisation for humanities PhDs

206 views

Published on

Training workshop for the CHASE Arts and Humanities in the Digital Age programme. (

This session will give you an overview of a variety of techniques and tools available for data visualisation and analysis in the humanities. You will learn about common types of visualisations and the role of exploratory and explanatory visualisations, explore examples of scholarly visualisations, try some visualisation tools, and know where to find further information about analysing and building data visualisations.

Published in: Data & Analytics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
206
On SlideShare
0
From Embeds
0
Number of Embeds
95
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to information visualisation for humanities PhDs

  1. 1. Introduction to Information Visualisation Dr Mia Ridge, @mia_out Digital Curator, British Library CHASE Arts and Humanities in the Digital Age, February 2017
  2. 2. While we're getting started... • Check that you can get online with the browsers Firefox or Chrome • The Exercises page contains all the links you need during the day • Check you can view it now: http://bit.ly/2kYtGx4 • Check you can log in to Viewshare with your new account http://viewshare.org/ • Timetable • 11am Tea and coffee • 1 - 1:45pm Lunch • 3 - 3:15pm Tea and coffee • 4:30pm Finish; free working time until 5pm
  3. 3. Overview • What is information visualisation and why use it? • The building blocks of visualisations • Exploring and critiquing interactive visualisations • Getting from the data you have to the visualisation you want
  4. 4. What is visualisation?
  5. 5. Visualisation is the graphical display of quantitative or qualitative information to create insights by highlighting patterns, trends, variations and anomalies.
  6. 6. From this...
  7. 7. ...to this
  8. 8. ...or this
  9. 9. Data visualisation can help you... Explore your data Explain your results
  10. 10. Why visualise information? For 'sense-making (also called data analysis) and communication' (Stephen Few) '…showing quantitative and qualitative information so that a viewer can see patterns, trends, or anomalies, constancy or variation' (Michael Friendly) '…interactive, visual representations of abstract data to amplify cognition' (Card et al) 'Distant reading' (Moretti) - focus on the shape rather than detail of a collection
  11. 11. Introductions • In a sentence or two, what's your interest in data visualisation? – What kinds of data do you work with? – What's the goal of any visualisations you're interested in creating? – Do you have any potential users in mind?
  12. 12. The building blocks of visualisation
  13. 13. Joseph Priestley, 1769
  14. 14. Florence Nightingale's petal charts, 1857
  15. 15. Charts https://cloud.highcharts.com/show/azujym
  16. 16. John Snow's cholera map, 1854
  17. 17. Charles Minard's figurative map, 1869 'Figurative Map of the successive losses in men of the French Army in the Russian campaign 1812-1813'. Drawn up by M. Minard, Inspector General of Bridges and Roads in retirement. Paris, November 20, 1869.
  18. 18. Web 2.0 and the mashup, 2006 http://www.bombsight.org
  19. 19. Small multiples
  20. 20. The old tube map
  21. 21. Harry Beck, 1931
  22. 22. Exercise: compare n-gram tools http://bit.ly/2kYtGx4 • Think of two words or phrases you'd like to compare over time (e.g. Burma, Burmah). • Open two browser windows • In one, go to http://books.google.com/ngrams • In the other, go to http://benschmidt.org/OL/ • Enter your words or phrases in each and compare the results • Discuss with your neighbour: what differences did you find, and why?
  23. 23. Exploring words http://www.codeitmagazine.com/images/text.png
  24. 24. Exploring words http://www.jasondavies.com/wordtree/
  25. 25. Networks http://networks.viraltexts.org/1836to1899/
  26. 26. Networks Every point on this diagram represents a male film producer. The pink dots represent men who worked exclusively with other men in the period surveyed, and the green dots represent those who worked with women. https://theconversation.com/women-arent-the-problem-in-the-film-industry-men-are-68740 Deb Verhoeven and Stuart Palmer
  27. 27. Visualising images and video http://www.flickr.com/photos/culturevis/5883371358/ 'Mondrian vs. Rothko', Lev Manovich, 2010. Image preparation: Xiaoda Wang
  28. 28. Sonification http://www.caseyrule.com/projects/sounds-of-sorting/
  29. 29. http://notes.husk.org/post/509063519/infographics
  30. 30. Data types • Quantitative • Qualitative • Geographic • Temporal • Media • Entities (people, places, events, concepts, things)
  31. 31. How do you get data to visualise? • Make it – Type it into a spreadsheet or database • Automate it – Extract it from text, images, audio or video • Find it – Lots of freely available data to practice with
  32. 32. Topic modelling http://discontents.com.au/mining-for-meanings/
  33. 33. Other forms of text analysis Entity recognition: turning text into things
  34. 34. Entity recognition examples
  35. 35. Extracting information from video http://emotions.periscopic.com/inauguration/
  36. 36. Extracting information from images https://www.clarifai.com/demo
  37. 37. Exercise: try entity recognition Go to http://bit.ly/2kYtGx4 and follow the steps for text or images
  38. 38. Exploring scholarly visualisations
  39. 39. Scholarly data visualisations • Visualisations as 'distant reading' where distance is 'a specific form of knowledge: fewer elements, hence a sharper sense of their overall interconnection' (Moretti, 2005) • Inspiring curiosity and research questions • But - which questions do they privilege and what do they leave out?
  40. 40. Exercise: critiquing scholarly visualisations Go to http://bit.ly/2kYtGx4 and follow the steps for Exercise 3 Pair up and discuss together before reporting back.
  41. 41. America's Public Bible http://americaspublicbible.org/
  42. 42. http://on-broadway.nyc/
  43. 43. http://www.sixdegreesoffrancisbacon.com/
  44. 44. http://maps.bristol.gov.uk/knowyourplace/
  45. 45. https://www.historypin.org/
  46. 46. Visualizing Emancipation http://www.americanpast.org/emancipation/
  47. 47. New York Society Library’s City Readers http://cityreaders.nysoclib.org/About/visualizations
  48. 48. Mapping the Republic of Letters http://www.stanford.edu/group/toolingup/rplviz/rplviz.swf
  49. 49. https://www.locatinglondon.org/
  50. 50. Digital Harlem http://digitalharlem.org
  51. 51. Digital Public Library of America http://dp.la/
  52. 52. Orbis http://orbis.stanford.edu
  53. 53. Lost Change http://tracemedia.co.uk/lostchange/
  54. 54. State of the Union http://benschmidt.org/poli/2015-SOTU
  55. 55. http://viraltexts.northeastern.edu/
  56. 56. Comments or questions?
  57. 57. From the data you have to the visualisation you want
  58. 58. Dealing with humanities data
  59. 59. Considerations for humanities data Commercial tools often assume complete, born- digital datasets – no missing fields or changes in data entry over time • Historical records often contain uncertainty and fuzziness (e.g. date ranges, multiple values, uncertain or unavailable information) • Includes metadata, data, digital surrogates
  60. 60. Messiness in historical data • 'Begun in Kiryu, Japan, finished in France' • 'Bali? Java? Mexico?' • Variations on USA: – U.S. – U.S.A – U.S.A. – USA – United States of America – USA ? – United States (case) • Inconsistency in uncertainty – U.S.A. or England – U.S.A./England ? – England & U.S.A.
  61. 61. When were objects collected? http://ibm.co/OS3HBa
  62. 62. Computers don't cope
  63. 63. Preparing data for visualisations Historical data often needs manual cleaning to:  remove rows where vital information is missing  tidy inconsistencies in term lists or spelling  convert words to numbers (e.g. dates)  remove hard returns and non-ASCII characters (or change data format)  split multiple values in one field into other columns (e.g. author name, date in single field)  expand coded values (e.g. countries, language)
  64. 64. Open Refine
  65. 65. …but be careful
  66. 66. What do you want to visualise?
  67. 67. Structure Purpose Data Audience
  68. 68. What do you want to do? • See relationships among data points • Compare a set of values • Track change over time • See the parts of a whole
  69. 69. See relationships among data points • Scatterplot • Matrix • Network diagram
  70. 70. Compare a set of values • Bar chart • Bubble chart • Histogram
  71. 71. Track change over time • Line graph • Stack graph
  72. 72. See the parts of a whole • Pie chart • Treemap
  73. 73. Key format decisions • Static or interactive? • Print or digital? • Narrative or 'factual'? • Shape (distant view) or detail (close view)?
  74. 74. Purpose, data, audience, structure • Intersections of format and purpose • Data types: quantitative, qualitative, geographic, time series, media, entities (people, places, events, concepts, things) • Static, interactive; print, digital; product, process • Exploratory, explanatory: find new insights, or tell a story? Pragmatic, emotive?
  75. 75. Dealing with complex data • Find a visualisation type that can harbour the data in a meaningful way or reduce the data in a meaningful way. – e.g. go from individual values to distribution of values – e.g. introduce interaction: overview, zoom and filter, details on demand (Ben Shneiderman)
  76. 76. Exercise: 10 minute Viewshare tutorial Instructions http://bit.ly/2kYtGx4 Discuss: what did you learn about preparing data and using visualisation software?
  77. 77. Choosing a structure
  78. 78. http://extremepresentation.com/design/7-charts/
  79. 79. http://extremepresentation.com/design/7-charts/
  80. 80. Giorgia Lupi and Stefanie Posavec http://www.dear-data.com/all
  81. 81. Preparing data
  82. 82. Data Preparation • Generally needs to be in tables, one row per item, one column per value • Aggregate or individual values - might need to calculate totals in advance • Data should be made as consistent as possible with tools like Excel, OpenRefine
  83. 83. Document data preparation!
  84. 84. Sample advice From viewshare, on spreadsheets: • Remove any data that is not in a solid rectangular area. This includes white space, page titles, scattered cells, and additional worksheets. • Check that your formatting is consistent throughout each column (e.g. column is all in date format, currency format, etc. as appropriate). • Make sure that data of the same type but in different columns is formatted consistently (e.g. dates in different columns are in the same date format).
  85. 85. If all else fails... • Sketch out your visualisation on paper to test it • Iteration is key, and... • Stubbornness is a virtue!
  86. 86. Exercise: try views and widgets in Viewshare Instructions http://bit.ly/2kYtGx4 Views • Lists, maps, pie charts, bar charts, scatter plots, tables, timelines or galleries Widgets • Search boxes, lists, tag clouds, sliders, ranges, logos or text How might you apply these with your own data?
  87. 87. Design matters
  88. 88. Worst practice in data visualisations Source: http://www.forbes.com/sites/naomirobbins/2013/01/03/deceptive-donut-chart/
  89. 89. Worst practice in data visualisations Source: https://twitter.com/altonncf/status/293392615225823232
  90. 90. Visualisations and 'truthiness' A sample of publication printing locations 1534-1831 (British Library data) http://bit.ly/W9VM7D
  91. 91. Visualising uncertainty Matt Lincoln http://blogs.getty.edu/iris/metadata-specialists-share-their-challenges-defeats-and-triumphs/#matt
  92. 92. Visualising uncertainty
  93. 93. Publishing visualisations • How can you contextualise, explain any limitations of your visualisations? e.g. – provenance and qualities of original dataset; – what you needed to do to it to get it into software (how transformed, how cleaned); – what's left out of the visualisation, and why?
  94. 94. Best practice for design • How effectively does the visualisation support cognitive tasks? • The most important and frequent visual queries/pattern finding should be supported with the most visually distinct objects • Question: which examples did this well?
  95. 95. Do you really need a visualisation? • Use tables when: – doc will be used to look up individual values – to compare individual values – precise values are required – the quantitative info to be communicated involves more than one unit of measure • Use graphs when: – the message is contained in the shape of the values – the document will be used to reveal relationships among values
  96. 96. Don’t Do try this at home
  97. 97. Tools that don't require programming • Excel • Google Fusion Tables, Google Drive • Viewshare • Tableau Public NB: be careful about sensitive data on cloud platforms
  98. 98. Thank you! http://bit.ly/2kYtGx4 Mia Ridge @mia_out Digital Curator, British Library CHASE Arts and Humanities in the Digital Age, February 2017

×