Your SlideShare is downloading. ×
MPhil Lecture on Data Vis for Analysis
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

MPhil Lecture on Data Vis for Analysis


Published on

Published in: Education, Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. An Introduction to Data Visualisation for Analysis Exploring the Dataset - Textual, Numerical and Otherwise
  • 2. Agenda Thoughts from last week - Introduction What do we mean by Data Analysis? Some foundation terms and concepts The Data Visualisation Process Tools and Methods Extending your toolset An Exercise
  • 3. Objective To appreciate the rich variety of techniques and tools available to digital humanities scholars for data visualisation and analysis. The intention is to be able to add tools to your arsenal and to have a sense of where to look for more.
  • 4. Breakpoint One of the keys to good visualization is understanding what your immediate goals are. Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others? You - Visualisation for Data Analysis Others - Visualisation for Presentation
  • 5. Speaking of Data Analysis SPSS SAS OS Equivalents
  • 6. So Why Would You Want to VisualiseYour Data? Bypass language centres to tap directly into the visual cortex Leverage ability to recognise patterns - what they call visual sense-making Powerful graphics engines now allow for live data processing and sophisticated animations and interactive research environments Sources: Geoff McGhee, Getting Started with Data Viz
  • 7. So Why Would You Want to VisualiseYour Data? Work with new data to create new knowledge Explore data to discover things that used to be unknown, unknowable or impractical to know Take a new perspective on the familiar to reveal previously hidden insights
  • 8. Visualising New Information Tourists vs Locals, Eric Fischer, 2010 - Flickr
  • 9. Visualising New Information Flickr Flow, Martin Wattenberg and Fernanda Viegas, 2009
  • 10. The Familiar through New Eyes The Times Atlas
  • 11. How Could You Use Data Analysis “In the Lab” - for your own analysis Online as part of collabourative groups Through dissemination for extension of own work - crowdsourcing Others?
  • 12. The Time Ribbon and the Tree Map
  • 13. Visualisation Objective Exploring the ordinary life of rural pioneers in nineteenth century Ontario
  • 14. Farm Journal William Sunter Farm Diary, 1858
  • 15. Diaries: the raw materials • 100s of pages • Varying hands • Varying quality
  • 16. The Process • Generate word frequency (Voyeur, TAPoR) • Isolate known farm activities (NLP - LanguageWare) • Collocate to link activity references to time, duration, and resources (Voyeur)
  • 17. Example: Medical Diary Medical Diary by BlueChillies
  • 18. Example: History Flow History flow by Martin Wattenberg and Fernanda Viegas
  • 19. The Result/ New Patterns
  • 20. The Result/ New Patterns•Less time haying•The impact of technology•More tasks faster
  • 21. How Else Could this be done?
  • 22. What is the Value of this Visualisation • Easier to compare over intervals • Multiple vectors with greater granularity in a compressed space • The challenge is to find rich enough source materials to yield substantive datasets
  • 23. The Tree Map
  • 24. Example: Newsmap
  • 25. Example: Panopticon
  • 26. Case Study:Occupations of Politicians • What are we studying? – Self-declared occupations of politicians • Why? – What bias might they bring to their job? • How? – Visualising past occupation and mapping to political platform of party affiliated with
  • 27. Occupations of TDs in the 30th Dáil
  • 28. Occupations of MPs in the 2nd Parliament
  • 29. Occupations of MPs in the 37th Parliament
  • 30. The Result/ New Patterns • The emergence of the professional politician with no private sector experience • Occupational continuity across changes in governing party
  • 31. How Else Could this be Done?
  • 32. The Value of Data Vis for Analysis • New ways of presenting allow new ways of seeing • Hidden patterns become evident • Suggest other hypothesis to test
  • 33. Basic Terms Datamining Statistics Structured/Unstructured Data Visualisation Modelling
  • 34. Types of Data to Visualise Audio Data Network Data Categorical Data Social Cartographic Data Other Collections Numerical Data Image Data Temporal Data Still Textual Data Moving Narrative Metadata Qualitative Multimedia Data ????
  • 35. General Steps in Data Vis for DH Discovery / Acquisition Cleaning / ‘Munging’ Analysis / Exploratory Vis Presentation
  • 36. Discovery / Acquisition Original Research Scraping Spreadsheets Junar Databases Outwit Hub Digitized Media ScraperWiki Other Downloads Public Data Archives/Libraries Academic Partners Purchase
  • 37. Demo/Hands-On: Junar
  • 38. Cleaning / Munging(Normalisation, Format Conversion) Tools: Data Wrangler Google Refine Mr. Data Converter Data Wrangler Does simple, split, clear, fold/unfold transforms on data See example --> Data and Script Google Refine Works with larger datasets
  • 39. Hands-On: Data Wrangler
  • 40. Hands-On: Google Refine
  • 41. Hands-On: Mr Data Converter
  • 42. Analysis / Exploratory Visualisation Web Services Google Fusion Tables Google Spreadsheets IBM ManyEyes TimeFlow Applications Tableau/Tableau Public MS Office OpenOffice Gephi Node XL (plug-in for Excel) Spotfire R Processing
  • 43. Google NGram Viewers Examine word frequency in digitised books Currently about 4% of books ever published In English, Chinese, French, German, Hebrew, Russian, and Spanish Changes in word usage Trends Check out the Cultural Observatory @ Harvard
  • 44. Google NGram Viewer
  • 45. Wordle Visually present word frequency using size, weight, colour Consider Word Clouds Considered Harmful
  • 46. Exercise Choose a dataset from a source such as: The CSO Project Guttenberg or your own material Choose an appropriate Data Visualisation from a webservice we explored in workshop. Explain the process and how you madeyour choice and embed it in your own blog using as we explored last week. Suggest a research question that can be answered by using this data visualisation as a research environment Send the link to me at: Maybe: business-post-red-c-poll-4th-september-2011/