0
An Introduction to Data       Visualisation for Analysis                 Exploring the Dataset -            Textual, Numer...
Agenda  Thoughts from last week - wordpress.com?  Introduction  What do we mean by Data Analysis?  Some foundation terms a...
Objective    To appreciate the rich variety of techniques and   tools available to digital humanities scholars for        ...
Breakpoint        One of the keys to good visualization is   understanding what your immediate goals are.  Are you visuali...
Speaking of Data Analysis   SPSS   SAS   OS Equivalents
So Why Would You Want to VisualiseYour Data?   Bypass language centres to tap directly into the   visual cortex   Leverage...
So Why Would You Want to VisualiseYour Data?   Work with new data to create new knowledge   Explore data to discover thing...
Visualising New Information                  Tourists vs Locals, Eric Fischer, 2010 - Flickr
Visualising New Information            Flickr Flow, Martin Wattenberg and Fernanda Viegas, 2009
The Familiar through New Eyes                  The Times Atlas
How Could You Use Data Analysis   “In the Lab” - for your own analysis   Online as part of collabourative groups   Through...
The Time Ribbon and the Tree Map
Visualisation Objective   Exploring the ordinary life of rural pioneers in     nineteenth century Ontario
Farm Journal               William Sunter Farm Diary, 1858
Diaries: the raw materials   • 100s of pages   • Varying hands   • Varying quality
The Process  • Generate word frequency (Voyeur, TAPoR)  • Isolate known farm activities (NLP -    LanguageWare)  • Colloca...
Example: Medical Diary                         Medical Diary by BlueChillies
Example: History Flow                        History flow by Martin Wattenberg and Fernanda Viegas
The Result/ New Patterns
The Result/ New Patterns•Less time haying•The impact of technology•More tasks faster
How Else Could this be done?
What is the Value of this Visualisation  • Easier to compare over intervals  • Multiple vectors with greater granularity i...
The Tree Map
Example: Newsmap
Example: Panopticon
Case Study:Occupations of Politicians   • What are we studying?     – Self-declared occupations of politicians   • Why?   ...
Occupations of TDs in the 30th Dáil
Occupations of MPs in the 2nd Parliament
Occupations of MPs in the 37th Parliament
The Result/ New Patterns  • The emergence of the professional politician with    no private sector experience  • Occupatio...
How Else Could this be Done?
The Value of Data Vis for Analysis   • New ways of presenting allow new ways of seeing   • Hidden patterns become evident ...
Basic Terms   Datamining   Statistics   Structured/Unstructured Data   Visualisation   Modelling
Types of Data to Visualise   Audio Data            Network Data   Categorical Data          Social   Cartographic Data    ...
General Steps in Data Vis for DH   Discovery / Acquisition   Cleaning / ‘Munging’   Analysis / Exploratory Vis   Presentat...
Discovery / Acquisition   Original Research      Scraping     Spreadsheets           Junar     Databases              Outw...
Demo/Hands-On: Junar  http://www.junar.com
Cleaning / Munging(Normalisation, Format Conversion)    Tools:      Data Wrangler      Google Refine      Mr. Data Converte...
Hands-On: Data Wrangler   http://vis.stanford.edu/wrangler/app/
Hands-On: Google Refine   http://code.google.com/p/google-refine/
Hands-On: Mr Data Converter   http://shancarter.com/data_converter/
Analysis / Exploratory Visualisation     Web Services       Google Fusion Tables       Google Spreadsheets       IBM ManyE...
Google NGram Viewers  Examine word frequency in digitised books  Currently about 4% of books ever published  In English, C...
Google NGram Viewer
Wordle  Visually present word frequency using size,  weight, colour  Consider Word Clouds Considered Harmful
Exercise   Choose a dataset from a source such as:      The CSO      Project Guttenberg      or your own material   Choose...
Upcoming SlideShare
Loading in...5
×

MPhil Lecture on Data Vis for Analysis

1,140

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,140
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "MPhil Lecture on Data Vis for Analysis"

  1. 1. An Introduction to Data Visualisation for Analysis Exploring the Dataset - Textual, Numerical and Otherwisehttp://www.slideshare.net/shawnday/m-phil-datavisforanalysis
  2. 2. Agenda Thoughts from last week - wordpress.com? Introduction What do we mean by Data Analysis? Some foundation terms and concepts The Data Visualisation Process Tools and Methods Extending your toolset An Exercise
  3. 3. Objective To appreciate the rich variety of techniques and tools available to digital humanities scholars for data visualisation and analysis. The intention is to be able to add tools to your arsenal and to have a sense of where to look for more.
  4. 4. Breakpoint One of the keys to good visualization is understanding what your immediate goals are. Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others? You - Visualisation for Data Analysis Others - Visualisation for Presentation
  5. 5. Speaking of Data Analysis SPSS SAS OS Equivalents
  6. 6. So Why Would You Want to VisualiseYour Data? Bypass language centres to tap directly into the visual cortex Leverage ability to recognise patterns - what they call visual sense-making Powerful graphics engines now allow for live data processing and sophisticated animations and interactive research environments Sources: Geoff McGhee, Getting Started with Data Viz
  7. 7. So Why Would You Want to VisualiseYour Data? Work with new data to create new knowledge Explore data to discover things that used to be unknown, unknowable or impractical to know Take a new perspective on the familiar to reveal previously hidden insights
  8. 8. Visualising New Information Tourists vs Locals, Eric Fischer, 2010 - Flickr
  9. 9. Visualising New Information Flickr Flow, Martin Wattenberg and Fernanda Viegas, 2009
  10. 10. The Familiar through New Eyes The Times Atlas
  11. 11. How Could You Use Data Analysis “In the Lab” - for your own analysis Online as part of collabourative groups Through dissemination for extension of own work - crowdsourcing Others?
  12. 12. The Time Ribbon and the Tree Map
  13. 13. Visualisation Objective Exploring the ordinary life of rural pioneers in nineteenth century Ontario
  14. 14. Farm Journal William Sunter Farm Diary, 1858
  15. 15. Diaries: the raw materials • 100s of pages • Varying hands • Varying quality
  16. 16. The Process • Generate word frequency (Voyeur, TAPoR) • Isolate known farm activities (NLP - LanguageWare) • Collocate to link activity references to time, duration, and resources (Voyeur)
  17. 17. Example: Medical Diary Medical Diary by BlueChillies
  18. 18. Example: History Flow History flow by Martin Wattenberg and Fernanda Viegas
  19. 19. The Result/ New Patterns
  20. 20. The Result/ New Patterns•Less time haying•The impact of technology•More tasks faster
  21. 21. How Else Could this be done?
  22. 22. What is the Value of this Visualisation • Easier to compare over intervals • Multiple vectors with greater granularity in a compressed space • The challenge is to find rich enough source materials to yield substantive datasets
  23. 23. The Tree Map
  24. 24. Example: Newsmap
  25. 25. Example: Panopticon
  26. 26. Case Study:Occupations of Politicians • What are we studying? – Self-declared occupations of politicians • Why? – What bias might they bring to their job? • How? – Visualising past occupation and mapping to political platform of party affiliated with
  27. 27. Occupations of TDs in the 30th Dáil
  28. 28. Occupations of MPs in the 2nd Parliament
  29. 29. Occupations of MPs in the 37th Parliament
  30. 30. The Result/ New Patterns • The emergence of the professional politician with no private sector experience • Occupational continuity across changes in governing party
  31. 31. How Else Could this be Done?
  32. 32. The Value of Data Vis for Analysis • New ways of presenting allow new ways of seeing • Hidden patterns become evident • Suggest other hypothesis to test
  33. 33. Basic Terms Datamining Statistics Structured/Unstructured Data Visualisation Modelling
  34. 34. Types of Data to Visualise Audio Data Network Data Categorical Data Social Cartographic Data Other Collections Numerical Data Image Data Temporal Data Still Textual Data Moving Narrative Metadata Qualitative Multimedia Data ????
  35. 35. General Steps in Data Vis for DH Discovery / Acquisition Cleaning / ‘Munging’ Analysis / Exploratory Vis Presentation
  36. 36. Discovery / Acquisition Original Research Scraping Spreadsheets Junar Databases Outwit Hub Digitized Media ScraperWiki Other Downloads Public Data Archives/Libraries Academic Partners Purchase
  37. 37. Demo/Hands-On: Junar http://www.junar.com
  38. 38. Cleaning / Munging(Normalisation, Format Conversion) Tools: Data Wrangler Google Refine Mr. Data Converter Data Wrangler Does simple, split, clear, fold/unfold transforms on data See example --> Data and Script Google Refine Works with larger datasets
  39. 39. Hands-On: Data Wrangler http://vis.stanford.edu/wrangler/app/
  40. 40. Hands-On: Google Refine http://code.google.com/p/google-refine/
  41. 41. Hands-On: Mr Data Converter http://shancarter.com/data_converter/
  42. 42. Analysis / Exploratory Visualisation Web Services Google Fusion Tables Google Spreadsheets IBM ManyEyes TimeFlow Applications Tableau/Tableau Public MS Office OpenOffice Gephi Node XL (plug-in for Excel) Spotfire R Processing
  43. 43. Google NGram Viewers Examine word frequency in digitised books Currently about 4% of books ever published In English, Chinese, French, German, Hebrew, Russian, and Spanish Changes in word usage Trends Check out the Cultural Observatory @ Harvard
  44. 44. Google NGram Viewer
  45. 45. Wordle Visually present word frequency using size, weight, colour Consider Word Clouds Considered Harmful
  46. 46. Exercise Choose a dataset from a source such as: The CSO Project Guttenberg or your own material Choose an appropriate Data Visualisation from a webservice we explored in workshop. Explain the process and how you madeyour choice and embed it in your own blog using wordpress.com as we explored last week. Suggest a research question that can be answered by using this data visualisation as a research environment Send the link to me at: days@tcd.ie Maybe: http://politicalreform.ie/2011/12/04/state-of-enda-sunday- business-post-red-c-poll-4th-september-2011/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×