Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dashboarding Dirty Data with Dave Tarrant

185 views

Published on

How to dashboard messy data - 1 hour session.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Dashboarding Dirty Data with Dave Tarrant

  1. 1. Content created by The Open Data Institute Dashboarding dirty data with Dave Dr David Tarrant @davetaz The Open Data Institute
  2. 2. Content created by The Open Data Institute Course aim Create a dashboard from dirty input data Course aim
  3. 3. Content created by The Open Data Institute Outcomes Design a properly structured spreadsheet Create a schema for a given set of data Clean a set of dirty data Sort, filter and analyse data in a spreadsheet Create a dashboard using data Outcomes
  4. 4. Content created by The Open Data Institute Part 1 – Organising data Design a properly structured spreadsheet Create a schema for a given set of data Outcomes
  5. 5. Content created by The Open Data Institute Exercise 1 – Organising data bit.ly/tz_source Download and open What would you do (practically) to improve this spreadsheet?
  6. 6. Content created by The Open Data Institute Top 3 tips 1. A single sheet for all data 2. A simple schema without abbreviations 3. No mixed data types in columns
  7. 7. Content created by The Open Data Institute Structure and Unstructured
  8. 8. Content created by The Open Data Institute Documents vs Data For documents the machine is told where to put different things on screen to suit humans. Very fixed output. Given data, the machine can decide how to use it and how to display it best without the need to be told explicitly by a human.
  9. 9. Content created by The Open Data Institute Part 2 - Cleaning Clean a set of dirty data Outcomes
  10. 10. Content created by The Open Data Institute Open refine
  11. 11. Content created by The Open Data Institute Exercise 2 – Cleaning bit.ly/tz_unclean Download and open with open refine (refine available from http://training.theodi.org/InADay) Explore clustering and other cleaning features to ensure this data is ready for analysing
  12. 12. Content created by The Open Data Institute Part 3 – Sort, filter & basic analysis Sort, filter and analyse data in a spreadsheet Outcomes
  13. 13. Content created by The Open Data Institute Exercise 3 – Filtering and analysing bit.ly/tz_clean Download and open with excel Instructor facilitated session
  14. 14. Content created by The Open Data Institute Key spreadsheet features 1. Sort 2. Filter 3. Formula 4. Pivot table
  15. 15. Content created by The Open Data Institute Part 3 – Dashboading your data Create a dashboard using data Outcomes
  16. 16. Content created by The Open Data Institute Exercise 4 – Dasboarding bit.ly/tz_clean Upload this csv dataset to dataseedapp.com (you will need to register for a free account)
  17. 17. Content created by The Open Data Institute
  18. 18. Content created by The Open Data Institute Dataseed – Editing Re-design elements Change colour Change measurement Export/embed
  19. 19. Content created by The Open Data Institute Summary What did we need to do in order to dashboard the original dirty data? Outcomes
  20. 20. Content created by The Open Data Institute Outcomes Design a properly structured spreadsheet Create a schema for a given set of data Clean a set of dirty data Sort, filter and analyse data in a spreadsheet Create a dashboard using data Outcomes
  21. 21. Content created by The Open Data Institute Thank-you Dr David Tarrant @davetaz The Open Data Institute Tools used Microsoft Excel Open Refine Dataseedapp

×