Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Conversations with data


Published on

#dalmooc 27/10/12 slides

Published in: Education
  • Be the first to comment

Conversations with data

  1. 1. Conversations with Data Tony Hirst Computing and Communications, The Open University
  2. 2. (Recognising and addressing a skills gap)
  3. 3. “The Technical Tools of Statistics” read at the 125th Anniversary Meeting of the American Statistical Association, Boston, November 1964, published in April 1965 American Statistician. /via Adam Cooper, “Exploratory Data Analysis” John Tukey “journeyman carpenter of data-analytical tools”
  4. 4. “A Boy's Work is Never Done”, KellyB. (flickr: foreverphoto/2467694199/)
  5. 5. “Exploratory data analysis is an attitude, a flexibility, and reliance on display, not a bundle of techniques and should be so taught.” John Tukey Tukey, John W. "We need both exploratory and confirmatory." The American Statistician 34.1 (1980): 23-25.
  6. 6. “I … cannot disagree strongly enough with statements about the dangers of putting powerful tools in the hands of novices. Computer algebra, statistics, and graphics systems provide plenty of rope for novices to hang themselves and may even help to inhibit the learning of essential skills needed by researchers. The obvious problems caused by this situation do not justify blunting our tools, however. They require better education in the imaginative and disciplined use of these tools. And they call for more attention to the way powerful and sophisticated tools are presented to novice users.” Leland Wilkinson, The Grammar of Graphics, Springer-Verlag, 1999, ISBN 0-387-98774-6, p15-16.
  7. 7. Data accessibility Data sensemaking
  8. 8. Clean Shape Augment Look
  9. 9. Dirty Data
  10. 10.
  11. 11. Shapes…
  12. 12. I see trees…
  13. 13. See also: IPython notebook demo
  14. 14. “There is no more reason to expect one graph to ‘tell all’ than to expect one number to do the same.” -- John Tukey
  15. 15. If quantities are conserved, can you think of them in terms of flow?
  16. 16. “[T]he picture examining eye is the best finder we have of the wholly unanticipated.” Tukey, John W. "We need both exploratory and confirmatory." The American Statistician 34.1 (1980): 23-25. John Tukey
  17. 17. How can we look at data?
  18. 18. How do we ask questions of data?
  19. 19. underspend filetype:xls Search limits
  20. 20. Structured queries underspend filetype:xls select webPages where text like “%underspend%” and filetype=“xls” and domain=“” SQL
  21. 21. Count things Sort things
  22. 22.
  23. 23. How do we interpret the answers?
  24. 24. Look for outliers Top 3… …bottom 3
  25. 25. Outliers may be rare occurrences over time too… Streaks and runs…
  26. 26. Look for similarities & differences
  27. 27. Look for trends
  28. 28. Look for patterns & structure
  29. 29. “Hand-drawing of graphs, except perhaps for reproduction in books and in some journals, is now economically wasteful, slow, and on the way out.” – John Tukey
  30. 30. Recording your conversations
  31. 31.
  32. 32. IPython Notebook
  33. 33. “I know of no person or group that is taking nearly adequate advantage of the graphical potentialities of the computer.” – John Tukey
  34. 34. Hopefully, that contained some -- @psychemedia