Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Elizabeth Churchill, "Data by Design"

3,976 views

Published on

Published in: Education, Technology
  • Be the first to comment

Elizabeth Churchill, "Data by Design"

  1. 1. Data by DesignElizabeth F. Churchill
  2. 2. Design/Science of participation (1) Science through (platforms for mediated communication)  TMSP (2) Science on (social science contributions about fundamentals of psychology/communication/collaboration/cooperation)  “Hubble telescope” of social scienceWE NEED TO ADDRESS THE DESIGN OF DATA (FOR) SCIENCE ISSUE DIRECTLY
  3. 3. On (1) – TMSP via SMPs  Awareness  Conversation and content exchange good; content storage, indexing and search poor  Content sharing  Malleable as well as stable content  Coordination  Long and short term  Collaborative production  Lightweight to complex  Longevity  Currently questionable….
  4. 4. Cooperative activities,centralisedCollective action,centralisedCollective action,decentralised
  5. 5. On (2)- Sciences of the social  Data quality  descriptive/predictive; observed/understood; local/universal; reactive/proactive; stand- alone/replicated  Science quality  Data stability/longevity, TOS, content and social responsibility WE NEED TO ADDRESS THE DESIGN OF DATA (FOR) SCIENCE ISSUE DIRECTLYDesigners : Statisticians : Computer scientists : Data Scientists : Social scientists
  6. 6. Focus on (2) Mike Loukides http://radar.oreilly.com/2010/06/what-is-data-science.html
  7. 7. On Data Science “What differentiates data science from statistics is that data science is a holistic approach. We’re increasingly finding data in the wild, and data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others.” The first step of any data analysis project is “data conditioning,” or getting data into a state where it’s usable.
  8. 8. On Data Science The most meaningful definition I’ve heard: “big data” is when the size of the data itself becomes part of the problem. The need to define a schema in advance conflicts with reality of multiple, unstructured data sources, in which you may not know what’s important until after you’ve analyzed the data.
  9. 9. On Data Science Data scientists … come up with new ways to view the problem, or to work with very broadly defined problems: “here’s a lot of data, what can you make from it?” The future belongs to the companies who figure out how to collect and use data successfully.  …and the scientists?
  10. 10. Business logic is not science logic
  11. 11. http://www.forbes.com/sites/onmarketing/2012/06/28/social-media-and-the-big-data-explosion/
  12. 12. Data – the ‘this is the dataset’ problem
  13. 13. Verbeeldingskr8 on Flickr
  14. 14. Interface elements….lead to data, inviting action and inviting information
  15. 15. Facebook
  16. 16. Like!Like?Agree!Disagree!(bookmarked)Hello Sherry
  17. 17. Dating
  18. 18. profile creationexplicit versus passive“personalisation”
  19. 19. Anxiety, self reflection, identity…. Eva Illouz
  20. 20. Flickr
  21. 21. Recording and SharingDocumentingPersonal and CollectiveMemoryCompetitionStatusAffiliationGroup MembershipLearningEmulatingAwarenessNear and FarCuriosity/Voyeurism
  22. 22. Flickr – Photo sharing by user location
  23. 23. The Library of Congress, the Powerhouse Museum, the Smithsonian,New York Public Library, and Cornell University Library
  24. 24. http://www.flickr.com/photos/powerhouse_museum/2980051095/
  25. 25. http://www.museumsandtheweb.com/mw2011/papers/rethinking_evaluation_metrics_in_light_of_flic
  26. 26. Data longevity “Like all Commons members, the other qualitative measure we value highly is the sheer inventiveness of Flickr members who engage with the photographs. Currently, Cornell saves links to examples of reuse on delicious (http://www.delicious.com) and displays them as a feed on its website.
  27. 27. Business logic is not science logic
  28. 28. Design/Science of participation(1) Science through (platforms for mediatedcommunication)  TMSP(2) Science on (social science contributions aboutfundamentals of collaboration/cooperation)  “Hubble telescope” of social science
  29. 29. Reflections on requirements Stability – the existence of content in an accessible (and hopefully the same) format over time Science requires  Consistency: consistently re-code the same data in the same way over a period of time  Reproducibility: the tendency for a group of coders to classify categories membership in the same way  Accuracy: or the extent to which the classification of a text corresponds to a standard or norm statistically.  Validity  correspondence of the categories to the conclusions, avoiding ambiguity and addressing multiple possible classifications  Proof: trust in the inferential procedures and clarity of what level of implication is allowed. i.e. do the conclusions follow from the data or are they explainable due to some other phenomenon  Generalizability of results to a theory  Cross-setting comparative interventions
  30. 30. On (2)- Sciences of the social  Data quality  descriptive/predictive; observed/understood; local/universal; reactive/proactive; stand- alone/replicated  Science quality  Data stability/longevity, TOS, content and social responsibility WE NEED TO ADDRESS THE DESIGN OF DATA (FOR) SCIENCE ISSUE DIRECTLYDesigners : Statisticians : Computer scientists : Data Scientists : Social scientists
  31. 31. Questions?churchill@acm.orgxeeliz on Twitter
  32. 32. Acknowledgements On dating: Elizabeth Goodman; on Flickr: Shyong (Tony) Lam, on instrumentation and analysis: David Ayman Shamma & M. Cameron Jones; on Flickr Commons: George Oates Flickr photographers: Marina Noordegraaf (Verbeeldingskr8), Tim Jagenberg, Nicolas Nova

×