Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How is Data Made? From Dataset Literacy to Data Infrastructure Literacy


Published on

Slides for presentation at Data Literacy Workshop at Web Science 2015, 30th June 2015.

Published in: Education

How is Data Made? From Dataset Literacy to Data Infrastructure Literacy

  1. 1. How is Data Made?
 From Dataset Literacy to
 Data Infrastructure Literacy 30th June, Web Science 2015, University of Oxford Jonathan Gray | | @jwyg
  2. 2. Some thoughts on data literacy beyond the dataset.
  3. 3. Not just reading and using datasets.
  4. 4. Thinking critically and constructively about their contexts of production.
  5. 5. Bigger picture: role of data in society.
  6. 6. Not just literacies to read and use datasets, but literacies to read and shape data infrastructures.
  7. 7. What is data literacy?
  8. 8. What is data?
  9. 9. A metaphor.
  10. 10. Data and photography.
  11. 11. Jonathan Gray (2012) “What Data Can and Cannot Do”. The Guardian. Available at:
  12. 12. Jonathan Gray (2012) “What Data Can and Cannot Do”. The Guardian. Available at:
  13. 13. Early optimism about veracity and fidelity of photography.
  14. 14. – Franklin v. State of Georgia, 69 Ga. 36; 1882 Ga “We cannot conceive of a more impartial and truthful witness than the sun, as its light stamps and seals the similitude of the wound on the photograph put before the jury; it would be more accurate than the memory of witnesses, and as the object of all evidence is to show truth, why should not this dumb witness show it?”
  15. 15. Critical literacy around photography.
  16. 16. Critical literacy to read images: ! • How is the camera set up to take shots? • What is captured and how? • What is not captured? • How does equipment mediate the image? • Selection, framing, arrangement, post- production?
  17. 17. Instead of the camera, the elaborate sprawl of public information systems.
  18. 18. Data infrastructures as socio-technical systems.
  19. 19. What do they measure or capture, and how?
  20. 20. But datasets are not photographs.
  21. 21. Specificities of data infrastructures.
  22. 22. Datasets are heterogeneous.
  23. 23. Datasets are generated by a mixture of social and technical processes, including e.g.: ! • Laws and policies • Administrative protocols • Registration procedures • Instruments and equipment • Software systems • Financial audits • Feedback systems • Management systems • Metadata from digital services • Standards bodies/standardisation procedures
  24. 24. Data literacy is not just about knowing how to use data analysis software or understanding statistics..
  25. 25. But also understanding methods, rationales, assumptions, definitions, technologies, institutions, through which datasets were generated.
  26. 26. Democratising the data revolution.
  27. 27. Not just liberalising access to the informational by-products of public institutions.
  28. 28. But also bringing data infrastructures back into realm of democratic political life.
  29. 29. Recent examples.
  30. 30. 1. Beneficial ownership advocacy. 2. “Statactivism” and counting the uncounted.
  31. 31. 1. Beneficial ownership advocacy." 2. “Statactivism” and counting the uncounted.
  32. 32. Gray. J. & Davies, T. (2015) “Fighting Phantom Firms in the UK: From Opening Up Datasets to Reshaping Data Infrastructures?”. Available at SSRN:
  33. 33. In case of campaigning around company ownership, the disclosure of existing datasets was not enough.
  34. 34. Civil society organisations had to undertake a more creative, sustained and holistic engagement with shaping and influencing the development of data infrastructures as socio-technical systems.
  35. 35. This included research and advocacy around: ! • Costs, functionalities and user interfaces of software systems that would run the register; • Changes to primary and secondary legislation; • Additional administrative requirements and their impacts on different actors inside and outside the public sector.
  36. 36. Campaigners had to look beyond the question of what information is released, towards the question of what information is collected and generated by the public sector in the first place, how this is information is generated through data infrastructures.
  37. 37. 1. Beneficial ownership advocacy. 2. “Statactivism” and counting the uncounted.
  38. 38. 1. Beneficial ownership advocacy. 2. “Statactivism” and counting the uncounted.
  39. 39. “Statactivism”
  40. 40. Bruno, I. and Didier, E. and Vitale, T. (2014) “Statactivism: Forms of Action between Disclosure and Affirmation”. Available at SSRN:
  41. 41. Not just blanket critique or withdrawal of quantification and “metrification”.
  42. 42. Highlighting limitations of existing forms of measurement and proposing alternatives.
  43. 43. For example, gender equality, climate change, working conditions and health.
  44. 44. What should be measured and how?
  45. 45. What is not currently being measured?
  46. 46. Recent examples from data journalism.
  47. 47. New “action repertoires” for civil society actors to shape data infrastructures.
  48. 48. To what extent do data infrastructures address needs and interests of civil society actors?
  49. 49. How to broaden the publics that shape data as well as the publics that use it?
  50. 50. Legal, social and technical measures for making open data initiatives more responsive to concerns of civil society?
  51. 51. ROUTE TO PA:
  53. 53. Question of what is measured and how.
  54. 54. But also who uses information, and how information acts.
  55. 55. From “information as resource” to “information as agent”. ! (Sandra Braman, Change of State, MIT Press, 2009)
  56. 56. “Participatory data infrastructures”
  57. 57. In conclusion…
  58. 58. Going beyond focus on literacy with datasets, towards literacy with data infrastructures through which they are generated.
  59. 59. Role of data infrastructures in addressing global challenges - from climate change to tax base erosion.
  60. 60. Data infrastructures as crucial part of democratic politics in 21st century.
  61. 61. Jonathan Gray | | @jwyg