Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data stories


Published on

Engaging with data in a post-truth world

Published in: Data & Analytics
  • You have to choose carefully. ⇒ ⇐ offers a professional writing service. I highly recommend them. The papers are delivered on time and customers are their first priority. This is their website: ⇒ ⇐
    Are you sure you want to  Yes  No
    Your message goes here
  • My daughter struggled with Maths due to an absence of teachers during year 10. I purchased Jeevan's 'home-tutor' program and she has not looked back. Not only does it explain the basic steps but also how to achieve those top grades. This is not only far better than a private tutor but amazing value for money. I would only have got a few hours of a tutors time for the same money. I am very grateful as this has turned my daughters attitude to Maths around- she now loves it and finds it easy! My other daughter, who is currently 14-years-old, has already begun your program. After going through your book and DVD's, she has moved up to the top set in maths. I have no doubt when she takes her GCSE maths in 2 years, she will achieve an A/A* grade! Many many thanks for your help Jeevan! ■■■
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Data stories

  1. 1. DATA STORIES ENGAGING WITH DATA IN A POST-TRUTH WORLD Elena Simperl @esimperl Data science seminar Feb 19th 2018
  2. 2. “One of the interpretations of the EU referendum result and the rise of Donald Trump in the US is that we are now living in a post-truth society - a world in which anecdotes shared on social media and invented numbers thrown on the sides of buses are more trusted and influential than official statistics, extensive research, and proven expertise. In this world, scientists, statisticians, analysts, and journalists must find new ways to bring hard, factual data to citizens.” “Data must entertain as well as inform, excite as well as educate. It must be built with social media sharing in mind, and become part of our everyday activities and digital interactions with others.”
  3. 3. Data Stories looks at frameworks and technology to bring data closer to people through art, games, and storytelling. It examines the impact that varying levels of localisation, topicalisation, participation, and shareability have on the engagement of the public with factual evidence. It delivers tools and guidance for communities and civic groups to achieve wider participation and support for their initiatives; and empower artists, designers, statisticians, analysts, and journalists to communicate through data in inspiring, informative ways.
  4. 4. “Data is infrastructure. It underpins transparency, accountability, public services, business innovation and civil society.”
  5. 5. How do we help people tell their data stories? What data stories do people share and why? How do we make data more engaging?
  6. 6. HUMAN DATA INTERACTION Term originally introduced in (Crabtree and Mortier, 2015) in the context of personal data A multidisciplinary field that places human factors at the centre of attention in everything data Considers the whole interaction process between people and data, and the context in which such interactions takes place
  8. 8. RESEARCH QUESTIONS • Who searches for data and why? • How do people search for data? • What sort of queries do they write? • Do they need query writing support? • How should results be displayed? • Do they need one or more search sessions to find what the user is looking for? • Is the search exploratory? • How do people pick the best results?
  9. 9. CONCEPTUAL FRAMEWORKS FOR INTERACTING WITH DATA HELP SYSTEM DESIGNERS IDENTIFY USER TASKS AND TAILOR FEATURES Existing frameworks  Belkin et al. introduced a faceted approach to conceptualizing tasks in information seeking (Belkin et al., 2008)  Yi et al. introduced a taxonomy of tasks in information visualisation (Yi et al., 2007)  We introduced an interaction framework for structured data (Koesten et al., 2017)
  10. 10. INTERACTING WITH STRUCTURED DATA Goal or process oriented Web Data portals People FoI Relevance Usability Quality Visual scan Obvious errors Basic stats Headers Metadata Koesten, L.M., Kacprzak, E., Tennison, J.F. and Simperl, E., 2017, May. The Trials and Tribulations of Working with Structured Data:-a Study on Information Seeking Behaviour. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 1277-1289). ACM.
  11. 11. ANALYSIS OF SEARCH BEHAVIOUR INFORMS THE DESIGN OF DATA SEARCH ENGINES ● Four national open governmental data portals, 2.2 million queries from 2013-2016 (Kacprzak et al., 2017) ● Shorter queries, include temporal and location information ● Explorative search ● Difference in topics between queries issued directly to portals and web search engines ● Ongoing work: comparison to data requests Kacprzak, E., Koesten, L.M., Ibáñez, L.D., Simperl, E. and Tennison, J., A Query Log Analysis of Dataset Search. In International Conference on Web Engineering (pp. 429-436). Springer, 2017.
  12. 12. DATA SUMMARIES HELP PEOPLE MAKE SENSE OF DATA EFFECTIVELY Study with experts and novices, 20 datasets  Task: Write a summary (100 words) about the data  Analysis: thematic analysis, comparison with existing summaries and metadata schemas Automatically generating text from structured data  Neural network architecture  Tested on Dbpedia/Wikidata triples in English, Arabic, Esperanto  Text reused by editors to start new articles Vougiouklis, P., Elsahar, H., Kaffee, L.A., Gravier, C., Laforest, F., Hare, J. and Simperl, E., 2017. Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples. arXiv preprint arXiv:1711.00155.
  14. 14. See
  16. 16. VIRAL DATA HELPS (ALTERNATIVE) FACTS SPREAD FASTERS How does data travel? E.g. on social media What makes data go viral? Visualisations? Subject matter/topic? “Transmission vectors”: journalists, celebrities, grassroots, botnets?
  17. 17. CURRENT DATA SHARING PRACTICES ON TWITTER • What evidence can we see of data sharing activities? • What form is data being shared in? • How are the various stages of the data science pipeline represented? • Does anyone share raw data? • Do narratives explicitly reference the data that they are built on? • How common is data sharing • Who is it done by? • How do they do it? • What kind of data is (not) being shared? • Who makes use of the data for what purposes?
  18. 18. OFFICIAL DATA • 6 week Twitter study of • 1186 original tweets made by 898 people, with 4906 subsequent retweets • 15 most active tweeters, half work for the ONS or are official accounts of the ONS • Most retweeted tweet (503 times) is by a BBC journalist mentioning an ONS data visualisation • One of the 64 separate tweets about this ONS data release
  19. 19. OPEN DATA • Six week Twitter study of • 113 original tweets made by 87 different accounts, with 258 subsequent retweets • No bias towards organisational affiliation is present in the set of active retweeters • The single most retweeted tweet (121 times) is by a Joint Nature Conservation Committee earth observation specialist. Mentions a crop map visualisation from
  20. 20. SHARING SPREADSHEETS • No XLSX, but Google sheets • 1475 original tweets from 1067 unique accounts with 6923 retweets • Most retweeted spreadsheet (1188 times) • Schedule for the timings of INKIGAYO broadcasts (famous Korean livestreamed pop music program with live voting) • Sent by account promoting BTS, a recent high profile K-pop band (the first to win a Billboard newcomers award in the US) • Gives detailed song broadcast timings
  21. 21. SPREADSHEET CATEGORIES AND USE • Visual inspection of 100 highly retweeted sheets • sports statistics (including gambling analysis) • computer games statistics • catalogues of resources/assets (including artist’s videos or a series of TV episodes) • selling goods/artwork/services for a trader or fan group • coordinating donations/volunteers, political info • coordinating political activity • music voting • buying on behalf of an artist • monitoring cryptocurrency offerings Simple list 10% Rich data 40% Data analysis 10% Promoting action 15% Coordinating crowd action 20% Other 5%
  22. 22. USE OF CHARTS • 5% (29) of sheets contained charts • 4 charts intended to promote subsequent use and discussion • Survey of fanfic community from NYC festival attendees • A maths teacher who takes part in Maths Teaching discussion groups tweeted a Google form to record preferences for banana ripeness • A study on the citation of Registered Reports in Cognitive Neuroscience • Historic weather data collected by a local citizen offered to a “sports weather” journalist Games (trading, playing, curation) 7 Politics (monitoring, organising, arguing) 6 Surveys (attitudes, phenomena) 4 Financial investment analysis 3 Personal list of assets/achievements 2 TV/radio (voting/ratings) 2 Trading (orders) 1 Miscellaneous data collection - Historic weather data - Boeing 787 production data (hobbyist) - Google Analytics audit of Udemy - Academic citation analysis 4
  23. 23. USE OF CHARTS (2) • 2 charts support an argument or discussion • UN data on firearms. Discussion thread between pro- & anti- NRA positions. Sent by author, a senior technologist in Microsoft. • Use of the Physics GRE in N American University Physics admission processes. Sent by a delegate at the Conference for Undergraduate Underrepresented Minorities in Physics, not the spreadsheet author.
  25. 25. Can games help people get familiar with data?
  26. 26. DATA GAMES HELP PEOPLE EXPLORE FACTS Minecraft maps generated using LIDAR data Demonstrate effects of global warming Create/model archaeological digs over different time periods C. Gutteridge, Magical Minecraft Map Maker,, 2015
  27. 27. Alexa, what’s our discount levels on those sales?
  28. 28. DATA AS CULTURE, 2018 Exhibition at the Open Data Institute, London Launched January 23rd 2018 Curated by Julie Freeman and Hannah Redler Hawes
  29. 29. Dan Hett Lee Montgomery Pip Thornton Riita Oittinen
  30. 30. WE’RE HIRING @esimperl