• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
20120921 storytelling withopendata
 

20120921 storytelling withopendata

on

  • 822 views

 

Statistics

Views

Total Views
822
Views on SlideShare
464
Embed Views
358

Actions

Likes
0
Downloads
0
Comments
0

3 Embeds 358

http://maryannkempthorne.ca 355
https://twitter.com 2
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • My name is Rhiannon Coppin and I’m supposed to be here today to talk about Storytelling with Open Data,
  • But instead I’m going to be talking more about Storytelling with Open-ISH Data – because I find that – overall – there isn’t at this time a great variety of ready-to-use, restriction-free, accessible, data. But that’s okay. There’s still a lot of shareable, usable, interesting data, if you know how to find it, or recognize it when its staring you in the face.
  • And that’s what I hope I can help with – introducing some of you to ideas and possibilities, so that when you do come across a unique data set, you see the opportunity for how it could be shared. And so I’ll add the subtitle “An incomplete survey of cool stuff”, which is a sampling of apps and visualisations by journalists and coders that caught my eye.
  • A bit about me,Some days I wake up and write and edit news stories for the CBC website. Other days I engage in nerdly hobbies, which also tend to involve sitting in front of the computer.Recently was able to combine the two identities by working on an open(ish) data project for the cbc. We started with one and a half million attendance records from the Vancouver School Board, and asked the question: does the data show that students are in fact having a hard time making it to their first class… the answer, which you’d expect, is yes… but sometimes the data told stories we hadn’t thought of… For one, I didn’t expect to see that Grade 12s were having the hardest time being on time. Another query that didn’t make it into the story was about the most-missed classes – and would you believe PLANNING 10 was in there near the top?I am particularly excited about three areas of computer science that I see leaking out into the wild – because even journalists are using them. I will touch on these sometimes explicitly, sometimes implicitly, and if you have specific questions about any of the content or links please come see me after.Also, I studied journalism in New York and this presentation is a bit U.S.-centric mainly because of my familiarity with the work of other American journalists and coders I stay in touch with and keep tabs on.
  • So on to storytelling. I’m going to divide the entire corpus of stories into two categories.. The stories we know and like to tell over and over again.. Things like U.S. cities are segregated, which most of us know to some degree, if even just through movies, but with the 2010 census data, the New York Times published a block-by-block point map of the dominant racial/ethnic origin as recorded in the census. There are also the stories we maybe don’t know, which the map on the right shows. Using 2010 and year-2000 census data (albeit in larger chunks), the urban research group shows that over the past decade, the segregation is less pronounced.
  • Staying with the topic of census data, which is probably the most popular single “open data” among journalists, I wanted to show you a set of stacked charts designed by a Stanford Human-Computer Interface guy named Jeff Heer, who spoke at a computer-assisted reporting conference lst year. This is a story we already know – the existence of the baby boomers – but sometimes seeing it put another way really drives home the demographic anomaly we’re facing in our current population.
  • The previous slide was put together with an open source javascript library called Data-Driven Documents (D3).js– and when I was browsing a list of examples I came across this fellows visualization of the authors and dates of publication for papers that came out of his team or department. It’s interesting to note the change in topic areas over the past 20 years, and the story of who came in and out of the department. I imagine a visualization like this could be done for any list of citations.
  • Another D3 example is this chart showing the interactions between characters in Victor Hugo’s Les Miserables. Open(ish) because not many of these data sets exist… and there may be copyright issues for those that do.
  • Open(ish) because this example.. Documents about the Iraq War, which are available here and there online, were initially leaked to the media.11,000 separate documents…
  • Edwin Chen, a data scientist at Twitter, analyzed
  • Fernanda Viégas and Martin Wattenberg lead Google's "Big Picture" visualization research group in Cambridge, Massachusetts. Before joining Google, the two founded Flowing Media, Inc., a visualization studio focused on media and consumer-oriented projects. Prior to Flowing Media, they led IBM’s Visual Communication Lab, where they created the ground-breaking public visualization platform Many Eyes.
  • And I’ll add the subtitle: “An incomplete survey of cool stuff I’m working towards” (as soon as I find time.. The thing I really need to find a conference about.)

20120921 storytelling withopendata 20120921 storytelling withopendata Presentation Transcript

  • Storytelling with Open Data Rhiannon Coppin Journalist, coder @coppinr
  • Storytelling with Open Data Rhiannon Coppin Journalist, coder @coppinr
  • Storytelling with Open Data (An incomplete survey of cool stuff) Rhiannon Coppin Journalist, coder @coppinr
  • A bit about me… Web Writer for CBC.ca (part-time) & Dabble in code, math, stats• Excited about: – Natural Language Processing (NLP) – Machine Learning (clustering, regression) – Automation (feeds)
  • Types of stories we tell:1) Things we already know 2) Things we don’t already knowhttp://projects.nytimes.com/census/2010/explorer?ref=us http://www.urbanresearchmaps.org/comparinator/pluralitymap.htm
  • Open data• Government data http://vis.stanford.edu/jheer/d3/pyramid/shift.html
  • Open data• Publication information http://www.cs.umd.edu/~bederson/papers/index.html
  • Open(ish) data• Publication content – things we know http://bost.ocks.org/mike/miserables/
  • Open(ish) data• Publication content – things we don’t know http://overview.ap.org/
  • Twitter: Things we know… http://blog.echen.me/2012/07/06/soda-vs-pop-with-twitter/
  • Twitter: Things we don’t know…http://www.guardian.co.uk/uk/interactive/2011/dec/07/london-riots-twitter
  • More animation• Motion Charts http://www.gapminder.org/world
  • More animation (automated)• Imagine… circulation datahttp://www.wunderground.com/US/Region/US/2xWindSpeed.html
  • More animation (automated)• Imagine… circulation datahttp://hint.fm/wind/
  • (some) ToolsOpen Source• D3.js (also: jQuery)• Overview• R• PANDAFree• Google Refine, Fusion Tables, Motion ChartsOther: Excel, SPSS
  • Where to go for more• Open news – Ideas, people, code: – open.blogs.nytimes.com – propublica.org/nerds – blog.apps.chicagotribune.com – datadesk.latimes.com – guardian.co.uk/data – mozillaopennews.org• Coursera (free!) – stats, data analysis, NLP
  • Thank you!Storytelling with Open Data (An incomplete survey of cool stuff) Rhiannon Coppin Journalist, coder @coppinr