A presentation at Data Driven Connecticut 2014: Progress and Possibilities. Moving from Data to Action: A Connecticut Data Collaborative Conference on Friday, November 24, 2014 at Yale School of Management, Evans Hall, New Haven, Connecticut. See Notes for presentation script.
8. Seek truth and report it
Minimize harm - Balance the public’s need for information against
potential harm or discomfort.
Act independently
Be accountable and transparent
Ethical challenges
Editor's Notes
Data has always played a role in journalism. When I was a college student 20+ years ago, journalists called it COMPUTER ASSISTED REPORTING. Now all reporting involved computers,
Data used to be hard to get. News organizations rarely collected their own data sets, so journalists were dependent upon public agencies, academics, think tanks, it was the journalists' job to find out what data public agencies had in their possession.
use FREEDOM OF INFORMATION requests get access to that information. Getting the data often was an arduous process. Cleaning and parsing the data was equally difficult and time-consuming.
But now, thanks to the internet and the push for OPEN DATA, that scenario has changed for the most part. Data is more abundant than ever, and much of it is publicly available online.
The software used to format, analyze and visualize data quickly is more abundant, too.
We still have to file FOI request, and often government agencies post data online so they DON’T have to talk to journalists. But journalists still have to ask questions, do reporting to understand the data.
Why should the public care whether journalists use data?
Public opinion is often driven by news coverage. Storytelling is a central part of news
Journalists have an important role in shaping how the public perceives certain issues by nature of their large audiences and choosing what information to frame and amplify.
Data journalism begins in one of two ways: either there is question that needs data to answer, or a dataset in need of questioning.
To trust the quality of the data – and that means cleaning it. Cleaning typically takes two forms: removing human error; and converting the data into a format that is consistent with other data you are using.
For example, datasets will often include some or all of the following: duplicate entries; empty entries, bad formatting, multiple names for the same thing. UConn vs University of Connecticut
Context: who gathered it, when, and for what purpose? How was it gathered? (The methodology). What exactly do they mean by that?
Combine two or more datasets with a common data poin That might be a politican’s name, for example, or a school, or a location. Pl
Data provides the foundation for a story.
Journalists can think about data as clues to a story.
In the past month, my journalism students at UConn have analyzed specific data sets from the University’s Office of Institutional Research and created data visualizations from those sets, that reveal trends and story ideas.
Biological sciences is now the largest and most popular Liberal arts degree at UConn.
The number of international students at UConn has risen 278 percent in the last 10 years. Last year, half of those students came from China.
African American women enrolling at UConn outpace African American men.
Start with that trend or outlier and then ask more questions
Journalists armed with data have the ability to highlight issues that may have slipped under the radar of policymakers
And Anecdotal evidence backed by a data set is more powerful than either one alone.
In 2008. Nate Silver, a relatively unknown baseball statistician, correctly predicted every Senate race and all but one state in the presidential election. His blog fivethirtyeight became very popular. Last year, he shopped it around and was picked up by ESPN.
Vox – data & explanatory journalism
Propublica, - non profit, gives away data sets it obtains through FOI, sells other data it had to work hard to crunch
Crowdsourced data: ask the public to participate and help in collecting data. It has inherent issues of accuracy, but can still be helpful, especially during natural disasters. Which gas stations were open during Snow-pocalypse. Google Person Finder during Typhoon Haiyan.
If data journalism means the analysis of and reporting on data sets that already exist, sensor journalism goes a step further: Organizations and journalists using sensor technology to create their own real-time data and then report on it.
What sensors do best is detect characteristics of the physical world — properties such as light, heat, sound, pressure, vibration, air quality and moisture.
Journalists thinking about stories they could tell if only they had some data. If they can get data with relatively little money and effort, if they’re willing to veer well outside the usual tools journalists work with.
“take human observations and impressions and make them specific, so that they might be used for comparisons.”
Public sensors – members of the complained of speeding cops, high profile incident where a child was killed. - using data from toll transponders, they were able to prove cops WERE speeding – determine speed based on how quickly reach toll.
Pursuit of the news is not a license for arrogance or undue intrusiveness.
LO HUD GUN MAP – Journal News Westchester, NY
Realize that private people have a greater right to control information about themselves than public figures and others who seek power, influence or attention. Weigh the consequences of publishing or broadcasting personal information.
Make an error – correct it.
Journalism can overstate data conclusions to the public. Most journalists aren’t statisticians, so we need Even worse, other sites and analysts intentionally misrepresent data in order to confuse the public.
It is possible to find data saying almost anything. Data analysis must be performed to determine what the “truth” is or to make predictions, but this analysis has assumptions built into the model.
Orgs should make their assumptions publicly available and easily understandable,
The line between transparency and privacy is constantly in tension. “We’re very close to being able to gather data on what people do most of the time,” so choice becomes paramount. When you can survey everything, what do you report?”
Journalists must “ask questions about why we collect data, who controls this data and what it means will effectively empower individuals without subverting newsgathering.”
Who collects the data affects data collection methods, analysis of the data, and accessibility of the data.