Page 1 of 6
Data-driven enterprise off your beat
Matt Wynn | @mattwynn | matt.wynn@owh.com
Why learn this stuff?
- It’s the native language. Processes that used to happen in person or on
paper are more and more likely to take place digitally. If you can’t use
data, you’re missing things.
- Credibility. We all know the one-two-trend joke. Data take you away
from he-said/she-said and allow you to report the big picture.
- Golden ticket to enterprise. Data let us ask and answer our own
questions in a way that is beyond reproach. We can learn things even our
sources don’t know. Instead of reacting to news or newsmakers, data can
put us in the driver’s seat.
- Competitive edge. Knowing even a few simple tricks will lead to fewer
phone calls and follow-up emails. You can publish faster and better than
the competition.
- Gateway drug to a different kind of journalism. It’s not quite in our
wheelhouse today, but understanding data is a stepping stone toward Web
and mobile applications that allow us to tell stories in brand-new ways.
How do you start?
- Just do it! Pick a topic -- maybe one of those highlighted below -- and get
to it. Data journalism is like painting: you can study all you want, but you
won’t get any better until you pick up a brush. Request a data set and start
digging for a story.
- (Re)learn how to learn. There are a daunting number of tools and
methods to tackle any story. Data journalism is an awful lot of muddling
through new tools, failing and trying again. How-tos and tutorials are free
and plentiful. Googling is a skill, not an embarrassment.
- Pick something easy. There are fascinating stories that will let you
stretch your CAR (computer-assisted reporting) muscles without taking
too much time or causing too many headaches. Start with those. Your
audience and editors will be impressed, and you’ll earn more freedom to
pursue the big stories you want to chase.
- Ask questions. The data journalism community is notorious for helping
its members out. Join NICAR-L mailing list (http://bit.ly/nicarsubscribe);
email reporters whose work you want to emulate; ask questions online.
People have been in your shoes, and they are eager to offer a hand.
Page 2 of 6
- Work it into your process. And do it now. Take advantage of what you
learn, and use it to do a story as soon as possible. CAR muscles atrophy
quickly.
- Keep pushing yourself. Once you’ve got some victories under your belt,
keep pushing yourself to learn new techniques. Spreadsheets, databases,
mapping, statistics and programming all let us tell different stories, and all
of them have their place.
Daddy, where do data come from?
There are all kinds of ways to track down data. Each one makes sense for a
certain kind of story, a certain kind of project.
- Google. Use Boolean logic and some search tricks to find data sets on an
agency’s website. Type in “site:nebraska.gov” and “filetype:xlsx” and we
find evidence of contract records, tax-hearing schedules and all sorts of
stuff we otherwise might not know about. (Note: still vet any records with
the agency.)
- Check records-retention schedules. Most government entities have a
records-retention schedule that outlines what records they have, and how
long they have to be kept. It can be a nice index to what’s available.
- See what others are up to. www.ire.org/extraextra is a great resource
for data stories. Many will tip you off to interesting data sets that you can
obtain on your own.
- Fear not the FOIA (Freedom of Information Act). There is nothing
wrong with a formal records request. Years ago, reporter Mike McGraw
talked about filing at least one request every Friday, whether or not he
actually had an interest in the records. You never know what you’ll get,
and it trains agencies to accept that this is a tool in your toolbox. FOIAs
are especially helpful in CAR because they can ensure you get the exact
data you want, saving time and money. (If you’re in Nebraska, use our
generator at http://dataomaha.com/media/news/records/)
- Ask your sources. Often, the only way to find what you’re after is to ask.
Especially for one-off stories, especially for obscure agencies or processes,
there are more data than any reasonable person would expect. (Pro tip:
Don’t bother going through flaks for this kind of question. Find a person in
IT or with “data” in the title, and call them up)
- Walk through the chain of command. Often, data collection is
required by agencies responsible for a number of smaller agencies. A
school reports to a school district, which reports to the state, which reports
to the feds. If the federal government can report a fact about a school, then
Page 3 of 6
it must be reported by the school. The same method can hit pay dirt on any
beat.
- Read annual reports, board reports, and the like. The totals and
averages that are contained in regular reports don’t come from thin air.
Asking where the numbers came from can tip you off to the existence of
important, unadvertised data sets.
- Make your own. Sometimes you have to make a data set yourself.
Consult with experts to make sure you’re not doing anything unwise.
So what can you do on your beat?
Government
- Tax assessments. Gold mine of stories that people want to read.
Potential graft or quid pro quo.
- Salaries. Topline numbers can be predictable and boring. But
breakdowns -- overtime, special incentive pay, etc. -- can offer a new layer
worth exploring.
- Tax scofflaws. People and businesses that don’t pay up often make their
own special list. Worth getting.
- Budgets. The more detailed, the better. Changes over time can be
especially interesting.
- Licenses. Any licenses awarded by your agency can be telling.
- Inspections. Restaurants, weights and measures, cleanliness, safety, etc.
- Purchase records. What’s getting purchased, from whom and for how
much?
- P Cards. Similar to purchase records, but often with less oversight. Get
credit-card statements for cardholders within the agency.
- Campaign finance. Who’s giving? Who’s getting? When?
- Literally anything. Government is excellent because it has to answer to
the public. Nebraska’s laws are strong -- if they are spending money to do
a thing (which, I would argue, happens by virtue of a paid employee
tracking it) -- it is available to us.
Education
- Test scores. Over time, by race and poverty or special education status.
- Campus crime. Compiled by IRE and available for cheap, breaks down
crimes on every college campus in the US. Cross-checked with police
records, can lead to some valuable results. (http://ire.org/nicar/database-
library/databases/doe-campus-crime/)
- Teacher rosters. Which schools get the greenest teachers? Which has
the most experienced, the most educated?
Page 4 of 6
- Repairs, maintenance and repair requests. Maintaining so many
buildings and so much equipment costs money. How is your district
keeping up, and is every school treated fairly?
- Teacher discipline in Nebraska.
https://dc2.education.ne.gov/tc_lookup/
Cops and courts
- Sex-offender registration. Do they live near schools, day cares? Are
they all living in the same area? Are the registration records even
accurate?
- Crime logs. Where are burglaries most likely? Is there a time of year
that’s most dangerous? If your department has data with narratives, all the
better. Can be compared to FBI Uniform Crime Reporting (UCR) data to
see if reports are accurate.
- Jail/prison logs. Who’s been in longest? Who’s been in most often?
- Police discipline.
- Court records. Which judge is harshest? How does your county court
differ from those around the state in terms of sentencing? Which attorney
pleas down the most?
Health
- Vaccination rates
- Various inspections and complaints. Nursing homes, hospitals,
home health-care agencies.
- CMS (Centers for Medicare & Medicaid Services) data sets. The
feds have interesting data grading and comparing hospitals, nursing
homes and the like along a variety of measures. For example:
https://www.medicare.gov/hospitalcompare/search.html.
- Prescription-drug data. As of Jan. 1, 2017, Nebraska will have a
comprehensive data set of prescriptions. Names will be omitted, but the
topline figures should still prove interesting.
- Mortality. The Centers for Disease Control and Prevention (CDC) has
data reporting how every single American dies. What are the highest rates
in your area? Why? http://www.cdc.gov/nchs/deaths.htm
Sports
- Minor league baseball: http://www.milb.com/milb/stats/
- College athletic department salaries. See USA Today for top coaches’
salaries in football: http://www.usatoday.com/sports/college/salaries/
- and basketball:
http://www.usatoday.com/sports/college/salaries/ncaab/coach/
- NCAA research data: http://www.ncaa.org/about/resources/research
Page 5 of 6
- Major NCAA infractions:
https://web1.ncaa.org/LSDBi/exec/miSearch
- Academic-progress rates for college athletes:
http://www.icpsr.umich.edu/icpsrweb/content/NCAA/data.html
- High school sports-participation rates:
http://www.nfhs.org/ParticipationStatics/ParticipationStatics.aspx/
Grab bag
- Potholes. Cities should have databases reflecting pothole locations, the
date they were notified of the issue and the day it was fixed. Comparing all
those issues can show you hard-hit areas of town, how well your city
responds in general, or if services differ depending on where in the city a
pothole is located.
- Lawsuits and claims against the city. Pretty much just potholes, part
two. Can let you see if potholes are getting more severe, based on the
number of claims or actual value paid out. Might also indicate other
persistent issues in your city or its infrastructure.
- Pet names: Many local governments track this when issuing dog tags; if
not, the humane society or others may have the info. You can look at the
most popular name, the most popular name by breed or by type of pet.
You can break down into ZIP codes or cities, if you have a large enough
coverage area. You can also look at most unique names.
- Gas-pump inspections: An excellent first CAR story. The state
department of weights and measures tests every gas pump to see whether,
when it reports pumping a gallon, it indeed does just that. Those data are
recorded and available for a nominal fee. You can see the worst pump in
your town, how your town compares to others and so on.
Where can you learn more?
- Kate Willson set up an awesome tutorial for the building blocks of
data journalism in Excel. Includes filtering, sorting and pivot tables:
http://www.icij.org/resources/simple-excel-functions-data-analysis.
- Computer-assisted reporting boot camps through Investigative
Reporters & Editors: http://ire.org/events-and-training/boot-camps/.
Fellowships available: http://ire.org/events-and-training/fellowships-
and-scholarships
- While we’re at it, join Investigative Reporters & Editors (IRE):
$70 a year/$25 for students. 3,500 tip sheets; 30 low-cost, cleaned-up
government databases; 25,000 stories, NICAR-Learn short-video training
for $25 for members/$40 non-members.
https://www.ire.org/membership/
Page 6 of 6
- Webinar replay: The Basics of Data Journalism -
http://stateimpact.npr.org/toolbox/2013/04/15/webinar-the-basics-of-
data-journalism/ Created by the NPR StateImpact pilot project.
- Interactive tutorial: Spreadsheet Basics – Learn the main parts of
the spreadsheet, how to enter data and use formulas and functions:
http://bit.ly/sheetbasics Created by Berkeley Advanced Media Institute
Things I wish someone had told me
1) One year’s worth of data rarely gets you where you’re going. For
example, consider a salary database. With one year of data, you can tell
which department gets the most, who is the highest paid and….well, that’s
about it. With two years, you can dig into who has increased the most,
which department saw its budget cut or increased and the like. Those
findings are more likely to lead somewhere intriguing.
2) You always want detailed data. Often, agencies think you want
summary statistics -- the number of enforcement actions or the average
salary in a department, for example. We almost always want the detailed
data -- every enforcement action, every salary. We can create our own
summary stats, and more importantly, can see the outliers that might be
masked otherwise.
3) Journalism is way behind the curve. Chances are you’ll be one of a
handful of data-savvy people in your newsroom. It can seem lonely. But in
other industries, data tools are old hat. Lots of local experts can help you
learn whatever you want to know. Universities are chock full of people
happy to double-check your work or teach you a new skill.
4) Verify everything. Keep a log of whatever you do to a database. Don’t
delete data. Be super open about how you achieved the results you did. If
you’ve made a logical leap you shouldn’t have, calling around to experts in
advance will keep you from an embarrassing retraction.
5) This is not rocket science! At its core, journalism is about learning. We
learn new stuff every day and distill it down for our readers. Treat the tools
of data journalism like a new part of your beat and see how far it takes you.
Want to know more, or need help?
Get at me.
Matt Wynn
Omaha World-Herald
(402) 444-3144 | matt.wynn@owh.com | @mattwynn

Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTrain - April 9, 2016

  • 1.
    Page 1 of6 Data-driven enterprise off your beat Matt Wynn | @mattwynn | matt.wynn@owh.com Why learn this stuff? - It’s the native language. Processes that used to happen in person or on paper are more and more likely to take place digitally. If you can’t use data, you’re missing things. - Credibility. We all know the one-two-trend joke. Data take you away from he-said/she-said and allow you to report the big picture. - Golden ticket to enterprise. Data let us ask and answer our own questions in a way that is beyond reproach. We can learn things even our sources don’t know. Instead of reacting to news or newsmakers, data can put us in the driver’s seat. - Competitive edge. Knowing even a few simple tricks will lead to fewer phone calls and follow-up emails. You can publish faster and better than the competition. - Gateway drug to a different kind of journalism. It’s not quite in our wheelhouse today, but understanding data is a stepping stone toward Web and mobile applications that allow us to tell stories in brand-new ways. How do you start? - Just do it! Pick a topic -- maybe one of those highlighted below -- and get to it. Data journalism is like painting: you can study all you want, but you won’t get any better until you pick up a brush. Request a data set and start digging for a story. - (Re)learn how to learn. There are a daunting number of tools and methods to tackle any story. Data journalism is an awful lot of muddling through new tools, failing and trying again. How-tos and tutorials are free and plentiful. Googling is a skill, not an embarrassment. - Pick something easy. There are fascinating stories that will let you stretch your CAR (computer-assisted reporting) muscles without taking too much time or causing too many headaches. Start with those. Your audience and editors will be impressed, and you’ll earn more freedom to pursue the big stories you want to chase. - Ask questions. The data journalism community is notorious for helping its members out. Join NICAR-L mailing list (http://bit.ly/nicarsubscribe); email reporters whose work you want to emulate; ask questions online. People have been in your shoes, and they are eager to offer a hand.
  • 2.
    Page 2 of6 - Work it into your process. And do it now. Take advantage of what you learn, and use it to do a story as soon as possible. CAR muscles atrophy quickly. - Keep pushing yourself. Once you’ve got some victories under your belt, keep pushing yourself to learn new techniques. Spreadsheets, databases, mapping, statistics and programming all let us tell different stories, and all of them have their place. Daddy, where do data come from? There are all kinds of ways to track down data. Each one makes sense for a certain kind of story, a certain kind of project. - Google. Use Boolean logic and some search tricks to find data sets on an agency’s website. Type in “site:nebraska.gov” and “filetype:xlsx” and we find evidence of contract records, tax-hearing schedules and all sorts of stuff we otherwise might not know about. (Note: still vet any records with the agency.) - Check records-retention schedules. Most government entities have a records-retention schedule that outlines what records they have, and how long they have to be kept. It can be a nice index to what’s available. - See what others are up to. www.ire.org/extraextra is a great resource for data stories. Many will tip you off to interesting data sets that you can obtain on your own. - Fear not the FOIA (Freedom of Information Act). There is nothing wrong with a formal records request. Years ago, reporter Mike McGraw talked about filing at least one request every Friday, whether or not he actually had an interest in the records. You never know what you’ll get, and it trains agencies to accept that this is a tool in your toolbox. FOIAs are especially helpful in CAR because they can ensure you get the exact data you want, saving time and money. (If you’re in Nebraska, use our generator at http://dataomaha.com/media/news/records/) - Ask your sources. Often, the only way to find what you’re after is to ask. Especially for one-off stories, especially for obscure agencies or processes, there are more data than any reasonable person would expect. (Pro tip: Don’t bother going through flaks for this kind of question. Find a person in IT or with “data” in the title, and call them up) - Walk through the chain of command. Often, data collection is required by agencies responsible for a number of smaller agencies. A school reports to a school district, which reports to the state, which reports to the feds. If the federal government can report a fact about a school, then
  • 3.
    Page 3 of6 it must be reported by the school. The same method can hit pay dirt on any beat. - Read annual reports, board reports, and the like. The totals and averages that are contained in regular reports don’t come from thin air. Asking where the numbers came from can tip you off to the existence of important, unadvertised data sets. - Make your own. Sometimes you have to make a data set yourself. Consult with experts to make sure you’re not doing anything unwise. So what can you do on your beat? Government - Tax assessments. Gold mine of stories that people want to read. Potential graft or quid pro quo. - Salaries. Topline numbers can be predictable and boring. But breakdowns -- overtime, special incentive pay, etc. -- can offer a new layer worth exploring. - Tax scofflaws. People and businesses that don’t pay up often make their own special list. Worth getting. - Budgets. The more detailed, the better. Changes over time can be especially interesting. - Licenses. Any licenses awarded by your agency can be telling. - Inspections. Restaurants, weights and measures, cleanliness, safety, etc. - Purchase records. What’s getting purchased, from whom and for how much? - P Cards. Similar to purchase records, but often with less oversight. Get credit-card statements for cardholders within the agency. - Campaign finance. Who’s giving? Who’s getting? When? - Literally anything. Government is excellent because it has to answer to the public. Nebraska’s laws are strong -- if they are spending money to do a thing (which, I would argue, happens by virtue of a paid employee tracking it) -- it is available to us. Education - Test scores. Over time, by race and poverty or special education status. - Campus crime. Compiled by IRE and available for cheap, breaks down crimes on every college campus in the US. Cross-checked with police records, can lead to some valuable results. (http://ire.org/nicar/database- library/databases/doe-campus-crime/) - Teacher rosters. Which schools get the greenest teachers? Which has the most experienced, the most educated?
  • 4.
    Page 4 of6 - Repairs, maintenance and repair requests. Maintaining so many buildings and so much equipment costs money. How is your district keeping up, and is every school treated fairly? - Teacher discipline in Nebraska. https://dc2.education.ne.gov/tc_lookup/ Cops and courts - Sex-offender registration. Do they live near schools, day cares? Are they all living in the same area? Are the registration records even accurate? - Crime logs. Where are burglaries most likely? Is there a time of year that’s most dangerous? If your department has data with narratives, all the better. Can be compared to FBI Uniform Crime Reporting (UCR) data to see if reports are accurate. - Jail/prison logs. Who’s been in longest? Who’s been in most often? - Police discipline. - Court records. Which judge is harshest? How does your county court differ from those around the state in terms of sentencing? Which attorney pleas down the most? Health - Vaccination rates - Various inspections and complaints. Nursing homes, hospitals, home health-care agencies. - CMS (Centers for Medicare & Medicaid Services) data sets. The feds have interesting data grading and comparing hospitals, nursing homes and the like along a variety of measures. For example: https://www.medicare.gov/hospitalcompare/search.html. - Prescription-drug data. As of Jan. 1, 2017, Nebraska will have a comprehensive data set of prescriptions. Names will be omitted, but the topline figures should still prove interesting. - Mortality. The Centers for Disease Control and Prevention (CDC) has data reporting how every single American dies. What are the highest rates in your area? Why? http://www.cdc.gov/nchs/deaths.htm Sports - Minor league baseball: http://www.milb.com/milb/stats/ - College athletic department salaries. See USA Today for top coaches’ salaries in football: http://www.usatoday.com/sports/college/salaries/ - and basketball: http://www.usatoday.com/sports/college/salaries/ncaab/coach/ - NCAA research data: http://www.ncaa.org/about/resources/research
  • 5.
    Page 5 of6 - Major NCAA infractions: https://web1.ncaa.org/LSDBi/exec/miSearch - Academic-progress rates for college athletes: http://www.icpsr.umich.edu/icpsrweb/content/NCAA/data.html - High school sports-participation rates: http://www.nfhs.org/ParticipationStatics/ParticipationStatics.aspx/ Grab bag - Potholes. Cities should have databases reflecting pothole locations, the date they were notified of the issue and the day it was fixed. Comparing all those issues can show you hard-hit areas of town, how well your city responds in general, or if services differ depending on where in the city a pothole is located. - Lawsuits and claims against the city. Pretty much just potholes, part two. Can let you see if potholes are getting more severe, based on the number of claims or actual value paid out. Might also indicate other persistent issues in your city or its infrastructure. - Pet names: Many local governments track this when issuing dog tags; if not, the humane society or others may have the info. You can look at the most popular name, the most popular name by breed or by type of pet. You can break down into ZIP codes or cities, if you have a large enough coverage area. You can also look at most unique names. - Gas-pump inspections: An excellent first CAR story. The state department of weights and measures tests every gas pump to see whether, when it reports pumping a gallon, it indeed does just that. Those data are recorded and available for a nominal fee. You can see the worst pump in your town, how your town compares to others and so on. Where can you learn more? - Kate Willson set up an awesome tutorial for the building blocks of data journalism in Excel. Includes filtering, sorting and pivot tables: http://www.icij.org/resources/simple-excel-functions-data-analysis. - Computer-assisted reporting boot camps through Investigative Reporters & Editors: http://ire.org/events-and-training/boot-camps/. Fellowships available: http://ire.org/events-and-training/fellowships- and-scholarships - While we’re at it, join Investigative Reporters & Editors (IRE): $70 a year/$25 for students. 3,500 tip sheets; 30 low-cost, cleaned-up government databases; 25,000 stories, NICAR-Learn short-video training for $25 for members/$40 non-members. https://www.ire.org/membership/
  • 6.
    Page 6 of6 - Webinar replay: The Basics of Data Journalism - http://stateimpact.npr.org/toolbox/2013/04/15/webinar-the-basics-of- data-journalism/ Created by the NPR StateImpact pilot project. - Interactive tutorial: Spreadsheet Basics – Learn the main parts of the spreadsheet, how to enter data and use formulas and functions: http://bit.ly/sheetbasics Created by Berkeley Advanced Media Institute Things I wish someone had told me 1) One year’s worth of data rarely gets you where you’re going. For example, consider a salary database. With one year of data, you can tell which department gets the most, who is the highest paid and….well, that’s about it. With two years, you can dig into who has increased the most, which department saw its budget cut or increased and the like. Those findings are more likely to lead somewhere intriguing. 2) You always want detailed data. Often, agencies think you want summary statistics -- the number of enforcement actions or the average salary in a department, for example. We almost always want the detailed data -- every enforcement action, every salary. We can create our own summary stats, and more importantly, can see the outliers that might be masked otherwise. 3) Journalism is way behind the curve. Chances are you’ll be one of a handful of data-savvy people in your newsroom. It can seem lonely. But in other industries, data tools are old hat. Lots of local experts can help you learn whatever you want to know. Universities are chock full of people happy to double-check your work or teach you a new skill. 4) Verify everything. Keep a log of whatever you do to a database. Don’t delete data. Be super open about how you achieved the results you did. If you’ve made a logical leap you shouldn’t have, calling around to experts in advance will keep you from an embarrassing retraction. 5) This is not rocket science! At its core, journalism is about learning. We learn new stuff every day and distill it down for our readers. Treat the tools of data journalism like a new part of your beat and see how far it takes you. Want to know more, or need help? Get at me. Matt Wynn Omaha World-Herald (402) 444-3144 | matt.wynn@owh.com | @mattwynn