A session that challenges professional and student journalists to dig deeper, deliver more accountability and bring an enterprising/investigative mindset to their work. Training will include examples of using records, documents, data and experiments to bring more impactful reporting. No matter what the size of your team, your journalism can go deeper. Bring your laptop for the exercises. No previous data experience is required. Trainer Aaron Mendelson is the data reporter at KPCC, the NPR affiliate in Los Angeles.
1. Data-driven enterprise off your beat
Aaron Mendelson | @a_mendelson | amendelson@scpr.org
Have other questions? Just ask me.
Handout adapted from 2017 version created by the Boston Globe’s Todd Wallack
Years of slides and handouts from NewsTrain sessions are available at the following
link: bit.ly/2oGAOF2
Why data journalism?
Discover new stories. Using data, you can sometimes uncover trends or
stories that might go unreported.
Find great examples. Using a database, you can search find a human
source to give life to your story.
Get the perfect stat. With data, you can do more than present the
anecdote. You can give a more complete picture.
Make better graphics. Use data to create charts, maps or other visuals.
Reference material. Once you obtain a database, you can keep it handy
for breaking news. Create your own data library.
Burnish your resume. Increasingly, managers want journalists who
are comfortable with data.
How to get started
Learn Excel or Google Sheets. You can do at least 90% of data stories
with spreadsheets.
Start small. Do something simple. Analyze a spreadsheet someone sent
you. Look at the payroll or budget for an agency you cover. Find a personal
project outside of work.
Hunt for data related to your beat. It is data you’ll use right away.
File records requests to get retention schedules for agencies you cover, and
use those to file more records requests.
Copy others. Look for simple stories others have done nationally or at
other local news organizations. Then try tolocalize it for your market. Try
giving the reporter an email — oftentimes you’ll get a response.
Learn one skill at a time. It’s tempting to try totake on everything at
once – spreadsheets, databases, mapping, programming. But it’s more
manageable to master one thing at a time.
Practice, practice, practice. It’s easy toforget how to use Excel or
another tool if you don’t use it for months. So, use it for anything you can
think of – even keeping track of FOIA requests.
2. Consider taking a class (even a free or cheap one online). A class
is one way to force yourself to learn a little each week.
Find someone who can help. Find someone in the newsroom who
uses data. Join NICAR-L, an email list of journalists interested in data. If
you get stuck, someone on NICAR-L can probably help you within hours.
bit.ly/subscribeNICAR-L
Google it. The answer to 99.9% of technical questions is a quick search
away. Google it, click on the first couple links, take a deep breath, and dive
in.
Finding data
Ask sources. Ask watchdogs, government agencies, think tanks – anyone
– to point you to good data sources.
Use Google. You can easily find interesting data by searching websites
for agencies you cover or doing a broader search of the web.
Use a government data portal. Most states have gathered a portion of
their data sets in one place.
Check IRE tip sheets. IRE has a library of hundreds of tip sheets, many
of which include suggestions on data.
Examine the retention schedule for your city/state. It’s supposed
to be a guide to how long agencies must hold on to records. But it can also
be a tip sheet for records that agencies have.
Work backwards annual reports. See a stat? See a table? That means
the agency probably has a database that generated it. Ask for the data.
Look at forms. Most agencies enter every box of a form into a database.
That means you can probably obtain the database with a FOIA request.
FOIA. Sometimes data is on the web. Sometimes you can get it just by
asking. But don’t be afraid to file a public-records request for data. Even
open data portals maybe not have the entire database that you can get
through a records request.
Build your own. Sometimes, you just can’t find the data you need. Or it’s
only in paper form. In that case, it might be worth the effort to enter the
data into a spreadsheet so you can analyze it.
Open data portals
The federal government and many states operate websites where they feature
some of their databases.
U.S.: data.gov
New Mexico Sunshine Portal: https://ssp2.sunshineportalnm.com/#budget
3. Arizona Financial Transparency Portal: https://openbooks.az.gov/
Colorado Information Marketplace: https://data.colorado.gov/
Utah open data catalog: https://opendata.utah.gov/
Albuquerque Open Data: https://www.cabq.gov/abq-data
More Google tricks
Google’s new Dataset Search:
https://toolbox.google.com/datasetsearch
Use the advanced-search page: google.com/advanced_search.
Search by file type. Examples: filetype:XLS Some common ones are:
XLS (old Excel), XLSX (new Excel), CSV (comma-separated values – Excel
can open it), TSV (tab-separated values – you can import it into Excel).
Search a particular website or domain. Examples: site:gov or
site:boston.gov. Even if a website has a search box, sometimes Google
works better. Try it both ways.
Examples of databases to ask for
Payroll/salary data
Budgets
Parking tickets
Business/occupation licenses
Census
School test scores
Crime reports
Law enforcement officer
employment history/training
Purchase data
Campaign finance
4. Page 2 of 9
Tips on finding databases on non-governmental beats:
bit.ly/otherbeats
Public-records tips
Be positive. Assume it’s public. If you don’t ask, you won’t get it.
Ask for the documentation. The technical documentation for
databases can be called many things: a record layout, field list or data
dictionary. But it’s helpful to ask for it. That way, you’ll know what data
the agency keeps. And you’ll notice if something is missing.
Ask for the data in Excel, CSV or “machine-readable format”
(not PDFs). PDFs are designed to print out or look at – not analyze. You
want the data in a format a database can use.
Ask for more than one year of data. You want to see trends. I
typically ask for five years.
Talk to the data people. Sometimes, the PR people are friendly but
don’t know anything about the data and what is possible.
Appeal if rejected. Go up the chain of the command. Or follow the
appeals process (if one exists in your state).
Be polite, but be a pest. Sometimes agencies will simply hope you
forget about the request. But if you are persistent – even going to the
offices in person – they are less likely to blow you off.
Learn public records law. Make sure to counter any denial you get; it’ll
carry more weight if you know the ins and outs of the law.
Caution!!!
Watch out for dirty data. Typos. Mistakes. Missing data. If something
in the data seems crazy, it just might be an error. So, verify it with the
original documents or with sources.
Save a copy of your spreadsheet. Set aside the original, and do your
work on the copy. That way y0u preservea record of the data, to go back
and check later.
Double-check your calculations. It’s usually a good idea to run them
by the agency or another trusted source before publication. Or ask a
colleague to check your math.
Still need to do reporting. Data provides great examples and powerful
numbers. But you still have to do reporting to make sure the data is
reliable and confirm what it means. Make that phone call.
Don’t overload stories with numbers. A temptation with data stories
is to jam in every cool stat you find. But your stories will be stronger if you
5. Page 3 of 9
use only the numbers that matter most. Instead, tell the data stories
through people, anecdotes, quotes and traditional storytelling that
reinforces your findings.
Beware of working with new data on deadline. Every database has
quirks. Sometimes codes don’t mean what you think they mean.
Sometimes they’re incomplete. Try toavoid working with databases for the
first time on a tight deadline.
Where can you learn more
IRE/NICAR conferences/workshops/tip sheets. IRE costs $70 ($25 for
students) a year: bit.ly/joinire. But membership gives you access to
thousands of tip sheets and stories. Plus, you can listen to recordings of
past conferences. IRE’s trainings and conferences are also a great
resource. ire.org/events-and-training/
This tutorial from Berkeley Advanced Media Institute is for those who’ve
never opened a spreadsheet before: bit.ly/sheetbasics
The Data Journalism Handbook from the European Journalism Centre
and Open Knowledge Foundation is free to read online: bit.ly/datajbook
ICIJ reporter Kate Willson has four short videos on how to use Excel to
sort and filter, concatenate (link together), auto fill and make pivot tables:
bit.ly/ICIJexcel
Whether you cover education or anything else, this online guide to Excel
from the Education Writers Association will teach you everything you need
to know: ewa.org/reporter-guide/reporters-guide-excel
OpenNews: The website of the OpenNews project features helpful tutorials
and is a great way to follow what’s going on in the “news nerd”
community: https://source.opennews.org/
Knight Science Journalism at MIT has a fantastic resource for data work
that covers basics to programming: ksj.mit.edu/data-journalism-tools/
Spreadsheet basics
SAVE INITIAL FILE
Save the initial file somewhere safe, and make a new copy to work with. That way,
no matter what you do, you can go back to the original source of the data.
Google Sheets: Click on File in upper left corner, choose “Make a Copy” option.
(Google also saves a “Version History” that you can refer back to, but this is best
done only in emergencies. Find it in the File menu)
SAVE AS YOU GO
Always be saving.
6. Page 4 of 9
Excel Windows 2007/2016: Ctrl-S or hit the disk icon in upper left-hand
corner
Google Sheets: Saves automatically.
CHECK OUT DATA FIRST
Try tofind the “four corners” of the spreadsheet.
Excel Windows 2007/2016: Use the CTRL + Arrow keys togo up, down, left
and right.
Google Sheets: Use the CTRL + Arrow keys togo up, down, left and right.
UNDO
Sometimes, we all hit the wrong button. Here’s how to fix it.
Excel Windows 2007/2016: CTRL-Z (can hit more than once)
Google Sheets: CTRL-Z (can hit more than once). OR Go back to earlier
version by using the “version history” (under File menu or hit CTRL-ALT-SHIFT-
H. Then click on the version you want on the right, then click “restore this
version.” Tocancel, click on the left arrow in the upper left-hand corner.)
MULTIPLE SHEETS?
Check to see whether the worksheet contains multiple “sheets” or “tabs.” Look at
the bottom left-had corner. Tocreate a new one:
Excel Windows 2007: Click on the curled piece of paper on the lower left-hand
corner, next to the existing tabs. Excel 2016: Click the plus-sign-in-a-circle icon
on the lower left-hand corner, right of the existing tabs.
Google Sheets: Click on the plus sign on the lower left-hand corner, next to the
existing tabs.
FREEZE HEADERS
This is a handy command that lets you scroll through the data while still seeing
the headers/labels at the top.
Excel Windows 2007/2016: Hit the View tab at the top, select freeze panes
(middle right of the tool bar).
Google Sheets: Go to the View menu, select freeze.
WIDEN COLUMNS
Sometimes, columns are too narrow to read. (You will sometimes see ####s
when columns are too narrow to show a string of numbers.)
7. Page 5 of 9
Excel Windows 2007/2016: Hover cursor between the two letters marking
the columns until the cursor changes to a cross. Press and hold down the left
mouse key and drag the mouse left and right until it is the right width. Release.
Google Sheets: Hover cursor between the two letters marking the columns
until the cursor changes to a cross. Press and hold down the left mouse key and
drag the mouse left and right until it is the right width. Release.
SORT COLUMNS
Use this command when you want to sort from high to low (or in alphabetical
order).
Excel Windows 2007/2016: Click on any cell within the column you want to
sort. (Note: Do NOT highlight the entire column.) Click the Data tab at the top,
then click on the Sort tool icon in the middle of toolbar. Make sure the right
column is selected, Sort by Values, and then pick either A to Z (low to high) or Z
to A (high to low.) Note: Be sure headers box is checked correctly.
Short cut: Instead of using the Sort tool, you can also just click on the A-
Z or Z-A buttons in the toolbar after hitting the data tab. This will usually
work, but sometimes Excel gets confused and sorts the headers along with
the rest of the data. To fix this, click on the Sort tool under the Data menu,
then make sure the headers box is checked. (Yet another option: Highlight
the area you want to sort first.)
Google Sheets: Click on the Data menu, select Sort Sheet by Column _, A→Z or
Z→A. (The blank is for the letter of the Column.)
FILTERCOLUMNS
Use this command when you want to select rows that meet certain criteria, such
as all salaries from a certain department or all voters in a certain ZIP code.
Excel Windows 2007/2016: Make sure you click on a cell somewhere in the
data you are using. Click the Data menu button, hit the funnel button on the
Tools ribbon.
Little arrows should appear next to the columns. Click the arrow next to the
column you want to filter. Then select the criteria you want to use.
Google Sheets: Click the funnel on the upper right of the tool bar.
Mini funnels should appear next to the columns. Click the funnel next to the
column you want to filter. Then select the criteria you want to use.
INSERT COLUMNS, ROWS
It’s easy toadd another column or row.
8. Page 6 of 9
Excel Windows 2007/2016: Highlight row/column you want by clicking on
the letter or number that marks each row/column. Then right click, and then
click on insert.
Google Sheets: Highlight row/column you want by clicking on the letter or
number that marks each row/column. Then right click, and then click on insert.
OR
Hit the Insert menu option at the top, then choose either the column/row above
or below.
BASIC MATH
Formulas generally start with an = sign.
Addition: =SUM(cell range)
Example: =SUM(B2:B9)
Subtraction (change/difference): = New - Old
Example: =B2-C2
Percentage change: =(New - Old)/Old Way to remember: NOO!
Example: =(C2-B2)/B2
Then highlight the cell or column and hit the % button on the left-hand side of
the tool bar to convert to percent.
Percent of a total: = Part/Total
Example: =B2/$B$11
Note: Use the dollar signs to keep the second part of the formula from changing
when you copy the formula.
Average: =AVERAGE(cell range)
Example: =AVERAGE(B2:B10)
Median: =MEDIAN(cell range)
Example: =MEDIAN(B2:B10)
Maximum: =MAX(cell range)
Example: =MAX(B2:B10)
Minimum: =MIN(cell range)
Example: =MIN(B2:B10)
MORE ADVANCED FORMULAS
If/Then
=IF(comparison,”print this if true”,”print this if false”)
Example: IF(B2>100000,”High Earner”,”Low/Medium earner”)
Dates
=YEAR(CELL)
9. Page 7 of 9
=WEEKDAY(CELL)
=CHOOSE(WEEKDAY(CELL), "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat")
=MONTH(CELL)
Some formulas are slightly different in Excel and Google sheets.
For instance, to find the difference in dates:
Google: =CELL - CELL
Excel: =DATEDIF(A1,A2,"d") (for days) Use “m” for months or “y” for
years”
COPY A FORMULA DOWN AN ENTIRE COLUMN
Move the mouse to the formula, position the mouse in the lower right-hand
corner of the cell until you see the cursor change to a plus sign and double-click.
This will copy data down until it hits a blank row.
ANOTHER WAY TO COPY A FORMULA TO AN ENTIRE COLUMN
The above method will only work if the formula is next toa column with all the
rows filled out. Otherwise, it will only copy formulas down until the data next to
the column stops. If that is a problem, you can scroll to the bottom of the column
where you want to stick the formulas, enter some text - anything will do. Then go
back to the formula you want to copy, click on that cell, hit ctrl-c to copy, then
hold down the shift key, then hit ctrl-down-arrow to highlight the column (up
until the point where you typed in your random text), then hit ctrl-V to paste.