Becca Aaronson: "Visualizing Health Data," 7.23.15

Visualizing
Health Data
Becca Aaronson, The Texas Tribune
baaronson@texastribune.org
@becca_aa

Accurate representation
As journalists, we have an obligation to ensure
that our stories — and visualizations — don’t
mislead or misinform our readers.

A good source
The source of your data — the entity, and
individuals who collected, analyzed and
published the data — must be reliable. And as
a reporter, you must consider any bias that
occurred during the collection, analysis or
interpretation of the data, just as you would
when considering a human source.

Census Data
Here’s how the U.S. Census categorizes “White” people:
“White. A person having origins in any of the original
peoples of Europe, the Middle East, or North Africa. It
includes people who indicate their race as ‘White’ or report
entries such as Irish, German, Italian, Lebanese, Arab,
Moroccan, or Caucasian.”

The “Lie Factor”
“Lie factor” = Size of the effect shown on the graphic / Size
of the effect in the data
Every graphic’s “lie factor” should have a value between
0.95 and 1.05.

The “lie factor” should be between .95 to 1.05

This infographic has a “lie factor” of 2.8

Design variation
Changing the design of a graphic can confuse or mislead
readers, especially if graphics with varying designs are
side-by-side or not clearly labelled.

From Visualizing Health gallery

Tufte’s 6 principles:
1. The representation of numbers, as physically measured on the surface of the
graphics itself, should be directly proportional to the numerical quantities
represented.
2. Clear, detailed, and thorough labeling should be used to defeat graphics distortion
and ambiguity.
3. Show data variation, not design variation.
4. In time-series displays of money, deflated and standardized units of monetary
measurement are nearly always better than nominal units.
5. The number of information-carrying dimensions depicted should not exceed the
number of dimensions in the data.
6. Graphics must not quote data out of context.

The population map
WARNING: Don’t accidentally make a
population map.
Here’s a good article on when maps shouldn’t
be maps. And check out Darla Cameron’s 2015
NICAR lightning talk on alternative solutions.

Map: 2011 American Community Survey, poverty levels
By raw number...
By percent...

Politics of Prevention
Let’s take a look at my fellowship project
together, and the data visualizations created to
support the series.
Map: Find Texas Remaining Abortion Clinics

How much is a limb worth?
ProPublica’s “How Much is a Limb Worth?” is
amazing for many reasons, particularly how
they visualized the data. Let’s watch this short
clip on Scott Klein and Lena Groeger explaining
how they built it.

Stories + Graphics
Example 1
Example 2

Build your own
Datawrapper.de — Built by Danish journalists
Chartbuilder — Built by Quartz

Los Angeles children
Let’s say you’re working on a story about poverty impacts
children with disabilities living in the Los Angeles area.
According to 2013 American Community Survey data, a
greater proportion of Los Angeles children with disabilities
have incomes below the federal poverty level in the past
12-months.
Let’s visualize it!

1. Go to Chartbuilder. Click “Chart grid” in Step
1, then delete the default data in Step 2.
2. Open this spreadsheet.
3. Copy the data on the worksheet “Under 18 -
Poverty Level”
4. Paste your data in Step 2.

So far, it should
look like this.

5. Under Step 3, select 2 rows and 1 column.
6. You still need to label the data. On Step 4, add
“%” or “ percent” as a suffix
7. Add a title and source information on Step 5.
8. Download your image, and you’re ready to go.

Let’s take a step back...
The data you just copy/pasted was cleaned up from
American Community Survey data. Let’s go through how
we got the data ready for presentation.

Get the data
● Download the raw data or view the original data on our street on
the Worksheet labelled “ACS_13_1YR_B18130_with_ann.” The
fields we want to use are highlighted in blue.
● Copy all of the columns from the first blue column to the last blue
column — HD01_VD02 to HD01_VD15. In your own
spreadsheet, create a new worksheet titled “Data fields.” Right
click on cell A1 and select “Paste special > Paste transpose.”
● Put your cursor on B1, click the arrow on the top right corner,
and select “Sort Sheet A-Z”

Here’s a screenshot of the worksheet “Data Fields,” which
includes the information we want to analyze to see how many Los
Angeles children with disabilities are also below poverty level,
compared to children without disabilities.

Next, we want to add the children “Under 5 Years” and “5-17 Years”
by disability and income status to get an estimate for all children in
each of the sub-groups. Create a new worksheet titled
“Calculations” and set up the following structure:

Now, copy the data you’ll be working with and paste it underneath. For ease of
reference, I’ve just brought over the 4 data fields we need to add together, and
organized them by age group. You can bring over all of your data, but make sure to
widen the header on the data description column, so that you can double-check you’re
referencing the correct fields.

Sum the two age groups with corresponding disability and income status
to fill in your spreadsheet:

Next, calculate what percent of children with disabilities fall into each
income range. Then do the same for children with no disabilities.

Now you’re ready to chart!
Let’s try another charting tool, Datawrapper.de
Open the website, select “+New Chart” in the
upper right corner. You can copy/paste the
estimated totals that we just calculated. If you
have a large dataset, you can also save it as a
.csv and upload the file.

Click on a column header to change the format, add prefixes or
suffixes (like ‘$’ or ‘%’) or hide a column of data from the
visualization.

Test different layouts
See what the data looks like as a “bar chart” or
“column chart” and other views.
Click on “2 Check & Refine” to go back a step.
On the top left corner of your data table, click
“Transpose,” then “Proceed.” Now look at the
various chart types. Notice a difference?

Prepare to publish
Click “Refine” to choose different colors for your
chart. Click “Annotate” to add a title and source
information about your data.
When you’re ready, hit “Publish.” You’ll need an
account to save your graphic.

Becca Aaronson: "Visualizing Health Data," 7.23.15

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (17)

Similar to Becca Aaronson: "Visualizing Health Data," 7.23.15

Similar to Becca Aaronson: "Visualizing Health Data," 7.23.15 (20)

More from reportingonhealth

More from reportingonhealth (20)

Recently uploaded

Recently uploaded (20)

Becca Aaronson: "Visualizing Health Data," 7.23.15

Editor's Notes