Presentation on data journalism given at the spring National Scholastic Press Association/Journalism Education Association convention in San Diego, April 2014.
1. Lies, damned lies ... and surveys
Beatrice Motamedi
The Urban School of San Francisco
Spring JEA/NSPA National High School Journalism Convention
April 2014
Wednesday, April 9, 14
2. Wednesday, April 9, 14
“Lies, damned lies, and statistics” is a phrase popularized by Mark Twain, the American humorist, who
wrote in “Chapters from My Autobiography” that "(f)igures often beguile me ... particularly when I have
the arranging of them myself; in which case the remark attributed to Disraeli would often apply with
justice and force: 'There are three kinds of lies: lies, damned lies, and statistics.'"
3. What’s the last survey you took for your
newspaper or website?
How many people responded?
What did you learn?
What would you like to do better?
Wednesday, April 9, 14
4. Let’s BYOS (build your own survey)
• Find a person whose eye color is the same as yours
and has the same kind of smartphone as you do
(iPhone or Android)
• Brainstorm a question you’d like to ask right here
in San Diego. How many students surf? Who believes in global warming?
Should texting while driving be a crime? Choose a topic that actually interests
YOU.
• Take out your smartphone and download an app: Instasurvey (Android) or
Surveymonkey (iPhone)
• Set up a free account and go out with your partner/ask your question. Try to
capture at least one answer using your app. Be back in 5 minutes.
Wednesday, April 9, 14
6. Glossary
• Poll: you ask one question, usually
yes/no or multiple choice (what you
just did)
• Survey: you ask several questions
with various question types
(multiple-choice, open/closed,
short-answer essay, etc.).
• Random sample: a selection from a
population, based on chance, and
with an equal probability of being
selected
• Respondents: those who actually
respond to the survey (not the same
as those you sample)
• Time/manner: when and how a
survey is conducted, e.g., a
Surveymonkey from April 7-10
• Response bias: a preference that
results from problems in the
surveyprocess, e.g., using leading
questions, or not allowing answers
to sensitive q’s to be confidential
• Sampling error: the variation in data
among samples
Wednesday, April 9, 14
7. How not to do a survey*
Click here for
“RNC Young
Voters Survey,”
Stephen Colbert,
4/4/13
*with apologies to young Republicans in the audience
(young Dems probably have a survey just as bad as this one)
Wednesday, April 9, 14
This video captures many typical survey mistakes, including non-response bias (you can’t
find the survey), response bias (poorly written or leading questions) and no control over the
sample (Colbert is obviously over 30 years of age but takes the survey anyway).
8. What’s wrong here?
Wednesday, April 9, 14
Too many questions, bias, types of questions (leading) and methodology can all affect how a
person answers a survey, generating results that are just plain ... wrong.
9. What’s wrong here?
too many q’s
Wednesday, April 9, 14
Too many questions, bias, types of questions (leading) and methodology can all affect how a
person answers a survey, generating results that are just plain ... wrong.
10. What’s wrong here?
too many q’s
too many
assumptions (bias)
Wednesday, April 9, 14
Too many questions, bias, types of questions (leading) and methodology can all affect how a
person answers a survey, generating results that are just plain ... wrong.
11. What’s wrong here?
too many q’s
too many
assumptions (bias)
closed q, not open
Wednesday, April 9, 14
Too many questions, bias, types of questions (leading) and methodology can all affect how a
person answers a survey, generating results that are just plain ... wrong.
12. What’s wrong here?
too many q’s
too many
assumptions (bias)
closed q, not open
you woke this guy up ...
for this?
Wednesday, April 9, 14
Too many questions, bias, types of questions (leading) and methodology can all affect how a
person answers a survey, generating results that are just plain ... wrong.
13. Our goal: A baby survey you can use right here
• Population sample — WHO to ask
• One closed question — WHAT to ask
• One open question — ditto
• A platform — HOW you will ask your questions
Wednesday, April 9, 14
Our goal for this presentation: You will leave with the beginnings of a survey, including a
topic, two types of questions, a plan on how you will collect your sample, and a web-based
platform that makes it all easier.
14. A nation of question-askers
Wednesday, April 9, 14
Examples of some of the long-running established ways in which Americans ask each other
questions. George Gallup began systematically surveying Americans almost a hundred years
ago.
15. A nation of question-askers
Wednesday, April 9, 14
Examples of some of the long-running established ways in which Americans ask each other
questions. George Gallup began systematically surveying Americans almost a hundred years
ago.
16. A nation of question-askers
Wednesday, April 9, 14
Examples of some of the long-running established ways in which Americans ask each other
questions. George Gallup began systematically surveying Americans almost a hundred years
ago.
17. A nation of question-askers
Wednesday, April 9, 14
Examples of some of the long-running established ways in which Americans ask each other
questions. George Gallup began systematically surveying Americans almost a hundred years
ago.
18. My school of
question-askers
February 2013 Urban Legend
Wednesday, April 9, 14
Examples from my school, including a student cartoon that expresses frustration and fatigue
over constant question-asking.
19. A new industry: Web-based surveys and polls
Wednesday, April 9, 14
21. ... and live by polls
Poll data at nyt.com (left) and realpolitics.com (above) as accessed on on 12/18/13.
Wednesday, April 9, 14
22. Just one day ... at Real Clear Politics.com
Wednesday, April 9, 14
23. Selections from the Pew
“Fact Tank”
Wednesday, April 9, 14
Pew regularly surveys teens on technology, religion, social trends, sexuality, and more.
24. Making surveys sexy
Wednesday, April 9, 14
Nate Silver used his column to predict results in the 2008 and 2012 presidential elections
with astonishing accuracy, correctly predicting the winner in all 50 states in 2012. Analyzing
survey data is his forte. Now at ESPN, Silver says he practices “data journalism.”
25. Why surveys are powerful — and scary
“When they verify the popularity of an idea or
proposal, surveys and polls provide persuasive
appeals because, in a democracy, majority
opinion offers a compelling warrant:
A government should do what most people
want.”
Source: “Everything’s An Argument” (2007: St. Martin’s Press)
Wednesday, April 9, 14
26. “It always makes sense, though, to push back ...
especially (when a poll) supports
your own point of view.”
Wednesday, April 9, 14
27. Questions to ask
• Have you surveyed enough people to be accurate?
• Are those people representative of the selected population as a whole?
• Did you choose them randomly — not selecting those likely to say what you
want to hear?
• Does the wording of your questions intentionally or unintentionally create bias
or skew results?
• Have you described the results accurately and fairly? Does your story stick to
what you asked, and only what you asked?
Wednesday, April 9, 14
Here are questions you should ask yourself, whether you’re doing or reading a survey.
Questions adapted from “Arguments Based on Facts and Reason” in “Everything’s An
Argument” (2007: St. Martin’s Press).
30. Three tips (and three deadly sins)
• Get a random sample. It
should be representative
also — not just people who
will answer the way you want
them to.
Wednesday, April 9, 14
31. Three tips (and three deadly sins)
• Get a random sample. It
should be representative
also — not just people who
will answer the way you want
them to.
undercoverage
Wednesday, April 9, 14
32. Three tips (and three deadly sins)
• Get a random sample. It
should be representative
also — not just people who
will answer the way you want
them to.
• Avoid bias. Ask a variety of
questions (open/closed,
multiple choice, short
answer). Question order
matters; begin with basics.
Avoid leading questions.
undercoverage
Wednesday, April 9, 14
33. Three tips (and three deadly sins)
• Get a random sample. It
should be representative
also — not just people who
will answer the way you want
them to.
• Avoid bias. Ask a variety of
questions (open/closed,
multiple choice, short
answer). Question order
matters; begin with basics.
Avoid leading questions.
undercoverage
response bias
Wednesday, April 9, 14
34. Three tips (and three deadly sins)
• Get a random sample. It
should be representative
also — not just people who
will answer the way you want
them to.
• Maximize response rate.
Write good questions,
not too long. Use a
platform that makes it
easy. Analytics help.
• Avoid bias. Ask a variety of
questions (open/closed,
multiple choice, short
answer). Question order
matters; begin with basics.
Avoid leading questions.
undercoverage
response bias
Wednesday, April 9, 14
35. Three tips (and three deadly sins)
• Get a random sample. It
should be representative
also — not just people who
will answer the way you want
them to.
• Maximize response rate.
Write good questions,
not too long. Use a
platform that makes it
easy. Analytics help.
• Avoid bias. Ask a variety of
questions (open/closed,
multiple choice, short
answer). Question order
matters; begin with basics.
Avoid leading questions.
undercoverage
non-response bias
response bias
Wednesday, April 9, 14
36. #1: Get a random sample
Wednesday, April 9, 14
This analysis by Jeanne Curran and Susan R. Takata (February 2002) shows how Americans — and the
media — relied on telephone polls that failed to pick up Truman’s strength among voters who were
less likely to be reached that way (in 1948, a majority of Americans). The resulting Chicago Tribune
headline is a journalism classic, but for all the wrong reasons.
37. Surveymonkey Blog tips
Source: Surveymonkey Blog
Wednesday, April 9, 14
Avoiding non-response bias means getting people to answer your survey. First you have to
figure out how many people that needs to be. The more people you survey, the lower your
margin of error.
38. How much error will you tolerate? How would you
find that many people?
Wednesday, April 9, 14
In order to get a solid result here at the convention, how many students would you have to
survey more than one in two students. Can you do that? How? What strategies might you use?
39. Strategies
• Get a “concrete sampling frame” — a list of all of the students here (or at your
school). Use Excel to randomize/select your sample, e.g., every other
student.
• Design a random reporting strategy: Get five reporters to stand in five places
here (or at your school). Each reporter spends one hour surveying every other
person who walks by.
• Use advisory: Ask teachers to pass out surveys so that all grade groups are
covered (can’t use this one very often)
• Be social: Embed your survey on Facebook, or use your smartphone and go
to the dance (and ask every other person who shows up to take your survey)
Wednesday, April 9, 14
Potential problems are double-counting, failure to randomize for age/gender, non-response
bias (some people don’t answer email surveys; some won’t answer surveys unless they’re
emailed), etc.
40. #2: Avoid bias
Wednesday, April 9, 14
Voluntary response bias happens when respondents self-select, e.g., viewers of American Idol text their votes to
American Idol. In season 3, Fantasia Barrino beat Diana DeGarmo by 1.3 million out of a record 65 million votes
cast. But it was third-place winner Jennifer Hudson who really won. A student journalist might get similarly bad
results by surveying athletes only about school athletics. You have to cast a wider net.
41. #2: Avoid bias
Wednesday, April 9, 14
Voluntary response bias happens when respondents self-select, e.g., viewers of American Idol text their votes to
American Idol. In season 3, Fantasia Barrino beat Diana DeGarmo by 1.3 million out of a record 65 million votes
cast. But it was third-place winner Jennifer Hudson who really won. A student journalist might get similarly bad
results by surveying athletes only about school athletics. You have to cast a wider net.
42. #2: Avoid bias
Wednesday, April 9, 14
Voluntary response bias happens when respondents self-select, e.g., viewers of American Idol text their votes to
American Idol. In season 3, Fantasia Barrino beat Diana DeGarmo by 1.3 million out of a record 65 million votes
cast. But it was third-place winner Jennifer Hudson who really won. A student journalist might get similarly bad
results by surveying athletes only about school athletics. You have to cast a wider net.
43. #2: Avoid bias
Wednesday, April 9, 14
Voluntary response bias happens when respondents self-select, e.g., viewers of American Idol text their votes to
American Idol. In season 3, Fantasia Barrino beat Diana DeGarmo by 1.3 million out of a record 65 million votes
cast. But it was third-place winner Jennifer Hudson who really won. A student journalist might get similarly bad
results by surveying athletes only about school athletics. You have to cast a wider net.
44. Question types: Choose a variety
closed
open
also: filter questions, e.g.,
have you ever smoked
marijuana (if not, then you
can’t take the marijuana
smokers’ survey)
Wednesday, April 9, 14
47. Same question, three types
multiple: good for a poll/
quick react, especially if
localizing
Wednesday, April 9, 14
48. Same question, three types
multiple: good for a poll/
quick react, especially if
localizing
Wednesday, April 9, 14
49. Same question, three types
multiple: good for a poll/
quick react, especially if
localizing
matrix: more depth/better
for a feature on tech and
teens vs. other groups
Wednesday, April 9, 14
50. Same question, three types
multiple: good for a poll/
quick react, especially if
localizing
matrix: more depth/better
for a feature on tech and
teens vs. other groups
Wednesday, April 9, 14
51. Same question, three types
multiple: good for a poll/
quick react, especially if
localizing
matrix: more depth/better
for a feature on tech and
teens vs. other groups
text: could go either way
(thoughtful, or tossed off).
If you get names,
you could follow up.
Wednesday, April 9, 14
52. Watch your words
Wednesday, April 9, 14
Question #1 emphasizes cost (“at public expense”) while Question #2 emphasizes choice
(“any school, public or private”). Changing the emphasis reverses the results — if you
emphasize cost, most will say no, but if you emphasize choice, most will say yes. Source:
Friedman Foundation School Choice poll, 2005.
53. Watch your words
Wednesday, April 9, 14
Question #1 emphasizes cost (“at public expense”) while Question #2 emphasizes choice
(“any school, public or private”). Changing the emphasis reverses the results — if you
emphasize cost, most will say no, but if you emphasize choice, most will say yes. Source:
Friedman Foundation School Choice poll, 2005.
54. Watch your words
Wednesday, April 9, 14
Question #1 emphasizes cost (“at public expense”) while Question #2 emphasizes choice
(“any school, public or private”). Changing the emphasis reverses the results — if you
emphasize cost, most will say no, but if you emphasize choice, most will say yes. Source:
Friedman Foundation School Choice poll, 2005.
55. “Obamacare” or “Affordable Care Act”?
Source: “Polling Matters,” Gallup.com,
11/20/13
Wednesday, April 9, 14
57. #3. Maximize response rate
analytics here
Wednesday, April 9, 14
There’s always a tradeoff between anonymity and analytics. Anonymity can boost your
response rate (sometimes higher esp. with controversial topics) but analytics can suffer (you
can’t nag someone you can’t find). The partial surveys above on online censorship will stay
that way because I can’t respond to them. Rats!
58. Analyzing results
• Be humble. You have not surveyed the universe,
just one group of people, and your questions could have
include unintentional bias.
• Write the breadbox: When and where the survey was conducted, how many
in the sample, how many responded, margin of error (if you know).
• Never make a survey your one-and-only piece of evidence. Support/explain
with quotes, anecdotes, observation also.
• Compare/contrast your results to other surveys — five years ago, by another
group, in another state, by your same paper, etc. Put the data in perspective.
Wednesday, April 9, 14
59. Putting results into words — rhetorical strategies
• Set up/attribute: “According to a DATE survey of HOW MANY (people,
students, marijuana smokers) conducted WHEN, xx PERCENT of the HOW
MANY respondents (answered, replied, responded) that ...” (and now tell us
what the survey says).
• Use the right verbs — surveys show, indicate, hint, suggest, point towards,
reveal, reflect, appear to or contrast with ... but they rarely prove.
• Repeat the exact language you used in your questions, because that is what
the respondent answered (not a paraphrase).
• State the biggest finding first, then dig into the lesser ones. Don’t rush.
Wednesday, April 9, 14
60. Beware of false comparisons
Wednesday, April 9, 14
“In ‘The Marriage Crunch,’ (1986, Newsweek) reported on new demographic research from Harvard and Yale predicting that white, college-
educated women who failed to marry in their 20s faced abysmal odds of ever tying the knot. According to the research, a woman who
remained single at 30 had only a 20 percent chance of ever marrying. By 35, the probability dropped to 5 percent. In the story's most
infamous line, NEWSWEEK reported that a 40-year-old single woman was "more likely to be killed by a terrorist" than to ever marry. That
comparison wasn't in the study, and even in those pre-9/11 days, it struck many people as offensive. Nonetheless, it quickly became
entrenched in pop culture.” —Daniel McGinn, June 2006, Salon.com
61. Beware of false comparisons
Wednesday, April 9, 14
“In ‘The Marriage Crunch,’ (1986, Newsweek) reported on new demographic research from Harvard and Yale predicting that white, college-
educated women who failed to marry in their 20s faced abysmal odds of ever tying the knot. According to the research, a woman who
remained single at 30 had only a 20 percent chance of ever marrying. By 35, the probability dropped to 5 percent. In the story's most
infamous line, NEWSWEEK reported that a 40-year-old single woman was "more likely to be killed by a terrorist" than to ever marry. That
comparison wasn't in the study, and even in those pre-9/11 days, it struck many people as offensive. Nonetheless, it quickly became
entrenched in pop culture.” —Daniel McGinn, June 2006, Salon.com
62. Beware of false comparisons
A 40-year-old single woman “more likely to be killed by a
terrorist” than to marry?
Wednesday, April 9, 14
“In ‘The Marriage Crunch,’ (1986, Newsweek) reported on new demographic research from Harvard and Yale predicting that white, college-
educated women who failed to marry in their 20s faced abysmal odds of ever tying the knot. According to the research, a woman who
remained single at 30 had only a 20 percent chance of ever marrying. By 35, the probability dropped to 5 percent. In the story's most
infamous line, NEWSWEEK reported that a 40-year-old single woman was "more likely to be killed by a terrorist" than to ever marry. That
comparison wasn't in the study, and even in those pre-9/11 days, it struck many people as offensive. Nonetheless, it quickly became
entrenched in pop culture.” —Daniel McGinn, June 2006, Salon.com
63. Beware of false comparisons
A 40-year-old single woman “more likely to be killed by a
terrorist” than to marry?
Wednesday, April 9, 14
“In ‘The Marriage Crunch,’ (1986, Newsweek) reported on new demographic research from Harvard and Yale predicting that white, college-
educated women who failed to marry in their 20s faced abysmal odds of ever tying the knot. According to the research, a woman who
remained single at 30 had only a 20 percent chance of ever marrying. By 35, the probability dropped to 5 percent. In the story's most
infamous line, NEWSWEEK reported that a 40-year-old single woman was "more likely to be killed by a terrorist" than to ever marry. That
comparison wasn't in the study, and even in those pre-9/11 days, it struck many people as offensive. Nonetheless, it quickly became
entrenched in pop culture.” —Daniel McGinn, June 2006, Salon.com
64. Beware of false comparisons
A 40-year-old single woman “more likely to be killed by a
terrorist” than to marry?
Wednesday, April 9, 14
“In ‘The Marriage Crunch,’ (1986, Newsweek) reported on new demographic research from Harvard and Yale predicting that white, college-
educated women who failed to marry in their 20s faced abysmal odds of ever tying the knot. According to the research, a woman who
remained single at 30 had only a 20 percent chance of ever marrying. By 35, the probability dropped to 5 percent. In the story's most
infamous line, NEWSWEEK reported that a 40-year-old single woman was "more likely to be killed by a terrorist" than to ever marry. That
comparison wasn't in the study, and even in those pre-9/11 days, it struck many people as offensive. Nonetheless, it quickly became
entrenched in pop culture.” —Daniel McGinn, June 2006, Salon.com
65. Recommended reading/survey websites
• Michael Traugott and Paul Lavrakas, The Voter’s Guide to Election Polls
• Ronald Czaja and Johnny Blair, Designing Surveys
• David Moore, The Super Pollsters
• Limesurvey
• Snapsurveys
• Polldaddy
Wednesday, April 9, 14
66. Beatrice Motamedi
Feel free to email q’s:
bymotamedi@gmail.com
beatrice@newsroombythebay.com
bmotamedi@urbanschool.org
Wednesday, April 9, 14