Peter Orszag is the Director of the Office of Management and Budget. It is here that the Chief Statistician
of the United States, Katherine Wallman, works. Orszag has been a prominent voice in the Healthcare Reform
debate, something he connects to 'evidence-based policy'. Surprisingly, Orszag's evidence for Healthcare
Reform comes from a Dartmouth University research center (called ATLAS), not the Federal Statistical System,
which Wallman oversees. The Federal Statistical System is compromised of dozens of agencies that operate with
varying degrees of independence with Cabinet departments. The System faces considerable cultural & technical
challenges in meeting the Obama Administration's goals for transparency.
0.02% of the Federal Budget
is for Statistics
The annual cost to taxpayers
for Government Statistics
is less than $25, probably
closer to $10
In addition to Healthcare, critical areas like Climate Change, Education, and Immigration (which is also
Homeland Security) need better statistics. Things are looking up to some degree. The 2010 Census had its
funding raised by $1B to $15B. Other statistical agencies have seen their budgets increase or stabilize. Yet
Congressional Hearings into how to improve the Federal Statistical System started two weeks ago (see
http://jec.senate.gov). Yes, Federal Statistics can be improved, but it also important to mention that
Federal Statistics have been grossly underfunded.
“The return on Federal
investment in statistics
is almost infinite.”
-Andrew Reamer, Brookings Institute
Andrew Reamer is the one person I've come across in a Washington DC think-tank whose focus is the Federal
Statistical System. His observation is based on the fact that you cannot put a pricetag on crucial
information at a critical time.
Stories of Federal Statistics
World War II
Bureau of Economic Analysis, FY2007
2010 American Community Survey
In WW2 the United States had detailed information about the nation's industrial capacity. Records that were
at least as detailed as Germany's, and far better than Japan's. Given how the volume of material exceeded
those other nations, the country's statistical services certainly helped, rather than hurt the War effort.
There was a budget showdown for FY2007. Congress approved a budget that exceeded the President's by a few
hundred million dollars. Rather than risk a veto, Congress opted to cut that money from its budget. The
Bureau of Economic Analysis took a 5% hit from what was supposed to be $80 million. Among the programs that
BEA sacrificed were measures of specific industries on US Metro regions. So now when the auto industry needs
bailing out, there are no recent numbers on which to gauge the auto industry's impact on Metro Detroit.
The American Community Survey (ACS) replaced the Decennial Census Long Form as the main provider of detailed
demographic information. The last Long Form sampled ~17% of Americans. The current ACS currently get 10-12%,
not enough for reliable conclusions at the neighborhood level. It's a concern that press and politicians will
either base decisions on the potentially faulty information or feel misrepresented by the inaccuracy.
Kenneth Prewitt was the Director of 2000 Census, and currently consults for the Census Bureau. During the
recent Congressional Hearing he expressed strong support for making the Census Bureau an independent agency,
separate from the Department of Commerce where it currently resides. He is also an advocate for boosting the
Census Bureau and Federal Statistical System's scientific credentials, referring to these social scientists &
statisticians as Government science's step-children. Prewitt also sees a payoff from a more scientific Census
Bureau, which is improving acquisition/use of redundant and complimentary information from sources like
administrative records & government transactions (he calls it 'swipe data'), a challenge worth undertaking.
How do government data
help to produce important
Today's science is more computational and data-driven. It makes sense that the government has a role to play,
but what? And if the Federal Statistical System seeks to become 'more scientific', where can it look for
examples that can provide a guide in terms of process? The example I have comes from conservation biology,
and it is just one of what are probably many cases worth considering.
Landscape Ecology, Circuitscape
In general, conservation science depends on reliable, free access to Federal government data – maps,
elevation data, satellite images that show land covering. Ready access to the data frees computationally-
inclined scientists to develop better software tools. Here, Brad McRae of the Nature Conservancy has applied
Electronic Circuit Theory to help determine pathways that predict the movement and genetic differentiation of
plant and animal habitats in areas subject to development. The software is Open and available at
Circuitscape.org, and it has been used to investigate mountain lions, mountain goats, and frogs in the
Western US. ... Q: Can data provided by Federal Statistical Agencies + software innovation = Impact?
Partners in Improving
Q: Can data provided by Federal Statistical Agencies + software innovation = Real World Impact?
I think yes. But I would be more confident if journalists, programmers, statisticians, and policy-makers
could work across disciplines and collaborate the way scientists do. In the case of Circuitscape, Brad McRae
trained as an engineer before getting his PhD in Conservation Biology with Brett Dickson at Northern Arizona.
Collaborator Viral Shah works on Interactive Supercomputing at UC Santa Barbara. Brad now works for the
Nature Conservancy in Seattle applying his theory & tools to real-world conservation planning.
Q: Why isn't it a front-page story that
the Federal Government can't solve the
political conflicts in order to get good
A: "I wish one of my colleagues from the
Washington Post was here. I'd turn (this
question) over to them. The Washington
Post does a wonderful job of covering
those issues, better than the USA Today
does. Because that's much closer to the
heart of what the Post is and what its
For effective collaboration there has to be willing participation. I don't think Government statisticians
trust the media. I don't think press and policy-makers know know how they'll benefit from a better Federal
Statistics Service. None of the groups have good outreach mechanisms to involve programmers. Leaders who
cross disciplines should emerge.
Local Employment Dynamics
Andrew Reamer calls Data.gov a microdata collection of individual somethings--data on individual somethings
with people, transactions, permits, and making them available to the public so that people can add value. He
points to examples like the work going on with Local Employment Dynamics, a collaboration between Bureau of
Labor Statistics and Census Bureau on Metro-scale employment migration trends. It combines state-provided
administrative records on employment with Federal statistics. It uses 'synthetic data', a way to anonymize
records to preserve anonymity. It has a public online interface. And it's leading to interesting research
outcomes like 'labor sheds' and 'community sheds' that express ongoing relationships between where workers live
and where workers work.
Dealing with Complexity
There are not all that many ways in which the complex world gets processed by human brains. Narrative,
statistics, algorithms, interfaces, and laws are some. The opportunity in having Journalists, Statisticians,
Programmers, and Policy-makers collaborate lies in creating a sum that is greater than its parts. Here, 1 + 1 +
1 + 1 > 4.
Norms of Transparency
Credibility: Federal Statistical Agencies have decades of non-partisian service to policy-makers and represent
a Gold Standard in terms of the credibility of their data and analysis. Analysis & data coming from outside
don't have the same 'inside-the-Beltway' credibility. It is an open question as to how the Federal Statistical
Agencies can collaborate widely and maintain a solid reputation for non-partisianship.
Norms of Transparency: If somehow, someway transparency becomes the political norm, the watchdog job (whether
it's the press' job or someone else's) gets easier. This is where, I think, buy in from the Statistical
agencies meets evidence-based policy. In a manner similar to how scientists reproduce experiments, watchdogs
would have the means to work back from policy decisions to the evidence and if need be, down to source data.
Protecting Privacy: Federal Laws govern the agencies use of data and protect individuals' privacy. One risk of
advancing technology is that privacy safeguards might in the future be compromised by technologies that hadn't
as yet been invented.
Folks in Washington DC at the Congressional Joint Economic Commission, at the Office of the Chief Statistician
in OMB, and at the Census Bureau are interested in learning your thoughts. What can I tell them? If you are
interested in continuing this conversation, let me know. Thank you.