Democratizing Data


Published on

my talk to 2/12/09 O'Reilly IgniteBoston, emphasizing that passage of economic stimulus package, combined with current economy, is perfect time to introduce data-centric "democratizing data" approach, giving workers, regulators, public, watchdogs real-time access to critical information! Video version:

Published in: News & Politics
  • what are we doing wrong?
    Are you sure you want to  Yes  No
    Your message goes here
  • Great presentation David ... look forward to seeing you in NZ in May.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Democratizing Data

  1. 1. Democratizing Data to transform government, workplaces & our lives IgniteBoston Feb. 12, 2009 W. David Stephenson Stephenson Strategies Hello. Tonight I’d like to give you a preview of a book I’m writing for O’Reilly Media, Democratizing Data, to transform government, workplaces, and our lives, which is scheduled for July publication. Originally I was to co-author the book with Vivek Kundra, Chief Technical Officer of the District of Columbia, and a true trailblazer in this field. However, fortunately for the US, unfortunately for me, President Obama has chosen Vivek to become the Deputy Director of the Office of Management and Budget, in charge of all federal e-government initiatives, so what you see tonight is what you get!
  2. 2. Now, I don’t know about you, but for me, data used to be good for one thing, and one thing only: figuring the Sox’ batting averages. I’m a right-brained, creative type, and row upon row of numbers left me absolutely cold.
  3. 3. But increasingly, numbers started to intrude on my life, and I couldn’t ignore them anymore. Numbers such as how much the local aid my town gets was going to be cut...
  4. 4. .. How much the damn war was costing, and how much it was diverting from things we should be doing, such as providing quality health care…
  5. 5. … and, recently, documenting exactly how dire our situation had become.
  6. 6. Democratizing Data: How free access will transform our lives Ignite Boston Feb. 12, 2009 W. David Stephenson Stephenson Strategies But when I got interested in data, I found it was pretty hard to get at. Remember the end of “Raiders of the Lost Ark,” when the Ark of the Covenant was moved to a government warehouse? You knew it would never be seen again. That’s what seems to happen with a lot of data. We pay taxes so government can collect them, and you can bet companies know all about our shopping habits. Our activities and lives are their raw material. But once they’re collected, most citizens -- and a lot of employees for that matter -- don’t have a clue where data are stored or how they’re used. Even worse, that robs us of important tools that could improve organizations’ performance and cut their operating costs.
  7. 7. Fast forward , and lo and behold, in the latest Indiana Jones sequel, Indy retrieved the Ark! In my book, that’s an omen that you can’t keep things hidden forever! Similarly, closely-controlled and long-lost data are being liberated by the growing demand for transparency because of outrage about how TARP money was or was not spent and concern that te stimulus package be as effective as possible, by watchdog groups, the media -- and us. The time has come to democratize data -- to make it available, when and where people need it to do their jobs or to improve their lives. The result will be change and benefits in every aspect of our lives.
  8. 8. Beyond shedding light on how government operates, far-reaching and unprecedented change can result when we make reams of data available, plus tools to portray them visually. Generally acknowledged as the leading thinker on data graphics, Edward Tufte says that even the most skilled statisticians often find representing data visually is the most insightful way of making sense of them: quot;Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore and summarize a set of numbers -- even a very large set -- is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well- designed data graphics are usually the simplest and at the same time the most powerful.” This example is a Google mashup Jon Udell whipped up quickly to highlight pothole complaints to the DC Department of Public Works, and track -- on a real-time basis (because the city releases that data automatically) -- the repairs’ status. Sure, you might find that information in a chart, but who’d sift through pages of records in hopes of possibly finding the one or two that applied to their neighborhood? By contrast, if you saw this map, and lived near one of the pointers, wouldn’t curiosity compel you to click on it? Wouldn’t the fact that it includes not only information about where the pothole is and when the complaint was made, but also the repair status TODAY, both fascinate you -- and provoke you to call the DPW if it’s now 3 months later and the map shows the repair still hasn’t been made? Thus, a simple map can be the impetus for citizen awareness – and greater agency accountability. Incidentally, this example also illustrates an important aspect of data visualizations, reflected in the democratizing data concept: while many are official organization projects, many more are done by individuals or groups with a passion for a specific issue, such as..
  9. 9. … Rami Tabello’s, documenting illegal billboards in Toronto ….
  10. 10. …. Adrian Holovaty & Dan O’Neill’s EveryBlock …
  11. 11. …. and Jacqueline DuPree’s documentation of neighborhood issues in Southeast D.C.
  12. 12. Some visualizations combine various data bases to illustrate convergence, contrasts or possible causality. This example is Neighborhood Knowledge Los Angeles, a collaboration between UCLA and community activists. Their motto: “neighborhood improvement and recovery is not just for the experts.” This is an great example of democratizing data’s impact, because it combines and maps data on 7 “problem indicators” (including code violations, property tax delinquencies, and fire records, etc.) that might have otherwise remained isolated in various data bases in various agencies within city government. However, when the data are brought together and displayed on a map of a single block, that’s a red flag to city officials to intervene NOW with coordinated services to halt the decline. For the first time, it’s really possible to break down old barriers and work smart!
  13. 13. “ … put together big enough and diverse enough groups of people & ask them to make decisions affecting [the] general interest, [and] that group's decisions will, over time, be intellectually superior to the isolated individual, no matter how smart or well-informed he is. ” -- The Wisdom of Crowds Equally important, web-based data visualization sites often include a variety of community-building Web 2.0 tools such as topic hubs, tags, and discussion areas. They make it easy to focus many individuals’ and groups’ attention on a policy issue, increasing the chance that new insights will emerge precisely because of the interplay of so many perspectives. What could be more democratic? As James Surowiecki wrote in “The Wisdom of Crowds,” “… put together big enough and diverse enough groups of people & ask them to make decisions affecting matters of general interest, [and] that group's decisions will, over time, be intellectually superior to the isolated individual, no matter how smart or well-informed he is.quot;
  14. 14. <food> <name>Homestyle Breakfast</name> <price>$6.95</price> − <description> two eggs, bacon or sausage, toast, and our ever-popular hash b </description> <calories>950</calories> 1 st: tag & syndicate data But the devil’s in the details,. As of now, there’s far too little data available, in a timely and usable fashion, to workers, regulators, and/or the public. It’s time to switch to a data-centric approach, in which usable data is accessible to all sorts of applications and devices, automatically. The first, and most important, step is to structure data, in formats such as XML or KML, that will allow the data to be identified and read by both programs and devices. Equally important, the data must be syndicated, in streams such as RSS or Atom where it will be automatically delivered without any additional effort on users’ part. In fact, Princeton researchers last year released a paper making a startling assertion. They said the single most important step government can take to make web sites that really serve the public is to concentrate its attention on data streams: “Rather than struggling, as it currently does, to design sites that meet each end-user’s need, we argue that the executive branch should focus on creating a simple, reliable and publicly accessible infrastructure that exposes the underlying data”
  15. 15. Transparency begins at home 2nd: give workers data they need Curiously, although a growing range of government agencies release public data streams, almost none provide them to their own workforces, to give workers actionable data precisely when and where they need it, to do their work more efficiently. Agencies -- and corporations -- need to follow the District of Columbia's lead, and apply the same strategy behind the firewall first. After all, agencies’ employees may be struggling with incompatible data bases, may need to reach across agency “silos” to see if there might be synergies between programs, and employees from another agency may be able to provide new insights simply because of their differing life experiences and expertise. Also, as more young workers, who have never known life without the Web, join governmental workforces, they’ll naturally ask why tools they’ve used can’t be used in government. A data graphics project can empower them and tap their expertise.
  16. 16. Text 3rd: release the data Several federal and state agencies now publish a variety of data feeds. The most exciting model in the US is the District of Columbia’s Citywide Data Warehouse. It provides real-time numerical and geospatial feeds, drawn from more than 250 data sets, ranging from crime reports to to building permits to all purchase orders over $2,000. Anyone may access the feeds. In fact, a major reason why they are issued is to invite the media, community groups and watchdog organizations to examine -- closely -- the District’s internal operations, and to hold them accountable. After a long legacy of corruption, the DC government is earning public confidence, not through patronizing platitudes, but a transparent “don’t trust us, track us” invitation to check the facts. Given the loss of confidence in the federal government and industry in the wake of the financial collapse, it is urgent that they follow the District of Columbia’s lead.
  17. 17. •$50,000 •30 days •47 apps •4,0000% ROI! 4 th: make public co-creators Finally, on the cutting edge of democratizing data is to use it to invite your customers or citizens to become co-creators. That’s what my co-author, Vivek Kundra did as Chief Technology Officer of the District of Columbia. His Apps for Democracy contest was open to any developer, anywhere. They were invited to use one or more of DC’s data feeds, and create an open source app that would benefit the public. In one month, developers created 47 different usable apps, at a total cost to DC of $50,000 -- $20,000 of that for prizes -- an estimated ROI of 4,000% Now that Vivek has been named director of e-gov for the Obama Administration, look for this same sort of innovative public partnership to be replicated nationwide.
  18. 18. Benefits: •More informed policy debate •Consensus building •Better legislation •Transparency •Less corruption •Efficiency •Lower costs •Co-creating The potential benefits of democratizing data are many, and varied: • more informed policy debate, grounded in fact, rather than rhetoric • consensus building • better legislation • greater transparency and less corruption: greater accountability • optimizing program efficiency and reducing costs: • new perspectives, especially when “the wisdom of crowds” emerges. Who would have believed that dry data -- with a healthy dose of Web 2.0 magic -- could become the engine to involve the public in governmental transformation!
  19. 19. To learn more about democratizing data, contact: W David Stephenson Stephenson Strategies 335 Main Street, Medfield, MA 02052 508 740-8918 .. and watch for “Democratizing Data,” coming in July from O’Reilly Media! To learn more about transparent government and how to create the processes and policies to make it a reality, contact: Stephenson Strategies 335 Main Street, Medfield, MA 02052 (617) 314-7858