Let’s begin with this man, born in San Francisco, raised in Italy, Virginia, and the Bay Area, and most importantly from my point of view, a fellow UC Berkeley Alum. Go Bears! Here in Redmond, he is perhaps best remembered as a Technical Fellow who joined Microsoft in 1995. Not to get too nerdy, but among his best known achievements are granular database locking, two-tier transaction commit semantics, the "five-minute rule" for allocating storage, and the data cube operator for data warehousing applications. See http://en.wikipedia.org/wiki/Jim_Gray_%28computer_scientist%29
Before Jim, there were three paradigms of Science.
The shift from explaining surroundings as supernatural or mythological to natural laws.
Photo Credit: http://www.flickr.com/photos/marymaddux/4801937864/sizes/l/in/photostream/Rather than solving theoretical problems to understand the world around us, we start with the data and direct software to mine enormous databases for relationships. We discover the rules by studying the outcomes.
Photo Credit: http://www.flickr.com/photos/bopuc/1771812/sizes/o/in/photostream/Eric Horvitz was working at a VA hospital and realized that patients with congestive heart failure seemed to flood the hospital during the holidays. The reason? All that salty food (gravy!). It got him wondering – what do the patience that keep ending up in the emergency room over and over again have in common? So he developed a program that scanned 300,000 patient records, involving hundreds of thousands of variables to “learn” patient profiles. Adding new patient data – like if they live alone – allows the program to determine the probability that the patient bounces back into the hospital system.
In the data mining paradigm, more data is almostalways better. A patient’s health, especially for a patient with congestive heart failure, is certainly tied to the medical system. But health is not determined solely by medicine. It’s a multi-dimensional problem that is influenced by the food we eat, what we do for a living, who we live with, and how we live. These are all types of data that the medical profession is not usually privy to, but are collected in all kinds of ways – in weight watchers databases, social worker files, HUD housing information and more. If we could tie all these data sets together, what could we learn about these same patients?
Pre-req: Nonprofits have to understand what data they have, and they need tools to move and report on data. I’d say that the sector is getting ready for this.
Pre-req:Easily ask questions of disparate data. Understand that data about oceans may be important in understanding the spread of infectious diseases. We have to both think about data beyond our organizations, and beyond our sectors. We have to get very multi disciplinary here.
The New Data Imperative
The Data Imperative<br />Holly Ross, NTEN <br />Kurt Voelker, Forum One Communications<br />
!<br />Ideas from:<br />The Big Idea: The Next Scientific Revolution<br />HBR, November 2010, by Tony Hey<br />
Discussion Q’s:<br />How do we raise the importance of data across the sector?<br />Funders: Can you make grantee reporting data valuable for more than your annual report?<br />What’s the sector’s data sharing manifesto? What would you sign?<br />What’s the low hanging fruit?<br />Can we structure unstructured data?<br />