NOVEMBER 30, 2011 Why We Chose Open Science To accelerate research breakthroughs on brain diseases, the Allen Institute puts all its data online for use without fees. By PAUL ALLENThe Allen Institute for Brain Science in Seattle grew out of a simple question I posed in 2002 to aconstellation of top people in the field: Whats the most useful thing we could do to propel neuroscienceforward? The consensus became our inaugural project—a comprehensive, molecular-level, three-dimensional map of the mouse brain to show precisely where every gene is active, or "expressed." It wasthe first step on a long road to understand how genes function in the human brain, knowledge that willpoint to ways to better diagnose and treat brain ailments.A crucial aspect to this project—and others the Allen Institute has pursued over the last eight years—is an"open science" research model. Early on, we considered charging commercial users for access to our onlinedata. From a strictly financial standpoint, it made sense to reap front-end fees and, down the line,intellectual property royalties. The revenue could cover the high costs of maintenance and development tokeep the resource current and useful.But our mission was to spark breakthroughs, and we didnt want to exclude underfunded neuroscientistswho just might be the ones to make the next leap. And so we made all of our data free, with no registrationrequired. The Institute would have no gatekeeper. Our terms-of-use agreement is about 10% as long as theone governing iTunes.Our facility is neither the first nor the last to use a shared database to embrace "open science" and reject thecompetitive, single-lab R&D paradigm. Traditional research incentives—where journal publications are thecoin of the realm—tend to discourage vital sharing.In 1982, even before the dawn of the Internet, a consortium of government agencies established the openaccess GenBank. Maintained by a division of the National Institutes of Health (NIH), GenBank now housesthe sequence data from the Human Genome Project, the inspiration for our brain mapping.Getty Images
In recent years the NIH has sponsored other data-sharing portals, including the Alzheimers DiseaseNeuroimaging Initiative and the Neuroscience Information Framework. Private nonprofits like the PistoiaAlliance and Sage Bionetworks are curating their own open-source repositories.But the Allen Institute remains distinct in conducting industrial-scale big science that is fundamentallycollaborative. Internally, our team of scientists and support staff works together to meet the time lines andmilestones that frame each large project. The team released the initial data set from a ground-breakinghuman whole-brain atlas last year, and it is now midstream on a project to define the circuitry betweenneurons and how it affects human behavior.Most important, we generate data for the purpose of sharing it. Since opening shop in 2003, weve had 23public releases, or about three per year. We dont wait to analyze our raw data and publish in the literature.We pour it onto the public website as soon as it passes our quality control checks. Our goal is to speedothers discoveries as much as to springboard our own future research.The databases currently provide tens of millions of high-resolution images. The initial mouse brain atlasalone involved 600 terabytes of data, or 600 trillion bytes, more than half the total content of the Internetwhen we started. Since data of this volume would be of little use without effective search and navigationtools, the Institute developed a free online viewing application as well as the downloadable Brain Explorer3D viewer, which illuminates how expressed genes are distributed throughout the brain.Open science is a long-term and pricey proposition. It demands consistent curating, maintenance andupdating of databases, and regular software and hardware upgrades. The institute offers online videotutorials on a YouTube channel and in a tutorial library. For those seeking in-person walk-throughs orforums, it hosts training workshops and user group sessions in several areas around the country each year.These services, too, are free of charge.It is a modest cost that is paying off as the scientific community embraces the open access model. InOctober, the institutes suite of databases received more than 45,000 visits, from six continents and fromresearch organizations of every stripe: universities, government laboratories, independent institutes andbiotech and pharmaceutical companies. Institute brain atlases are accelerating research on the underlyingbiology of a broad range of diseases, from Alzheimers and Parkinsons to autism and schizophrenia.Growing numbers of college educators, from UCLA to the Radboud University Nijmegen in theNetherlands, are building curricular modules around our online resources.What Ive concluded is that foundations and other private funders who support scientific research also canhelp promote wider sharing of scientific data. Before funders write a check to a university, they should askabout the researchers policies and track record on sharing.On the federal level, the NIH now has such strong policies on sharing data. But Id like to see the agency doeven more to put its funding where its directives are. I propose that the NIH—along with the NationalScience Foundation and the U.S. Department of Education—direct funding into grant awards formanagement and curation of existing research data of special value.That would siphon some money for traditional research grants for new work. But I think wed get morebang for our buck by making more data more useful to more scientists—and, by extension, to the worldcommunity that will benefit from their work.Mr. Allen, the co-founder of Microsoft with Bill Gates, launched the nonprofit Allen Institute for BrainScience in 2003.