Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Moritz A Universe Of Data
1. Some Notes on Digital Data – with a suggestion
Tom Moritz / Internet Archive February, 2009
A UNIVERSE OF DATA???
What is “data”? The US NSF DataNet solicitation defines “data” as: “Any
information that can be stored in digital form and accessed electronically, including,
but not limited to, numeric data, text, publications, sensor streams, video, audio,
algorithms, software, models and simulations, images, etc.” i This definition is
technically acceptable but not scientifically epistemic. In fact, it is useful to think of
“data” in two distinct ways. “Data” refers (as in the DataNet definition) to the
computer readable code that is stored in, accessed from or flows between
computers. “Data” also means precise, well‐defined representations of observations,
descriptions or measurements of a referent (object or event) recorded in some
standard, well‐specified way.
The more inclusive DataNet definition has the virtue of forcing us to consider a
unified, holistic approach to knowledge and to the formal resources that inform and
express it; we are forced to confront the Web as it exists today.
HOW MUCH DATA?
In a now famous quip, Lewis Carroll noted that the perfect scale for maps was 1:1
but that farmers tend to become disgruntled when such maps are unrolled over
their fields. The notion that we could theoretically record “everything” in real time
‐‐ “ 1:1 capture “ – leaves us to ponder the limits of “data” collection, management
and longevity – full‐life‐cycleii curation and stewardship. With the evolution of
satellite coverages, nanotechnology, robotics and embedded network sensors, it is
possible, for example, to systematically record presence/absence data for birds at a
nesting site – at every nesting site in a given area ‐‐ 24‐7, forever [SEE for example:
http://www.jamesreserve.edu/webcams.lasso?CameraID=Cam14 ] iii or for that
matter to record every human heartbeat. iv And to archive these data in perpetuity?
(The casual assumption that we might comprehensively save all data is belied by a
recent forecast projecting that in 2007, the total data produced on earth for the first
time exceeded the available storage.v )
5.
Digital Data Preservation and Access Network Partners (DataNet) Program Solicitation NSF 07-601 ,
p.5.
iii
Or as another instance see recent NYT article: Natalie Anger “Tracking forest creatures on the move.”
NYT Feb 2, 2009
http://www.nytimes.com/2009/02/03/science/03angier.html?_r=1&scp=1&sq=tracking%20mammals&st=c
se
iv
The California poet William Everson once asked poignantly: “And when the last coyote has been
tagged…?”
v
“…the amount of information created, captured or replicated exceeded available storage for the first tie in
2007. Not all information created and transmitted gets stored, but by 2011, almost half of the digital
universe will not have a permanent home.” John Gantz et al. (IDC) The diverse and exploding digital
universe; an updated forecast or worldwide information growth through 2011. (March, 2008)
www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf
vi
Serge Bloch in NYT: Natalie Anger “Tracking forest creatures on the move.” NYT Feb 2, 2009 SEE:
http://www.nytimes.com/2009/02/03/science/03angier.html?_r=1&scp=1&sq=tracking%20mammals&st=c
se
vii HISTORIC BUDGET SUPPORT FOR NLM
viii
R. Lewontin, The Triple Helix: Gene, Organism, Environment
ix
“Property rights in science are whittled down to a bare minimum by the rationale of the scientific ethic.
The scientist’s claim to “his” intellectual “property” is limited to that of recognition and esteem which, if
the institution functions with a modicum of efficiency, is roughly commensurate with the significance of
the increments brought to the common fund of knowledge.” Robert K. Merton, “A Note on Science and
Democarcy,” Journal of Law and Political Sociology 1 (1942): 121.
x
SEE for example: Peter Galison, “The Collective Author,” in M. Biagioli and P. Galison (ed.s) Scientific
Authorship: Crdit and Intelletual Property in ScienceNY, Routledge, 2003.
xi
SEE: THE ROLE OF SCIENTIFIC AND TECHNICAL DATA AND INFORMATION IN THE
PUBLIC DOMAIN PROCEEDINGS OF A SYMPOSIUM J.M. Esanu and P.F. Uhlir, (Ed.s) Steering
Committee on the Role of Scientific and Technical Data and Information in the Public Domain Office of
International Scientific and Technical Information Programs Board on International Scientific
Organizations Policy and Global Affairs Division, National Research Council of the National Academies,,
xii
SEE L. Lessig, Code
xiii
SEE Julian Birkinshaw and Tony Sheehan, “Managing the Knowledge Life Cycle,” MIT Sloan
Management Review, 44 (2) Fall, 2002: 77.
xiv SEE for ex.:
xv
A short list is relatively easy to compose…