Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Â
Armageddon
1. Facing Up to ARMA-geddon: Preserving Cultural Records in the Digital Age
ARMA-CT, New Haven, CT. May 22, 2013
Greg Colati, University of Connecticut
NOTE: The presentation was far more informal than this text indicates, with lots of discussion
throughout. In fact, while all of the main ideas contained here were discussed, very little of what
I said followed this script. I make it available here for those of you who might be interested in a
more orderly development of the ideas that I presented.
Slide 1
I want to thank the organizers for giving me a few moments to speak about sustaining digital
content over the long term. It is a topic about which archivists and records managers are
sometimes seen as looking at from opposite sides of the same coin. But I don’t think that is
necessarily true. At the highest level, we have common interests. I will speak tonight from my
perspective as an archivist, and I hope it will resonate a bit with your vision of the world. And I
sincerely hope it begins a dialog tonight and beyond about the commonalities of our two related
professions.
For centuries, the desire to document human activity and culture was hampered by the lack of
records. Governments and other official entities controlled information, and the historical
record was filled with the stories of the privileged. Underdocumented communities had no voice
and made few appearances in the story of human history.
Over time, technology has made it more possible for unofficial voices to be heard and to endure.
The printing press, home photography, wood pulp paper, the phonograph , photocopier,
videotape, and digital cameras all have increased the output and convenience of information
creation and distribution. Since the arrival of the Internet, overabundance has replaced scarcity
as the primary challenge to preserving and making sense of the cultural record.
With so much being generated, how do we decide what to preserve, how to we insure that it will
endure, and how can we help people of the future make sense if it?
Slide 2
From the beginning of recorded information, culture has depended on the sustainability and
authenticity of the historical record. Myriad dystopian novels and films based on the distortion
and manipulation of the historical record echo George Orwell’s oft quoted “Who controls the
past, controls the future. Who controls the present controls the past.”
However, it may be true that as much if not more culture has been lost through the more
mundane reasons of neglect, inefficiency and apathy than purposeful destruction for nefarious
goals. And the principle culprit has been our own culture and human attitudes. The very things
that make a modern free society possible are the things that make it difficult to document a free
2. society. Decentralization, the lack of controls, the ability to think and do as you wish. This has
perhaps never been truer than it is today, when everyone can be a publisher, a chronicler, and a
documentarian with the click of a mouse. Yet those publications, chronicles, and documents
exist in an environment that is potentially a more ephemeral recording medium than at any time
since the invention of writing.
Slide 3
We can fast forward through the technological development of record keeping and cultural
documentation with two thoughts in mind. In nearly every case, technology advances over the
centuries expand the ability of people to create and distribute their own records. And given the
choice, society always chooses a recording means which is most convenient over that which is
most permanent.
Some years ago, Paul Conway, now at the University of Michigan created what has become a
much reproduced graphic representation (or what we would now call a data visualization) of
what he called the Dilemma of Modern Media—this was before cell phone cameras, Facebook,
and YouTube—that shows the inverse relationship between information density of recorded
media and the longevity of that media. Papyrus replaces clay tablets, scrolls replace papyrus,
the Codex—by the way the first random access memory device—replaces the scroll. On through
moveable type, wood pulp paper, newsprint, microfilm, compact disks and other optical media,
and by extension to the cloud services that are nearly ubiquitous today.
But as we see in the graph, as early as the end of the 19th century, the medium of the historical
record began to become as much of an issue as the contents of the record itself.
As long as history was recorded with scratch marks on a physical medium or to a lesser extent
photographs on glass or film, it was not only permanent, but interpretable with the human eye.
Slide 4
The widespread use of “coded” information transfer began to change all that. Information
transfer that required intervening technology—whether it be a telegraph operator or a optical
drive—becomes inaccessible to the average human without the reading or interpreting device.
This occurs perhaps as early as the telegraph, and certainly in popular culture by the time of the
wax cylinder. I cannot simply look at a wax cylinder of a spoken word recording and “read” the
words on it. But even the wax cylinder and the flat LP disk are analog renderings of actual sound
waves. The advent of magnetic, optical, and digital media further changed the landscape. Now
even the written word was subject to an intervening technology, and the permanence of the
documentary record was subject not only to the vagaries of humidity and temperature, but also
much more to the marketplace of technology.
Slide 5: Cultural Armageddon, 1980s: Brittle Books
3. Even so the primacy of print persisted well into the 20th Century to the point where, by the
1980s the crisis of cultural documentation was brittle books, and the solution was “mass
deacidification” and the advent of mass produced, acid-free paper as the “permanent” solution
to the crisis of preserving the cultural record of the printed word.
Slide 6 Cultural Armageddon 1990s Media Obsolescence
At the same time, but much less noticed, typewriters begin to give way to word processors, film
to videotape and then digital recording. By the 1990s the historical record became dependent
upon the recording medium as never before. Marshal McLuhan famously said that “the medium
is the message” For archivists, he could have said that the “Medium of the medium is the
message.” By the 1990s we were looking at a NEW cultural Armageddon of media obsolescence.
Mass migration/emulation begins to replace mass deacidification in an attempt to preserve the
cultural record of the early Information Age. We began to be faced with a questions about what
exactly was the “record?”
For example, early in my career, archivists spent a good bit of time determining the “record
copy” of archival records, tracing “originality” to something called the “ribbon copy” of a letter
or document. Ribbon copy being of course the piece of paper that had come into contact with the
typewriter ribbon, rather than a carbon copy or even a xerographic reproduction. Today the
concept of originality is much less clear, as every copy of a digital file is in some respects an
identical twin of every other copy, and the viewing experience of anyone interacting with the
information in that digital file is dependent not only on the characteristics of the file itself, but of
the viewing environment of any one particular user. This question of originality and authenticity
is out of our scope today, but is nonetheless an important topic of conversation.
Digital content preserved on high quality carrier media was the standard solution for the time,
and migration to new forms of carriers was the permanent solution to the problem
Slide 7 Cultural Armageddon 2000s File format obsolescence and the Digital Dark Age
By the turn of the current century, there were fears that the ephemerality of not just media, but
digital file formats and media would lead to a “Digital Dark Age” when obsolete media and the
inability of modern equipment to interpret old formats would render mute the voices of the
computer age. Again, thanks to the work of archivists, computer scientists, librarians and many
others, these fears were shown to be largely unfounded. File format normalization and
identification of “archival” or sustainable file formats was the NEW permanent solution.
While we continue to lose the historical record at an alarming rate for traditional reasons like
natural disasters, societal collapse, and media obsolescence, we lose much less of it due file
format obsolescence.
These solutions addressed what was and is a backward looking problem: How do we sustain
access to scarce information resources? It did not prepare us to address the next great challenge
to preserving human culture
4. Slide 8: Content overload: the NEW cultural Armageddon
The new cultural Armageddon is not how we can sustain access to scarce information, but how
we can collect, manage, and make sense of the explosion of information to be collected. Not long
ago the BBC estimated that by 2007, 94% of stored information was in digital form and the total
amount of stored information at that time was in excess of 295 exabytes. Researchers at the
University of Southern California estimated that the sheer quantity of digital information has
increased exponentially in the last 25 years, and shows no signs of slowing.
An Exabyte is one billion gigabytes, 1000 petabytes or 10 followed by 18 zeros.
Slide 9
More recently, the digital universe report estimated that in 2011 alone more than 1.8 Zettabytes
(1 ZB=1000Exabytes) were created, and that in the near future the number of files created will
increase by a factor of 75. But most importantly it found that 75% of all that information will be
created by INDIVIDUALS, not associated with any formal or organizational records
management system. And even more significant, much of this information will be stored in
systems that are NOT dedicated to preservation or recordkeeping.
Slide 10
MANY of those will use 3rd party web-based applications that have EULAs that are seldom read
or understood by users of those systems.
Slide 11 Digital Attic
Others, will simply not bother to clean out the digital attic.
Slide 12 Life Documented
Beyond this, the technology available makes it potentially possible to record virtually everything,
from lifelogging to lifecasting, we can all be part of the 24/7 social community.
Slide 13 Google Glass
It just keeps getting easier and easier to record, every look, every activity, either in person…
Slide 14
5. …or remotely
In the sense of information creation, we have truly crossed the Digital Divide, and there is no
going back.
Slide 15
More information that we need?
Slide 16 Lifecycles
I think there is at least one thing we can take away from this: If we are not able to select,
manage and preserve digital content, we are missing out on more than 90% of the material that
could constitute the historical record.
And THAT is a lot more than is being lost through any other means. Archivists have adopted the
idea of the curation lifecycle—something that records managers should find familiar—to think
about how to deal with the mountains of digital content. It is not much different than what we
have always done, just with better graphics.
The way forward is becoming clearer however. Rather than developing strategies for managing
different material types, carrier media, and file formats, we are beginning to think in terms of
separating the informational content from its storage and even delivery medium.
Slide 17
We want to think of these cultural artifacts less as objects and more as data. Data that is
maintained in a way that makes it possible to be used, arranged, and rearranged to tell stories.
Slide 18 Five Equations
The flood of information is just one of the challenges of the current age. What users expect from
us also drives what we do and how we do it. The desire for permanence is just one of the factors.
This evolution in content creation is coupled with changing attitudes toward research.
Researchers and archivists are applying sophisticated tools and applications to the digital
objects that are seen as pieces of data: Data, in an archival context, is any information suitable
for manipulation, use, or reuse in an electronic environment. This includes metadata, which is
the “sum total of anything we know about an object” as well as digital objects themselves, which,
by their binary nature are inherently data.
Combining our collections into aggregations of data, that can be manipulated, used, and reused,
without losing their authenticity is a way to transform our digital objects in to useful and
valuable data.
6. Visualizations are modern ways to tell stories, that are not all that different from traditional
storytelling, they just use information in a different way. If primary sources are the raw material
of storytelling, our primary sources must support modern visualization.
Slide 19:
The ephemerality of digital archives and the systems that manage and deliver them pose a
dilemma for scholarship that is based on citation. How do we insure that something I cite today
is going to be there tomorrow?
Slide 20 Digital Repositories
As we have seen, the Humanities community is has embraced digital scholarship. And like any
scholarship, digital humanities scholarship is dependent on the availability of the resources. We
know that today’s resources require not only an intervening technology to experience, but an
intervening technology infrastructure--that we call cyberinfrastructure--to make it possible for
scholars to interact with our data and to turn it into stories. Collaborating with our partners on
the other side of the reference desk is a way for us to help them tell their stories.
Slides 21, 22, 23, 24: Use and Reuse
If all of our content has become data, our mission and activities nevertheless remain the same,
even if our tools are changing. We continue to appraise, collect, and contextualize, and make
available our collections in ways suitable to each and all of our communities of interest.
Slide 25: Four “-Itys”
With the evolution of the material in our care, the tools required to manage them have also
evolved. Right now, these tools mean systems that can support preservation activities like error
checking for detecting “bit rot,” multiple redundant storage arrays, automated extraction of
technical metadata that can be used to plan format migration. Creating and maintain systems to
preserve digital assets is expensive, and usually beyond the reach all but the largest
organizations. However, in aggregate, the so called long tail of small to medium organizations
probably contains more historical content than all but the very biggest collections.
Slide 26
7. If we can collaborate to build a cyberinfrastructure for digital culture in Connecticut we will
accomplish a number of things together that we cannot do alone, no matter what funding was
available:
support sustainability of digital assets for all
create coherent and managed digital collections that are comprehensive rather than exemplary
reflect a commitment to share digital assets on a fair and equitable basis for everyone
This has a number of advantages:
The University of Connecticut, along with the Connecticut State library and other cultural
heritage organization in Connecticut are working together to make it possible to connect the
assets of even the smallest historical society with colleagues, scholars, and enthusiasts not only
locally but globally.
The Connecticut Digital Archive will aggregate, not only access derivatives, but digital masters
for preservation from cultural heritage organizations based in CT.
Contributing to a content aggregator like the CTDA will make it possible to connect to even
larger aggregators like the Digital Public Library of America.
Collaborative digital preservation works to sustain Connecticut’s digital heritage, because it
makes it possible for each organization to prove its worth and sustain its own collections.
Secondarily, but perhaps more importantly on a larger sense, it supports a community of
knowledge that is larger than any one organization.
Slide 27
We have the opportunity, today, to do something that many of us in this room have dreamed
about. The reality of course, will be imperfect, the details will be messy, and progress will seem
glacial at times. As an archivist, I see that as standard operations, and it is not daunting. From
the first archives of clay tablets, to the digital repositories of the future, we are part of a long and
respected tradition. We have solved so many other challenges of preserving and making
available the historical record, this is just the next one, and the one we have been given in our
lifetime.