An ERA Update 26 June2007


Published on

Electronic Records Archives Presentation 26 June 2007

Published in: Technology, Education, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • OverviewHandouts:TimelineRisk Brochure (records at risk)2005 Prologue article: ERA as a tool for archivistsResearch Partners sheet
  • SLIDE 3: NARA’s MissionBefore NARA was established in 1935There was no central repository for ALL of the records created by the Fedl Govt’s agencies and bureaus.Federal agencies were responsible for storing their records in locations around the country, which meant that records may have been stored in attics scattered about in regional locations, and were exposed to environmental threats such as fires, flooding, tornadoes.Researchers would have had to travel to that agency to conduct research, or wait for records for weeks or months to arrive to the agency’s Headquarters building. So around 1935, the Treasurer of the US (Andrew Mellon) went to Congress to request approval to establish and fund the National Archives, which would be a central repository for Federal records.NARA’s MISSION: To ensure access to records of three branches of U.S. GovernmentRecords that protect citizens rightsHold Govt Officials accountableFacilitate historical understanding of our national experienceKey: NARA must providing access to records regardless of format (Paper AND Electronic)To ensure a PERSISTENT ARCHIVES for the Life of the Republic: no other entity has this missionFast Forward to 1960s:The increased use of computers by wider numbers of Federal agenciespresented the same concern for records being held at the agency level: electronic records were being created on stovepipe systems, that became obsolete, records were often not included in the agency’s Records Schedules (SF-115s), and therefore were not sent to NARA.This introduced a new risk of loss of important agency records that may have historical value for researchers, (genealogists, historians, attorneys, etc) or for use by other Government officials (e.g. during Tobacco litigation, etc)
  • SLIDE 4: ERA HISTORYPRESERVING ELECTRONIC RECORDS @ NARA IS NOT NEW:NARA has 40 yrs experience preserving and providing access to e-recordsThe types of files have been limited to primarily structured data files, fielded, fixed length, comma separated ASCII files, early generation database files, relational databases, etc.Other records may be represented in a highly structured tabular manner (rows/columns)NARA provides access to approx 200K flat filesOther types of electronic files (e.g. email) have received minimal processing (bit stream preservation only)We have gained valuable experience in understanding the issues that need to be addressed in order to deal with more complex files that are being created by Federal agencies.1968 NARA was already concerned about preservation of data created by agencies on computers1970: the first electronic records were transferred to NARA1995: Ken Thibodeau, head of the Center for Electronic Records (today: NWME), approached the Archivist of the U.S. (Carlin) with concerns that NARA was heading for mission failure – i.e. NARA/NWME was limited to accepting and preserving only the electronic records formats mentioned above, and did not have the capability to take in and preserve every electronic record that was being created by Federal Govt agencies (PRESENT) – and potentially, more complicated formats that have not even been invented yet (FUTURE)1998 – Ken was asked to create a Research + Development Team to understand the issues, and potential solutions. (Later slide will discuss these partnerships)2000 – Results Results were presented to The Archivist, who supported creation of a new system; Separate appropriations were created to establish the ERA Program, to ensure that funding would be available to develop the system. Requirements were developed (based on an Open Archival Information System/OAIS model) and contractors were asked to bid on developing the system.
  • This is a LONG-range vision tied to NARA’s mission of being able to preserve and provide access to any Federal government records for the life of the republic…Several years ago, we began talking about building a system that would be able to preserve and provide continuing access to ANY type of electronic record, free from dependence on hardware or software.There were big challenges (next slide) and Funding realities hit (9-11, Hurricane Katrina, Iraq war, etc.)So, we had to scale down the term vision to limit the TYPES of records that the system could handle, and the NUMBER OF AGENCIES that we could work with during the initial stages.We’re MANAGING EXPECTATIONS:So, in the first few years, the initial system will have limited capabilities and functions, not as many people will use it, and only certain records will be stored in the system.The system will be built to take advantage of new technologies that are available later on, which can be added onto the existing system over time.So this is the VISION: EVENTUALLY (10-20 years from now???) ERA will have these capabilities, but will always have the capability to be built upon based on new challenges and technologies.NEXT SLIDE: CHALLENGES
  • SLIDE 6:NARA’S ChallengeScopeApprox 400-500 Federal agencies and their bureaus (e.g. Dept Agriculture: Food Safety Divisions, Agric Statistical Services, etc.)Govt agencies are very complex; diversity of objects covers wide rangeEach agency creates separate (“stovepipe”) systems that most likely do not talk to each other. NARA has to deal w/ ALL agencies.VarietyOver 4500 formats of electronic records (e.g. .jpg, TIFF, .mpg, .avi, .xls, .doc, .wmf, etc.) PLUS emerging formats that may be developed in the future. Obsolescence - (Media: cd-roms, floppies: this problem is manageable)Electronic records and systems become obsolete approx every 18 months.Transforming objects into a persistent formNeed to identify essential characteristics of the object that are to be preservedRecords are transformed by tagging or encapsulating them in metadataWrap objects in their native formatTo counteract obsolescence, the archive updates s/w mediators to translate models/metadata into forms that current technology can interpretTechnology migration is an integral part of the repository’s architecture.Govt Programs/activities will extend over multiple generations of info TechnologyAccessGovt activities could be crippled if records were not accessible b/c of obsolescenceWill computers be able to read documents created in the year 2000 by MS Word – 75 years later? Probably not. NARA cannot become a computer museum (parts break down, space issues, spare parts, etc.)We must expect changes in information technologyAnd maintain performance and customer serviceVolume Some agencies, like NASA are creating large volumes of eRecords on a daily/hourly basis (e.g. satellite data, weather maps, GIS data, etc.), there is already a backlog of electronic records that have arrived at NARA that cannot be processed (because we don’t have the technologies). More records will be arriving as the ERA system is turned on in 20007 –it will have the capability of taking in records, so the number of records in the system will increase exponentially.NEXT SLIDE: Examples of these challenges are listed on the Next SLIDE: 9-11 Commission records
  • SLIDE 7: EXAMPLES OF VARIETY & COMPLEXITYThe 9-11 Commission records contain a wide variety of types of records (wav files, image files, document files, GIS, etc), both paper and electronicThis is ONE example of the challenge of VARIETY and COMPLEXITY of Permanent records:contained a wide variety of both paper and electronic records Will future Researchers be able to access these records in 2075?Some records born digital are not available “in hard copy” (right column: digital audio files, HDTV)Some files are create LARGE sized files (i.e. scanned paper documents, satellite imagery – from megabytes to terrabytes)Some files Web pages are dynamic, and may change from day to dayGIS records are VERY large and complex – there may be hundreds of views of the data that would not make sense (for example if the object is 3-dimensional) in printed form. Also files that would be too large to send electronically from one system to another.The new challenge is to balance current needs with preservation.
  • SLIDE 8: RECORDS ALREADY AT NARA: Examples of VOLUMEThese are just four examples of hundreds of series of permanent electronic records that are ALREADY at NARA:Clinton emails - 40 million email messagesare Presidential Records had complex attachments: image files, spreadsheets, decision documents, etc.State Dept - 25 million electronic diplomatic messagescreates and sends large volume of electronic messages daily  these are PERMANENT recordsDoD – 54 million images from electronic official military personnel files annuallyNARA needs to be able to accept Military OPFs from ALL services before 2007 (they will be scanned files)Will military employees be able to access their service records in the future?These records will need to be mapped to related records such as health benefits, medical records, etc.Census Bureau - 600 - 800 million image (TIFF) filesPaper forms were SCANNED, creating over 600-800 million image filesThese scanned files are TIFF format: very LARGE files!
  • SLIDE 10: THREE COMONENTS OF ERA PROGRAMSo, the solution was to build the “Archives of the future” Using NARA’s 30 year experience, coupled with external partnerships, NARA took a very comprehensive look at what was needed to build this system. There are 3 Parts to the ERA Program:Research and Exploratory Development (1998 – present)NARA had to look beyond itself; it realized that these were not just NARA issues/challenges. They were challenges faced by (e.g.) computer scientists, engineers who spent their entire careers studying these issues and possible solutions.The first solution was to build partnerships with communities who were already studying the continuing changes in IT and how these IT technologies were being used in governmentAcquiring a System that meets our requirements and our missionRequirements were developed by studying NARA business processes (i.e. what does an archivist do to schedule, transfer, and preserve records)Requirements spelled out WHAT tasks that a computer would need to be able to do to match up with primary NARA business functions: SCHEDULING/APPRAISAL, TRANSFER/INGEST, PRESERVATION and ACCESSThere are over 800 requirements “Shall Statements:” The system shall do _(archival function)___” etc. (NOTE: Requirements are available on the ERA web site, under system documentation; a comma separated value/CSV file)Organizational Change Management: Ensuring that NARA (internal staff, traditionally processed PAPER) can successfully implement the system,Ensure that (external) users will be able to use the system. NARA wants to be sure that if it will spend millions of dollars to build this system, that people will be able to USE it.
  • SLIDE 11:ERA RESEARCH PARTNERSAddressing the challenges requires a new relationship between scientific and archival/records communities.Until recently, engineers and archivists didn’t interact muchBut technology has changed and presented a common interest w/ common problemsEngineers look for absolute accuracy and empirical resultsArchivists look for authenticityCollaborations have produced a major change in NARA’s perspectiveCOMPUTER ENGINEERS: Spent their entire careers focusing on complex problems – they jumped at the chance to work closely with NARA to find solutions.UNIV MD/UMIACS and SDSC (top right) were first ERA Partners. SDSC is examining LARGE VOLUMES of data shared by multiple (both private + public sector) agenciesNNSA – a research partner AND one of the Increment 1 agencies…(Slide 11)IEEE, GGF, NCSA, etc.LIBRARY COMMUNITIES Library of Congress, Digital Library Federation, National Agricultural Library/USDAACADEMIC Univ of MD, Georgia Tech (They are building tools/technologies to deal with Presidential records)Mass Institute of Technology, + 2006 MOU with WVa Univ students to do electronic records research at Allegany Ballistics Lab (ERA at Rocket Center, WVA)GOVT AGENCIES- USDA – Natl Agricultural Library- Army Research Lab- Natl Science Foundation- Natl Institutes of Standards and Technologies (NIST)- Dept of Energy (NNSA) – see slide 11- Dept of Defense: US Naval Oceanographic Office (slide 11)INTERNATIONAL COMMUNITIES:International Research on Permanent Authentic Records in Electronic Systems/INTERPARES (examining issues related to electronic records authenticity)
  • SLIDE 14 : MAKING THE CONNECTIONSTo ensure a PERSISTENT ARCHIVES for the Life of the Republic: no other entity has this missionA PERSISTENT ARCHIVES needs to be: EVOLVABLE – ability to change over time (plug in new technologies)SCALABLE – ability to manage (Ingest, Access…) large/small volumes of recordsEXTENSIBLE – ability to accept varieties of formats + formats that will be used in the futureThe information that allows us to interpret the 0s and 1sThe information that adds context The connections or linkages between these
  • Recap: 2000 – ERA becomes an official Program; finalizing the Requirements Document and Concept of Operations (ConOps paper: “how ERA will work”)2004 – Competition between Harris Corporation and Lockheed to develop a prototype of the system; showing NARA that they understood our requirements and processes.2005: Lockheed wins the competition (Harris is currently building an electronic information system for GPO)Inc 1: 2005-2007 - Lockheed has two years to build the foundational systemNo Classified records will be in the systemOnly 4 Federal agencies at firstOnly limited formats (use six formats for which NARA has provided transfer guidance)See chart, next slide that shows system functions for each increment
  • Naval Oceanographic Office (NAVO) Battleship drawings (CAD files), Ship Records created by TRIMBureau of Labor Statistics – Large Data sets, e-journalsUS Patent & Trademark OfficeDigital Patent Applications / case filesPatent process used to be manual, now is fully automated, creates electronic records:CAD drawings (large files!), electronic forms and other associated recordsThese are records that protect inventors’ legal rights, complex files w/ privacy, legal issuesNNSA (Kansas City Plant records) - Ability to test REGIONAL records/storage- NNSA was one of the early ERA Research partners studying long-term Temp Scientific Data
  • An ERA Update 26 June2007

    1. 1. Building the Archives of the Future An Update on the Electronic Records Archives (ERA) ERA Program National Archives and Records Administration 26 June 2007 National Archives and Records Administration 1
    2. 2. Overview • NARA’s History and Mission • ERA history and vision • Electronic Records Challenges & Strategies • Three components of the ERA Program • The ERA Timeline: Where we are now 05/04/09 National Archives and Records Administration 2
    3. 3. NARA’s Mission To ensure access to records of three branches of the U.S. Government. Records that: • Protect citizen’s rights • Hold Government officials accountable • Facilitate historical understanding of our national experience 05/04/09 National Archives and Records Administration 3
    4. 4. ERA History • 1970 First electronic records transferred to NARA • 1995 NARA is heading for Mission failure • 1998 Begin ERA Research Partnerships • 2000 ERA becomes an Official Program 05/04/09 National Archives and Records Administration 4
    5. 5. ERA’s Long-range Vision ERA will be a comprehensive, systematic, and dynamic means of preserving and providing continuing access to any type of electronic record free from dependence on any specific hardware or software, created anywhere in the U.S. Federal Government enabling NARA to carry out its mission into the future 05/04/09 National Archives and Records Administration 5
    6. 6. NARA’s Challenges • Scope The entire U.S. Federal Government • Variety Different/Complex Types of Records • Complexity and Records Formats • Obsolescence Constantly Changing Technology • Access Ability to use records over time • Volume Large numbers of records arriving at NARA 05/04/09 National Archives and Records Administration 6
    7. 7. Examples of Variety & Complexity The 9-11 Commission Records Office Automation Files Complex Formats  Word processing documents  Digital Photography  Spreadsheets  Satellite Imagery  Presentations  Digital audio files  E-mail w/ attachments  HD Video  Scanned paper documents  Web pages  Databases  Geospatial Information Systems 05/04/09 National Archives and Records Administration 7
    8. 8. Records ALREADY at NARA: Examples of Volume • Clinton Administration 40 million email messages • State Department 25 million electronic diplomatic messages • Department of Defense 54 million images from electronic official military personnel files annually • Census Bureau (2000 Census) 600 - 800 million image (TIFF) files 05/04/09 National Archives and Records Administration 8
    9. 9. THREE Components of the ERA Program Research and Exploratory Development • • Acquiring a System that meets our requirements and our mission Organizational Change Management • 05/04/09 National Archives and Records Administration 9
    10. 10. Some of the ERA Research Partners G l o b National a l San Diego Science G r Supercomputer Foundation i d Center F o National Computational r u Science Alliance m Army Research N Laboratory I S National Agricultural Library T 05/04/09 National Archives and Records Administration 10 …and many other Federal Agencies and their Records Officers
    11. 11. Making the Connections Making sense of the 0s and 1s is dependent on a web of connections Persistent Archives Binary Sequence Record Data Type Context Record Documentation Template Type Template Encoding Archival Data Processing Context Format 05/04/09 National Archives and Records Administration 11 The Persistence of an Archives is Only as Strong as its Weakest Link
    12. 12. The ERA Development Timeline ERA will be developed and released in five phases, (or increments) spanning from FY2005 - FY2011: FY 2007 Sep 2005 2011 Inc 2 Inc 3 Inc 4 Inc 5 Inc 1 1 year 1 year 1 year 1 year 2 years NARA Awarded ERA Initial Contract to Lockheed Operating Full Martin Capability Operating (IOC) Capability 05/04/09 National Archives and Records Administration 12
    13. 13. The Four Increment 1 Agencies 05/04/09 National Archives and Records Administration 13