SlideShare a Scribd company logo
1 of 17
Electronic Records Transfer
         Guidance
          at NARA
      Kevin De Vorsey
  kevin.devorsey@nara.gov
Current Guidance (Developed Between 2002-2004)




Reflect NARA’s
capabilities at the time.
                 National Archives and Records Administration
Current Guidance: limited scope that does not
address all format types




               National Archives and Records Administration
Current Guidance Products
 Demonstrate a preference for open, standards-
    based formats.
   Require that agencies transform or normalize data
    into acceptable formats prior to transfer*.
   Have proven an obstacle to the steady transfer of
    records.
   Are referred to at many points of the lifecycle.
   Cannot adapt to records with different retention
    periods.

                  National Archives and Records Administration
Project Scope
 In scope:
    This project seeks to support the work of federal agencies by
     providing flexible and realistic electronic records file format
     guidance on all electronic records types for use when transferring
     permanent records to NARA in accordance with the Federal
     Records Act.
    This project will identify and recommend changes but the
     execution of any additional guidance including guidance for all
     types of metadata as well as the revision of business
     processes, or development of standard operating procedures is
     beyond the scope of this project.

 Out of scope:
   Format guidance for other areas of the record lifecycle other than
    transfer to NARA
   Guidance on physical media
   Records of the Executive Office of the President and the records
    of the United States Congress. These branches are not covered
    by the Federal Records Act and are therefore excluded from
    consideration for this project
Project Phases
 Phase 1: Planning and Preparation
   July 1– December 30, 2011
 Phase 2: Conduct Informational Meetings
   February 6– August 12, 2012
   Internal NARA SMEs
   Future Perfect Conference
   Agency representatives
 Phase 3: Develop and Publish Guidance Product
   May 29 – September 7, 2012
 Phase 4: Evaluation and Completion
   December 12 - December 21, 2012
Electronic Records Lifecycle
              Migration       Decommissioning


                                                                            Processing
                      Maintenance
                                                                 Transfer
                                                                                                                 Transformation
                                                                                Preservation & Access Planning
System                                    Transfer Planning
design/planning




                                          Destruction Planning
        Record Creation

                                                                                                           Access
                                                                                     Preservation/
                                                                       Ingest
                          Scheduling/Appraisal                                       Maintenance


                                                                                            Regular
                                                                                                                    Public
                                                                                            Requests
                                                                                            (FOIA, etc.)




                                             National Archives and Records Administration
Revised Guidance Content Categories

  Electronic Textual Records
  Digital Still Images
  Digital Audio Records
  Digital Moving Image Records
  Structured Data
  Geospatial Records
  CAD and Vector Graphics
  Web Records
  E-mail Records


               National Archives and Records Administration
Relevant Content Categories Definitions
     Structured Data – includes the broad category of data that is stored in defined fields and
      includes:
        Databases – Database formats are organized collections of associated data that conform
          to a logical structure. Database formats are determined by “data models” that describe
          specific data structures used to model an application and generally include navigational,
          relational, and hybrid models.
        Spreadsheets – Spreadsheets are electronic simulations of paper accounting
          worksheets for financial plans, budgets, etc. Personal computer and server based
          spreadsheet programs [e.g. Microsoft Excel, Lotus 1-2-3, Open Office Calc.] can create
          both proprietary files as well as software independent files including text or XML. Cloud-
          based spreadsheets [e.g. Google Documents] include format export options such as .xls,
          .csv, .txt, .ods, PDF and HTML files as well as import and conversion options for common
          spreadsheet formats including .xls, .csv, and .ods.
        Statistical Data – Statistical Data is the result of scientific quantitative research and
          analyses. Statistical data formats contain collections of data presented in both tabular
          and non-tabular form. Datasets are formatted as strings of characters contained within a
          markup language [e.g. XML] or as software dependent proprietary files by commercial
          statistical and qualitative data analysis software tools (e.g. SAS and SPSS).
        Scientific data refers to research data collected by instrumentation tools during the
          scientific process. Scientific data formats are either domain specific such as those used
          within a single field of study [e.g. Flexible Image Transport System (FITS)] or are multi-
          domain formats useful for transfer of scientific data between domains [e.g. Common Data
          Format (CDF), HDF5].
Relevant Content Categories Definitions
  Geospatial – Geospatial data includes files created by
   geographic information systems (GIS) or other software
   applications for spatial analysis using computer systems.
   The data may be contained within a database to enable
   analysis across the datasets (e.g. geo-database), united
   within a complex file format structure where one geospatial
   file is comprised of several distinct, but related, formats
   (e.g. shapefile), or contained within a single file (e.g.
   GML).
  Computer Aided Design (CAD) and Vector Graphics–
   Non-raster Vector graphics formats use mathematical
   expressions to create and manipulate computer graphics
   and animations. Computer Aided Design (CAD) are
   vector programs used in engineering and manufacturing
   design to create animations and represent three-
   dimensional surfaces of inanimate objects. CAD and
   Vector graphics programs can output binary and XML
Record Categories Held in
Systems

                                                    Geospatial Data
                                 Geospatial
                                                    System Records
CADCAM System
                    CAD/CAM
Generated Records


                              Database        Database System
                                              Generated Records




                       NARA/ERA
Considerations*
 What part(s) of the system represents the record?
 Do we want to bring in the entire system?
 Could ERA cope with the formats, file size, and/or
  volume or files?
 If we only want a subset or can only accept an export
  then what is the “best” format for the electronic record
  type in question?
 What additional information should accompany the
  data?
 How should we validate and verify this data?


*These influence the transfer guidance but changes to existing work
  processes are out of the scope of this project.
Goals
  Provide clear, concise, and consistent direction to
   agencies regarding formats that are acceptable
   for use when transferring records to NARA.
  Develop a flexible and extensible framework that
   can adapt to future needs.
  Balance preference for open formats with the
   business needs of agencies and NARA.
  Support digital continuity across the lifecycle of
   electronic records.



              National Archives and Records Administration
Stress Sustainability*
   Disclosure: the degree to which complete specifications and technical
      integrity tools exist.
     Adoption: the degree to which the format is used by
      creators, disseminators, or users.
     Transparency: the degree to which the digital representation is open to
      direct analysis with basic tools, including human readability using a text-
      only editor.
     Self-documentation: formats that contain all the metadata needed to
      render the data as usable information.
     External dependencies: refers to the degree to which a format
      depends on particular hardware, operating system, or software for
      rendering or use.
     Impact of patents: Patents related to a digital format may inhibit the
      ability of archival institutions to sustain content in that format.
     Technical protection mechanisms: To preserve digital content and
      provide service to users and designated communities decades
      hence, NARA must be able to replicate the content on new
      media, migrate and normalize it in the face of changing technology, and
      disseminate it to researchers.

                       National Archives and Records Administration
  *adapted from http://www.digitalpreservation.gov/formats/
Recognize That Formats …

   Influence usability
   Affect behavior and performance
   Influence NARA’s capability to preserve
   complex records like databases, video, and
   GIS




            National Archives and Records Administration
Concluding Thoughts
 NARA should:
  expand the types of formats that NARA accepts
  balance the business requirements of agencies with
   NARA’s preservation and access needs
  minimize the need for agencies to transform records
   prior to transfer
  develop guidance across the lifecycle of electronic
   records to support digital continuity




                 National Archives and Records Administration
Thank You!

               Kevin L. De Vorsey
Supervisory Electronic Records Format Specialist
       Electronic Records Format Section
   Policy Analysis and Enforcement Division
       Office of the Chief Records Officer
                 Agency Services
  National Archive and Records Administration
       kevin.devorsey@nara.gov

More Related Content

Viewers also liked

Viewers also liked (12)

Fdtd
FdtdFdtd
Fdtd
 
Bunny booktemplate1
Bunny booktemplate1Bunny booktemplate1
Bunny booktemplate1
 
Grace Currie Ann Jebson First Things First
Grace Currie Ann Jebson First Things FirstGrace Currie Ann Jebson First Things First
Grace Currie Ann Jebson First Things First
 
Tourismo filipino1
Tourismo filipino1Tourismo filipino1
Tourismo filipino1
 
Dave Pearson The Adventures of Digi
Dave Pearson The Adventures of DigiDave Pearson The Adventures of Digi
Dave Pearson The Adventures of Digi
 
Ageofdiscovery
AgeofdiscoveryAgeofdiscovery
Ageofdiscovery
 
Steve Mc Eachern Australian Data Archive
Steve Mc Eachern Australian Data ArchiveSteve Mc Eachern Australian Data Archive
Steve Mc Eachern Australian Data Archive
 
Jeff Rothenberg Digital Preservation Perspective
Jeff Rothenberg Digital Preservation PerspectiveJeff Rothenberg Digital Preservation Perspective
Jeff Rothenberg Digital Preservation Perspective
 
OGD Wien - Ideensammlung
OGD Wien - IdeensammlungOGD Wien - Ideensammlung
OGD Wien - Ideensammlung
 
Bigger Hard Drive Jamie Lean
Bigger Hard Drive Jamie LeanBigger Hard Drive Jamie Lean
Bigger Hard Drive Jamie Lean
 
Steve Knight by Design
Steve Knight by DesignSteve Knight by Design
Steve Knight by Design
 
Michael Parsons Passion
Michael Parsons PassionMichael Parsons Passion
Michael Parsons Passion
 

Similar to Kevin De Vorsey Past is Prologue

Muench,John Res21 Jul2010 It Dr Pm San Compliance For It Pm Op (17 Nov 2010)
Muench,John Res21 Jul2010 It Dr Pm San Compliance For  It Pm Op (17 Nov 2010)Muench,John Res21 Jul2010 It Dr Pm San Compliance For  It Pm Op (17 Nov 2010)
Muench,John Res21 Jul2010 It Dr Pm San Compliance For It Pm Op (17 Nov 2010)JFMuench
 
Lumbs Latest Resume
Lumbs Latest ResumeLumbs Latest Resume
Lumbs Latest ResumeJohn Lumb
 
2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrackRudolf Husar
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.pptArbazAli27
 
Resume of Stacy Lauren Hendricks
Resume of Stacy Lauren HendricksResume of Stacy Lauren Hendricks
Resume of Stacy Lauren HendricksStacy Hendricks
 
Technical Comms Planning Nf
Technical Comms Planning NfTechnical Comms Planning Nf
Technical Comms Planning NfJohn_Wilson
 
WQD2011 - INNOVATION - DEWA - Tracking Management System
WQD2011 - INNOVATION - DEWA - Tracking Management SystemWQD2011 - INNOVATION - DEWA - Tracking Management System
WQD2011 - INNOVATION - DEWA - Tracking Management SystemDubai Quality Group
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 
Scott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdfScott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdfIUI
 
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...IRJET Journal
 
Reverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureReverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureDharmalingam Ganesan
 

Similar to Kevin De Vorsey Past is Prologue (20)

Muench,John Res21 Jul2010 It Dr Pm San Compliance For It Pm Op (17 Nov 2010)
Muench,John Res21 Jul2010 It Dr Pm San Compliance For  It Pm Op (17 Nov 2010)Muench,John Res21 Jul2010 It Dr Pm San Compliance For  It Pm Op (17 Nov 2010)
Muench,John Res21 Jul2010 It Dr Pm San Compliance For It Pm Op (17 Nov 2010)
 
Lumbs Latest Resume
Lumbs Latest ResumeLumbs Latest Resume
Lumbs Latest Resume
 
2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack
 
Ws Stuff
Ws StuffWs Stuff
Ws Stuff
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.ppt
 
Resume of Stacy Lauren Hendricks
Resume of Stacy Lauren HendricksResume of Stacy Lauren Hendricks
Resume of Stacy Lauren Hendricks
 
Technical Comms Planning Nf
Technical Comms Planning NfTechnical Comms Planning Nf
Technical Comms Planning Nf
 
System design
System designSystem design
System design
 
WQD2011 - INNOVATION - DEWA - Tracking Management System
WQD2011 - INNOVATION - DEWA - Tracking Management SystemWQD2011 - INNOVATION - DEWA - Tracking Management System
WQD2011 - INNOVATION - DEWA - Tracking Management System
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Scott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdfScott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdf
 
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
 
Reverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureReverse Engineering of Software Architecture
Reverse Engineering of Software Architecture
 
Mis module ii
Mis module iiMis module ii
Mis module ii
 
Chapter01
Chapter01Chapter01
Chapter01
 
Ena ch01
Ena ch01Ena ch01
Ena ch01
 
Ena ch01
Ena ch01Ena ch01
Ena ch01
 
charles finch CV
charles finch CVcharles finch CV
charles finch CV
 

More from Future Perfect 2012

Working Across Organizations white paper
Working Across Organizations white paperWorking Across Organizations white paper
Working Across Organizations white paperFuture Perfect 2012
 
Ensuring Data Integrity white paper
Ensuring Data Integrity white paperEnsuring Data Integrity white paper
Ensuring Data Integrity white paperFuture Perfect 2012
 
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...Future Perfect 2012
 
Joe Coleman Biodiversity Heritage Library
Joe Coleman Biodiversity Heritage LibraryJoe Coleman Biodiversity Heritage Library
Joe Coleman Biodiversity Heritage LibraryFuture Perfect 2012
 
James Smithies Academic Earthquake Research
James Smithies Academic Earthquake ResearchJames Smithies Academic Earthquake Research
James Smithies Academic Earthquake ResearchFuture Perfect 2012
 
Shaun Hendy Innovation Ecosystem
Shaun Hendy Innovation EcosystemShaun Hendy Innovation Ecosystem
Shaun Hendy Innovation EcosystemFuture Perfect 2012
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineFuture Perfect 2012
 
Parul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right CombinationParul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right CombinationFuture Perfect 2012
 
Jay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying FormatsJay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying FormatsFuture Perfect 2012
 
Stuart Wakefield Cloud Computing
Stuart Wakefield Cloud ComputingStuart Wakefield Cloud Computing
Stuart Wakefield Cloud ComputingFuture Perfect 2012
 
Cassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSWCassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSWFuture Perfect 2012
 
T Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation PilotT Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation PilotFuture Perfect 2012
 
Dennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital PreservationDennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital PreservationFuture Perfect 2012
 

More from Future Perfect 2012 (15)

Working Across Organizations white paper
Working Across Organizations white paperWorking Across Organizations white paper
Working Across Organizations white paper
 
Ensuring Data Integrity white paper
Ensuring Data Integrity white paperEnsuring Data Integrity white paper
Ensuring Data Integrity white paper
 
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
 
Joe Coleman Biodiversity Heritage Library
Joe Coleman Biodiversity Heritage LibraryJoe Coleman Biodiversity Heritage Library
Joe Coleman Biodiversity Heritage Library
 
James Smithies Academic Earthquake Research
James Smithies Academic Earthquake ResearchJames Smithies Academic Earthquake Research
James Smithies Academic Earthquake Research
 
Shaun Hendy Innovation Ecosystem
Shaun Hendy Innovation EcosystemShaun Hendy Innovation Ecosystem
Shaun Hendy Innovation Ecosystem
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP Online
 
Parul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right CombinationParul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right Combination
 
Gabe Nault Data Integrity
Gabe Nault Data IntegrityGabe Nault Data Integrity
Gabe Nault Data Integrity
 
Jay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying FormatsJay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying Formats
 
Stuart Wakefield Cloud Computing
Stuart Wakefield Cloud ComputingStuart Wakefield Cloud Computing
Stuart Wakefield Cloud Computing
 
Cassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSWCassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSW
 
T Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation PilotT Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation Pilot
 
Dennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital PreservationDennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital Preservation
 
Bedrich Vychodil DIFFER
Bedrich Vychodil DIFFERBedrich Vychodil DIFFER
Bedrich Vychodil DIFFER
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Kevin De Vorsey Past is Prologue

  • 1. Electronic Records Transfer Guidance at NARA Kevin De Vorsey kevin.devorsey@nara.gov
  • 2. Current Guidance (Developed Between 2002-2004) Reflect NARA’s capabilities at the time. National Archives and Records Administration
  • 3. Current Guidance: limited scope that does not address all format types National Archives and Records Administration
  • 4. Current Guidance Products  Demonstrate a preference for open, standards- based formats.  Require that agencies transform or normalize data into acceptable formats prior to transfer*.  Have proven an obstacle to the steady transfer of records.  Are referred to at many points of the lifecycle.  Cannot adapt to records with different retention periods. National Archives and Records Administration
  • 5. Project Scope  In scope:  This project seeks to support the work of federal agencies by providing flexible and realistic electronic records file format guidance on all electronic records types for use when transferring permanent records to NARA in accordance with the Federal Records Act.  This project will identify and recommend changes but the execution of any additional guidance including guidance for all types of metadata as well as the revision of business processes, or development of standard operating procedures is beyond the scope of this project.  Out of scope:  Format guidance for other areas of the record lifecycle other than transfer to NARA  Guidance on physical media  Records of the Executive Office of the President and the records of the United States Congress. These branches are not covered by the Federal Records Act and are therefore excluded from consideration for this project
  • 6. Project Phases  Phase 1: Planning and Preparation  July 1– December 30, 2011  Phase 2: Conduct Informational Meetings  February 6– August 12, 2012  Internal NARA SMEs  Future Perfect Conference  Agency representatives  Phase 3: Develop and Publish Guidance Product  May 29 – September 7, 2012  Phase 4: Evaluation and Completion  December 12 - December 21, 2012
  • 7. Electronic Records Lifecycle Migration Decommissioning Processing Maintenance Transfer Transformation Preservation & Access Planning System Transfer Planning design/planning Destruction Planning Record Creation Access Preservation/ Ingest Scheduling/Appraisal Maintenance Regular Public Requests (FOIA, etc.) National Archives and Records Administration
  • 8. Revised Guidance Content Categories  Electronic Textual Records  Digital Still Images  Digital Audio Records  Digital Moving Image Records  Structured Data  Geospatial Records  CAD and Vector Graphics  Web Records  E-mail Records National Archives and Records Administration
  • 9. Relevant Content Categories Definitions  Structured Data – includes the broad category of data that is stored in defined fields and includes:  Databases – Database formats are organized collections of associated data that conform to a logical structure. Database formats are determined by “data models” that describe specific data structures used to model an application and generally include navigational, relational, and hybrid models.  Spreadsheets – Spreadsheets are electronic simulations of paper accounting worksheets for financial plans, budgets, etc. Personal computer and server based spreadsheet programs [e.g. Microsoft Excel, Lotus 1-2-3, Open Office Calc.] can create both proprietary files as well as software independent files including text or XML. Cloud- based spreadsheets [e.g. Google Documents] include format export options such as .xls, .csv, .txt, .ods, PDF and HTML files as well as import and conversion options for common spreadsheet formats including .xls, .csv, and .ods.  Statistical Data – Statistical Data is the result of scientific quantitative research and analyses. Statistical data formats contain collections of data presented in both tabular and non-tabular form. Datasets are formatted as strings of characters contained within a markup language [e.g. XML] or as software dependent proprietary files by commercial statistical and qualitative data analysis software tools (e.g. SAS and SPSS).  Scientific data refers to research data collected by instrumentation tools during the scientific process. Scientific data formats are either domain specific such as those used within a single field of study [e.g. Flexible Image Transport System (FITS)] or are multi- domain formats useful for transfer of scientific data between domains [e.g. Common Data Format (CDF), HDF5].
  • 10. Relevant Content Categories Definitions  Geospatial – Geospatial data includes files created by geographic information systems (GIS) or other software applications for spatial analysis using computer systems. The data may be contained within a database to enable analysis across the datasets (e.g. geo-database), united within a complex file format structure where one geospatial file is comprised of several distinct, but related, formats (e.g. shapefile), or contained within a single file (e.g. GML).  Computer Aided Design (CAD) and Vector Graphics– Non-raster Vector graphics formats use mathematical expressions to create and manipulate computer graphics and animations. Computer Aided Design (CAD) are vector programs used in engineering and manufacturing design to create animations and represent three- dimensional surfaces of inanimate objects. CAD and Vector graphics programs can output binary and XML
  • 11. Record Categories Held in Systems Geospatial Data Geospatial System Records CADCAM System CAD/CAM Generated Records Database Database System Generated Records NARA/ERA
  • 12. Considerations*  What part(s) of the system represents the record?  Do we want to bring in the entire system?  Could ERA cope with the formats, file size, and/or volume or files?  If we only want a subset or can only accept an export then what is the “best” format for the electronic record type in question?  What additional information should accompany the data?  How should we validate and verify this data? *These influence the transfer guidance but changes to existing work processes are out of the scope of this project.
  • 13. Goals  Provide clear, concise, and consistent direction to agencies regarding formats that are acceptable for use when transferring records to NARA.  Develop a flexible and extensible framework that can adapt to future needs.  Balance preference for open formats with the business needs of agencies and NARA.  Support digital continuity across the lifecycle of electronic records. National Archives and Records Administration
  • 14. Stress Sustainability*  Disclosure: the degree to which complete specifications and technical integrity tools exist.  Adoption: the degree to which the format is used by creators, disseminators, or users.  Transparency: the degree to which the digital representation is open to direct analysis with basic tools, including human readability using a text- only editor.  Self-documentation: formats that contain all the metadata needed to render the data as usable information.  External dependencies: refers to the degree to which a format depends on particular hardware, operating system, or software for rendering or use.  Impact of patents: Patents related to a digital format may inhibit the ability of archival institutions to sustain content in that format.  Technical protection mechanisms: To preserve digital content and provide service to users and designated communities decades hence, NARA must be able to replicate the content on new media, migrate and normalize it in the face of changing technology, and disseminate it to researchers. National Archives and Records Administration *adapted from http://www.digitalpreservation.gov/formats/
  • 15. Recognize That Formats …  Influence usability  Affect behavior and performance  Influence NARA’s capability to preserve complex records like databases, video, and GIS National Archives and Records Administration
  • 16. Concluding Thoughts  NARA should:  expand the types of formats that NARA accepts  balance the business requirements of agencies with NARA’s preservation and access needs  minimize the need for agencies to transform records prior to transfer  develop guidance across the lifecycle of electronic records to support digital continuity National Archives and Records Administration
  • 17. Thank You! Kevin L. De Vorsey Supervisory Electronic Records Format Specialist Electronic Records Format Section Policy Analysis and Enforcement Division Office of the Chief Records Officer Agency Services National Archive and Records Administration kevin.devorsey@nara.gov