Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scalable and Secure Cloud-Based Data Archiving for Digital Libraries, Compliance, and Public Records

223 views

Published on

Mike Davis, AWS Storage Business Development
Tres Vance, Senior Solutions Architect
10/17/18

  • Be the first to comment

  • Be the first to like this

Scalable and Secure Cloud-Based Data Archiving for Digital Libraries, Compliance, and Public Records

  1. 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. October 17, 2018 Raise the BAR on Data Protection and Continuity Long-Term Archival with AWS Mike Davis, AWS Storage Business Development Tres Vance, Senior Solutions Architect
  2. 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Our Mission as Public Sector Archivists Preserve Leverage content value • Access, Context, Agility Offload non-differentiated activities • Scaling, provisioning, durability, networking... Assure security and compliance Achieve our goals at the lowest possible cost
  3. 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Fun is Just Beginning 3 Metadata (descriptions about records) Repository (a safe place for records) Submission (accepting records) Access (getting records and descriptions out)
  4. 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. NARA – ERA 2.0 Architecture – Just Launched! Why AWS • Pay as you go • Storage • High availability • Elasticity • First NARA Agile dev project, “Just in time” design • Services
  5. 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  6. 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. But let’s back up to the BasicsDurability, security, scale “…the scale at which AWS operates its public cloud storage services dwarfs the other vendors in this Magic Quadrant.” - Gartner Magic Quadrant for Public Cloud Storage Services, Worldwide Raj Bala, Arun Chandrasekaran, John McArthur, July 24, 2017
  7. 7. 18 (+4) REGIONS, 55 AVAILABILITY ZONES, 136 EDGE LOCATIONS, 142 SERVICES, MORE THAN 1 MILLION CUSTOMERS 7 Govcloud For CUI, Fedramp, CJIS, ITAR, DOD, FIPS-140, etc
  8. 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A view of good enterprise protection 8 DRPrimary
  9. 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The AWS Durability Model: What is 99.999999999%? 9 A Single AWS “Region” AZ1 AZ2 AZ3 Availability Zones separated in power, network, flood-plane Objects striped/coded across AZ’s Data integrity checks Tolerant of concurrent failures Disks, nodes, racks Networks, WAN providers Datacenters Search: “James Hamilton ReInvent 2016”
  10. 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What’s in Your Threat Matrix Today? Do Priorities Change When Storage is 99.999999999% Durable Physical security Media failure System fault External attacker Malicious insider Application bug Example Threat Matrix, On-Premises versus AWS See Well Architected Framework (blog, podcast, webinar, white paper)
  11. 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security – Core to the Well Architected Framework Identity and Access Management (IAM) and Security Token Service (STS) Data classification & Tagging Data Loss Prevention (DLP) using ML “Macie” CloudTrail: Track users and API activity Encryption • Over the wire, in the SnowBall • At rest: CSE, SSE, SSE-KMS, SSE-CloudHSM MFA Networking/VPC Configuration monitoring, management, remediation https://d1.awsstatic.com/whitepapers/architecture/AWS-Security-Pillar.pdf
  12. 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Compliance and audit breadth benefits all AWS customers //aws.amazon.com/compliance/
  13. 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Broadest Range of Storage Services Data movement OnlineOffline Data security and management Amazon EFS Amazon EBS Amazon S3 Amazon Glacier AWS KMS AWS IAM AWS CloudWatch AWS CloudTrail AWS CloudFormation AWS Lambda Amazon Macie AWS QuickSight AWS Snowball AWS Storage Gateways AWS Direct Connect EFS File Sync S3 Transfer Acceleration 3rd Party Applications Amazon Kinesis Firehose S3 today hosts trillions of Objects, Exabytes of capacity, and delivers 99.999999999% durability
  14. 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Glacier: For Long Term Low-Cost Archival AWS’s lowest-cost persistent storage tier for Digital Asset Management, preservation archival, and backup Film studios and broadcasters storing millions of hours of media: Fox pictures, Fox Sports, Discovery Channel, Sony, Turner/CNN, etc Federal agencies SEC-regulated archives HIPAA-regulated Healthcare Heritage archival 14
  15. 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Glacier: For Long Term Low-Cost Archival Object storage with tape-equivalent TCO and performance 3-zone data placement Accessed via S3 APIs Used in conjunction with S3 for index, distribution, analytics/ML, etc Versioning, WORM/retention, encryption Multiple retrieval methods: exped/std/bulk Fixity checking: upon transfer, on read operations, and ongoing 15
  16. 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Managing your data Governance, compliance, and content management
  17. 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Common Services for Archival S3 and Glacier object storage Lambda: Automatically trigger programmable functions based on an event or software call (“serverless compute”) EC2 Spot: Off-peak compute capacity at large discounts DynamoDB: Scalable NoSQL serverlass key-value store RDS: SQL/MySQL serverless database SQS (Simple Queue Service) and SNS (Simple Notification Service) AWS Media Analytics solution (facial/object recog, transcribe) (visit us at ReInvent in November for other things coming)
  18. 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Partner Community for Complete Solutions 4,200 Marketplace listings Amazon Partner Network (APN) • Over 10,000 new APN partners – just in 2017! • 550 partners focused on Federal, SLG, EDU, NPO Wide portfolio of backup, DR, primary storage, and archive technology partners
  19. 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Moving Your Data
  20. 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Network transfer times Line speed per month (30 days) Size 32,400,000 MB 31,641 GB 31 TB 324,000 GB 316 TB 3,240,000 GB 3,164 TB 3 PB 100 Mbps 1 Gbps 10 Gbps
  21. 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Direct Connect AWS Snowball ISV Connectors S3 Transfer Acceleration AWS Storage Gateway Migration: multiple paths available AWS Snowmobile
  22. 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Snowball: Offline transport for bulk data moves E-ink shipping label “8.5G Impact” case • Encrypted end-to-end 100 TB 10G-40Gb network • Rain & dust resistant • Tamper-resistant • Onboard EC2 & Lambda
  23. 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Maximizing migration bandwidth: 100 Petabytes-per-month
  24. 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Offline tape migration Egress optimization Proprietary conversion Metadata export Asset QC & registry Index/ Analyze Proxy generation Rekognition Transcribe Comprehend DAM/CMS modernization Metadata import Security & DR Lifecycle and cost optimization Migration can be complex, let us help Export Normalization & Augmentation Import & Operationalize Records Access Curation Workflow Import/Quarantine
  25. 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Evaluating storage costPay-as-you-go, follow commodity curves
  26. 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Archival Pricing Pricing based on... • Capacity • Transactions • Data volume transferred out • A storage component • A networking component “egress” Cost-following model
  27. 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS cost transparency: commodity savings are passed to our customers $0.00 $0.02 $0.04 $0.06 $0.08 $0.10 $0.12 $0.14 $0.16 May-05 Oct-06 Feb-08 Jul-09 Nov-10 Apr-12 Aug-13 Dec-14 May-16 Sep-17 S3-Std Storage Price Reductions (1TB tier) -15.7% average annual decline March 14, 2006
  28. 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Economics and Risks in Large-Scale Tape Archival Economics of tape: core BOM • Big libraries: >$2 Million and the media to fill them • Drives, robotics, networking • Software, media servers, support contracts Additional costs • 2 or 3 copies of tapes for DR and offsite vaulting • Costs of recurring media migration • Bit rot, mechanical failure, and “fell off the truck” • The risk of a shrinking vendor base 28
  29. 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Glacier vs. on-premises cost comparison (Sony data) @SonyDADCNMS
  30. 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Enhances Long Term Archival Economies-of-scale deliver more than low cost • 99.999999999% durability • Scale, global footprint, service breadth • Physical, data security, compliance, Govcloud Focus your teams on core competencies • New access and usage models • Spend less time on datacenter and K.T.L.O. activities AWS improves asset value • Availability: global, distribution, data mobility • Context: discovery, catalog, search, learn, discover • Agility: workflow, transcode, normalize, package Courtesy SDSC
  31. 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P Partner Storage Integration
  32. 32. Public sector: requirements for long-term digital preservation  Federal, state and local mandates to retain records for decades to permanent  FOI/FOIA and increased transparency  Digital transformation and cloud-first initiatives  GDPR compliance for long-term PII  Litigation support Long-term government records  Social care  Planning & infrastructure  Environment & transportation  Democratic process  Historical & cultural  Genealogical
  33. 33. Digital content older than 10 years is at risk ꓫ Cannot read media ꓫ Corrupted, altered or deleted files ꓫ Obsolete file formats ꓫ No software license ꓫ Incorrect software versions ꓫ Website decommissioned ꓫ Legacy application migration “As formats change, software is retired and hardware becomes obsolete, the data that organisations might want to keep can be lost forever.”
  34. 34. Proven global partnership in public sector Preservica public sector customers:  22 US state archives  15 national archives  Pan-national organizations  UK central government  US & UK county & city government EU (Ireland) Asia Pacific (Sydney) US East (N.Virginia) Canada (Central) AWS GovCloud (US)  Preservica Enterprise Private Cloud o can be deployed in any AWS region  AWS S3 and Glacier storage  AWS Snowball for bulk transfer  AWS encryption Preservica SaaS Active Digital Preservation  Long-term accessibility & authenticity of digital records  Preservica Cloud Edition o available in multiple AWS regions...with more to come
  35. 35. Proven preservation of long-term and permanent digital records Active Preservation Migration to new file formats Information management Content acquisition OAIS ISO 14721 ISO 27001 ISO 9001 Discovery & secure access Secure cloud infrastructure & durable storage Intelligent storage GDPR 
  36. 36. Texas State Library & Archives Commission (TSLAC) Background  Official library and archives of Texas, received large volume of Governor Rick Perry’s digital records in 2014 Customer Concerns  Legal mandate to ensure future access and FOI/transparency  Prohibitive cost of using own state data center Solution Offered  Preservica hosted on AWS GovCloud (US) – S3 and Glacier Success Metrics  Launch of the Texas Digital Archive – online public access  Minimal local IT resources with scalable cloud deployment  Geographical dispersal of multiple data copies
  37. 37. Kentucky Department for Libraries & Archives (KDLA) Background  One of largest US State Archive collections – pioneered electronic record archiving in the 1980’s Customer Concerns  State mandate to preserve and provide access to state records  Automating a previous manual process and operating at scale Solution Offered  Preservica hosted on AWS US East (N. Virginia) region Success Metrics  Reduced storage cost and need for local IT resource  Ingesting 1,000+ records a week from over 240 state agencies  Launch of Kentucky State Digital Archives - online public access
  38. 38. Meeting FOI/FOIA and citizen access requirements Texas Digital Archive Kentucky State Digital Archives
  39. 39. Boston City Archives Background  Mandate to preserve historical and administrative records  Established in 1989 – prior to this records spread across the city Customer Concerns  No central digital repository  Automated snapshots of city & department websites  No local IT resource Solution Offered  Preservica hosted on AWS US East (N. Virginia) region Success Metrics  Boston City Archives Digital Repository – for online public access  Records discoverable through the Digital Public Library of America  Elevated profile & value of the archive - for sustainable funding
  40. 40. Transport for London (TfL) Background  Operate London’s public transport network  29,000 employees across 300+ sites Customer Concerns  No centralized digital repository – 120+ million files to manage  Different levels of secure access required Solution Offered  Preservica hosted on AWS EU (Ireland) region  S3 and Glacier storage for optimized cost and security Success Metrics  Secure access to critical long-term digital records  Protection of brand heritage and history  Long-term maintenance of extensive infrastructure  Responsive litigation & compliance support
  41. 41. Storage Use Cases aws.amazon.com/backup-recovery/partner-solutions Backup and RecoveryPrimary Storage Archive BCDR Solutions that leverage file, block, object, and streamed data formats as an extension to on-premises storage Solutions that leverage Amazon S3 for durable data backup Solutions that leverage Amazon Glacier for durable and cost- effective long-term data backup Solutions that utilize AWS to enable recovery strategies focused on RTO and RPO requirements Consulting Consulting services that provide implementation capabilities in one or more core storage categories
  42. 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Any Questions? Mike Davis: mikdav@amazon.com Tres Vance: tresvanc@amazon.com Dolly Isaac: dollyi@amazon.com Follow us: @aws_gov More information, visit https://aws.amazon.com/government-education/raisethebar/

×