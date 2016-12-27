© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Henry Zhang, Senior Product Manager, AWS Rich Su...
AWS storage maturity Amazon EFS File Amazon Elastic Block Store Amazon EC2 Instance Store Block Amazon S3 Amazon Glacier O...
• Media distribution backbone (Ve.nue platform) • Over-The-Top (OTT) broadcast service • 20PBs of media assets, 800,000 ho...
Patient data–Philips Healthcare • HealthSuite digital platform powered by AWS • 15 petabytes of patient data • Archived fo...
Public sector–King County • Most populous county in Washington state • Replaced tape solution for backup from 17 agencies ...
Archive: Data retained for the long term, for compliance or potential future reference Data archiving needs are growing ev...
Consideration 1 – Total Archive Cost
Traditional archiving approaches • Tape libraries, robots, drives, media • Onsite (online and offline) • Offsite tape out/...
How can AWS help with your archival? Metered usage: Pay as you go No capital investment No commitment No risky capacity pl...
1 PB raw storage 800 TB usable storage 600 TB allocated storage 400 TB application data Storage pricing - pay only for wha...
Consideration 2 – Durability
99.999999999% Durability Durability for long-term preservation Built-in Fixity Checking Automatic recovery
Consideration 3 – Accessibility
Amazon Glacier – Data Retrieval Tiers Standard Retrieval • Current model • 3-5 hours • Disaster Recovery Bulk Retrieval • ...
Consideration 4 - Application & Data Management
Amazon Glacier – 3 ways to Access •Direct Glacier API/SDK •S3 lifecycle integration •Third party tools and gateways
Amazon Glacier – Direct access/APIs Create Vault Configure Access Upload Archives Register Archive ID Data Upload Initiate...
Use Glacier via S3 Object Lifecycle S3 Standard Active data Archive dataInfrequently accessed data S3 - Infrequent Access ...
- Transition Standard to Standard-IA - Transition Standard-IA to Amazon Glacier - Transition based on object tags - Expira...
Transition older videos to Standard-IA
Save money on storage 45% saving over S3 Standard 44% saving over S3 Standard-IA * Assumes the highest public pricing tier
Amazon Glacier – Third-party tools and gateways • Consumer grade: less than $50 • Example: Cloudberry, FastGlacier, Arq (H...
Which option should I choose? • Use S3 lifecycle managed Amazon Glacier if the S3 object keys are sufficient for index/sea...
corporate data center Media Archive and Metadata (cloud transition) Onsite Archive Offsite Tape Archive Hierarchical Stora...
Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Ama...
Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Ama...
Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Ama...
Consideration 5 - Compliance and Retention
Amazon Glacier Vault Lock allows you to easily set compliance controls on individual vaults and enforce them via a lockabl...
Vault Lock for compliance storage • Non-overwrite, non-erasable records • Time-based retention with “ArchiveAgeInDays” con...
Amazon Glacier received a third-party assessment from Cohasset Associates on how Amazon Glacier with Vault Lock can be use...
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rich Sutton, VP of Engineering Digital Risk, Soc...
Proofpoint • Cloud-based security and compliance for the enterprise: threat research, email, mobile, social, digital risk ...
Proofpoint SocialPatrol Policy controls and enforcement for social • Combats fraudulent brand impersonation • Moderates co...
Proofpoint SocialPatrol How it works: PFPT in AWS Policy engine MySQL/C*/Solr Enterprise Archive “Awesome. Help me with re...
Proofpoint SocialPatrol archiving integration Imperfect … Social != Email Every archive is different Requires internal col...
Proofpoint SocialPatrol Archive SEC Rule 17a-4(f)-compliant archive, purpose-built for social, enabled by Amazon Glacier a...
Proofpoint SocialPatrol Archive The customer specifies the retention period in Proofpoint Social:
Proofpoint SocialPatrol Archive Via AWS API we create a vault for that customer:
Proofpoint SocialPatrol Archive Via AWS API, we lock the vault, and specify policy to observe a legal hold via a tag.
Proofpoint SocialPatrol Archive As social content flows in, we record its purge date and surface that to the user. Each pi...
Proofpoint SocialPatrol Archive Search UI uses the copy of the data we already had. As archives expire, we purge them.
Proofpoint SocialPatrol Archive • Legal hold can be put in place by Proofpoint Support • Data can be exported from Amazon ...
Snowball Edge • Accelerate PBs with AWS- provided appliances • NEW 100 TB model with compute Storage Gateway • Instant hyb...
Related Sessions STG302 - Deep Dive on Amazon Glacier STG210 - Simplified Data Center Migration—Lessons Learned by Live Na...
Remember to complete your evaluations!
Thank you!
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amazon Glacier (STG209)

Without careful planning, data management can quickly turn complex with a runaway cost structure. Enterprise customers are turning to the cloud to solve long-term data archive needs such as reliability, compliance, and agility while optimizing the overall cost. Come to this session and hear how AWS customers are using Amazon Glacier to simplify their archiving strategy. Learn how customers architect their cloud archiving applications and share integration to streamline their organization's data management and establish successful IT best practices.

AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amazon Glacier (STG209)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Henry Zhang, Senior Product Manager, AWS Rich Sutton, VP of Engineering, Digital Risk, Proofpoint November 30, 2016 STG209 Strategic Planning for Long-Term Data Archiving with Amazon Glacier
  2. 2. AWS storage maturity Amazon EFS File Amazon Elastic Block Store Amazon EC2 Instance Store Block Amazon S3 Amazon Glacier Object Data Transfer AWS Direct Connect AWS Snowball ISV Connectors Amazon Kinesis Firehose Amazon S3 Transfer Acceleration AWS Storage Gateway
  3. 3. • Media distribution backbone (Ve.nue platform) • Over-The-Top (OTT) broadcast service • 20PBs of media assets, 800,000 hours of high-res content • Assets to be archived and retained for decades Video archives
  4. 4. Patient data–Philips Healthcare • HealthSuite digital platform powered by AWS • 15 petabytes of patient data • Archived for decades (beyond the lifetime of patients) • Uses AWS HIPAA-eligible services in the BAA
  5. 5. Public sector–King County • Most populous county in Washington state • Replaced tape solution for backup from 17 agencies • Meets compliance requirement • Saved $1MM in first year; no more tape refresh or management churn
  6. 6. Archive: Data retained for the long term, for compliance or potential future reference Data archiving needs are growing everywhere • Media assets, 4K, 8K • Health care/life sciences • Financial services • Regulated industries • Oil and gas/geospatial • Digital preservation • Long-term backups • Logs
  7. 7. Consideration 1 – Total Archive Cost
  8. 8. Traditional archiving approaches • Tape libraries, robots, drives, media • Onsite (online and offline) • Offsite tape out/vaulting • Specialized software and personnel • Tape refresh every 3-5 years
  9. 9. How can AWS help with your archival? Metered usage: Pay as you go No capital investment No commitment No risky capacity planning Avoid risks of physical media handling Control your geographic locality for performance and compliance
  10. 10. 1 PB raw storage 800 TB usable storage 600 TB allocated storage 400 TB application data Storage pricing - pay only for what you use AWS Cloud Storage Amazon Glacier starts at $0.004/GB/month Price drop by 43% on 11/21
  11. 11. Consideration 2 – Durability
  12. 12. 99.999999999% Durability Durability for long-term preservation Built-in Fixity Checking Automatic recovery
  13. 13. Consideration 3 – Accessibility
  14. 14. Amazon Glacier – Data Retrieval Tiers Standard Retrieval • Current model • 3-5 hours • Disaster Recovery Bulk Retrieval • Batch/Bulk access • 5-12 hours • PB scale re-transcoding or video/image analysis Expedited Retrieval • Emergency access • 1-5 minutes • Last minute play-out schedule swap $0.03/GB $0.01/GB $0.0025/GB On-site tape replacement Off-site tape replacement
  15. 15. Consideration 4 - Application & Data Management
  16. 16. Amazon Glacier – 3 ways to Access •Direct Glacier API/SDK •S3 lifecycle integration •Third party tools and gateways
  17. 17. Amazon Glacier – Direct access/APIs Create Vault Configure Access Upload Archives Register Archive ID Data Upload Initiate Retrieval Async Retrieval Completion Completion Notification Download Data Data Retrieval
  18. 18. Use Glacier via S3 Object Lifecycle S3 Standard Active data Archive dataInfrequently accessed data S3 - Infrequent Access Amazon Glacier Synchronous access Async accessSynchronous access $0.023/GB/mo. $0.004/GB/mo.$0.0125/GB/mo.
  19. 19. - Transition Standard to Standard-IA - Transition Standard-IA to Amazon Glacier - Transition based on object tags - Expiration and versioning Data lifecycle management T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days Data access frequency over time
  20. 20. Transition older videos to Standard-IA
  21. 21. Save money on storage 45% saving over S3 Standard 44% saving over S3 Standard-IA * Assumes the highest public pricing tier
  22. 22. Amazon Glacier – Third-party tools and gateways • Consumer grade: less than $50 • Example: Cloudberry, FastGlacier, Arq (Haystack Software) • Small / medium business: $500 - $1,000 • Example: Synology, Veeam, QNap • Enterprise gateway and data management software • Example: NetApp AltaVault, CommVault, StorNext, Vidispine
  23. 23. Which option should I choose? • Use S3 lifecycle managed Amazon Glacier if the S3 object keys are sufficient for index/search capability • Use Amazon Glacier directly if you already plan to store more metadata/indices in a database • Use 3rd party tools to minimize coding • Does the tool write data in proprietary or native format in AWS?
  24. 24. corporate data center Media Archive and Metadata (cloud transition) Onsite Archive Offsite Tape Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks On-Premise Tape
  25. 25. Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Amazon Glacier Cloud DAM (Syncing Metadata from on-prem) Amazon Direct Connect Offsite Tape ArchiveOn-Premise Tape Media Archive (transition to the cloud)
  26. 26. Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Amazon Glacier Cloud DAM (Syncing Metadata from on- prem) Amazon S3 Cloud Based Processing Tasks Amazon Direct Connect On-Premise Tape Offsite Tape Archive Media Archive (transition to the cloud)
  27. 27. Onsite Archive Hierarchical Storage Manager Metadata (Asset Manager) Processing Tasks corporate data center AWS Region Amazon Glacier Cloud DAM (Syncing Metadata from on- prem) Amazon S3 Cloud Based Processing Tasks Amazon Direct Connect Onsite Cache Offsite Tape ArchiveOn-Premise Tape Media Archive (transition to the cloud)
  28. 28. Consideration 5 - Compliance and Retention
  29. 29. Amazon Glacier Vault Lock allows you to easily set compliance controls on individual vaults and enforce them via a lockable policy Time-based retention MFA authentication Controls govern all records in a vault Immutable policy Two-step locking Compliance storage with Vault Lock
  30. 30. Vault Lock for compliance storage • Non-overwrite, non-erasable records • Time-based retention with “ArchiveAgeInDays” control • Policy lockdown (strong governance) • Legal hold with vault-level tags • Configure optional designated third-party access and grant temporary access
  31. 31. Amazon Glacier received a third-party assessment from Cohasset Associates on how Amazon Glacier with Vault Lock can be used to meet the requirements of SEC Rule 17a-4(f) and CFTC 1.31(b)-(c).
  32. 32. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rich Sutton, VP of Engineering Digital Risk, Social Media Security, and Compliance Proofpoint SocialPatrol Archive AWS Glacier and Vault Lock Use Case
  33. 33. Proofpoint • Cloud-based security and compliance for the enterprise: threat research, email, mobile, social, digital risk • Founded 2002, public in 2012 • $350M annual revenue, $3B market cap • Huge AWS user
  34. 34. Proofpoint SocialPatrol Policy controls and enforcement for social • Combats fraudulent brand impersonation • Moderates content at scale • Ensures compliance in publishing • Integrates with social APIs • 150+ classifiers using NLP and ML • Text, links, images, meta data • Ingesting >1M social posts per day • Built in AWS
  35. 35. Proofpoint SocialPatrol How it works: PFPT in AWS Policy engine MySQL/C*/Solr Enterprise Archive “Awesome. Help me with retention by integrating with my existing email archive.” Social
  36. 36. Proofpoint SocialPatrol archiving integration Imperfect … Social != Email Every archive is different Requires internal collaboration
  37. 37. Proofpoint SocialPatrol Archive SEC Rule 17a-4(f)-compliant archive, purpose-built for social, enabled by Amazon Glacier and Vault Lock PFPT in AWS Policy engine MySQL/C*/SolrSocial Amazon Glacier & Vault Lock
  38. 38. Proofpoint SocialPatrol Archive The customer specifies the retention period in Proofpoint Social:
  39. 39. Proofpoint SocialPatrol Archive Via AWS API we create a vault for that customer:
  40. 40. Proofpoint SocialPatrol Archive Via AWS API, we lock the vault, and specify policy to observe a legal hold via a tag.
  41. 41. Proofpoint SocialPatrol Archive As social content flows in, we record its purge date and surface that to the user. Each piece of social content is an archive in the vault.
  42. 42. Proofpoint SocialPatrol Archive Search UI uses the copy of the data we already had. As archives expire, we purge them.
  43. 43. Proofpoint SocialPatrol Archive • Legal hold can be put in place by Proofpoint Support • Data can be exported from Amazon Glacier by Proofpoint Support when necessary • Amazon Glacier with Vault Lock allowed us to build a product that complies with SEC Rule 17a-4(f) and CFTC Rule 1.31(b)-(c) What would it have cost for us to build a WORM data store, get it certified, and scale it … ?
  44. 44. Snowball Edge • Accelerate PBs with AWS- provided appliances • NEW 100 TB model with compute Storage Gateway • Instant hybrid cloud • Up to 120 MB/s cloud upload rate (4x improvement) Data ingestion into AWS storage services Firehose • Ingest data streams directly into AWS data stores Direct Connect • COLO to AWS ISV Connectors • Commvault • Veritas • etcetera NEW S3 Transfer Acceleration • Accelerate object transfer up to 300% using AWS’s private network
  45. 45. Related Sessions STG302 - Deep Dive on Amazon Glacier STG210 - Simplified Data Center Migration—Lessons Learned by Live Nation STG312 - Workshop: Working with AWS Snowball - Accelerating Data Ingest into the Cloud
  46. 46. Related Sessions STG302 - Deep Dive on Amazon Glacier STG210 - Simplified Data Center Migration—Lessons Learned by Live Nation STG312 - Workshop: Working with AWS Snowball - Accelerating Data Ingest into the Cloud
  47. 47. Remember to complete your evaluations!
  48. 48. Thank you!

