This document discusses architecting genomic data security and compliance when using Amazon Web Services (AWS). It reviews guidelines from agencies like NIH for genomic privacy and security. It describes AWS's shared security responsibilities model where AWS manages the underlying cloud infrastructure but customers are responsible for their own data and access controls in their AWS accounts. The document provides recommendations for how to use AWS services to meet requirements for controlled access genomic datasets from dbGaP and other repositories.
This document provides an overview of Amazon Web Services (AWS) and its portfolio of cloud computing services. It discusses the breadth and depth of the AWS platform, which includes over 100 services across computing, storage, databases, analytics, mobile, developer tools, management tools, IoT, and enterprise applications. The document also summarizes several core AWS services like Amazon EC2 for virtual servers, Amazon S3 for object storage, Amazon DynamoDB for NoSQL databases, and Amazon RDS for SQL databases. It highlights how AWS services provide scalability, reliability, security, and low costs compared to on-premises infrastructure.
This document provides an introduction to high performance computing (HPC) on Amazon Web Services (AWS). It discusses what HPC is and how grids and clusters are commonly used to enable parallel processing. It also outlines a wide range of HPC applications and how they map to AWS features. Key factors that make AWS compelling for HPC are its scalability, agility, ability to enable global collaboration, and cost optimization options. Sample HPC architectures that can be implemented on AWS including grid and cluster computing are also presented.
Fortinet Automates Migration onto Layered Secure WorkloadsAmazon Web Services
A primary concern many of today’s organizations is how to securely migrate their data and workloads to the cloud. To mitigate these challenges, multi-layered protection needs to be in place at all points along the path of data: entering, exiting, and within the cloud. Join Fortinet and AWS to learn how you can enable robust and effective security for your AWS Cloud-based applications and services. Fortinet provides a comprehensive security solution for your hybrid workloads, allowing you to effectively secure your workloads with simplified, automated migration.
Join us to learn:
- The best practices for enabling visibility and control against advanced threats
- Identify and enable the right security architecture for your applications and services
- How to protect your data along each step of the migration process
Who should attend: CTOs, CIOs, CISOs, IT Administers, IT Architects and IT Security Engineers
This document provides an overview of database scaling strategies on AWS. It begins with a single EC2 instance hosting a full stack application and database. It then progresses through separating components, adding redundancy, implementing sharding and database federation to handle increasing user loads from 1 to over 1 million users. Key strategies discussed include moving to managed database services like RDS, adding read replicas, distributing load with services like S3, CloudFront, DynamoDB and SQS, and splitting databases by function or key using sharding or federation.
Session Sponsored by Tableau: Transforming Data Into Valuable InsightsAmazon Web Services
Session Sponsored by Tableau: Transforming Data Into Valuable Insights
Want to transform your data into valuable insights that can help make your business more productive, profitable and secure? Come learn about Splunk Cloud which delivers Operational Intelligence as a cloud service, enabling you to gain critical insights from your machine data without the need to manage any infrastructure.
Speaker: Jason Oakes, Sales Consultant, Tableau
Next-Generation Security Operations with AWS | AWS Public Sector Summit 2016Amazon Web Services
With the upswell of cloud adoption, many traditional infrastructure paradigms are shifting. Security is no different. The cloud service provider industry is discovering new ways to tackle security, including automation, bottomless logging, scalable analysis clusters, and pluggable security tools. This session presents a case study in extending a traditional infrastructure operation into AWS. We provide a practical look into the technical challenges and benefits of operating in this new paradigm, explore incident response automation (Alexa integration), and provide various examples of shifting an on-premises security operation to a scalable, hybrid model. Through lessons learned and analysis, we show why your data is safer in the cloud than in that rack you can touch in your data
AWS is hosting the first FSI Cloud Symposium in Hong Kong, which will take place on Thursday, March 23, 2017 at Grand Hyatt Hotel. The event will bring together FSI customers, industry professional and AWS experts, to explore how to turn the dream of transformation, innovation and acceleration into reality by exploiting Cloud, Voice to Text and IoT technologies. The packed agenda includes expert sessions on a host of pressing issues, such as security and compliance, as well as customer experience sharing on how cloud computing is benefiting the industry.
Speaker: Olivier Klein, Solutions Architect, AWS
1) The document discusses serverless real-time analytics using AWS services like Kinesis, Kinesis Firehose, Kinesis Analytics and Athena. It provides examples of how to ingest raw JSON data from sensors in real-time, convert to CSV, aggregate data and generate alerts.
2) The document compares Kinesis Streams and Firehose for ingesting streaming data and explains how to process and analyze real-time data using Kinesis Analytics with SQL. It also discusses visualizing results using QuickSight.
3) The examples show how to model streaming data with schemas, apply windowing functions to aggregate data, detect anomalies with Lambda/SNS and enable real-time predictions with Machine Learning
This document provides an overview of Amazon Web Services (AWS) and its portfolio of cloud computing services. It discusses the breadth and depth of the AWS platform, which includes over 100 services across computing, storage, databases, analytics, mobile, developer tools, management tools, IoT, and enterprise applications. The document also summarizes several core AWS services like Amazon EC2 for virtual servers, Amazon S3 for object storage, Amazon DynamoDB for NoSQL databases, and Amazon RDS for SQL databases. It highlights how AWS services provide scalability, reliability, security, and low costs compared to on-premises infrastructure.
This document provides an introduction to high performance computing (HPC) on Amazon Web Services (AWS). It discusses what HPC is and how grids and clusters are commonly used to enable parallel processing. It also outlines a wide range of HPC applications and how they map to AWS features. Key factors that make AWS compelling for HPC are its scalability, agility, ability to enable global collaboration, and cost optimization options. Sample HPC architectures that can be implemented on AWS including grid and cluster computing are also presented.
Fortinet Automates Migration onto Layered Secure WorkloadsAmazon Web Services
A primary concern many of today’s organizations is how to securely migrate their data and workloads to the cloud. To mitigate these challenges, multi-layered protection needs to be in place at all points along the path of data: entering, exiting, and within the cloud. Join Fortinet and AWS to learn how you can enable robust and effective security for your AWS Cloud-based applications and services. Fortinet provides a comprehensive security solution for your hybrid workloads, allowing you to effectively secure your workloads with simplified, automated migration.
Join us to learn:
- The best practices for enabling visibility and control against advanced threats
- Identify and enable the right security architecture for your applications and services
- How to protect your data along each step of the migration process
Who should attend: CTOs, CIOs, CISOs, IT Administers, IT Architects and IT Security Engineers
This document provides an overview of database scaling strategies on AWS. It begins with a single EC2 instance hosting a full stack application and database. It then progresses through separating components, adding redundancy, implementing sharding and database federation to handle increasing user loads from 1 to over 1 million users. Key strategies discussed include moving to managed database services like RDS, adding read replicas, distributing load with services like S3, CloudFront, DynamoDB and SQS, and splitting databases by function or key using sharding or federation.
Session Sponsored by Tableau: Transforming Data Into Valuable InsightsAmazon Web Services
Session Sponsored by Tableau: Transforming Data Into Valuable Insights
Want to transform your data into valuable insights that can help make your business more productive, profitable and secure? Come learn about Splunk Cloud which delivers Operational Intelligence as a cloud service, enabling you to gain critical insights from your machine data without the need to manage any infrastructure.
Speaker: Jason Oakes, Sales Consultant, Tableau
Next-Generation Security Operations with AWS | AWS Public Sector Summit 2016Amazon Web Services
With the upswell of cloud adoption, many traditional infrastructure paradigms are shifting. Security is no different. The cloud service provider industry is discovering new ways to tackle security, including automation, bottomless logging, scalable analysis clusters, and pluggable security tools. This session presents a case study in extending a traditional infrastructure operation into AWS. We provide a practical look into the technical challenges and benefits of operating in this new paradigm, explore incident response automation (Alexa integration), and provide various examples of shifting an on-premises security operation to a scalable, hybrid model. Through lessons learned and analysis, we show why your data is safer in the cloud than in that rack you can touch in your data
AWS is hosting the first FSI Cloud Symposium in Hong Kong, which will take place on Thursday, March 23, 2017 at Grand Hyatt Hotel. The event will bring together FSI customers, industry professional and AWS experts, to explore how to turn the dream of transformation, innovation and acceleration into reality by exploiting Cloud, Voice to Text and IoT technologies. The packed agenda includes expert sessions on a host of pressing issues, such as security and compliance, as well as customer experience sharing on how cloud computing is benefiting the industry.
Speaker: Olivier Klein, Solutions Architect, AWS
1) The document discusses serverless real-time analytics using AWS services like Kinesis, Kinesis Firehose, Kinesis Analytics and Athena. It provides examples of how to ingest raw JSON data from sensors in real-time, convert to CSV, aggregate data and generate alerts.
2) The document compares Kinesis Streams and Firehose for ingesting streaming data and explains how to process and analyze real-time data using Kinesis Analytics with SQL. It also discusses visualizing results using QuickSight.
3) The examples show how to model streaming data with schemas, apply windowing functions to aggregate data, detect anomalies with Lambda/SNS and enable real-time predictions with Machine Learning
by Jon Handler, Principal Solutions Architect and Sanjay Dhar, Solutions Architect, AWS
Nearly everything in IT - servers, applications, websites, connected devices, and other things - generate discrete, time-stamped records of events called logs. Processing and analyzing these logs to gain actionable insights is log analytics. We'll look at how to use centralized log analytics across multiple sources with Amazon Elasticsearch Service.
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Amazon Web Services
Join us for this general session where AWS big data experts present an in-depth look at the current state of big data. Learn about the latest big data trends and industry use cases. Hear how other organizations are using the AWS big data platform to innovate and remain competitive. Take a look at some of the most recent AWS big data developments. Learn More: https://aws.amazon.com/government-education/
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesAmazon Web Services
Trusted Advisor and AWS Health can help automate best practices and operational health for AWS resources. Trusted Advisor provides recommendations across cost optimization, security, fault tolerance, and performance. It has provided over $500M in savings. AWS Health provides visibility into service health. The Personal Health Dashboard provides customized alerts and automations. Examples shown include notifying Slack of issues or stopping deployments using Lambda and CloudWatch Events when issues arise. Automation helps ensure operational best practices but safety measures are needed when automating actions on resources.
February 2016 Webinar Series - Introduction to AWS Database Migration ServiceAmazon Web Services
AWS Database Migration Service helps you migrate databases to AWS easily and securely with minimal downtime to the source database. AWS Database Migration Service can be used for both homogeneous and heterogeneous database migrations from on-premise to RDS or EC2 as well as EC2 to RDS.
In this webinar, we will provide an introduction to AWS Database Migration Service and go through the details of how you can use it today for your database migration projects. We will also discuss AWS Schema Conversion Tool that help you convert your database schema and code for cross database (heterogeneous) migrations.
Learning Objectives:
Understand what is AWS Database Migration Service
Learn how to start using AWS Database Migration Service
Understand homogenous and heterogeneous migrations
Learn about AWS Schema Conversions Tool
Who Should Attend:
IT Managers, DBAs, Solution Architects, Engineers and Developers
The document discusses Amazon Web Services (AWS) and why customers use AWS. It notes that the top reasons are: 1) Agility and scalability allowing customers to quickly scale resources, 2) The broad platform of services offered by AWS, and 3) AWS's ability to support innovation through frequent new feature and service launches. It provides examples of how customers from startups to large enterprises are using AWS across different industry domains.
Get Started Today with Cloud-Ready Contracts | AWS Public Sector Summit 2017Amazon Web Services
In this session, we provide an overview of existing cloud-ready contracts, such as cooperative, federal, and state directed contracts, and walk through steps on how to choose the right one for your procurement. We compare various cloud-ready contracts by identifying scope, end-user eligibility, and primary service offerings to help you make the right choice for your mission needs. Learn More: https://aws.amazon.com/government-education/
Optimizing the Data Tier for Serverless Web Applications - March 2017 Online ...Amazon Web Services
This document summarizes an AWS webinar on optimizing the data tier for serverless web applications. The webinar covered anatomy of serverless apps, data tier options on AWS including DynamoDB, RDS, and ElastiCache. It discussed NoSQL vs SQL considerations and best practices for using AWS Lambda with each data store. Additional best practices covered caching, retries, and event ordering for serverless architectures.
This document provides an overview of big data concepts and architectures, as well as AWS big data services. It begins with introducing big data challenges around variety, volume, and velocity of data. It then covers the Hadoop ecosystem including HDFS, MapReduce, Hive, Pig and Spark. The document also discusses data lake architectures and how AWS services like S3, Glue, Athena, EMR, Redshift, QuickSight can be used to build them. Specific services covered in more detail include Kinesis, MSK, Glue, EMR and Redshift. Real-world examples of big data usage are also presented.
AWS re:Invent 2016: IoT: Build, Test, and Securely Scale (GPST302)Amazon Web Services
With the rapid adoption of IoT services on AWS, how do partners and organizations effectively build, test, scale, and secure these highly transaction-data laden systems? This session is a deep dive on the API, SDK, device gateway, rules engine, and device shadows. Consulting and Technology Partner customers share their experiences as we highlight lessons learned and best practices to increase audience efficacy.
AWS re:Invent 2016: Case Study: Data-Heavy Healthcare: UPMCe’s Transformative...Amazon Web Services
Today's health care systems generate massive amounts of protected health information (PHI) — patient electronic health records, imaging, prescriptions, genomic profiles, insurance records, even data from wearable devices. In this session, UPMCe dives deep into two efforts: Their "Data Liberation Project"—a next-gen petabyte-scale software solution that provides responsible management of PHI within their own environments as well as externally, and “Neutrino” a real time medical document aggregator which utilizes natural language processing techniques to unlock hidden value from unstructured narratives. UPMC Enterprises (UPMCe), a division of University of Pittsburgh Medical Center, builds technology and invests in health care companies, from new startups to large established partners, with an eye toward revolutionizing healthcare. They embody the startup mentality with a focus on innovation and creating new data-heavy applications—all in support of new spin-off companies, furthering economic development, and disrupting healthcare. Join us to learn how they do security management and governance using Amazon S3, Amazon EC2, AWS Config, AWS CloudTrail, and other Amazon services help UPMCe think big about healthcare data in the public sector.
This document discusses best practices for building a data lake architecture on AWS. It recommends using Amazon S3 as the centralized data lake storage and decoupling storage from compute. This allows for cheaper, more efficient operation and the ability to evolve to clusterless analytics tools like Amazon Athena. The document provides guidance on security, ingestion, cataloging, cost optimization, analytics tools and building a sample pipeline to analyze data in the lake.
The document discusses Amazon Relational Database Service (Amazon RDS), which allows users to set up and manage relational databases in the cloud. Some key points:
- Amazon RDS provides a managed database service running MySQL, Oracle, SQL Server, PostgreSQL, MariaDB, or Aurora databases. It handles time-consuming administration tasks.
- Databases can be easily launched and scaled up or down as needed. Additional storage, compute power, and throughput can also be provisioned.
- Automated backups provide point-in-time recovery. Multi-AZ deployments provide high availability and durability. Security features include encryption and IAM access control.
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
This document discusses building a big data analytics data lake. It begins with an overview of what a data lake is and the benefits it provides like quick data ingestion without schemas and storing all data in one centralized location. It then discusses important capabilities like ingestion, storage, cataloging, search, security and access controls. The document provides an example of how biotech company AMGEN built their own data lake on AWS. It concludes with a demonstration of an AWS data lake solution package that can be deployed via CloudFormation to build an initial data lake.
Amazon Web Services offers a wide range of tools and features to help you to meet your security objectives. These tools mirror the familiar controls you deploy within your on-premises environments. Amazon Web Services provides security-specific tools and features across network security, configuration management, access control and data security. In addition, Amazon Web Services provides monitoring and logging tools to provide full visibility into what is happening in your environment. In this session, you will get introduced to the range of security tools and features that Amazon Web Services offers, and the latest security innovations coming from Amazon Web Services.
Andrew Watts-Curnow, Cloud Architect - Professional Services, ASEAN
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleAmazon Web Services
"With cloud maturity comes operational efficiencies and endless potential for innovation and business growth. However, the complexities of governing cloud infrastructure are impeding without the right strategy. Visibility, accountability, and actionable insights are some of the most invaluable considerations. The AWS cloud clearly enables convenience and cost savings for organizations that know how to leverage its full potential. Amazon EC2 Reserved Instances (RIs) in particular, present a tremendous opportunity when scaling to save significantly on capacity but there are many considerations to fully reaping the benefits of RIs. In this session, CloudCheckr CTO Patrick Gartlan will present issues that every organization runs into when scaling, provide best practices for how to combat them and help you show your boss how RIs help you save money and move faster.
This session is brought to you by AWS Summit New York City sponsor, CloudCheckr. "
by Gautam Srinivasan, Solutions Architect and Karan Desai, Solutions Architect, AWS
We'll import data from multiple sources and run analytics on our Data Lake. You’ll need a laptop with a Firefox or Chrome browser.
The document provides an overview of leading big data companies in 2021 and the Apache Hadoop stack, including related Apache software and the NIST big data reference architecture. It lists over 50 big data companies, including Accenture, Actian, Aerospike, Alluxio, Amazon Web Services, Cambridge Semantics, Cloudera, Cloudian, Cockroach Labs, Collibra, Couchbase, Databricks, DataKitchen, DataStax, Denodo, Dremio, Franz, Gigaspaces, Google Cloud, GridGain, HPE, HVR, IBM, Immuta, InfluxData, Informatica, IRI, MariaDB, Matillion, Melissa Data
When evaluating and planning migrating your data from on premises to the Cloud, you might encounter physical limitations. Amazon offers a suite of tools to help you surmount these limitations by moving data using networks, roads, and technology partners. In this session, we discuss how to move large amounts of data into and out of the Cloud in batches, increments, and streams.
This document discusses implementing Windows workloads on AWS. It provides a brief history of Windows support on AWS since 2008. It describes how line of business applications and corporate applications can be deployed on AWS through self-managed EC2 instances or managed RDS. It discusses why customers choose AWS for security, availability, performance, familiar environment, cost effectiveness and licensing options. The document concludes with next steps to get started with Windows workloads on AWS.
Emind’s Architecture for Enterprise with AWS IntegrationLahav Savir
This document outlines Emind's architecture for integrating enterprise systems with AWS. The key goals are to reduce wait times, enable easy scaling of computational pipelines, and provide access to cloud services. The architecture covers integrating billing, identity management, networking, security, applications, monitoring, and automation between on-premise systems and AWS. It also describes managed services and self-service options.
This document provides instructions for building an HPC cluster on AWS using the cfnCluster tool in 10 minutes. It discusses establishing an AWS account, pulling the cfnCluster source code from GitHub, generating SSH keys, configuring cfnCluster, and other configuration options to consider like using low-cost t2.micro instance types when first experimenting with cfnCluster functionality. The overall process demonstrated allows provisioning an HPC cluster within AWS that includes a head node and auto-scaling compute nodes connected over a 10G network, using CloudFormation templates managed by cfnCluster.
by Jon Handler, Principal Solutions Architect and Sanjay Dhar, Solutions Architect, AWS
Nearly everything in IT - servers, applications, websites, connected devices, and other things - generate discrete, time-stamped records of events called logs. Processing and analyzing these logs to gain actionable insights is log analytics. We'll look at how to use centralized log analytics across multiple sources with Amazon Elasticsearch Service.
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Amazon Web Services
Join us for this general session where AWS big data experts present an in-depth look at the current state of big data. Learn about the latest big data trends and industry use cases. Hear how other organizations are using the AWS big data platform to innovate and remain competitive. Take a look at some of the most recent AWS big data developments. Learn More: https://aws.amazon.com/government-education/
ENT314 Automate Best Practices and Operational Health for Your AWS ResourcesAmazon Web Services
Trusted Advisor and AWS Health can help automate best practices and operational health for AWS resources. Trusted Advisor provides recommendations across cost optimization, security, fault tolerance, and performance. It has provided over $500M in savings. AWS Health provides visibility into service health. The Personal Health Dashboard provides customized alerts and automations. Examples shown include notifying Slack of issues or stopping deployments using Lambda and CloudWatch Events when issues arise. Automation helps ensure operational best practices but safety measures are needed when automating actions on resources.
February 2016 Webinar Series - Introduction to AWS Database Migration ServiceAmazon Web Services
AWS Database Migration Service helps you migrate databases to AWS easily and securely with minimal downtime to the source database. AWS Database Migration Service can be used for both homogeneous and heterogeneous database migrations from on-premise to RDS or EC2 as well as EC2 to RDS.
In this webinar, we will provide an introduction to AWS Database Migration Service and go through the details of how you can use it today for your database migration projects. We will also discuss AWS Schema Conversion Tool that help you convert your database schema and code for cross database (heterogeneous) migrations.
Learning Objectives:
Understand what is AWS Database Migration Service
Learn how to start using AWS Database Migration Service
Understand homogenous and heterogeneous migrations
Learn about AWS Schema Conversions Tool
Who Should Attend:
IT Managers, DBAs, Solution Architects, Engineers and Developers
The document discusses Amazon Web Services (AWS) and why customers use AWS. It notes that the top reasons are: 1) Agility and scalability allowing customers to quickly scale resources, 2) The broad platform of services offered by AWS, and 3) AWS's ability to support innovation through frequent new feature and service launches. It provides examples of how customers from startups to large enterprises are using AWS across different industry domains.
Get Started Today with Cloud-Ready Contracts | AWS Public Sector Summit 2017Amazon Web Services
In this session, we provide an overview of existing cloud-ready contracts, such as cooperative, federal, and state directed contracts, and walk through steps on how to choose the right one for your procurement. We compare various cloud-ready contracts by identifying scope, end-user eligibility, and primary service offerings to help you make the right choice for your mission needs. Learn More: https://aws.amazon.com/government-education/
Optimizing the Data Tier for Serverless Web Applications - March 2017 Online ...Amazon Web Services
This document summarizes an AWS webinar on optimizing the data tier for serverless web applications. The webinar covered anatomy of serverless apps, data tier options on AWS including DynamoDB, RDS, and ElastiCache. It discussed NoSQL vs SQL considerations and best practices for using AWS Lambda with each data store. Additional best practices covered caching, retries, and event ordering for serverless architectures.
This document provides an overview of big data concepts and architectures, as well as AWS big data services. It begins with introducing big data challenges around variety, volume, and velocity of data. It then covers the Hadoop ecosystem including HDFS, MapReduce, Hive, Pig and Spark. The document also discusses data lake architectures and how AWS services like S3, Glue, Athena, EMR, Redshift, QuickSight can be used to build them. Specific services covered in more detail include Kinesis, MSK, Glue, EMR and Redshift. Real-world examples of big data usage are also presented.
AWS re:Invent 2016: IoT: Build, Test, and Securely Scale (GPST302)Amazon Web Services
With the rapid adoption of IoT services on AWS, how do partners and organizations effectively build, test, scale, and secure these highly transaction-data laden systems? This session is a deep dive on the API, SDK, device gateway, rules engine, and device shadows. Consulting and Technology Partner customers share their experiences as we highlight lessons learned and best practices to increase audience efficacy.
AWS re:Invent 2016: Case Study: Data-Heavy Healthcare: UPMCe’s Transformative...Amazon Web Services
Today's health care systems generate massive amounts of protected health information (PHI) — patient electronic health records, imaging, prescriptions, genomic profiles, insurance records, even data from wearable devices. In this session, UPMCe dives deep into two efforts: Their "Data Liberation Project"—a next-gen petabyte-scale software solution that provides responsible management of PHI within their own environments as well as externally, and “Neutrino” a real time medical document aggregator which utilizes natural language processing techniques to unlock hidden value from unstructured narratives. UPMC Enterprises (UPMCe), a division of University of Pittsburgh Medical Center, builds technology and invests in health care companies, from new startups to large established partners, with an eye toward revolutionizing healthcare. They embody the startup mentality with a focus on innovation and creating new data-heavy applications—all in support of new spin-off companies, furthering economic development, and disrupting healthcare. Join us to learn how they do security management and governance using Amazon S3, Amazon EC2, AWS Config, AWS CloudTrail, and other Amazon services help UPMCe think big about healthcare data in the public sector.
This document discusses best practices for building a data lake architecture on AWS. It recommends using Amazon S3 as the centralized data lake storage and decoupling storage from compute. This allows for cheaper, more efficient operation and the ability to evolve to clusterless analytics tools like Amazon Athena. The document provides guidance on security, ingestion, cataloging, cost optimization, analytics tools and building a sample pipeline to analyze data in the lake.
The document discusses Amazon Relational Database Service (Amazon RDS), which allows users to set up and manage relational databases in the cloud. Some key points:
- Amazon RDS provides a managed database service running MySQL, Oracle, SQL Server, PostgreSQL, MariaDB, or Aurora databases. It handles time-consuming administration tasks.
- Databases can be easily launched and scaled up or down as needed. Additional storage, compute power, and throughput can also be provisioned.
- Automated backups provide point-in-time recovery. Multi-AZ deployments provide high availability and durability. Security features include encryption and IAM access control.
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
This document discusses building a big data analytics data lake. It begins with an overview of what a data lake is and the benefits it provides like quick data ingestion without schemas and storing all data in one centralized location. It then discusses important capabilities like ingestion, storage, cataloging, search, security and access controls. The document provides an example of how biotech company AMGEN built their own data lake on AWS. It concludes with a demonstration of an AWS data lake solution package that can be deployed via CloudFormation to build an initial data lake.
Amazon Web Services offers a wide range of tools and features to help you to meet your security objectives. These tools mirror the familiar controls you deploy within your on-premises environments. Amazon Web Services provides security-specific tools and features across network security, configuration management, access control and data security. In addition, Amazon Web Services provides monitoring and logging tools to provide full visibility into what is happening in your environment. In this session, you will get introduced to the range of security tools and features that Amazon Web Services offers, and the latest security innovations coming from Amazon Web Services.
Andrew Watts-Curnow, Cloud Architect - Professional Services, ASEAN
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleAmazon Web Services
"With cloud maturity comes operational efficiencies and endless potential for innovation and business growth. However, the complexities of governing cloud infrastructure are impeding without the right strategy. Visibility, accountability, and actionable insights are some of the most invaluable considerations. The AWS cloud clearly enables convenience and cost savings for organizations that know how to leverage its full potential. Amazon EC2 Reserved Instances (RIs) in particular, present a tremendous opportunity when scaling to save significantly on capacity but there are many considerations to fully reaping the benefits of RIs. In this session, CloudCheckr CTO Patrick Gartlan will present issues that every organization runs into when scaling, provide best practices for how to combat them and help you show your boss how RIs help you save money and move faster.
This session is brought to you by AWS Summit New York City sponsor, CloudCheckr. "
by Gautam Srinivasan, Solutions Architect and Karan Desai, Solutions Architect, AWS
We'll import data from multiple sources and run analytics on our Data Lake. You’ll need a laptop with a Firefox or Chrome browser.
The document provides an overview of leading big data companies in 2021 and the Apache Hadoop stack, including related Apache software and the NIST big data reference architecture. It lists over 50 big data companies, including Accenture, Actian, Aerospike, Alluxio, Amazon Web Services, Cambridge Semantics, Cloudera, Cloudian, Cockroach Labs, Collibra, Couchbase, Databricks, DataKitchen, DataStax, Denodo, Dremio, Franz, Gigaspaces, Google Cloud, GridGain, HPE, HVR, IBM, Immuta, InfluxData, Informatica, IRI, MariaDB, Matillion, Melissa Data
When evaluating and planning migrating your data from on premises to the Cloud, you might encounter physical limitations. Amazon offers a suite of tools to help you surmount these limitations by moving data using networks, roads, and technology partners. In this session, we discuss how to move large amounts of data into and out of the Cloud in batches, increments, and streams.
This document discusses implementing Windows workloads on AWS. It provides a brief history of Windows support on AWS since 2008. It describes how line of business applications and corporate applications can be deployed on AWS through self-managed EC2 instances or managed RDS. It discusses why customers choose AWS for security, availability, performance, familiar environment, cost effectiveness and licensing options. The document concludes with next steps to get started with Windows workloads on AWS.
Emind’s Architecture for Enterprise with AWS IntegrationLahav Savir
This document outlines Emind's architecture for integrating enterprise systems with AWS. The key goals are to reduce wait times, enable easy scaling of computational pipelines, and provide access to cloud services. The architecture covers integrating billing, identity management, networking, security, applications, monitoring, and automation between on-premise systems and AWS. It also describes managed services and self-service options.
This document provides instructions for building an HPC cluster on AWS using the cfnCluster tool in 10 minutes. It discusses establishing an AWS account, pulling the cfnCluster source code from GitHub, generating SSH keys, configuring cfnCluster, and other configuration options to consider like using low-cost t2.micro instance types when first experimenting with cfnCluster functionality. The overall process demonstrated allows provisioning an HPC cluster within AWS that includes a head node and auto-scaling compute nodes connected over a 10G network, using CloudFormation templates managed by cfnCluster.
This document provides an overview of how Amazon Web Services (AWS) can be used for high-performance computing (HPC). It discusses how AWS's large scale and flexibility allows users to provision resources on demand and pay only for what they use. Tools like AWS CloudFormation and cfnCluster make it easy to provision HPC clusters in AWS that utilize services like EC2, EBS, and VPC.
AWS Educate is Amazon's global initiative to accelerate cloud learning for educational institutions, educators, and students. It provides four pillars of support: labs and open course content on AWS products and cloud topics, grants for free AWS services, educator collaboration tools, and communities for sharing best practices. The program is self-service, automated, and available globally in English. Over 1,300 educators and 380 institutions from over 50 countries have already joined, along with over 9,400 students. The initiative aims to graduate students ready for cloud-enabled careers.
El ensayo analiza la definición de constitución según Ferdinand Lassalle. Lassalle argumenta que una constitución es la suma de los factores reales de poder que rigen un estado, aunque estas fuerzas a menudo se omiten en las constituciones escritas. Explica que una constitución debe reflejar los intereses de todos los miembros de una sociedad al establecer los poderes del estado y los derechos de los ciudadanos. Concluye que una constitución es más que una simple ley escrita, sino la ley fundamental que debe organizarse con
Este documento describe los programas de mantenimiento predictivo, incluyendo su definición, ventajas, técnicas, desventajas y pasos para implementar un programa efectivo. Explica que el mantenimiento predictivo se basa en determinar el estado de la maquinaria en operación para detectar síntomas de fallas incipientes. Algunas técnicas incluyen análisis de aceite, vibraciones y temperaturas. Implementar un programa de mantenimiento predictivo puede incrementar la confiabilidad de la maquinaria y reducir costos.
Automating Compliance Defense in the Cloud - September 2016 Webinar SeriesAmazon Web Services
This document discusses how to automate compliance when using AWS cloud services. It recommends five steps: 1) Partner cloud technology and security experts; 2) Integrate industry standards and regulatory requirements; 3) Create a master design that meets requirements; 4) Enforce deployment according to the design; and 5) Mechanize scalable governance and auditing programs. Following best practices like leveraging CIS benchmarks, creating a "golden environment" configuration, and using AWS Service Catalog can help automate controls and achieve continuous compliance defense in the cloud.
Rackspace provides a comprehensive set of tooling and expertise on AWS that further unlocks your ability to secure your environment efficiently and cost effectively. The dynamic environment of data, applications, and infrastructure can pose challenges for businesses trying to manage security while following compliance regulations. To mitigate these challenges, businesses need a scalable security solution to ensure their data is safe, secure, and stable. In this webinar, Brad Schulteis, Jarret Raim and Todd Gleason will discuss the topic of security control requirements on AWS through the lens of three common compliance scenarios: HIPAA, PCI-DSS, and generalized security compliance based on the NIST Risk Management Framework. Watch our webinar to learn how Rackspace combines AWS and security expertise with tools like AWS CloudFormation, AWS CodeCommit and AWS CodeDeploy to help customers meet their security and compliance needs.
Join us to learn:
• Best practices for securely operating workloads on the AWS Cloud
• Architecting a secure environment for dynamic workloads
• How to incorporate Security by Design principles to address compliance needs across 3 use cases: HIPAA, PCI-DSS and generalized security compliance based on the NIST Risk Management Framework
Who should attend: Directors and Managers of Security, IT Administers, IT Architects, and IT Security Engineers
Automating Compliance Defense in the Cloud - Toronto FSI Symposium - October ...Amazon Web Services
Jodi Scrofani
Global Financial Services Compliance Strategist for AWS takes us on a journey of Security and Compliance mechanisms, that are mandatory in the Financial Services Industry, and explains how they are addressed by customers today on the AWS Cloud. She explains the AWS Shared Security Model, gives a detailed overview of audit and certifications achieved by AWS, and shows best practices and steps that FSI customers should take to ensure compliance and security.
This document provides guidance on how to architect applications on Amazon Web Services (AWS) to be compliant with the Health Insurance Portability and Accountability Act (HIPAA). It discusses how AWS services can be used to encrypt protected health information (PHI) at rest and in transit. It also addresses how AWS features support HIPAA requirements for auditing, backups, and disaster recovery. The document focuses on encryption capabilities of specific AWS services like Amazon EC2, Amazon S3, Amazon RDS, and how AWS Key Management Service can help manage encryption keys to encrypt PHI.
The document discusses various backup and recovery approaches using AWS services. It describes using AWS storage services like S3, Glacier and Storage Gateway for backup data storage. It outlines designing backup solutions for cloud-native, on-premises and hybrid environments. Specifically, it recommends using EBS snapshots to back up cloud-native EC2 instance volumes for block-level recovery of databases and file systems.
Journey Through the Cloud - Security Best Practices on AWSAmazon Web Services
Amazon Web Services (AWS) delivers a scalable cloud computing platform with high availability and dependability, offering flexibility for customers to build a wide range of applications. Helping to protect the security of our customers content is of utmost importance to AWS, as is maintaining customer trust and confidence. Under the AWS shared responsibility model, AWS provides a secure global infrastructure, including compute, storage, networking and database services, as well as a range of high level services.
AWS provides a range of security services and features that AWS customers can use to secure their content and meet their own specific business requirements for security. This presentation focuses on how you can make use of AWS security features to meet your own organization's security and compliance objectives.
Topics covered include:
• The AWS approach to security and how responsibilities are shared between AWS and our customers
• How to build your own secure virtual private cloud and integrate it with your existing solutions
• How to use AWS Identity and Access Management to securely manage and operate your applications
• Best practices for securing your AWS account, your content and your applications
View a recording of this webinar here: http://youtu.be/Ihe_8o00-WI
The document discusses best practices for the security pillar of the AWS Well-Architected Framework. It covers five areas of security: identity and access management, detective controls, infrastructure protection, data protection, and incident response. For identity and access management, the document emphasizes protecting AWS credentials through practices like multi-factor authentication, fine-grained authorization using IAM roles and policies, and integrating external identity providers. It also stresses automating security practices and protecting data at rest and in transit through encryption and classification.
This document discusses how AWS features can help customers achieve governance goals around managing IT resources, security, and performance. It outlines common IT governance domains and challenges with on-premise environments. For each domain, it provides the associated AWS features that enable governance, how they provide value, and links to learn more. The features can help customers more easily control costs, access, security and monitor resources compared to traditional on-premise IT environments.
The document discusses best practices for implementing detective controls in AWS, including capturing and analyzing logs and integrating auditing controls with notification and workflow. It recommends customizing AWS CloudTrail delivery to capture API activity globally and centralizing logs for storage and analysis. Services like Amazon GuardDuty and AWS Config can help monitor for threats or detect configuration changes. Notifications and workflows should be integrated to help security teams automatically respond to potential security events.
Updating Security Operations for the Cloud - AWS Symposium 2014 - Washington ...Amazon Web Services
Learn how to increase the effectiveness of your security operations as you move to the Cloud. We will discuss how your current incident response, monitoring, and audit response tactics have to change in the Cloud. Drawing from experiences helping clients move to the Cloud, industry research, and the 'school of hard knocks', this talk will help provide practical advice you can apply today. This session is recommended for technical users who want to know how the day-to-day work of securing their on-premises workloads should change when moving to the Cloud.
The AWS Shared Responsibility Model in PracticeAlert Logic
This document discusses the AWS shared responsibility model and how it divides security responsibilities between AWS and customers. It provides examples of how the responsibilities are divided for different types of AWS services, including infrastructure services, container services, and abstract services. It also promotes the security tools and services available in AWS that can help customers automate security tasks, gain visibility, and protect their infrastructure, data, and applications.
The AWS Shared Responsibility Model in PracticeAlert Logic
The document discusses security in the cloud with Amazon Web Services (AWS). It highlights that AWS provides tools to automate security, inherit global controls, and scale with visibility and control. It also discusses the shared responsibility model where AWS manages security of the cloud infrastructure and customers manage security in the cloud. Finally, it provides examples of AWS security services for identity and access management, detective controls, infrastructure security, data protection, and incident response.
AWS re:Invent 2016: Common Considerations for Data Integrity Controls in Heal...Amazon Web Services
This document provides an overview of common considerations for data integrity controls in healthcare and life sciences. It discusses key concepts like data integrity, applicable regulations, and risks to integrity. It also outlines 10 key data integrity controls and how AWS features and customer guidance can help address them, including risk-based design, access restrictions, audit trails, validation, and senior management responsibilities.
With the importance of cloud security, cloud professionals are widely choosing security career. If you are the one, you should go through these frequently asked AWS security interview questions and answers to land a job in AWS security.
Cloud security is one of the highly critical aspects related to the cloud in present times. More evolved threats are emerging every day, and qualified cloud security professionals are in very small numbers. Therefore, a career in AWS cloud security could be a trustworthy choice for many. If you want to go ahead with a career in AWS security, then you must be worried about AWS security interview questions.
https://www.infosectrain.com/blog/top-15-aws-security-interview-questions/
This document provides an introduction to auditing the use of AWS. It discusses key concepts such as the shared responsibility model between AWS and customers, and how auditors can leverage AWS tools and services for auditing. The document also outlines some common AWS assets that may need to be identified and inventoried for auditing purposes, such as EC2 instances, S3 buckets, and more. AWS account identifiers like account IDs and ARNs help uniquely identify resources for auditing.
Security Assurance and Governance in AWS (SEC203) | AWS re:Invent 2013Amazon Web Services
With the rapid increase of complexity in managing security for distributed IT and cloud computing, security, and compliance managers can innovate in how to ensure a high level of security is practiced to manage AWS resources. In this session, Chad Woolf, Director of Compliance for AWS will discuss which AWS service features can be leveraged to achieve a high level of security assurance over AWS resources, giving you more control of the security of your data and preparing you for a wide range of audits. Attendees will also learn first-hand what some AWS customers have accomplished by leveraging AWS features to meet specific industry compliance requirements.
AWS provides a range of security services and features that AWS customers can use to secure their content and applications and meet their own specific business requirements for security. This presentation focuses on how you can make use of AWS security features to meet your own organization's security and compliance objectives.
View a recording of the webinar based on this presentation on YouTube here: http://youtu.be/rXPyGDWKHIo
Similar to How to Secure Genomic Data in the Cloud (20)
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
How to Secure Genomic Data in the Cloud
1. Architecting for Genomic Data
Security and Compliance in AWS
Working with Controlled-Access Datasets from
dbGaP, GWAS, and other Individual-Level Genomic
Research Repositories
Angel Pizarro
Chris Whalley
December 2014
2. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 2 of 17
Table of Contents
Overview.................................................................................................................................... 3
Scope ........................................................................................................................................ 3
Considerations for Genomic Data Privacy and Security in Human Research ............................. 3
AWS Approach to Shared Security Responsibilities................................................................... 4
Architecting for Compliance with dbGaP Security Best Practices in AWS .................................. 5
Deployment Model.................................................................................................................. 6
Data Location ......................................................................................................................... 6
Physical Server Access .......................................................................................................... 7
Portable Storage Media.......................................................................................................... 7
User Accounts, Passwords, and Access Control Lists............................................................ 8
Internet, Networking, and Data Transfers ............................................................................... 9
Data Encryption.....................................................................................................................11
File Systems and Storage Volumes.......................................................................................13
Operating Systems and Applications .....................................................................................14
Auditing, Logging, and Monitoring .........................................................................................15
Authorizing Access to Data....................................................................................................16
Cleaning Up Data and Retaining Results...............................................................................17
Conclusion................................................................................................................................17
3. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 3 of 17
Overview
Researchers who plan to work with genomic sequence data on Amazon Web Services (AWS)
often have questions about security and compliance; specifically about how to meet guidelines
and best practices set by government and grant funding agencies such as the National
Institutes of Health. In this whitepaper, we review the current set of guidelines, and discuss
which services from AWS you can use to meet particular requirements and how to go about
evaluating those services.
Scope
This whitepaper focuses on common issues raised by Amazon Web Services (AWS) customers
about security best practices for human genomic data and controlled access datasets, such as
those from National Institutes of Health (NIH) repositories like Database of Genotypes and
Phenotypes (dbGaP) and genome-wide association studies (GWAS). Our intention is to provide
you with helpful guidance that you can use to address common privacy and security
requirements. However, we caution you not to rely on this whitepaper as legal advice for your
specific use of AWS. We strongly encourage you to obtain appropriate compliance advice about
your specific data privacy and security requirements, as well as applicable laws relevant to your
human research projects and datasets.
Considerations for Genomic Data Privacy and
Security in Human Research
Research involving individual-level genotype and phenotype data and de-identified controlled
access datasets continues to increase. The data has grown so fast in volume and utility that the
availability of adequate data processing, storage, and security technologies has become a
critical constraint on genomic research. The global research community is recognizing the
practical benefits of the AWS cloud, and scientific investigators, institutional signing officials, IT
directors, ethics committees, and data access committees must answer privacy and security
questions as they evaluate the use of AWS in connection with individual-level genomic data and
other controlled access datasets. Some common questions include: Are data protected on
secure servers? Where are data located? How is access to data controlled? Are data
protections appropriate for the Data Use Certification?
These considerations are not new and are not cloud-specific. Whether data reside in an
investigator lab, an institutional network, an agency-hosted data repository or within the AWS
cloud, the essential considerations for human genomic data are the same. You must correctly
implement data protection and security controls in the system by first defining the system
requirements and then architecting the system security controls to meet those requirements,
particularly the shared responsibilities amongst the parties who use and maintain the system.
4. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 4 of 17
AWS Approach to Shared Security
Responsibilities
AWS delivers a robust web services platform with features that enable research teams around
the world to create and control their own private area in the AWS cloud so they can quickly
build, install, and use their data analysis applications and data stores without having to
purchase or maintain the necessary hardware and facilities. As a researcher, you can create
your private AWS environment yourself using a self-service signup process that establishes a
unique AWS account ID, creates a root user account and account ID, and provides you with
access to the AWS Management Console and Application Programming Interfaces (APIs),
allowing control and management of the private AWS environment.
Because AWS does not access or manage your private AWS environment or the data in it, you
retain responsibility and accountability for the configuration and security controls you implement
in your AWS account. This customer accountability for your private AWS environment is
fundamental to understanding the respective roles of AWS and our customers in the context of
data protections and security practices for human genomic data. Figure 1 depicts the AWS
Shared Responsibility Model.
Figure 1 - Shared Responsibility Model
In order to deliver and maintain the features available within every customer’s private AWS
environment, AWS works vigorously to enhance the security features of the platform and ensure
that the feature delivery operations are secure and of high quality. AWS defines quality and
security as confidentiality, integrity, and availability of our services, and AWS seeks to provide
researchers with visibility and assurance of our quality and security practices in four important
ways.
5. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 5 of 17
First, AWS infrastructure is designed and managed in alignment with a set of internationally
recognized security and quality accreditations, standards and best-practices, including industry
standards ISO 27001, ISO 9001, AT 801 and 101 (formerly SSAE 16), as well as government
standards NIST, FISMA, and FedRAMP. Independent third parties perform accreditation
assessments of AWS. These third parties are auditing experts in cloud computing environments
and each brings a unique perspective from their compliance backgrounds in a wide range of
industries including healthcare, life sciences, financial services, government and defense, and
others. Because each accreditation carries a unique audit schedule, including continuous
monitoring, AWS security and quality controls are constantly audited and improved for the
benefit of all AWS customers including those with dbGaP, HIPAA, and other health data
protection requirements.
Second, AWS provides transparency by making these ISO, SOC, FedRAMP and other
compliance reports available to customers upon request. Customers can use these reports to
evaluate AWS for their particular needs. You can request AWS compliance reports at
https://aws.amazon.com/compliance/contact, and you can find more information on AWS
compliance certifications, customer case studies, and alignment with best practices and
standards at the AWS compliance website, http://aws.amazon.com/compliance/.
Third, as a controlled U.S. subsidiary of Amazon.com Inc., Amazon Web Services Inc.
participates in the Safe Harbor program developed by the U.S. Department of Commerce, the
European Union, and Switzerland, respectively. Amazon.com and its controlled U.S.
subsidiaries have certified that they adhere to the Safe Harbor Privacy Principles agreed upon
by the U.S., the E.U. and Switzerland, respectively. You can view the Safe Harbor certification
for Amazon.com and its controlled U.S. subsidiaries on the U.S. Department of Commerce’s
Safe Harbor website. The Safe Harbor Principles require Amazon and its controlled U.S.
subsidiaries to take reasonable precautions to protect the personal information that our
customers give us in order to create their account. This certification is an illustration of our
dedication to security, privacy, and customer trust.
Lastly, AWS respects the rights of our customers to have a choice in their use of the AWS
platform. The AWS Account Management Console and Customer Agreement are designed to
ensure that every customer can stop using the AWS platform and export all their data at any
time and for any reason. This not only helps customers maintain control of their private AWS
environment from creation to deletion, but it also ensures that AWS must continuously work to
earn and keep the trust of our customers.
Architecting for Compliance with dbGaP
Security Best Practices in AWS
A primary principle of the dbGaP security best practices is that researchers should download
data to a secure computer or server and not to unsecured network drives or servers.1
The
remainder of the dbGaP security best practices can be broken into a set of three IT security
control domains that you must address to ensure that you meet the primary principle:
1 http://www.ncbi.nlm.nih.gov/projects/gap/pdf/dbgap_2b_security_procedures.pdf
6. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 6 of 17
Physical Security refers to both physical access to resources, whether they are located in
a data center or in your desk drawer, and to remote administrative access to the underlying
computational resources.
Electronic Security refers to configuration and use of networks, servers, operating
systems, and application-level resources that hold and analyze dbGaP data.
Data Access Security refers to managing user authentication and authorization of access
to the data, how copies of the data are tracked and managed, and having policies and
processes in place to manage the data lifecycle.
Within each of these control domains are a number of control areas, which are summarized in
Table 1.
Table 1 - Summary of dbGaP Security Best Practices
Control
Domain
Control Areas
Physical
Security
Deployment Model
Data Location
Physical Server Access
Portable Storage Media
Electronic
Security
User Accounts, Passwords, and Access Control Lists
Internet, Networking, and Data Transfers
Data Encryption
File Systems and Storage Volumes
Operating Systems and Applications
Auditing, Logging And Monitoring
Data Access
Security
Authorizing Access to Data
Cleaning Up Data and Retaining Results
The remainder of this paper focuses on the control areas involved in architecting for security
and compliance in AWS.
Deployment Model
A basic architectural consideration for dbGaP compliance in AWS is determining whether the
system will run entirely on AWS or as a hybrid deployment with a mix of AWS and non-AWS
resources. This paper focuses on the control areas for the AWS resources. If you are
architecting for hybrid deployments, you must also account for your non-AWS resources, such
as the local workstations you might download data to and from your AWS environment, any
institutional or external networks you connect to your AWS environment, or any third-party
applications you purchase and install in your AWS environment.
Data Location
The AWS cloud is a globally available platform in which you can choose the geographic region
in which your data is located. AWS data centers are built in clusters in various global regions.
AWS calls these data center clusters Availability zones (AZs). As of December 2014, AWS
7. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 7 of 17
maintains 28 AZs organized into 11 regions globally. As an AWS customer you can choose to
use one region, all regions, or any combination of regions using built-in features available within
the AWS Management Console.
AWS regions and Availability Zones ensure that if you have location-specific requirements or
regional data privacy policies, you can establish and maintain your private AWS environment in
the appropriate location. You can choose to replicate and back up content in more than one
region, but you can be assured that AWS does not move customer data outside the region(s)
you configure.
Physical Server Access
Unlike traditional laboratory or institutional server systems where researchers install and control
their applications and data directly on a specific physical server, the applications and data in a
private AWS account are decoupled from a specific physical server. This decoupling occurs
through the built-in features of the AWS Foundation Services layer (see Figure 1 - Shared
Responsibility Model) and is a key attribute that differentiates the AWS cloud from traditional
server systems or even traditional server virtualization. Practically, this means that every
resource (virtual servers, firewalls, databases, genomic data, etc.) within your private AWS
environment is reduced to a single set of software files that are orchestrated by the
Foundational Services layer across multiple physical servers. Even if a physical server fails,
your private AWS resources and data maintain confidentiality, integrity, and availability. This
attribute of the AWS cloud also adds a significant measure of security, because even if
someone were to gain access to a single physical server, they would not have access to all the
files needed to recreate the genomic data within the your private AWS account.
AWS owns and operates its physical servers and network hardware in highly-secure, state-of-
the-art data centers that are included in the scope of independent third-party security
assessments of AWS for ISO 27001, Service Organization Controls 2 (SOC 2), NIST’s federal
information system security standards, and other security accreditations. Physical access to
AWS data centers and hardware is based on the least privilege principle, and access is
authorized only for essential personnel who have experience in cloud computing operating
environments and who are required to maintain the physical environment. When individuals are
authorized to access a data center they are not given logical access to the servers within the
data center. When anyone with data center access no longer has a legitimate need for it,
access is immediately revoked even if they remain an employee of Amazon or Amazon Web
Services.
Physical entry into AWS data centers is controlled at the building perimeter and ingress points
by professional security staff who use video surveillance, intrusion detection systems, and other
electronic means. Authorized staff must pass two-factor authentication a minimum of two times
to enter data center floors, and all physical access to AWS data centers is logged, monitored,
and audited routinely.
Portable Storage Media
The decision to run entirely on AWS or in a hybrid deployment model has an impact on your
system security plans for portable storage media. Whenever data are downloaded to a portable
device, such as a laptop or smartphone, the data should be encrypted and hardcopy printouts
controlled. When genomic data are stored or processed in AWS, customers can encrypt their
8. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 8 of 17
data, but there is no portable storage media to consider because all AWS customer data resides
on controlled storage media covered under AWS’s accredited security practices. When
controlled storage media reach the end of their useful life, AWS procedures include a
decommissioning and media sanitization process that is designed to prevent customer data
from being exposed to unauthorized individuals. AWS uses the techniques detailed in DoD
5220.22-M (“National Industrial Security Program Operating Manual”) or NIST 800-88
(“Guidelines for Media Sanitization”) to destroy data as part of the decommissioning process. All
decommissioned magnetic storage devices are degaussed and physically destroyed in
accordance with industry-standard practices.
For more information, see Overview of Security Processes.2
User Accounts, Passwords, and Access Control Lists
Managing user access under dbGaP requirements relies on a principle of least privilege to
ensure that individuals and/or processes are granted only the rights and permissions to perform
their assigned tasks and functions, but no more.3
When you use AWS, there are two types of
user accounts that you must address:
Accounts with direct access to AWS resources, and
Accounts at the operating system or application level.
Managing user accounts with direct access to AWS resources is centralized in a service called
AWS Identity and Access Management (IAM). After you establish your root AWS account using
the self-service signup process, you can use IAM to create and manage additional users and
groups within your private AWS environment. In adherence to the least privilege principle, new
users and groups have no permissions by default until you associate them with an IAM policy.
IAM policies allow access to AWS resources and support fine-grained permissions allowing
operation-specific access to AWS resources. For example, you can define an IAM policy that
restricts an Amazon S3 bucket to read-only access by specific IAM users coming from specific
IP addresses. In addition to the users you define within your private AWS environment, you can
define IAM roles to grant temporary credentials for use by externally authenticated users or
applications running on Amazon EC2 servers.
Within IAM, you can assign users individual credentials such as passwords or access keys.
Multi-factor authentication (MFA) provides an extra level of user account security by prompting
users to enter an additional authentication code each time they log in to AWS. dbGaP also
requires that users not share their passwords and recommends that researchers communicate
a written password policy to any users with permissions to controlled access data. Additionally,
dbGaP recommends certain password complexity rules for file access. IAM provides robust
features to manage password complexity, reuse, and reset rules.
How you manage user accounts at the operating system or application level depends largely on
which operating systems and applications you choose. For example, applications developed
specifically for the AWS cloud might leverage IAM users and groups, whereas you'll need to
assess and plan the compatibility of third-party applications and operating systems with IAM on
a case-by-case basis. You should always configure password-enabled screen savers on any
2 http://media.amazonwebservices.com/pdf/AWS_Security_Whitepaper.pdf
3 http://www.ncbi.nlm.nih.gov/projects/gap/pdf/dbgap_2b_security_procedures.pdf
9. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 9 of 17
local workstations that you use to access your private AWS environment, and configure virtual
server instances within the AWS cloud environment with OS-level password-enabled screen
savers to provide an additional layer of protection.
More information on IAM is available in the IAM documentation and IAM Best Practices guide,
as well as on the Multi-Factor Authentication page.
Internet, Networking, and Data Transfers
The AWS cloud is a set of web services delivered over the Internet, but data within each
customer’s private AWS account is not exposed directly to the Internet unless you specifically
configure your security features to allow it. This is a critical element of compliance with dbGaP
security best practices, and the AWS cloud has a number of built-in features that prevent direct
Internet exposure of genomic data.
Processing genomic data in AWS typically involves the Amazon Elastic Compute Cloud
(Amazon EC2). Amazon EC2 is a service you can use to create virtual server instances that run
operating systems like Linux and Microsoft Windows. When you create new Amazon EC2
instances for downloading and processing genomic data, by default those instances are
accessible only by authorized users within the private AWS account. The instances are not
discoverable or directly accessible on the Internet unless you configure them otherwise.
Additionally, genomic data within an Amazon EC2 instance resides in the operating system’s file
directory, which requires that you set OS-specific configurations before any data can be
accessible outside of the instance. When you need clusters of Amazon EC2 instances to
process large volumes of data, a Hadoop framework service called Amazon Elastic MapReduce
(Amazon EMR) allows you to create multiple, identical Amazon EC2 instances that follow the
same basic rule of least privilege unless you change the configuration otherwise.
Storing genomic data in AWS typically involves object stores and file systems like Amazon
Simple Storage Service (Amazon S3) and Amazon Elastic Block Store (Amazon EBS), as well
as database stores like Amazon Relational Database Service (Amazon RDS), Amazon Redshift,
Amazon DynamoDB, and Amazon ElastiCache. Like Amazon EC2, all of these storage and
databases services default to least privilege access and are not discoverable or directly
accessible from the Internet unless you configure them to be so.
Individual compute instances and storage volumes are the basic building blocks that
researchers use to architect and build genomic data processing systems in AWS. Individually
these building blocks are private by default and networking them together within the AWS
environment can provide additional layers of security and data protections. Using Amazon
Virtual Private Cloud (Amazon VPC), you can create private, isolated networks within the AWS
cloud where you retain complete control over the virtual network environment, including
definition of the IP address range, creation of subnets, and configuration of network route tables
and network gateways. Amazon VPC also offers stateless firewall capabilities through the use
of Network Access Control Lists (NACLs) that control the source and destination network traffic
endpoints and ports, giving you robust security controls that are independent of the
computational resources launched within Amazon VPC subnets. In addition to the stateless
firewalling capabilities of Amazon VPC NACLs, Amazon EC2 instances and some services are
launched within the context of AWS Security Groups. Security groups define network-level
stateful firewall rules to protect computational resources at the Amazon EC2 instance or service
10. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 10 of 17
layer level. Using security groups, you can lock down compute, storage, or application services
to strict subsets of resources running within an Amazon VPC subnet, adhering to the principal of
least privilege.
Figure 2 - Protecting data from direct Internet access using Amazon VPC
In addition to networking and securing the virtual infrastructure within the AWS cloud, Amazon
VPC provides several options for connecting to your AWS resources. The first and simplest
option is providing secure public endpoints to access resources, such as SSH bastion servers.
A second option is to create a secure Virtual Private Network (VPN) connection that uses
Internet Protocol Security (IPSec) by defining a virtual private gateway into the Amazon VPC.
You can use the connection to establish encrypted network connectivity over the Internet
between an Amazon VPC and your institutional network.
Lastly, research institutions can establish a dedicated and private network connection to AWS
using AWS Direct Connect. AWS Direct Connect lets you establish a dedicated, high-bandwidth
(1 Gbps to 10 Gbps) network connection between your network and one of the AWS Direct
Connect locations. Using industry standard 802.1q VLANs, this dedicated connection can be
partitioned into multiple virtual interfaces, allowing you to use the same connection to access
public resources such as objects stored in Amazon S3 using public IP address space, and
private resources such as Amazon EC2 instances running within an Amazon Virtual Private
Cloud (Amazon VPC) using private IP space, while maintaining network separation between the
public and private environments. You can reconfigure virtual interfaces at any time to meet your
changing needs.
1
1
2
2
3
3
dbGaP data in Amazon S3 bucket; accessible only by
Amazon EC2 instance within VPC security group.
Amazon EC2 instance hosts Aspera Connect
download software running within VPC security group
Amazon VPC network configured with private subnet
requiring SSH client, VPN gateway, or other
encrypted connection
Amazon S3 bucket
w/ dbGaP data EC2 instance w/
Aspera Connect
11. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 11 of 17
Using a combination of hosted and self-managed services, you can take advantage of secure,
robust networking services within a VPC and secure connectivity with another trusted network.
To learn more about the finer details, see our Amazon VPC whitepaper, the Amazon VPC
documentation, and the Amazon VPC Connectivity Options Whitepaper.
Data Encryption
Encrypting data in-transit and at rest is one of the most common methods of securing controlled
access datasets. As an Internet-based service provider, AWS understands that many
institutional IT security policies consider the Internet to be an insecure communications medium
and, consequently, AWS has invested considerable effort in the security and encryption features
you need in order to use the AWS cloud platform for highly sensitive data, including protected
health information under HIPAA and controlled access genomic datasets from the National
Institutes of Health (NIH). AWS uses encryption in three areas:
Service management traffic
Data within AWS services
Hardware security modules
As an AWS customer, you use the AWS Management Console to manage and configure your
private environment. Each time you use the AWS Management Console an SSL/TLS4
connection is made between your web browser and the console endpoints. Service
management traffic is encrypted, data integrity is authenticated, and the client browser
authenticates the identity of the console service endpoint using an X.509 certificate. After this
encrypted connection is established, all subsequent HTTP traffic, including data in transit over
the Internet, is protected within the SSL/TLS session. Each AWS service is also enabled with
application programming interfaces (APIs) that you can use to manage services either directly
from applications or third-party tools, or via Software Development Kits (SDK), or via AWS
command line tools. AWS APIs are web services over HTTPS and protect commands within an
SSL/TLS encrypted session.
Within AWS there are several options for encrypting genomic data, ranging from completely
automated AWS encryption solutions (server-side) to manual, client-side options. Your decision
to use a particular encryption model may be based on a variety of factors, including the AWS
service(s) being used, your institutional policies, your technical capability, specific requirements
of the data use certificate, and other factors. As you architect your systems for controlled access
datasets, it’s important to identify each AWS service and encryption model you will use with the
genomic data. There are three different models for how you and/or AWS provide the encryption
method and work with the key management infrastructure (KMI), as illustrated in Figure 3.
4 Secure Sockets Layer (SSL)/Transport Layer Security (TLS)
12. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 12 of 17
Customer Managed AWS Managed
Model A
Researcher manages the
encryption method and entire
KMI.
Model B
Researcher manages the
encryption method; AWS
provides storage component of
KMI while researcher provides
management layer of KMI.
Model C
AWS manages the encryption
method and the entire KMI.
Figure 2 - Encryption Models in AWS
In addition to the client-side and server-side encryption features built-in to many AWS services,
another common way to protect keys in a KMI is to use a dedicated storage and data
processing device that performs cryptographic operations using keys on the devices. These
devices, called hardware security modules (HSMs), typically provide tamper evidence or
resistance to protect keys from unauthorized use. For researchers who choose to use AWS
encryption capabilities for your controlled access datasets, the AWS CloudHSM service is
another encryption option within your AWS environment, giving you use of HSMs that are
designed and validated to government standards (NIST FIPS 140-2) for secure key
management.
If you want to manage the keys that control encryption of data in Amazon S3 and Amazon EBS
volumes, but don’t want to manage the needed KMI resources either within or external to AWS,
you can leverage the AWS Key Management Service (AWS KMS). AWS Key Management
Service is a managed service that makes it easy for you to create and control the encryption
keys used to encrypt your data, and uses HSMs to protect the security of your keys. AWS Key
Management Service is integrated with other AWS services including Amazon EBS, Amazon
S3, and Amazon Redshift. AWS Key Management Service is also integrated with AWS
CloudTrail, discussed later, to provide you with logs of all key usage to help meet your
regulatory and compliance needs. AWS KMS also allows you to implement key creation,
rotation, and usage policies. AWS KMS is designed so that no one has access to your master
keys. The service is built on systems that are designed to protect your master keys with
extensive hardening techniques such as never storing plaintext master keys on disk, not
persisting them in memory, and limiting which systems can connect to the device. All access to
update software on the service is controlled by a multi-level approval process that is audited and
reviewed by an independent group within Amazon.
KMI
Encryption Method
KMI
Encryption Method
KMI
Encryption Method
Key Storage
Key Management
Key Storage
Key Management
Key Storage
Key Management
13. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 13 of 17
As mentioned in the Internet, Network, and Data Transfer section of this paper, you can protect
data transfers to and from your AWS environment to an external network with a number of
encryption-ready security features, such as VPN.
For more information about encryption options within the AWS environment, see Securing Data
at Rest with Encryption, as well as the AWS CloudHSM product details page. To learn more
about how AWS KMS works you can read the AWS Key Management Service whitepaper5
.
File Systems and Storage Volumes
Analyzing and securing large datasets like whole genome sequences requires a variety of
storage capabilities that allow you to make use of that data. Within your private AWS account,
you can configure your storage services and security features to limit access to authorized
users. Additionally, when research collaborators are authorized to access the data, you can
configure your access controls to safely share data between your private AWS account and
your collaborator’s private AWS account.
When saving and securing data within your private AWS account, you have several options.
Amazon Web Services offers two flexible and powerful storage options. The first is Amazon
Simple Storage Service (Amazon S3), a highly scalable web-based object store. Amazon S3
provides HTTP/HTTPS REST endpoints to upload and download data objects in an Amazon S3
bucket. Individual Amazon S3 objects can range from 1 byte to 5 terabytes. Amazon S3 is
designed for 99.99% availability and 99.999999999% object durability, thus Amazon S3
provides a highly durable storage infrastructure designed for mission-critical and primary data
storage. The service redundantly stores data in multiple data centers within the Region you
designate, and Amazon S3 calculates checksums on all network traffic to detect corruption of
data packets when storing or retrieving data. Unlike traditional systems, which can require
laborious data verification and manual repair, Amazon S3 performs regular, systematic data
integrity checks and is built to be automatically self-healing.
Amazon S3 provides a base level of security, whereby default-only bucket and object owners
have access to the Amazon S3 resources they create. In addition, you can write security
policies to further restrict access to Amazon S3 objects. For example, dbGaP recommendations
call for all data to be encrypted while the data are in flight. With an Amazon S3 bucket policy,
you can restrict an Amazon S3 bucket so that it only accepts requests using the secure HTTPS
protocol, which fulfills this requirement. Amazon S3 bucket policies are best utilized to define
broad permissions across sets of objects within a single bucket. The previous examples for
restricting the allowed protocols or source IP ranges are indicative of best practices. For data
that need more variable permissions based on whom is trying to access data, IAM user policies
are more appropriate. As discussed previously, IAM enables organizations with multiple
employees to create and manage multiple users under a single AWS account. With IAM user
policies, you can grant these IAM users fine-grained control to your Amazon S3 bucket or data
objects contained within.
Amazon S3 is a great tool for genomics analysis and is well suited for analytical applications
that are purpose-built for the cloud. However, many legacy genomic algorithms and applications
cannot work directly with files stored in a HTTP-based object store like Amazon S3, but rather
need a traditional file system. In contrast to the Amazon S3 object-based storage approach,
5 https://d0.awsstatic.com/whitepapers/KMS-Cryptographic-Details.pdf
14. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 14 of 17
Amazon Elastic Block Store (Amazon EBS) provides network-attached storage volumes that
can be formatted with traditional file systems. This means that a legacy application running in an
Amazon EC2 instance can access genomic data in an Amazon EBS volume as if that data were
stored locally in the Amazon EC2 instance. Additionally, Amazon EBS offers whole-volume
encryption without the need for you to build, maintain, and secure your own key management
infrastructure. When you create an encrypted Amazon EBS volume and attach it to a supported
instance type, data stored at rest on the volume, disk I/O, and snapshots created from the
volume are all encrypted. The encryption occurs on the servers that host Amazon EC2
instances, providing encryption of data-in-transit from Amazon EC2 instances to Amazon EBS
storage. Amazon EBS encryption uses AWS Key Management Service (AWS KMS) Customer
Master Keys (CMKs) when creating encrypted volumes and any snapshots created from your
encrypted volumes. The first time you create an encrypted Amazon EBS volume in a region, a
default CMK is created for you automatically. This key is used for Amazon EBS encryption
unless you select a CMK that you created separately using AWS Key Management Service.
Creating your own CMK gives you more flexibility, including the ability to create, rotate, disable,
define access controls, and audit the encryption keys used to protect your data. For more
information, see the AWS Key Management Service Developer Guide.
There are three options for Amazon EBS volumes:
Magnetic volumes are backed by magnetic drives and are ideal for workloads where data
are accessed infrequently, and scenarios where the lowest storage cost is important.
General Purpose (SSD) volumes are backed by Solid-State Drives (SSDs) and are
suitable for a broad range of workloads, including small- to medium-sized databases,
development and test environments, and boot volumes.
Provisioned IOPS (SSD) volumes are also backed by SSDs and are designed for
applications with I/O-intensive workloads such as databases. Provisioned IOPs offer storage
with consistent and low-latency performance, and support up to 30 IOPS per GB, which
enables you to provision 4,000 IOPS on a volume as small as 134 GB. You can also
achieve up to 128MBps of throughput per volume with as little as 500 provisioned IOPS.
Additionally, you can stripe multiple volumes together to achieve up to 48,000 IOPS or
800MBps when attached to larger Amazon EC2 instances.
While general-purpose Amazon EBS volumes represent a great value in terms of performance
and cost, and can support a diverse set of genomics applications, you should choose which
Amazon EBS volume type to use based on the particular algorithm you're going to run. A benefit
of scalable on-demand infrastructure is that you can provision a diverse set of resources, each
tuned to a particular workload.
For more information on the security features available in Amazon S3, see the Access Control
and Using Data Encryption topics in the Amazon S3 Developer Guide. For an overview on
security on AWS, including Amazon S3, see Amazon Web Services: Overview of Security
Processes. For more information about Amazon EBS security features, see Amazon EBS
Encryption and Amazon Elastic Block Store (Amazon EBS).
Operating Systems and Applications
Recipients of controlled-access data need their operating systems and applications to follow
predefined configuration standards. Operating systems should align with standards, such as
15. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 15 of 17
NIST 800-53, dbGaP Security Best Practices Appendix A, or other regionally accepted criteria.
Software should also be configured according to application-specific best practices, and OS and
software patches should be kept up-to-date. When you run operating systems and applications
in AWS, you are responsible for configuring and maintaining your operating systems and
applications, as well as the feature configurations in the associated AWS services such as
Amazon EC2 and Amazon S3.
As a concrete example, imagine that a security vulnerability in the standard SSL/TLS shared
library is discovered. In this scenario, AWS will review and remediate the vulnerability in the
foundation services (see Figure 1), and you will review and remediate the operating systems
and applications, as well as any service configuration updates needed for hybrid deployments.
You must also take care to properly configure the OS and applications to restrict remote access
to the instances and applications. Examples include locking down security groups to only allow
SSH or RDP from certain IP ranges, ensuring strong password or other authentication policies,
and restricting user administrative rights on OS and applications.
Auditing, Logging, and Monitoring
Researchers who manage controlled access data are required to report any inadvertent data
release in accordance with the terms in the Data Use Certification, breach of data security, or
other data management incidents contrary to the terms of data access.
The dbGaP security recommendations recommend use of security auditing and intrusion
detection software that regularly scans and detects potential data intrusions. Within the AWS
ecosystem, you have the option to use built-in monitoring tools, such as Amazon CloudWatch,
as well as a rich partner ecosystem of security and monitoring software specifically built for
AWS cloud services. The AWS Partner Network lists a variety of system integrators and
software vendors that can help you meet security and compliance requirements. For more
information, see the AWS Life Science Partner webpage6
.
Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you
run on AWS. You can use Amazon CloudWatch to collect and track metrics, collect and monitor
log files, and set alarms. Amazon CloudWatch provides performance metrics on the individual
resource level, such as Amazon EC2 instance CPU load, and network IO, and sets up
thresholds on these metrics to raise alarms when the threshold is passed. For example, you can
set an alarm to detect unusual spikes in network traffic from an Amazon EC2 instance that may
be an indication of a compromised system. CloudWatch alarms can integrate with other AWS
services to send the alerts simultaneously to multiple destinations. Example methods and
destinations might include a message queue in Amazon Simple Queuing Service (Amazon
SQS) which is continuously monitored by watchdog processes that will automatically quarantine
a system; a mobile text message to security and operations staff that need to react to immediate
threats; an email to security and compliance teams who audit the event and take action as
needed.
Within Amazon CloudWatch you can also define custom metrics and populate these with
whatever information is useful, even outside of a security and compliance requirement. For
instance, an Amazon CloudWatch metric can monitor the size of a data ingest queue to trigger
6 http://aws.amazon.com/partners/competencies/life-sciences/
16. Amazon Web Services – Architecting for Genomic Data Security and Compliance in AWS December 2014
Page 16 of 17
the scaling up (or down) of computational resources that process data to handle variable rates
of data acquisition.
AWS CloudTrail and AWS Config are two services that enable you to monitor and audit all of
the operations against the AWS product API’s. AWS CloudTrail is a web service that records
AWS API calls for your account and delivers log files to you. The recorded information includes
the identity of the API caller, the time of the API call, the source IP address of the API caller, the
request parameters, and the response elements returned by the AWS service. With AWS
CloudTrail, you can get a history of AWS API calls for your account, including API calls made
via the AWS Management Console, AWS SDKs, command line tools, and higher-level AWS
services (such as AWS CloudFormation). The AWS API call history produced by AWS
CloudTrail enables security analysis, resource change tracking, and compliance auditing.
AWS Config builds upon the functionality of AWS CloudTrail, and provides you with an AWS
resource inventory, configuration history, and configuration change notifications to enable
security and governance. With AWS Config you can discover existing AWS resources, export a
complete inventory of your AWS resources with all configuration details, and determine how a
resource was configured at any point in time. These capabilities enable compliance auditing,
security analysis, resource change tracking, and troubleshooting.
Lastly, AWS has implemented various methods of external communication to support all
customers in the event of security or operational issues that may impact our customers.
Mechanisms are in place to allow the customer support team to be notified of operational and
security issues that impact each customer’s account. The AWS incident management team
employs industry-standard diagnostic procedures to drive resolution during business-impacting
events within the AWS cloud platform. The operational systems that support the platform are
extensively instrumented to monitor key operational metrics, and alarms are configured to
automatically notify operations and management personnel when early warning thresholds are
crossed on those key metrics. Staff operators provide 24 x 7 x 365 coverage to detect incidents
and to manage their impact and resolution. An on-call schedule is used so that personnel are
always available to respond to operational issues.
Authorizing Access to Data
Researchers using AWS in connection with controlled access datasets must only allow
authorized users to access the data. Authorization is typically obtained either by approval from
the Data Access Committee (DAC) or within the terms of the researcher’s existing Data Use
Certification (DUC).
Once access is authorized, you can grant that access in one or more ways, depending on
where the data reside and where the collaborator requiring access is located. The scenarios
below cover the situations that typically arise:
Provide the collaborator access within an AWS account via an IAM user (see User
Accounts, Passwords and Access Control Lists)
Provide the collaborator access to their own AWS accounts (see File Systems, Storage
Volumes, and Databases)
Open access to the AWS environment to an external network (see Internet, Networking, and
Data Transfers)