Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
Cloud On-Ramp Project
August 27th 2015Overview, Concepts and Capabilities
Contact: Robert McDermott rmcdermo@fredhutch.o...
Agenda
 High-level project goals
 Why AWS?
 Overview of AWS services
 Virtual Private Clouds (VPC) and networking
 Ac...
High-Level Project Goals
 Gain experience and competency operating securely in a public cloud
environment
 Design and im...
Cloud Basics
4
We Manage TheyManage
Server rooms
Physicalservers
Virtual servers
Network gear
Storage systems
Amazon AWS
R...
Why Amazon Web Services?
 Market Leader
o Overwhelming market share (Gartner)
o 10 times the compute capacity of all comp...
2014
2012
2013
2010
2015
2011
Gartner Magic Quadrant for IaaS 2010 - 2015
6
Are we locked in to AWS?
 We are not locked in; it’s possible to use other IaaS providers at the same time as
AWS if ther...
Overview of AWS Services
Used during project
Compute
Storageand ContentDelivery
Databases
Networking
Administration& Secur...
Overview of AWS Services
ResearchComputing
Compute
Storageand ContentDelivery
Databases
Networking
Administration& Securit...
AWS is similar to Lego Mindstorm…
Imagination
& Code
+ =
Imagination
& Code+ =
Lego Mindstorm Building Blocks A Sudoku Sol...
Region Location AZs
us-east-1 N. Virginia 5
us-west-1 N. California 3
us-west-2 Oregon 3
eu-west-1 Ireland 3
eu-central-1 ...
Availability Zone Basics
Availability
Zone B
Availability
Zone C
US-WEST-2: OregonRegion
Availability
Zone A
 Each region...
Virtual Private Cloud (VPC) Overview
 VPCs are customer defined private networks that provideisolationfrom other VPCs (ev...
Private Only
Subnets with
VPN connection
InternetInternet
Corporate Datacenter
High-Level Virtual Private Cloud Patterns
P...
Example Account, VPC, Peering and Billing Configurations
 Single account;single bill
 No isolation
 Very simple but no ...
Example of a Possible Account & VPC Architecture
Production
Environment
Account
Benefits
 Infrastructure testing in isola...
AWS VPC
Hutch Network
192.168.0.0/16
10.168.0.0/16
Internet
Hutch firewalls in
active/standby
configuration
Hutch Network
...
Hutch Network
192.168.0.0/16
10.111.0.0/16
Internet
VPN tunnel extends
the Hutch network
into the VPC.
External
Systems
Hu...
Networks
192.168.0.0/16
10.168.0.0/16
DC1
US-WEST-2A: 172.16.160.x
AD Site: AWS_USW2
DC2
DC3
DC-USW2A
DC-USW2B
AD Site: SE...
 Tags are key/value annotationsthat can be attachedto every type of object in AWS
 Tags are used for inventory,security,...
Metadata Tagging Scheme Example
Name : skyshield
InstanceId : i-32b8edc1
InstanceType : m3.medium
ImageId : ami-b9c98181
S...
Cost Accounting During the Project
 79.7% of AWS costs on averagecould be directly tied to an owner/department
 After ac...
Cost Reporting and Potential Chargebacks
owner invoice_date bill
----- ------------ -------
_adm/custserv 2015-07-01 $401....
Elastic Compute Cloud (EC2) Overview
 EC2 is Amazon’svirtual server (instance) service
 38 instances types are available...
EC2: On-Demand Pricing
 Zero upfront costs with no long term commitments
 Charged hourly (fractional hours, rounded up) ...
EC2: Reserved Instances
 Reserved instances are commitments for 1 or 3 years and provide guaranteed availability
 All up...
EC2: Spot Market Pricing
 Save up to 90% by bidding on unused capacity
 Spot instances are functionallyidenticalto on-de...
Infrastructure as
Code
28
Infrastructure as Code
 Networks, servers, storage, security, databases, monitoring, etc… can be defined in code
 Code “...
Infrastructure as Code Demo
Q: Management
How long would it take to migrate the Center's public web site to
the cloud whil...
Infrastructure as Code Example
This code does the following:
 Creates a new VPC wherever you like
 Creates 2 public subn...
Availability Zone
Public Subnet
Private subnet
Public Security Group
security group
Web
Server
NAT
GitHub
Availability Zon...
Custom Cloud Automation Code
 EC2 instance provisioningto build and configure Windows and Linux instances
o Tags instance...
Custom Cloud Automation Code (cont)
34
 EC2 backups
o Finds all volumes that are tagged with a backup retention and snaps...
Cloud Security
35
Amazon AWS Certifications and Accreditations
PCI DSS Level1, SOC 1/ ISAE 3402,SOC 2, SOC 3, FIPS 140-2,CSA. FedRAMP,
DIACA...
Proposed Security Choices
37
 What we think are good ideas based on what we learned
during the project
 See appendix for...
Proposed Security Choice Highlights
38
 2-factor authentication for AWS administrator accounts with
passwords
 For servi...
Security-wise, AWS adoption brings…
39
 Potential benefits
 Challenges and uncertainties
 For certain areas of IT opera...
Potential Security Benefits
40
 Complete audit trail of infrastructure access & changes
 Improved detection and alerting...
Potential Security Benefits (cont)
41
 Relatively easy to compartmentalize IT resources with
well-defined technical and a...
Security Challenges and Uncertainties
42
 The “everything-as-code” paradigm bundles the different layers of
the IT stack ...
Security Challenges and Uncertainties (cont)
43
 Rapid evolution of AWS features—long-term investment in
staff time for l...
Security - No Change
44
 OS and application patching (but we may end up maintaining
fewer servers, if we purchase things ...
Cloud Computing for
Scientific Applications
45
Ad-Hoc Capacity
46
 When compute capacity needs(cores, memory,
storage) exceeds in-house
o Reduce time-to-solution
o Scal...
Ad-Hoc Capability
47
 Use of technologies not currently available in-
house
o GPU
o Low-latency interconnect (AWS “enhanc...
Sandbox
48
 Provide a sandboxfor prototyping and
evaluation
o Easily provisioned ephemeral environment
o Allows researche...
Container Solutions
49
 Containers are
o “a server-virtualization method where the kernel of
an operating system allows f...
Science DMZ
 Transferring data into the cloud is
free
 Transferring data outthe cloud is
chargedby the GB
 Download lar...
Collaboration
51
 Environmentsproviding compute, application,and storage for collaborations
between Hutch and others
o Re...
Meet-Me
52
 A self-contained VPC for collaboration
o Custom environment
o Isolated from other Hutch
resources
o Limits ne...
Illumina Data for External Customers
 Upload from HiSeq into S3
(implemented today)
 Processing in EC2
 Download by cus...
 The Proteomics Lab is currently testing Proteome Discoverer in the cloud
 Run time with current local system takes 150 ...
AWS HIPAA BAA Details
55
 We must identify the AWS account IDs that we want covered by the BAA
 We are responsible for i...
NIH Security Best Practices for Controlled-Access Data Subject to the NIH
GenomicData Sharing (GDS) Policy *
 Information...
AWS Support Options
57
Basic Developer Business Enterprise
AWS Trusted
Advisor
4 checks 4 checks 41 checks 41 checks
Acces...
 An opportunity to build a high security computing environment for our current and
future security needs.
 A complete, u...
AWS Challenges
59
 The highly abstractednature of the cloud combined with the infrastructure as code
paradigm results in ...
Not Everything Works Well in the Cloud
 Is physically connectedto an instrument
 Requires a licensing “dongle”
 Has ver...
Capabilities Gained During This Project
 Create complex cloud based datacenters and networks
 Logically integrate cloud ...
FHCRC AWS Roadmap
 Determine cloud operations model (who and what)
 Develop cloud governance model
 Select production a...
Key Takeaways
 The cloud is no longer just hype, it’s a very capable, mature
platform that can offer increased agility, f...
Thank You!
64
Questions?
Contact: Robert McDermott rmcdermo@fredhutch.org
Appendix
65
Proposed Security Choices
66
Identity and Access Management for AWS
 API-based access to AWS for automation tasks should ...
Proposed Security Choices (cont)
67
VPC-Level Security
 The permissions to modify VPC-level configurations mustbe limited...
VPC Peering Example
[10.99.1.50]$ traceroute 192.168.1.156
1 192.168.1.156 1.472 ms
[10.99.1.50]$ curl http://192.168.1.15...
albite01.fhcrc.org
DNS Views
Internal: 192.168/16, 10.168/16, 172.16/16 (AWS VPC)
External: !(internal)
AWS VPC Resolv.con...
Project Scope
In Scope
 Develop functionaland security requirements
 Design and implement virtual datacenter architectur...
 Secure, Logical network extension of FHCRC IP space into the cloud
 Encrypted transport between FHCRC and cloud
 Desig...
Upcoming SlideShare
Loading in …5
×

Cloud On-Ramp Project Briefing

4,158 views

Published on

An overview of our cloud computing on-ramp project, what we've learned about the cloud and the capabilities we've gained during this project.

Published in: Technology
  • Celebrated pianist Scott Henderson says: "I am thoroughly impressed by the system's ability to multiply your investment! ◆◆◆ http://t.cn/A6zP24pL
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/2u6xbL5 ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ❤❤❤ http://bit.ly/2u6xbL5 ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • If you want a girl to "chase" you, then you have to use the right "bait". We discovered 4 specific things that FORCE a girl to chase after you and try to win YOU over. copy and visiting... ◆◆◆ http://t.cn/AijLRbnO
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • How to start a wildly profitable 7 figure marketing business and get your first commission check tonight, click here ■■■ http://ishbv.com/j1r2c/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Cloud On-Ramp Project Briefing

  1. 1. 1 Cloud On-Ramp Project August 27th 2015Overview, Concepts and Capabilities Contact: Robert McDermott rmcdermo@fredhutch.org
  2. 2. Agenda  High-level project goals  Why AWS?  Overview of AWS services  Virtual Private Clouds (VPC) and networking  Account and VPC options  Metadata tagging  Cost accountability and reporting  Elastic Compute Cloud (EC2) overview  Infrastructure as code  FHCRC directory services integration (AD/DNS)  Cloud security overview  Cloud computing for scientific applications overview  AWS support options  HIPAA BAA details  Cloud Benefits  Cloud Challenges  Capabilities we gained during this project  high-level AWS roadmap 2
  3. 3. High-Level Project Goals  Gain experience and competency operating securely in a public cloud environment  Design and implement a cloud based virtual datacenter  Logically extend the Center’s internal IP network to secured subnets in the cloud datacenter  Explore various use cases (Servers, HPC, application hosting, database hosting, etc…)  Stand up at least one production server/service by the conclusion of this project  Develop a roadmap for future use and enhancements of the architecture  Gain operational flexibility to respond quickly to emergingneeds 3
  4. 4. Cloud Basics 4 We Manage TheyManage Server rooms Physicalservers Virtual servers Network gear Storage systems Amazon AWS Rackspace Microsoft Azure Google Compute Office 365 DNAnexus DropBox Google Docs Amazon AWS Microsoft Azure Google App Engine Heroku
  5. 5. Why Amazon Web Services?  Market Leader o Overwhelming market share (Gartner) o 10 times the compute capacity of all competitors combined (Gartner)  Greatestbreadth and depth of services  Maturity: IaaS service launched in 2006 o Microsoft announces Azure IaaS (VMs) service preview in June 7, 2012 o Google Compute Engine limited preview started June 28, 2012  Rapid pace of innovation: 449 new services and features roll outs in 2014 alone  Cost competitive o 48 Price reductions since 2006 o Reserved instances and the spot market  Broad Adoption o Government, Financial, Healthcare, Education, Research, Entertainment, etc… ,  Security Certifications o FISMA, HIPAA, SOC, PCI DSS, ITAR, DOD CSM, FedRAMP, ISO, FERPA, etc…  Mostreliable cloud provider (CloudHarmony.com) o AWS: 2.4 hours of total downtime in 2014 (zero downtime in our region) o Google Compute Engine: 4.4 hours total downtime in 2014 o Azure: 39.6 hours of total downtime in 2014 Why AWS? Why Not? 5
  6. 6. 2014 2012 2013 2010 2015 2011 Gartner Magic Quadrant for IaaS 2010 - 2015 6
  7. 7. Are we locked in to AWS?  We are not locked in; it’s possible to use other IaaS providers at the same time as AWS if there are compellingreasons to do so.  Using both Microsoft Office 365 (SaaS) and AmazonAWS (IaaS) at the same time will work perfectly with no overlap in effort or waste; It’s likely even a great strategy – MicrosoftOffice365 for SaaS and Amazon AWS for IaaS.  Dumping AWS to move to another IaaS provider (Azure, Google, Rackspace) would require reworking everything (networking, automation, chargebacks, etc…). It’s possible, but it will be a lot of work to attempt to recreate the architecture. General Electric COO of IT Chris Drumgoole …we really view ourselves to be a service provider to our businesses, so our businesses can buy [AWS cloud services] from us or they can buy from others. You can go to Amazon directly or you can go to Azure directly. If you want to come through me, by definition, you’re going to live and operate in this safe environment. I have already taken care of the things that GE holds dear and our requirements around regulation, security, data privacy and so on. I pre-built and pre-instrumented the environment so that those things are not something you have to worry about. That’s the benefit of coming to me. If you decide to go on your own, you certainly can. We’re never going to stop you, but understand that now those things are on you and you have to take care of them. http://www.infoworld.com/article/2824508/cloud-computing/ges-head-of-it-were-going-all-in-with-the-public-cloud.html 7
  8. 8. Overview of AWS Services Used during project Compute Storageand ContentDelivery Databases Networking Administration& Security Deployment& Management Analytics Enterprise Applications Mobile Services Application Services 8
  9. 9. Overview of AWS Services ResearchComputing Compute Storageand ContentDelivery Databases Networking Administration& Security Deployment& Management Analytics Enterprise Applications Mobile Services Application Services 9
  10. 10. AWS is similar to Lego Mindstorm… Imagination & Code + = Imagination & Code+ = Lego Mindstorm Building Blocks A Sudoku Solving Robot Amazon AWS Building Blocks A $5.5 billion in annualrevenueservice that consumes37% of the Internettraffic at peak 10
  11. 11. Region Location AZs us-east-1 N. Virginia 5 us-west-1 N. California 3 us-west-2 Oregon 3 eu-west-1 Ireland 3 eu-central-1 Frankfurt 2 ap-southeast-1 Singapore 2 ap-southeast-2 Sydney 2 ap-northeast-1 Tokyo 3 sa-east-1 Sao Paulo 3 us-gov-west-1 Pacific northwest 2 cn-north-1 Beijing 2 Atlanta, GA - Ashburn, VA (3) - Dallas/Fort Worth, TX (2) - Hayward, CA - Jacksonville, FL - Los Angeles, CA (2) - Miami, FL - New York, NY (3) - Newark, NJ - Palo Alto, CA - San Jose, CA - Seattle, WA - South Bend, IN - St. Louis, MO - Amsterdam, The Netherlands (2) - Dublin, Ireland - Frankfurt, Germany (3) - London, England (3) - Madrid, Spain - Marseille, France - Milan, Italy - Paris, France (2) - Stockholm, Sweden - Warsaw, Poland - Chennai, India - Hong Kong (2) - Mumbai, India - Manila, the Philippines - Osaka, Japan - Seoul, Korea (2) - Singapore (2) - Taipei, Taiwan - Tokyo, Japan (2) - Australia - Melbourne, Australia - Sydney, Australia - São Paulo, Brazil - Rio de Janeiro, Brazil In addition to regions and zones there are currently 64 edge locations Regions and Availability Zones 11
  12. 12. Availability Zone Basics Availability Zone B Availability Zone C US-WEST-2: OregonRegion Availability Zone A  Each region has at least 2 availability zones  Each availability zone is in a separate location miles apart that shares nothing with other zones  Latency between availabilityzones in the same region is less than 2ms  Systems requiring high-availability should be designed to take advantageof multiple AZs  The elastic network load balancers(ELBs) live at the region level across all AZs  A typical HA design pattern is shown here: ELB App App DB DB App.com AZ A AZ B 12
  13. 13. Virtual Private Cloud (VPC) Overview  VPCs are customer defined private networks that provideisolationfrom other VPCs (even your own) and other customers.  A VPC can only reside in one region, but can span all AZs in that region.  A VPC is subdividedinto subnets.  Subnets are locatedin AZs; subnets can’t span across multipleAZs.  Subnets can be completely isolated, connectedto the internet or connected to your corporate network via a VPN or direct connectionoffered by a number of AWS partners  VPCs can be peered with other VPCs, even other customers VPCs for collaboration.  The VPC service provides the following buildingblocks to design a network to fit your needs: VPCs Routers Internet Gateways Customer Gateways VPNs Virtual Private Gateways VPC Peering Subnets Route Tables DHCP Network ACLs Security Groups Stateful Firewall NAT NAT 13
  14. 14. Private Only Subnets with VPN connection InternetInternet Corporate Datacenter High-Level Virtual Private Cloud Patterns Public Only Subnets Public and Private Subnets Public and Private Subnets with VPN connection Corporate Datacenter Internet 14 A B C D
  15. 15. Example Account, VPC, Peering and Billing Configurations  Single account;single bill  No isolation  Very simple but no flexibility Account VPC Account VPC 1 VPC 2 VPC 3 VPC 4 Account 1 VPC 1 VPC 2 VPC 3 VPC 4 Account 2  Single account;single bill  Isolation between environments  No accountisolation  Still simple with some flexibility  Multiple accounts;single bill  High level of isolation  Intra accountVPC peering  Flexible butmoderately complex Billing  Multiple accounts;separate billing  High level of isolation  Intra and Inter organization peering  Mostflexible but very complex Account 1 VPC 1 Account 2 VPC 2 Org B AccountVPC X peer peer peer 15 A B C D Account 3 VPC Z
  16. 16. Example of a Possible Account & VPC Architecture Production Environment Account Benefits  Infrastructure testing in isolatedtest account  Network level isolationbetween environmentsfor flexibilityand some independence  Single consolidatedbillcovering both accounts Test VPC EnterpriseVPC ResearchVPC High Security VPC Collaboration VPC peer Test Environment Account Test VPC EnterpriseVPC ResearchVPC High Security VPC Collaboration VPC peer ConsolidatedBilling Org AVPC Org BVPC Dev/testsystems Administrativecomputing ResearchComputing Sensitive systems (PHI) Scientific collaboration 16
  17. 17. AWS VPC Hutch Network 192.168.0.0/16 10.168.0.0/16 Internet Hutch firewalls in active/standby configuration Hutch Network 192.168.0.0/16 10.168.0.0/16 Internet Current Topology Hutch Network extended to our AWS VPC Firewall Instance filters and logs all traffic in and out of the VPC via the AWS Internet Gateway. Virtual systems residing in VPC subnets are referred to as Instances of the machine images from which they are launched. AWS Internet Gateway provides direct access between the VPC and the Internet (avoids having to route traffic through the Hutch campus – AKA “hairpinning”). Extension of the Hutch Network Traffic between the Hutch network and our VPC must pass through the Center’s Internet firewall, which is an endpoint for the VPN connection between the two networks. Cloud On-Ramp Datacenter 17
  18. 18. Hutch Network 192.168.0.0/16 10.111.0.0/16 Internet VPN tunnel extends the Hutch network into the VPC. External Systems Hutch firewalls serve as the VPN Tunnel Endpoint (Customer Gateway) Cloud On-Ramp – Virtual Private Cloud (VPC) 172.16.0.0/16 + AWS Elastic IP (EIP) assigned to Firewall us-west-2a us-west-2b us-west-2c Internal Subnet 172.16.160.0/24 Instances directly accessible via Hutchnet AWS Virtual Private Gateway Internal Subnet 172.16.168.0/24 Instances directly accessible via Hutchnet Internal Subnet 172.16.176.0/24 Instances directly accessible via Hutchnet Firewall Inside 172.16.8.0/24 Connectivity from internal Hutch systems in the VPC to the Internet via the Firewall / UTM Instance and AWS Internet Gateway Notes: • Internal subnets are logical extensions of the Hutch network. • Instances in the internal subnets can communicate freely with other systems in the Hutch network via the VPN tunnel using their private IP addresses (172.16.x.x). • The Firewall Outside subnet is the only subnet with direrct access to the AWS Internet Gateway • The Firewall is a FortiGate instance with interfaces in both the Firewall Outside and Inside subnets. It can communicate with any VPC subnet via its inside interface. • The Firewall performs NAT for outbound-initiated Internet access. AWS Internet Gateway Post Phase 1Post Phase 1Post Phase 1 Firewall Outside 172.16.0.0/24 FW Cloud On-Ramp Datacenter – Detailed 18 Server for software updates
  19. 19. Networks 192.168.0.0/16 10.168.0.0/16 DC1 US-WEST-2A: 172.16.160.x AD Site: AWS_USW2 DC2 DC3 DC-USW2A DC-USW2B AD Site: SELU Networks 172.16.0.0/16 US-WEST-2B: 172.16.168.x Intra Site Replication  Create a new AD “site” named for the AWS region “AWS_USW2”  Create two new domaincontrollersin AWS; each in a different availabilityzone  This architecture was tested during the project (twice) using test domains Active Directory Cloud Integration Design 19
  20. 20.  Tags are key/value annotationsthat can be attachedto every type of object in AWS  Tags are used for inventory,security, cost accounting, backups and automation Tag Restrictions  Maximum number of tags per resource: 10  Maximum key length: 127 Unicode characters  Maximum value length: 255 Unicode characters  Tag keys and values are case sensitive. Mandatory Cloud On-Ramp Tagging Scheme  Name: name of the server  Owner: department/customer that owns/pays for the server <_div/dept>  Technical_contact: who provides technical support for this system; who to send alerts and reports  Billing_contact: who to send chargeback invoices to  Description: short description of the servers purpose  SLE: business_hours=? / grant_critical=? / publicly_accessible=? Metadata Tagging Scheme 20
  21. 21. Metadata Tagging Scheme Example Name : skyshield InstanceId : i-32b8edc1 InstanceType : m3.medium ImageId : ami-b9c98181 State : running PrivateIpAddress : 172.16.0.18 PublicIpAddress : 52.16.139.222 SecurityGrps : sg-8e7221e1 AvailabilityZone : us-west-2a SubnetId : subnet-29549451 VpcId : vpc-f3e23491 owner : _adm/infosec technical_contact : xzy@fredhutch.org billing_contact : abc@fredhutch.org description : firewall - inside interface in dedicated subnet sle : business_hours=24x7 / grant_critical=no / publicly_accessible=no Tenancy : default LaunchTime : 6/3/2015 1:34:59 PM KeyName : cloud on-ramp test keypair Platform : linux 21
  22. 22. Cost Accounting During the Project  79.7% of AWS costs on averagecould be directly tied to an owner/department  After accountingfor sales taxand a proportionalamountof AWS support costs we would have been able to assign 89.2% of AWS costs to owners/departments for potential chargebacks in the future  Strict tagging of servers, network interfaces, volumes, snapshots, etc… is critical  Resources attachedto servers (volumes & NICs) need to automaticallyinherit tags from their parent to ensure all costs are captured  Tag creation, maintenanceand enforcement needs to be fully automated End of Month Taxes Daily Spend Report –Invoicedvs. Chargebacks Excessive “Leakage” 22
  23. 23. Cost Reporting and Potential Chargebacks owner invoice_date bill ----- ------------ ------- _adm/custserv 2015-07-01 $401.96 _adm/iops 2015-07-01 $370.46 _adm/solarch 2015-07-01 $213.45 _adm/infosec 2015-07-01 $102.43 _adm/ess 2015-07-01 $77.04 _adm/scicomp 2015-07-01 $1.71 Chargebacks: Monthly Charges by Owner Tag Monthly Charges by Owner Tag – Last Month vs This Month Pulled via API 23
  24. 24. Elastic Compute Cloud (EC2) Overview  EC2 is Amazon’svirtual server (instance) service  38 instances types are availableranging from 10% of a CPU core and 1GB of RAM to 40 CPU cores and 244GB of RAM.  General purpose, fractionalCPU (burstable), compute optimized, memory optimized, storage optimized (both high IO and high density) and GPU instance classes.  On-demand, Reserved and Spot market pricing options  Some instances come with “free” ephemeral storage  General purpose SSD EBS volumes up to 16TB each  Provisioned IOPs volumes can provide up to 20,000 IO operationsper second per volume  Shared or dedicatedtenancy models available Instances SSD IOPs Snap Auto Scaling AMI (Images) Network Interface Mag Magnetic Disks GP SSD Disks Provisioned IOps Disks Encrypted Disks Snapshots Monitoring Alerting Load Balancing 24
  25. 25. EC2: On-Demand Pricing  Zero upfront costs with no long term commitments  Charged hourly (fractional hours, rounded up) for the time the instance is running  Each instance type has a different hourly rate  Availabilityof specific instance types in specific AZs can fluctuate with demand  Best for short term workloadsthat can’t be interrupted while running  Best for systems that can be shutdown when not in use (test, monthly jobs, experiments)  Most flexible option but also most expensive Type CPUs RAM Temp Storage Rate Annual t2.micro 1* 1GB none $0.013 $114 m3.medium 1 3.75GB 1 x 4GB SSD $0.067 $587 t2.large 2* 8GB none $0.104 $911 m4.large 2 8GB none $0.126 $1,104 m4.2xlarge 4 16GB none $0.252 $2,207 c3.4xlarge 16 30GB 2 x 160GB SSD $0.84 $7,359 m4.10xlarge 40 160GB none $2.52 $22,075 i2.8xlarge 32 244GB 8 x 800GB SSD** $6.82 $59,743 **365,000 random read IOPS total* Fractional CPU with credit based burst 25
  26. 26. EC2: Reserved Instances  Reserved instances are commitments for 1 or 3 years and provide guaranteed availability  All upfront, partial upfront and no upfront purchasing options  Purchased for specific availabilityzones  Best for long term production servers  Unwanted reserved instances can be sold on the reserved instancemarket  Loss of flexibility but costs saving can be significant(up to %75) T2.Large Example 26
  27. 27. EC2: Spot Market Pricing  Save up to 90% by bidding on unused capacity  Spot instances are functionallyidenticalto on-demand and reserved instances  Requested instances are launchedwhen your bid matches or exceeds the market rate  Market rate fluctuatesbased on current supply and demand in a particularzone  When market rate exceeds the bid, instances are terminatedafter a two-minute notice  Good for short running jobs or long running processes that can check-pointtheir state  HPC and ad-hoc testing are good candidatesfor spot instances M4.10xlarge: 40 CPUs, 160GB RAM  On-demand rate: $2.52/hour  Current Spot rate: $0.27 (us-west-2a)  89% costs savings 27 Spot Market Rate History for M4.10xlarge instance in Oregon
  28. 28. Infrastructure as Code 28
  29. 29. Infrastructure as Code  Networks, servers, storage, security, databases, monitoring, etc… can be defined in code  Code “stacks” can be written to build entire complex infrastructures or multi-tier application stacks  Your infrastructure or application stack is always documented, versioned and consistently repeatable  Infrastructure code is strictly managed and tracked via a source code management system  Disaster recovery and business continuity can be an order of magnitude faster/simpler  This is the future of IT operations  Requires a different set of skills than traditional IT operations Example: Infrastructure changes are automatically documented and versioned 29
  30. 30. Infrastructure as Code Demo Q: Management How long would it take to migrate the Center's public web site to the cloud while increasing security, performance and availability? Will 6 months and a budget of $250K work? A: DevOps Engineer I can do it in 30 minutes for less than $5 per day. Where would you like it to reside? Oregon, California, or North Virginia? 30
  31. 31. Infrastructure as Code Example This code does the following:  Creates a new VPC wherever you like  Creates 2 public subnets (each in a different AZ)  Creates 2 private subnets (each in a different AZ)  Creates an Internet gateway  Creates a route table to route public traffic to the Internet gateway  Creates a NAT instance in a public subnet  Creates a route table to route Internet bound traffic from the private subnets to the NAT instance  Creates Security groups and network ACLs for both the private and public subnets  Creates two Linux instances, one in each private subnet  Installs and configures the NGINX web server on each Linux instance (via user-data script pulled from a github repo and provided to each instance at boot)  Each Linux web server pulls (via the NAT connection) a 1.4GB tarball containing the public www.fredhutch.org website from an S3 bucket and extracts into the NGINX web root  Creates an elastic load balancer (ELB) and attaches interfaces to the public subnets  Creates ELB health checks to verify the health of the new Fredhutch web servers  Adds the web server instances to the load balancer  Adds a DNS CNAME to the fredhutch.center DNS zone that points to the ELB public DNS name  Sends an SMS text message to my iPhone when it's done. 31
  32. 32. Availability Zone Public Subnet Private subnet Public Security Group security group Web Server NAT GitHub Availability Zone Public subnet Private subnet Public Security Group security group Web Server S3 www.fredhutch.org www.fredhutch.center SNS ELB Health Checks EIP Orchestration Code Webserver config code Base Images (AMIs) Internet Gateway Website Archive R53 PubRT PrivRT NAT Linux Config Config Nimbus Robert’s iPhone NACL NACL DHCP Key SMS Text Fredhutch.org Website Migration Demo 32
  33. 33. Custom Cloud Automation Code  EC2 instance provisioningto build and configure Windows and Linux instances o Tags instance with all mandatory tags o Configures OS at during bootstrap o Optionally enables monitoring an altering o Optionally enables daily backup rotation for all attached volumes o Optionally registers instance in DNS o Optionally creates and attaches additional “data” disk o Optionally configures data disk for encryption o Optionally configures instance for scheduled retirement  EC2 instance reporting o Gathers all information on instances to find, filter and report on instances  EC2 tag primer o Finds all instances without tags and creates all tag key stubs o Reports to ops group that instances without tags were found and tagged  EC2 tag inheritance o Ensures that EBS volumes and NICs attached to an instance inherit the parent instances tags  EC2 tag enforcer o Finds instances that are missing the mandatory tags o Reports them to ops group and optionally shuts them down 33
  34. 34. Custom Cloud Automation Code (cont) 34  EC2 backups o Finds all volumes that are tagged with a backup retention and snapshots them o Tags the snapshots to identify the owner, parent instance and retention date o Purges snapshots that are past their designated retention date  EC2 instance lifecycle o Finds instances that are scheduled for retirement in the next 30 days and reports on them o Retires instances that have reached their retirement date  Virtual datacenter creation o Creates a VPC, subnets, security group, NACLs, gateway, routing tables, DHCP options  “GrabCloudNode” researchcompute node provisioning o Researcher facing tool to provision a cloud based HPC node o Similar to existing “grabnode” functionality that researchers have to access on-premises compute resources o Tags the instance with all mandatory metadata tags o Facilitates transferring data to and from the cloud node o Sets up monitoring to automatically shutdown the node if it’s idle for more than 1 hour
  35. 35. Cloud Security 35
  36. 36. Amazon AWS Certifications and Accreditations PCI DSS Level1, SOC 1/ ISAE 3402,SOC 2, SOC 3, FIPS 140-2,CSA. FedRAMP, DIACAP, FISMA,ISO 27001,MPAA,Section 508 / VPAT,HIPAA, DOD CSM Levels 1-2,3-5, ISO 9001, CJIS, FERPA, G-Cloud,IT – Grundschutz,IRAP (Australia), MTCS Tier 3 Certification,ITAR  Amazon is responsible for the security of the Cloud  We are responsible for our security in the Cloud Our Responsibility Their Responsibility AWS Shared Responsibility Model 36
  37. 37. Proposed Security Choices 37  What we think are good ideas based on what we learned during the project  See appendix for details  These are not Policies o we do not understand our actual AWS use cases well enough to have policies and procedures o we have not tested these proposed choices in real operations mode
  38. 38. Proposed Security Choice Highlights 38  2-factor authentication for AWS administrator accounts with passwords  For service accounts making API calls with access keys, implement IP restrictions  Maintain team-level (as opposed to individual) access key repositories  Turn on CloudTrail and CloudConfig auditing everywhere. Send logs to Splunk  Clearly defined governance model for VPC-level design and changes—subnets,ACLs, EIPs, VPNs, VPC-peering…  Protect traffic between our EC2 instances and the Internet with a virtual firewall/IPS appliance, or some host-level alternative
  39. 39. Security-wise, AWS adoption brings… 39  Potential benefits  Challenges and uncertainties  For certain areas of IT operations, no change
  40. 40. Potential Security Benefits 40  Complete audit trail of infrastructure access & changes  Improved detection and alerting of security exceptions at the infrastructure level—faster, more precise incident response and recovery  Security goals such as physical security of the data center, protection of backup media, secure disposal of unwanted storage media are just easier to accomplish with AWS
  41. 41. Potential Security Benefits (cont) 41  Relatively easy to compartmentalize IT resources with well-defined technical and administrative boundaries o Optimized to create discretecomputing/storage/application instances on demand without having to maintain a common infrastructure o Create separate “networks” with VPCs o Buy storage space in the form S3 buckets o Separate database instances with RDS o Control admin access via granular access control rules.  IP & port filtering on a per-server basis with Security Groups
  42. 42. Security Challenges and Uncertainties 42  The “everything-as-code” paradigm bundles the different layers of the IT stack together, in a way that is not necessarily compatible with our current separation of duties. Combining services such as networking, server, OS, apps, and security filtering into one set of code blurs the lines between different teams’ responsibilities  confusion, unmet expectations, lost opportunities for cross- checking.  Our current team structure in Center IT is optimized for our physical IT environment. It is not necessarily efficient for managing the software-defined world of AWS. AWS is not a virtualized copy of our IT infrastructure.  New frontier. It will take time to establish new policies, expectations, and norms in order to operate AWS securely and smoothly. Potential for friction and dropped balls.
  43. 43. Security Challenges and Uncertainties (cont) 43  Rapid evolution of AWS features—long-term investment in staff time for learning.  AWS represents not a replacement of existing infrastructure, but a parallel one. We must duplicate resources to secure it.
  44. 44. Security - No Change 44  OS and application patching (but we may end up maintaining fewer servers, if we purchase things like storage and database “as-services” from AWS, instead of running our own)  Need for firewall/IPS/WAF protections (must be purchased via 3rd party vendors).
  45. 45. Cloud Computing for Scientific Applications 45
  46. 46. Ad-Hoc Capacity 46  When compute capacity needs(cores, memory, storage) exceeds in-house o Reduce time-to-solution o Scale wide/short: • 100 cores for 10 hours has same cost as 1000 cores for 1 hour o Rent-a-terabyte: • Short term analyses and interim storage options won’t require large capital investment
  47. 47. Ad-Hoc Capability 47  Use of technologies not currently available in- house o GPU o Low-latency interconnect (AWS “enhanced networking”) o Short term or one-off analyses won’t require large capital investment
  48. 48. Sandbox 48  Provide a sandboxfor prototyping and evaluation o Easily provisioned ephemeral environment o Allows researcher to try new algorithms and evaluate methods without constraints o Docker and AMIs are popular mechanisms for distributing data, tools, and pipelines
  49. 49. Container Solutions 49  Containers are o “a server-virtualization method where the kernel of an operating system allows for multiple isolated user-space instances” o Docker containers and AMIs allow distribution of tools and data in a portable container. o Reproducibility and distribution of results o Difficult and cumbersome (security) to deploy in- house • Easy to pop into a sandbox in the cloud!
  50. 50. Science DMZ  Transferring data into the cloud is free  Transferring data outthe cloud is chargedby the GB  Download large datasets quickly and inexpensively using Amazon’s big network pipes  Analyze and process data in the cloud using cloud resources (EC2, EMR, …)  Download the results of the analysis or experimentto the Hutch EC2 Compute Analyze Fred Hutch Campus Fred Hutch Amazon VPC Data Repository dbGaP EMBL-ENA Researcher Results 50 Store Retrieve S3 Storage S3 Storage /fh/fast “…designedsuch that the equipment,configuration, and security policies are optimized for high-performancescientific applications rather than forgeneral purpose business systems” -ESnet
  51. 51. Collaboration 51  Environmentsproviding compute, application,and storage for collaborations between Hutch and others o Resources independent o Access from one to other via peering o Uses AWS high-throughputnetworking o Data transfer does incur cost o Good for bringing in outside expertise Hutch Intercloud VPC Partner VPC EC2 Compute S3 Storage Hutch Partner In thisexample,agroupwith compute expertise provides theircomputationalresources, accessingHutch-produceddata viaa VPCpeeringrelationship
  52. 52. Meet-Me 52  A self-contained VPC for collaboration o Custom environment o Isolated from other Hutch resources o Limits need for shipping data between organizationsand VPCs (c.f. intercloud) o IAM controls access and authorization Meet-Me VPC EC2 Compute S3 Storage Hutch Partner
  53. 53. Illumina Data for External Customers  Upload from HiSeq into S3 (implemented today)  Processing in EC2  Download by customeror transfer of bucket to customervia VPC peering or S3 copy EC2 Compute Gerald/Bustard/etc. Shared Resources Genomics Fred Hutch Amazon VPC External SR Customer 53 Basecalls & Alignments “Raw” data storage S3 Storage S3 Storage External SR Customer VPC GlacierArchive (raw & cooked) This is simply an example of a possibility- no plans or proposals are in place at this time!
  54. 54.  The Proteomics Lab is currently testing Proteome Discoverer in the cloud  Run time with current local system takes 150 hours to run Cloud Comparison  8 CPU cloud server o Run time: 123.24 hours o Cost: $124.99  36 CPU cloud server o Run time: 42.47 hours o Cost: $132.91 Key Concepts  1 server running for 100 hours, costs about the same as running 100 servers for 1 hour  Running an 8 CPU system for 4 hours costs about the same a running a 32 CPU system for 1 hour Proteomics in the Cloud 54 8 CPUs 36 CPUs $125 $133
  55. 55. AWS HIPAA BAA Details 55  We must identify the AWS account IDs that we want covered by the BAA  We are responsible for implementing appropriate privacy and security safeguardsin order to protect our PHI in compliancewith HIPAA  The following are the current HIPAA eligible services: o Amazon Elastic Compute Cloud (EC2) o Amazon Simple Storage Service (S3) o Amazon Elastic BlockStore (EBS) o Amazon Glacier o Amazon Redshift o Amazon RDS (MySQL and Oracle engines only) o Amazon Elastic Map Reduce (EMR) o Amazon DynamoDB o Elastic Load Balancing  All compute instances processing, storing, or transmittingPHI must be dedicatedinstances  Dedicatedinstances won’t share a hypervisorhost with any other customers  Dedicatedtenancy costs an extra $2 per hourbut covers all EC2 instances in a region  AWS will reporting all security incidentsand breaches to us  We must enable all auditing and logging (CloudTrails,CloudConfig)  All PHI data must be encrypted at rest and transmission  Set ELB Load Balancer protocol to TCPfor sessions containingPHI, and the TCP session must be encrypted end-to-end (no SSL terminationon the ELB)
  56. 56. NIH Security Best Practices for Controlled-Access Data Subject to the NIH GenomicData Sharing (GDS) Policy *  Information security in cloud environments is still the responsibility of the institution, the implementation ofthat security is shared between the institutionand the cloud service provider  You and your institution are accountablefor ensuring the security of this data,not the cloud service provider.  The NIH strongly recommends that investigatorsconsult with institutional IT leaders, includingthe Chief Information Officer (CIO) and the institutional InformationSystems Security Officer (ISSO) * http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/GetPdf.cgi?document_name=dbgap_2b_security_procedures.pdf ** https://d0.awsstatic.com/whitepapers/compliance/AWS_dBGaP_Genomics_on_AWS_Best_Practices.pdf Whitepaper: “Architecting for GenomicData Security and Compliance in AWS” **  Guidance for working with controlled-accessdatasets from dbGaP, GWAS, and other individual-level genomic research repositories  Co-authored by Chris Whalley, formerly of the Fred Hutch regulatory compliance office dbGaP Data in the Cloud 56
  57. 57. AWS Support Options 57 Basic Developer Business Enterprise AWS Trusted Advisor 4 checks 4 checks 41 checks 41 checks Access to Support Support for Health Checks Email (local business hours) Phone, chat, email,live screen sharing (24/7) Phone, chat, email,live screen sharing,TAM (24/7) Primary case handling Technical Customer Service Associate Cloud Support Associate Cloud Support Engineer Sr. Cloud Support Engineer Users who can create support requests 1 Unlimited (IAM supported) Unlimited (IAM supported) Response time <12 hours <1 hour <15 minutes Cost Free $49/month 10% of monthly bill Rate goes down at higher spending tiers $15,000 Rate goes down at higher spending tiers  We used “Basic” support during the cloud on-ramp project  AWS has excellent documentationso we didn’t need to contact support during the project  I recommend that we upgrade to a Business support plan prior to productionuse
  58. 58.  An opportunity to build a high security computing environment for our current and future security needs.  A complete, up to the minute inventoryof all cloud resources  Complete visibilityand accountabilityof all IT costs associated with the Cloud  We can accuratelycalculatechargebacks for almost 90% of cloud costs  Disaster recovery and business continuity are a reality  Rapidly respond to urgent or unplannedIT needs  Audit all access and configuration changes  A documented, versioned, repeatable IT infrastructure is possible  Everything can be automatedvia well documented APIs and SDKs  No physicalinfrastructure equipment(servers, switches, routers, PDUs, etc…) to maintain  Collaboratewith other institutionsin the cloud via VPC peering  Take advantageof Amazon’s fat network pipes to downloadlarge datasets to the cloud  With the brokering layer in place, we could offer self-service IT to CIT, divisional ITand research staff across the Center. AWS Benefits 58
  59. 59. AWS Challenges 59  The highly abstractednature of the cloud combined with the infrastructure as code paradigm results in a blurring/elimination of the boundariesbetween traditional IT roles and separation of duties  Our current team structure in Center IT is optimized for our physicalIT environment.It is not necessarily efficient for managing the software-defined world of the Cloud. The Cloud is not a virtualized copy of our IT infrastructure.  AWS is evolving at a very rapid pace. IT staff responsiblefor the cloud infrastructure/service will need to stay abreast of all changes and incorporatethese changes into our architecture/service when beneficial  There is currently ~10ms of network latency between the Center and our Oregon VPC. Data and compute should to be co-locatedfor the best performance o On campus: ~1ms o Campus to Oregon AWS VPC: ~10ms o Campus to Europe: ~150ms o Campus to Africa: ~300ms  Our VPN can currently only encrypt network traffic to and from our VPC at a rate of 300Mb/s (37.5MB/s). Moving large data sets between campus and our VPC will be very slow until we upgrade to a dedicateddirect connect or other solution.
  60. 60. Not Everything Works Well in the Cloud  Is physically connectedto an instrument  Requires a licensing “dongle”  Has very specific hardware requirements(model XYZ only) Examples: Aperio, OnBase 60 The cloud is not possible for anything that…  Requires very low latency access to systems/data on Campus  Requires high throughput access to systems/dataon Campus * Examples: BI, PeopleSoft, Varonis, Hyperion The cloud is not currently a goodfit for anything that…  Requires a large server (many CPUs) and runs 24/7 Might not be cost effectivefor anything that… * This limitation can be removed by implementing a 10GbE direct connect (costs $5-7K / month)
  61. 61. Capabilities Gained During This Project  Create complex cloud based datacenters and networks  Logically integrate cloud networks with our campus network  Secure cloud resources with security groups, NACLs and third party firewalls  Limit access with fine grained security policies and multi-factor authentication  Consistently provision and configure Windows and Linux server instances  Audit configurationchanges to enforce change management  Audit all AWS access (web console, CLI, API)  Backup and recover servers in the cloud  Implement and enforce a metadatatagging scheme  Cost accounting and reporting to facilitatechargebacks  Monitor systems, trend metrics and alert support staff  Load balance private and public network traffic  Vertically scale(up/down) or horizontallyscale(out/in) systems  Log all network traffic in/out of the VPC  Peer with other organizations for collaboration in the cloud  Automate everythingvia AWS APIs, CLI tools, CloudFormation,Packer 61
  62. 62. FHCRC AWS Roadmap  Determine cloud operations model (who and what)  Develop cloud governance model  Select production account and VPC architecture  Extend production the FHCRC Active Directory with an AWS AD site  Integrate AWS user authentication (IAM) with FHCRC AD via SAML  Implement chargebacks (not decided yet)  Offer brokered self service to Center IT departments  Implement a direct connect network to AWS or other solution (Internet2)  Integrate EC2 into scientific computing service offering for researchers  Offer brokered self service to the research community 62
  63. 63. Key Takeaways  The cloud is no longer just hype, it’s a very capable, mature platform that can offer increased agility, flexibility, security and capabilities  The cloud is not a traditional IT operating environment and requires a different approach to operate effectively  Not every server or application can or should move to the cloud  It won’t happen overnight; the journey to the cloud will take several years  Center IT is not currently offering the cloud as a service, but we may in the future 63
  64. 64. Thank You! 64 Questions? Contact: Robert McDermott rmcdermo@fredhutch.org
  65. 65. Appendix 65
  66. 66. Proposed Security Choices 66 Identity and Access Management for AWS  API-based access to AWS for automation tasks should be done using serviceaccounts instead of the individual accounts of technicians.  All AWS user accounts belonging to humans must usetwo-factor authentication.  Permissions granted to serviceaccounts should be restricted to the sourceIPs or subnets of servers needing those permissions. In AWS parlance, IAMpolicies granting permissions to serviceaccounts should use sourceIP as a condition.  Serviceaccounts should be used with access keys only. They should not be associated with passwords. Access keys mustbe stored in encrypted, team-level key repositories, but not in the personalstorage spaceof individual technicians.  The ability to modify IAMsettings should be restricted to the ITSO. Exceptions (e.g. serviceaccounts requiring IAMpermissions) mustbeapproved by the ITSO. Logging and Auditing  CloudTrail must be turned on for all regions.CloudTrail logs must be forwarded to Splunk.  Splunk should be setup to monitor CloudTrail logs and alert [the cloud operations team] of notable activities, including activities that havesecurity implications, such as account creation and permission changes. The exact set of events to be monitored will be defined as we operationalize our AWS environment, and continuously updated as weaccumulate knowledgeof AWS.  CloudTrail logs mustbe retain in Splunk for at least a year.
  67. 67. Proposed Security Choices (cont) 67 VPC-Level Security  The permissions to modify VPC-level configurations mustbe limited to data-ops staff. Changes should be made in consultation with ITSO. VPC-levelconfigurations include, but are not limited to:  Creation/removal of subnets  Assignment/removalof Elastic IPs (EIPs)  Changes related to Access Control Lists (ACLs)  Changes related to internet gateways, VPNs, and VPC-peering  Changes related to VPC Endpoints.  All traffic between the FHCRC campus and the private IP spaceof our AWS VPC will go through a VPN tunnel. This represents the scenario wherehosts on the FHCRC campus access non-publicly accessible hosts in our VPC, and vice versa.  A Fortigate virtual appliance will be deployed within our VPC and managed by ITSO. Firewall, IPS, anti- virus, and application controlfeatures will be enabled on the Fortigate. The appliance will inspect traffic in the following scenarios:  All traffic originating fromthe VPC to non-FHCRC addresses. This represents thescenario wherehosts within the VPC need to initiate connections to the internet at large, for reasons such as patching.  All connections to publicly accessible hosts within the VPC, including connections originating fromour campus network. EC2-level security  By default, when EC2 instances are created they should be associated with one of the pre-defined security groups created by ITSO. Network administrators should notcreate new security groups unless there are specific needs to do so, and it should be done in consultation with ITSO.
  68. 68. VPC Peering Example [10.99.1.50]$ traceroute 192.168.1.156 1 192.168.1.156 1.472 ms [10.99.1.50]$ curl http://192.168.1.156/ Welcome to Organization A!! [192.168.1.156]$ traceroute 10.99.1.50 1 10.99.1.50 1.417 ms [192.168.1.156]$ curl http://10.99.1.50/ Welcome to Organization B! Organization A Organization B Peeringconnection Peering connection Route Table Route Table It works! It works! 68 123456789 987654321
  69. 69. albite01.fhcrc.org DNS Views Internal: 192.168/16, 10.168/16, 172.16/16 (AWS VPC) External: !(internal) AWS VPC Resolv.conf 172.16.160.11,172.16.168.11,192.168.116.A US-WEST-2A: 172.16.160.11 US-WEST-2B: 172.16.168.11 AWS VPC albite10.fhcrc.org IBX0 DNS Master albite01.fhcrc.org IBX-J4 IBX-E2 GridUpdates DNS Cloud Integration 69
  70. 70. Project Scope In Scope  Develop functionaland security requirements  Design and implement virtual datacenter architecture (regions, zones, subnets, etc)  Extend the Center’s IP network to the virtual datacenter (VPC)  Develop security polices and/or guidelineson appropriateuse of the environment  Active Directory and DNS services  Create FHCRC server templates (AMIs) and standards  Server pricing strategy (on-demand vs. reserved instances)  Develop and test various use cases  Trainoperational staffon the use of the environment  Determine RBAC/accountstrategy  Develop accountingstrategy to support future chargeback functionality  Select and pilot at least one productionserver/service in the new virtual datacenter  Pilot researcher use of EC2  Develop a roadmapfor this environment Out of Scope  Implementing a high-speed “Direct Connect” network connection  Implementationof chargebacks  Chef automatedserver builds (Chef implementationproject still in progress)  Customer self-service 70
  71. 71.  Secure, Logical network extension of FHCRC IP space into the cloud  Encrypted transport between FHCRC and cloud  Design that allows HA architected services  Separate public network to run services outside of the FHCRC network  Support both enterprise and research computing  Cost tracking and reporting  System metrics, monitoring and logging  FHCRC Active Directory access/integration for servers  FHCRC DNS access/integration for servers  Ability for servers to log to Splunk  Role base administrative access (RBAC)  Ability to backup / restore servers  Secure storage media wipe on deletion  Support full automated provisioning and configuration  Support Windows 2012, CentOS 6/7 and Ubuntu 14.04 operating systems  Pre-cooked FHCRC server templates (CentOS, Ubuntu and Windows 2012)  Stateful ingress/egress firewall capability  Advanced intrusion prevention firewall  Vendor support  Granular cost reporting (per server, application type, owner) to support future chargeback implementation  Metadata tagging capability to identify and group AWS objects Cloud Architecture Requirements During this project we’ve determined that we can satisfy all the following “must have” requirements 71

×