Practical Cloud Security
Upcoming SlideShare
Loading in...5
×
 

Practical Cloud Security

on

  • 4,027 views

 

Statistics

Views

Total Views
4,027
Views on SlideShare
4,027
Embed Views
0

Actions

Likes
2
Downloads
94
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Practical Cloud Security Practical Cloud Security Presentation Transcript

  • Practical Cloud Security Jason Chan chan@netflix.comTuesday, October 11, 2011
  • Agenda • Background and Disclaimers • Netflix in the Cloud • Model-Driven Deployment Architecture • APIs, Automation, and the Security Monkey • Cloud Firewall and Connectivity Analysis • Practical Cloud Security GapsTuesday, October 11, 2011
  • Background and DisclaimersTuesday, October 11, 2011
  • Background and DisclaimersTuesday, October 11, 2011
  • Background and Disclaimers • No cloud definitions, but . . .Tuesday, October 11, 2011
  • Background and Disclaimers • No cloud definitions, but . . . • Focus on IaaSTuesday, October 11, 2011
  • Background and Disclaimers • No cloud definitions, but . . . • Focus on IaaS • Netflix uses Amazon Web ServicesTuesday, October 11, 2011
  • Background and Disclaimers • No cloud definitions, but . . . • Focus on IaaS • Netflix uses Amazon Web Services • Guidance should be generally applicableTuesday, October 11, 2011
  • Background and Disclaimers • No cloud definitions, but . . . • Focus on IaaS • Netflix uses Amazon Web Services • Guidance should be generally applicable • Works in progress, still many problems to solve . . .Tuesday, October 11, 2011
  • Netflix in the CloudTuesday, October 11, 2011
  • Why is Netflix Using Cloud?Tuesday, October 11, 2011
  • !"#"$%&#&($Tuesday, October 11, 2011
  • !"#"$%&#&($ Netflix could not build data centers fast enoughTuesday, October 11, 2011
  • !"#"$%&#&($ Netflix could not build data centers fast enough Capacity requirements accelerating, unpredictableTuesday, October 11, 2011
  • !"#"$%&#&($ Netflix could not build data centers fast enough Capacity requirements accelerating, unpredictable Product launch spikes - iPhone, Wii, PS2, XBoxTuesday, October 11, 2011
  • Outgrowing Data Center http://techblog.netflix.com/2011/02/redesigning-netflix-api.html Netflix API: Growth in RequestsTuesday, October 11, 2011
  • Outgrowing Data Center http://techblog.netflix.com/2011/02/redesigning-netflix-api.html Netflix API: Growth in Requests 37x Growth 1/10 - 1/11Tuesday, October 11, 2011
  • Outgrowing Data Center http://techblog.netflix.com/2011/02/redesigning-netflix-api.html Netflix API: Growth in Requests 37x Growth 1/10 - 1/11Tuesday, October 11, 2011
  • Outgrowing Data Center http://techblog.netflix.com/2011/02/redesigning-netflix-api.html Netflix API: Growth in Requests 37x Growth 1/10 - 1/11 !"#"$%&#%( )"*"$+#,(Tuesday, October 11, 2011
  • netflix.com is now ~100% CloudTuesday, October 11, 2011
  • netflix.com is now ~100% Cloud Remaining components being migratedTuesday, October 11, 2011
  • Netflix Model-Driven ArchitectureTuesday, October 11, 2011
  • Data Center PatternsTuesday, October 11, 2011
  • Data Center Patterns • Long-lived, non-elastic systemsTuesday, October 11, 2011
  • Data Center Patterns • Long-lived, non-elastic systems • Push code and config to running systemsTuesday, October 11, 2011
  • Data Center Patterns • Long-lived, non-elastic systems • Push code and config to running systems • Difficult to enforce deployment patternsTuesday, October 11, 2011
  • Data Center Patterns • Long-lived, non-elastic systems • Push code and config to running systems • Difficult to enforce deployment patterns • ‘Snowflake phenomenon’Tuesday, October 11, 2011
  • Data Center Patterns • Long-lived, non-elastic systems • Push code and config to running systems • Difficult to enforce deployment patterns • ‘Snowflake phenomenon’ • Difficult to sync or reproduce environments (e.g. test and prod)Tuesday, October 11, 2011
  • Cloud PatternsTuesday, October 11, 2011
  • Cloud Patterns • Ephemeral nodesTuesday, October 11, 2011
  • Cloud Patterns • Ephemeral nodes • Dynamic scalingTuesday, October 11, 2011
  • Cloud Patterns • Ephemeral nodes • Dynamic scaling • Hardware is abstractedTuesday, October 11, 2011
  • Cloud Patterns • Ephemeral nodes • Dynamic scaling • Hardware is abstracted • Orchestration vs. manual stepsTuesday, October 11, 2011
  • Cloud Patterns • Ephemeral nodes • Dynamic scaling • Hardware is abstracted • Orchestration vs. manual steps • Trivial to clone environmentsTuesday, October 11, 2011
  • When Moving to the Cloud, Leave Old Ways Behind . . .Tuesday, October 11, 2011
  • When Moving to the Cloud, Leave Old Ways Behind . . . Generic forklift is generally a mistakeTuesday, October 11, 2011
  • When Moving to the Cloud, Leave Old Ways Behind . . . Generic forklift is generally a mistake Adapt development, deployment, and management models appropriatelyTuesday, October 11, 2011
  • When Moving to the Cloud, Leave Old Ways Behind . . . Generic forklift is generally a mistake Adapt development, deployment, and management models appropriatelyTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.htmlTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html Perforce SCMTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html Continuous Integration Jenkins Perforce SCMTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html Continuous Integration Jenkins Perforce Artifactory SCM Binary RepositoryTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html App-Specific Continuous Packages and Integration Configuration Jenkins Yum Perforce Artifactory SCM Binary RepositoryTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html App-Specific Continuous Packages and Integration Configuration Jenkins Yum Perforce Artifactory Bakery SCM Binary Combine Base and Repository App-Specific ConfigurationTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html App-Specific Customized, Continuous Packages and Cloud-Ready Integration Configuration Image Jenkins Yum AMI Perforce Artifactory Bakery SCM Binary Combine Base and Repository App-Specific ConfigurationTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html App-Specific Customized, Continuous Packages and Cloud-Ready Integration Configuration Image Jenkins Yum AMI Perforce Artifactory Bakery ASG SCM Binary Combine Base and Dynamic Repository App-Specific Scaling ConfigurationTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html App-Specific Customized, Continuous Packages and Cloud-Ready Integration Image Live System! Configuration Jenkins Yum AMI Instance Perforce Artifactory Bakery ASG SCM Binary Combine Base and Dynamic Repository App-Specific Scaling ConfigurationTuesday, October 11, 2011
  • Netflix Build and Deploy http://techblog.netflix.com/2011/08/building-with-legos.html App-Specific Customized, Continuous Packages and Cloud-Ready Integration Image Live System! Configuration Jenkins Yum AMI Instance Perforce Artifactory Bakery ASG SCM Binary Combine Base and Dynamic Repository App-Specific Scaling Configuration Every change is a new pushTuesday, October 11, 2011
  • ResultsTuesday, October 11, 2011
  • Results • No changes to running systemsTuesday, October 11, 2011
  • Results • No changes to running systems • No CMDBTuesday, October 11, 2011
  • Results • No changes to running systems • No CMDB • No systems management infrastructureTuesday, October 11, 2011
  • Results • No changes to running systems • No CMDB • No systems management infrastructure • Fewer logins to prod systemsTuesday, October 11, 2011
  • Impact on SecurityTuesday, October 11, 2011
  • Impact on Security • File integrity monitoringTuesday, October 11, 2011
  • Impact on Security • File integrity monitoring • User activity monitoringTuesday, October 11, 2011
  • Impact on Security • File integrity monitoring • User activity monitoring • Vulnerability managementTuesday, October 11, 2011
  • Impact on Security • File integrity monitoring • User activity monitoring • Vulnerability management • Patch managementTuesday, October 11, 2011
  • APIs, Automation, and the Security MonkeyTuesday, October 11, 2011
  • Common Challenges for Security EngineersTuesday, October 11, 2011
  • Common Challenges for Security Engineers • Lots of data from different sources, in different formatsTuesday, October 11, 2011
  • Common Challenges for Security Engineers • Lots of data from different sources, in different formats • Too many administrative interfaces and disconnected systemsTuesday, October 11, 2011
  • Common Challenges for Security Engineers • Lots of data from different sources, in different formats • Too many administrative interfaces and disconnected systems • Too few options for scalable automationTuesday, October 11, 2011
  • Enter the Cloud . . .Tuesday, October 11, 2011
  • How do you . . .Tuesday, October 11, 2011
  • How do you . . . • Add a user account?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • Inventory systems?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • Inventory systems? • Change a firewall config?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • Inventory systems? • Change a firewall config? • Snapshot a drive for forensic analysis?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • Inventory systems? • Change a firewall config? • Snapshot a drive for forensic analysis? • Disable a multi-factor authentication token?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • CreateUser() • Inventory systems? • Change a firewall config? • Snapshot a drive for forensic analysis? • Disable a multi-factor authentication token?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • CreateUser() • Inventory systems? • DescribeInstances() • Change a firewall config? • Snapshot a drive for forensic analysis? • Disable a multi-factor authentication token?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • CreateUser() • Inventory systems? • DescribeInstances() • Change a firewall config? • AuthorizeSecurityGroup Ingress() • Snapshot a drive for forensic analysis? • Disable a multi-factor authentication token?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • CreateUser() • Inventory systems? • DescribeInstances() • Change a firewall config? • AuthorizeSecurityGroup Ingress() • Snapshot a drive for forensic analysis? • CreateSnapshot() • Disable a multi-factor authentication token?Tuesday, October 11, 2011
  • How do you . . . • Add a user account? • CreateUser() • Inventory systems? • DescribeInstances() • Change a firewall config? • AuthorizeSecurityGroup Ingress() • Snapshot a drive for forensic analysis? • CreateSnapshot() • Disable a multi-factor • DeactivateMFADevice() authentication token?Tuesday, October 11, 2011
  • Security Monkey http://techblog.netflix.com/2011/07/netflix-simian-army.htmlTuesday, October 11, 2011
  • Security Monkey http://techblog.netflix.com/2011/07/netflix-simian-army.html • Leverages cloud APIsTuesday, October 11, 2011
  • Security Monkey http://techblog.netflix.com/2011/07/netflix-simian-army.html • Leverages cloud APIs • Centralized framework for cloud security monitoring and analysisTuesday, October 11, 2011
  • Security Monkey http://techblog.netflix.com/2011/07/netflix-simian-army.html • Leverages cloud APIs • Centralized framework for cloud security monitoring and analysis • Certificate and cipher monitoringTuesday, October 11, 2011
  • Security Monkey http://techblog.netflix.com/2011/07/netflix-simian-army.html • Leverages cloud APIs • Centralized framework for cloud security monitoring and analysis • Certificate and cipher monitoring • Firewall configuration checksTuesday, October 11, 2011
  • Security Monkey http://techblog.netflix.com/2011/07/netflix-simian-army.html • Leverages cloud APIs • Centralized framework for cloud security monitoring and analysis • Certificate and cipher monitoring • Firewall configuration checks • User/group/policy monitoringTuesday, October 11, 2011
  • Cloud Firewall and Connectivity AnalysisTuesday, October 11, 2011
  • Analyzing Traditional FirewallsTuesday, October 11, 2011
  • Analyzing Traditional Firewalls • Positioned at network chokepoints, providing optimal internetwork visibilityTuesday, October 11, 2011
  • Analyzing Traditional Firewalls • Positioned at network chokepoints, providing optimal internetwork visibility • Use tools like tcpdump, NetFlow, centralized logging to gather dataTuesday, October 11, 2011
  • Analyzing Traditional Firewalls • Positioned at network chokepoints, providing optimal internetwork visibility • Use tools like tcpdump, NetFlow, centralized logging to gather data • Review traffic patterns and optimizeTuesday, October 11, 2011
  • AWS Firewalls (Briefly)Tuesday, October 11, 2011
  • AWS Firewalls (Briefly) • “Security Group” is unit of measure for firewallingTuesday, October 11, 2011
  • AWS Firewalls (Briefly) • “Security Group” is unit of measure for firewalling • Policy-driven and network-agnostic, configuration follows an instanceTuesday, October 11, 2011
  • AWS Firewalls (Briefly) • “Security Group” is unit of measure for firewalling • Policy-driven and network-agnostic, configuration follows an instance • Network diagram irrelevantTuesday, October 11, 2011
  • AWS Firewalls (Briefly) • “Security Group” is unit of measure for firewalling • Policy-driven and network-agnostic, configuration follows an instance • Network diagram irrelevant • Chokepoints and sniffing are not possibleTuesday, October 11, 2011
  • AWS Firewalls (Briefly) • “Security Group” is unit of measure for firewalling • Policy-driven and network-agnostic, configuration follows an instance • Network diagram irrelevant • Chokepoints and sniffing are not possible • Outbound connections not filterable (!)Tuesday, October 11, 2011
  • Security Group AnalysisTuesday, October 11, 2011
  • Security Group Analysis • Use config and inventory to map reachabilityTuesday, October 11, 2011
  • Security Group Analysis • Use config and inventory to map reachability • Leverage APIs to evaluate reachability and detect violations:Tuesday, October 11, 2011
  • Security Group Analysis • Use config and inventory to map reachability • Leverage APIs to evaluate reachability and detect violations: • Security groups with no membersTuesday, October 11, 2011
  • Security Group Analysis • Use config and inventory to map reachability • Leverage APIs to evaluate reachability and detect violations: • Security groups with no members • “Insecure” services (e.g. Telnet, FTP)Tuesday, October 11, 2011
  • Security Group Analysis • Use config and inventory to map reachability • Leverage APIs to evaluate reachability and detect violations: • Security groups with no members • “Insecure” services (e.g. Telnet, FTP) • Rules that use “any” keywordTuesday, October 11, 2011
  • Security Group Analysis • Use config and inventory to map reachability • Leverage APIs to evaluate reachability and detect violations: • Security groups with no members • “Insecure” services (e.g. Telnet, FTP) • Rules that use “any” keyword • Visualize config into data flow diagramTuesday, October 11, 2011
  • Reachability & Violation AnalysisTuesday, October 11, 2011
  • Connectivity AnalysisTuesday, October 11, 2011
  • Connectivity Analysis • Reachability shows what “can” communicateTuesday, October 11, 2011
  • Connectivity Analysis • Reachability shows what “can” communicate • What about what is communicating?Tuesday, October 11, 2011
  • Connectivity Analysis • Reachability shows what “can” communicate • What about what is communicating? • Take same approach, leverage APIs for firewall and inventory and combine with host dataTuesday, October 11, 2011
  • Connectivity Analysis • Reachability shows what “can” communicate • What about what is communicating? • Take same approach, leverage APIs for firewall and inventory and combine with host data • Visualize data into connectivity diagramTuesday, October 11, 2011
  • Connectivity AnalysisTuesday, October 11, 2011
  • ‘Practical’ Cloud Security GapsTuesday, October 11, 2011
  • Common Security Product ModelTuesday, October 11, 2011
  • Common Security Product Model • Examples - AV, FIM, etc.Tuesday, October 11, 2011
  • Common Security Product Model • Examples - AV, FIM, etc. • “Management” station with client “nodes”Tuesday, October 11, 2011
  • Common Security Product Model • Examples - AV, FIM, etc. • “Management” station with client “nodes” • Limited tagging or abstractionTuesday, October 11, 2011
  • Common Security Product Model • Examples - AV, FIM, etc. • “Management” station with client “nodes” • Limited tagging or abstraction • Strong “manager” and “managed” modelTuesday, October 11, 2011
  • Common Security Product Model • Examples - AV, FIM, etc. • “Management” station with client “nodes” • Limited tagging or abstraction • Strong “manager” and “managed” model • Push and pull approachesTuesday, October 11, 2011
  • Common Security Product Model • Examples - AV, FIM, etc. • “Management” station with client “nodes” • Limited tagging or abstraction • Strong “manager” and “managed” model • Push and pull approaches • Per node licensingTuesday, October 11, 2011
  • “Thundering Herd”Tuesday, October 11, 2011
  • “Thundering Herd” • Mass deploymentsTuesday, October 11, 2011
  • “Thundering Herd” • Mass deployments • “Red/Black” push - concurrent clusters of 500+ nodesTuesday, October 11, 2011
  • “Thundering Herd” • Mass deployments • “Red/Black” push - concurrent clusters of 500+ nodes • Elasticity related to traffic spikesTuesday, October 11, 2011
  • “Thundering Herd” • Mass deployments • “Red/Black” push - concurrent clusters of 500+ nodes • Elasticity related to traffic spikes • Licensing constraintsTuesday, October 11, 2011
  • Node Ephemerality and Service AbstractionTuesday, October 11, 2011
  • Node Ephemerality and Service Abstraction • Data related to individual nodes becomes less importantTuesday, October 11, 2011
  • Node Ephemerality and Service Abstraction • Data related to individual nodes becomes less important • Dealing with short-lived systems, IP and ID reuseTuesday, October 11, 2011
  • Node Ephemerality and Service Abstraction • Data related to individual nodes becomes less important • Dealing with short-lived systems, IP and ID reuse • Event and log archives and data relationshipsTuesday, October 11, 2011
  • Resource Usage Logging and AuditingTuesday, October 11, 2011
  • Resource Usage Logging and Auditing • Public-facing APIs make access controls more difficult and more importantTuesday, October 11, 2011
  • Resource Usage Logging and Auditing • Public-facing APIs make access controls more difficult and more important • Programmable infrastructure needs robust logging and auditing capabilitiesTuesday, October 11, 2011
  • Resource Usage Logging and Auditing • Public-facing APIs make access controls more difficult and more important • Programmable infrastructure needs robust logging and auditing capabilities • Can metering data be repurposed?Tuesday, October 11, 2011
  • Identity IntegrationTuesday, October 11, 2011
  • Identity Integration • Federation use casesTuesday, October 11, 2011
  • Identity Integration • Federation use cases • On-instance credentialsTuesday, October 11, 2011
  • “Trusted Cloud”Tuesday, October 11, 2011
  • “Trusted Cloud” • Various components related to providing higher assurance/trust levels in the cloudTuesday, October 11, 2011
  • “Trusted Cloud” • Various components related to providing higher assurance/trust levels in the cloud • Virtual TPM / hardware root of trustTuesday, October 11, 2011
  • “Trusted Cloud” • Various components related to providing higher assurance/trust levels in the cloud • Virtual TPM / hardware root of trust • Controlled executionTuesday, October 11, 2011
  • “Trusted Cloud” • Various components related to providing higher assurance/trust levels in the cloud • Virtual TPM / hardware root of trust • Controlled execution • HSM in the cloudTuesday, October 11, 2011
  • Thanks! Questions? chan@netflix.com (I’m hiring!)Tuesday, October 11, 2011
  • References • http://www.slideshare.net/adrianco • http://aws.amazon.com • http://techblog.netflix.com • http://nordsecmob.tkk.fi/Thesisworks/Soren %20Bleikertz.pdf • https://cloudsecurityalliance.org/ • http://www.nist.gov/itl/cloud/index.cfmTuesday, October 11, 2011