Cloud orchestration risks
Glib Pakharenko
Cloud orechstration stack
Cloud services (TOP layer)
Cloud services (TOP layer)
Kubernetes architecture
Kubernetes architecture
Risk #1: Compromise in the cloud
A compromise to the degree described earlier is effectively
irreparable, and the standard advice to "flatten and rebuild" every
compromised system is simply not feasible or even possible if
Active Directory has been compromised or destroyed.
© https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/plan/security-best-practices/planning-for-compromise
Recovery from a full compromise is extremely difficult requiring
vast amounts of time. A likely scenario is that a victim may need
to rebuild the domain from scratch.
© https://securelink.net/insights/pentesting-stories-errors-limited-hygiene/
How to quickly rebuild infrastructure?
• manual deployment guides xx pages long are not feasible
• deployment automation even for small deployments:
• Ansible/Chef/Puppet for nodes
• Terraform for kubernetes deployments
• Helm for applications deployments
• CI/CD tools for deployments
• managed services for quick setup (e.g. cloud database engine
instead of database on IaaS)
Risk #1: Compromise in the cloud
How to do IR in the cloud?
• dedicated cloud account and infrastructure for security
• logs collection and monitoring
• containers compromise monitoring (who runs a shell there?)
• cloud objects logs
• kubernetes objects logs
• netflow/sflow collection and monitoring
• compliance monitoring (e.g. forseti)
• monitor budget spendings (e.g. someone mines cryptocurrency
on your infrastructure)
Risk #1: Compromise in the cloud
How to manage access in the cloud? Consider that to
have a good assurance about access revocation you
need to redeploy the assets!
© https://cloud.google.com/security/data-loss-prevention/revoking-user-access
If you're concerned about the code integrity of your
deployed apps, you may want to redeploy them
(including any modules) with a known-good checkout
from your version control system.
Risk #1: Compromise in the cloud
How to manage access in the cloud?
© https://cloud.google.com/security/compromised-credentials
• Replace Service Account Private Keys (JSON and p12 files)
• Regenerate API Keys
• Reset OAuth2 Client ID Secrets
• Remove Google Cloud SDK Credentials
• Remove ssh keys from VMs
• Revoke access to cloud SQL databases
• Invalidate Browser Cookies
• Delete all unauthorized resources
• And many other cloud and kubernetes potential backdoors could exist
Risk #1: Compromise in the cloud
Risk #2: Issues with governance in the cloud
Consider the following issues with access management and
resource provisioning:
1.Access management:
a. Admins in the cloud objects.
b. Admins in the Kubernetes objects.
c. Admins in the applications and docker images (hardcoded credentials).
2.Kubernetes has more than hundred type of objects:
a. How to authorize creation of new K8 objects?
b. How to authorize creation of new docker images?
c. How to authorize creation of new cloud objects (e.g. storage buckets)?
Key DevOps problem:
1.Old paradigm: InfoSec mandates through policy -> IT
implements no longer works.
2.InfoSec -> transforms into -> DevSecOps:
a. DevSecOps implements a one-click solution and IT just deploys that
(through Terraform/Helm/CI etc.)
3.More responsibility comes to DevSecOps:
a. If %CPU is overloaded due to the improper rules for web-firewall?
b. If a proper workaround for the security bug (CVE) exists instead of
updating the software?
c. DevSecOps should follow the Technology policy (say Mongodb is
banned, and we need to use Azure Cassandra).
Risk #2: Issues with governance in the cloud
Risk #3: Issues with segregation
The segregation should be implemented on many tires:
1.Consider the scenario for crypto-currency exchange. If we run the
different blockchain nodes (BTC/LTC/Monero) in the same cluster than
we’re very vulnerable if one is compromised.
2.Implement Pod<->Pod network level filtering (and encryption if possible).
3.Implement message queue security between microservices (in many
deployments only access to the BUS is authorized).
4.Limit access from Pods to the Cloud service accounts (and consider the
risk of SSRF bugs!).
5.Do not use shared secrets (say for Oauth2).
6.Limit commits ability to the main branch (e.g. only few seniors can handle
pull requests).
7.Separate Prod/Test/Dev clusters and all cloud accounts.
Other important points
1.Do not hardcode credentials in any way. Use kubernetes secrets at least.
2.Watch for vulnerabilities in all infrastructure layers:
a. Docker images (e.g. if there was no rebuild for 1 year image how many vulns are
there).
b. Cloud.
c. K8
d. etc.
3.Implement hardening on all levels (Application/OS/Docker/K8/..)
4.The lack of knowledge about modern clouds inside teams leads to the
security nightmare
5.Use signed Docker images and sign code and packages.
6.Use technology specific tools for compliance (e.g. K8 admission
controllers, or WP core::integrity checking).
Other important points
7.Manual operations lead to mistake (e.g. publish service on the
external IP -> implement firewall).
8.High speed of development leads to the lack of documentation
(e.g. even all software requirements are in the Jira tickets). If
you do not have an order now then you won’t have it in the
cloud either.
9.Just copy->cloud of the enterprise infrastructure is costly and
often ineffective. You need ot shift to the cloud paradigm of all
development&deployment process.
10.It is very very hard to move monolithic application to the K8
and cloud microservices.

Cloud orchestration risks

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
    Risk #1: Compromisein the cloud A compromise to the degree described earlier is effectively irreparable, and the standard advice to "flatten and rebuild" every compromised system is simply not feasible or even possible if Active Directory has been compromised or destroyed. © https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/plan/security-best-practices/planning-for-compromise Recovery from a full compromise is extremely difficult requiring vast amounts of time. A likely scenario is that a victim may need to rebuild the domain from scratch. © https://securelink.net/insights/pentesting-stories-errors-limited-hygiene/
  • 8.
    How to quicklyrebuild infrastructure? • manual deployment guides xx pages long are not feasible • deployment automation even for small deployments: • Ansible/Chef/Puppet for nodes • Terraform for kubernetes deployments • Helm for applications deployments • CI/CD tools for deployments • managed services for quick setup (e.g. cloud database engine instead of database on IaaS) Risk #1: Compromise in the cloud
  • 9.
    How to doIR in the cloud? • dedicated cloud account and infrastructure for security • logs collection and monitoring • containers compromise monitoring (who runs a shell there?) • cloud objects logs • kubernetes objects logs • netflow/sflow collection and monitoring • compliance monitoring (e.g. forseti) • monitor budget spendings (e.g. someone mines cryptocurrency on your infrastructure) Risk #1: Compromise in the cloud
  • 10.
    How to manageaccess in the cloud? Consider that to have a good assurance about access revocation you need to redeploy the assets! © https://cloud.google.com/security/data-loss-prevention/revoking-user-access If you're concerned about the code integrity of your deployed apps, you may want to redeploy them (including any modules) with a known-good checkout from your version control system. Risk #1: Compromise in the cloud
  • 11.
    How to manageaccess in the cloud? © https://cloud.google.com/security/compromised-credentials • Replace Service Account Private Keys (JSON and p12 files) • Regenerate API Keys • Reset OAuth2 Client ID Secrets • Remove Google Cloud SDK Credentials • Remove ssh keys from VMs • Revoke access to cloud SQL databases • Invalidate Browser Cookies • Delete all unauthorized resources • And many other cloud and kubernetes potential backdoors could exist Risk #1: Compromise in the cloud
  • 12.
    Risk #2: Issueswith governance in the cloud Consider the following issues with access management and resource provisioning: 1.Access management: a. Admins in the cloud objects. b. Admins in the Kubernetes objects. c. Admins in the applications and docker images (hardcoded credentials). 2.Kubernetes has more than hundred type of objects: a. How to authorize creation of new K8 objects? b. How to authorize creation of new docker images? c. How to authorize creation of new cloud objects (e.g. storage buckets)?
  • 13.
    Key DevOps problem: 1.Oldparadigm: InfoSec mandates through policy -> IT implements no longer works. 2.InfoSec -> transforms into -> DevSecOps: a. DevSecOps implements a one-click solution and IT just deploys that (through Terraform/Helm/CI etc.) 3.More responsibility comes to DevSecOps: a. If %CPU is overloaded due to the improper rules for web-firewall? b. If a proper workaround for the security bug (CVE) exists instead of updating the software? c. DevSecOps should follow the Technology policy (say Mongodb is banned, and we need to use Azure Cassandra). Risk #2: Issues with governance in the cloud
  • 14.
    Risk #3: Issueswith segregation The segregation should be implemented on many tires: 1.Consider the scenario for crypto-currency exchange. If we run the different blockchain nodes (BTC/LTC/Monero) in the same cluster than we’re very vulnerable if one is compromised. 2.Implement Pod<->Pod network level filtering (and encryption if possible). 3.Implement message queue security between microservices (in many deployments only access to the BUS is authorized). 4.Limit access from Pods to the Cloud service accounts (and consider the risk of SSRF bugs!). 5.Do not use shared secrets (say for Oauth2). 6.Limit commits ability to the main branch (e.g. only few seniors can handle pull requests). 7.Separate Prod/Test/Dev clusters and all cloud accounts.
  • 15.
    Other important points 1.Donot hardcode credentials in any way. Use kubernetes secrets at least. 2.Watch for vulnerabilities in all infrastructure layers: a. Docker images (e.g. if there was no rebuild for 1 year image how many vulns are there). b. Cloud. c. K8 d. etc. 3.Implement hardening on all levels (Application/OS/Docker/K8/..) 4.The lack of knowledge about modern clouds inside teams leads to the security nightmare 5.Use signed Docker images and sign code and packages. 6.Use technology specific tools for compliance (e.g. K8 admission controllers, or WP core::integrity checking).
  • 16.
    Other important points 7.Manualoperations lead to mistake (e.g. publish service on the external IP -> implement firewall). 8.High speed of development leads to the lack of documentation (e.g. even all software requirements are in the Jira tickets). If you do not have an order now then you won’t have it in the cloud either. 9.Just copy->cloud of the enterprise infrastructure is costly and often ineffective. You need ot shift to the cloud paradigm of all development&deployment process. 10.It is very very hard to move monolithic application to the K8 and cloud microservices.