Unexpected Leaks in AWS
Transit Gateways
Cloud Village @ DEFCON32
William Taylor
• William Taylor
• Security Consultant @ WithSecure
• Cloud, Kubernetes, Mobile Security
Introduction
• Client has a new deployment of sensitive
compute resources
• Isolation of compute stated as primary
security concern
• New deployment is temporarily connected to
old deployment during transition phase (1~2
years)
• WithSecure to perform a security assessment
to assess efficacy of the design‘s security
controls
The Scenario
The Design
Virtual private cloud (VPC)
Compute Compute Compute
Egress Egress Egress
Availability Zone A
AWS PrivateLink
Availability Zone B Availability Zone C
Isolated Account Legacy Account
VPC
Subnet 1
TGW TGW Subnet 2
Subnet N
The Expectation
Virtual private cloud (VPC)
Compute Compute Compute
Egress Egress Egress
Availability Zone A Availability Zone B Availability Zone C
Isolated Account Legacy Account
VPC
Subnet 1
TGW TGW Subnet 2
Subnet N
The Reality
Virtual private cloud (VPC)
Compute Compute Compute
Egress Egress Egress
Availability Zone A Availability Zone B Availability Zone C
Isolated Account Legacy Account
VPC
Subnet 1
TGW TGW Subnet 2
Subnet N
nmap found ALL hosts up
• The design makes sense, but the
evidence proves otherwise, what’s going
on?
• Reviewed AWS account
• Reviewed IaC
• Double/triple checked NACLs
• Google
• Blog post on AWS support forum
• “…we use multiple subnets by AZ. Our
standard VPC configuration includes two
subnets in AZ … two subnets in AZ B ...
[a]ccording to my test and documentation,
it is impossible to link two or more subnets
to a Transit Gateway Attachment.”
• “The subnet association is simply the subnet
WITHIN THE ENTIRE AZ … it will be able to
communicate to any subnet in that AZ, as
long as your routing rules and security
groups allow it.”
The Investigation
• The design makes sense, but the
evidence proves otherwise, what’s going
on?
• Reviewed AWS account
• Reviewed IaC
• Double/triple checked NACLs
• Google
• Blog post on AWS support forum
• “…we use multiple subnets by AZ. Our
standard VPC configuration includes two
subnets in AZ … two subnets in AZ B ...
[a]ccording to my test and documentation,
it is impossible to link two or more subnets
to a Transit Gateway Attachment.”
• “The subnet association is simply the subnet
WITHIN THE ENTIRE AZ … it will be able to
communicate to any subnet in that AZ, as
long as your routing rules and security
groups allow it.”
The Investigation
The Explanation
Virtual private cloud (VPC)
Compute Compute Compute
Egress Egress Egress
Availability Zone A Availability Zone B Availability Zone C
Isolated Account Legacy Account
VPC
Subnet 1
TGW TGW Subnet 2
Subnet N
• Solutions and recommendations are a
guidance only
• Apply restrictive NACLs to compute subnets
• If using NACLs, don’t keep the default allow all
• VPC peering as an alternative to TGW peering
• Separate Compute and Egress VPCs
The Fix
• Strong design, strong start
• Security design review valuable
• Practical testing to verify critical
• Dangers of dodgy documentation
The Conclusion
• AWS EC2 Deployment
• Public IPs; security group set to allow all
• iptables rules used to prevent tcp
connections
• Scanning showed 1 of 800 was publicly
exposed
• Error in the init script, rule never set
• Identified with security assessment
• Azure subscription with sensitive
compute
• Large number of NSGs; granular
permissions
• Outbound rule used AzureCloud service
tag
• Permitted outbound connection to all Az
Compute IPs
The Others
Unexpected Leaks in AWS Transit Gateways

Unexpected Leaks in AWS Transit Gateways

  • 1.
    Unexpected Leaks inAWS Transit Gateways Cloud Village @ DEFCON32 William Taylor
  • 2.
    • William Taylor •Security Consultant @ WithSecure • Cloud, Kubernetes, Mobile Security Introduction
  • 3.
    • Client hasa new deployment of sensitive compute resources • Isolation of compute stated as primary security concern • New deployment is temporarily connected to old deployment during transition phase (1~2 years) • WithSecure to perform a security assessment to assess efficacy of the design‘s security controls The Scenario
  • 4.
    The Design Virtual privatecloud (VPC) Compute Compute Compute Egress Egress Egress Availability Zone A AWS PrivateLink Availability Zone B Availability Zone C Isolated Account Legacy Account VPC Subnet 1 TGW TGW Subnet 2 Subnet N
  • 5.
    The Expectation Virtual privatecloud (VPC) Compute Compute Compute Egress Egress Egress Availability Zone A Availability Zone B Availability Zone C Isolated Account Legacy Account VPC Subnet 1 TGW TGW Subnet 2 Subnet N
  • 6.
    The Reality Virtual privatecloud (VPC) Compute Compute Compute Egress Egress Egress Availability Zone A Availability Zone B Availability Zone C Isolated Account Legacy Account VPC Subnet 1 TGW TGW Subnet 2 Subnet N nmap found ALL hosts up
  • 7.
    • The designmakes sense, but the evidence proves otherwise, what’s going on? • Reviewed AWS account • Reviewed IaC • Double/triple checked NACLs • Google • Blog post on AWS support forum • “…we use multiple subnets by AZ. Our standard VPC configuration includes two subnets in AZ … two subnets in AZ B ... [a]ccording to my test and documentation, it is impossible to link two or more subnets to a Transit Gateway Attachment.” • “The subnet association is simply the subnet WITHIN THE ENTIRE AZ … it will be able to communicate to any subnet in that AZ, as long as your routing rules and security groups allow it.” The Investigation
  • 8.
    • The designmakes sense, but the evidence proves otherwise, what’s going on? • Reviewed AWS account • Reviewed IaC • Double/triple checked NACLs • Google • Blog post on AWS support forum • “…we use multiple subnets by AZ. Our standard VPC configuration includes two subnets in AZ … two subnets in AZ B ... [a]ccording to my test and documentation, it is impossible to link two or more subnets to a Transit Gateway Attachment.” • “The subnet association is simply the subnet WITHIN THE ENTIRE AZ … it will be able to communicate to any subnet in that AZ, as long as your routing rules and security groups allow it.” The Investigation
  • 9.
    The Explanation Virtual privatecloud (VPC) Compute Compute Compute Egress Egress Egress Availability Zone A Availability Zone B Availability Zone C Isolated Account Legacy Account VPC Subnet 1 TGW TGW Subnet 2 Subnet N
  • 10.
    • Solutions andrecommendations are a guidance only • Apply restrictive NACLs to compute subnets • If using NACLs, don’t keep the default allow all • VPC peering as an alternative to TGW peering • Separate Compute and Egress VPCs The Fix
  • 11.
    • Strong design,strong start • Security design review valuable • Practical testing to verify critical • Dangers of dodgy documentation The Conclusion
  • 12.
    • AWS EC2Deployment • Public IPs; security group set to allow all • iptables rules used to prevent tcp connections • Scanning showed 1 of 800 was publicly exposed • Error in the init script, rule never set • Identified with security assessment • Azure subscription with sensitive compute • Large number of NSGs; granular permissions • Outbound rule used AzureCloud service tag • Permitted outbound connection to all Az Compute IPs The Others

Editor's Notes

  • #1 Hi everyone, thanks for being here Great effort making it to final day, final talk, looking fresher than I feel I’m here to talk though a security assessment that threw up strange behaviour in TGW, the issue, the investigation, lessons learned
  • #2 I’m will, security consultant at WS Perform security reviews and offensive security testing for range of clients Mobile, K8s, and for this talk cloud
  • #3 Sensitive compute, deploying to new region, compute auto applied as needed when customers request, no direct compute deployment control Isolation key concern, from the internet, from other regions, from other deployments Connected to old region during transition, auto deployment will place compute in both regions, connectivity needed only when required WS to perform a security review, clear focus on isolation of resources
  • #4 Isolated account, with a VPC, Some subnets, Compute subnets, where the instances will be deployed, and egress subnets for connecting to other services Subnets across three AZ Routing tables forward most traffic to PrivateLinks But we are not looking at that We are interested in the link to the legacy region, similar deployment of compute VPCs and subnets Uses TGW TW Attachment applied to each Egress subnet, routing tables configured to send Private IP ranges of legacy subnets to the TGW TGW peered to another TGW, which in turn is attached to the legacy subnets – simplified legacy details, not reviewed in the assessment Finally, and importantly, NACLs applied in the Egress subnets – granularly configured for each legacy subnet, plus a deny all
  • #5 Testing deployed instance to one compute subnet Attempting to hit legacy will hit NACLs
  • #6 Simple nmap to a known host in the legacy region – HIT Nmap to a know subnet – HIT HIT HIT Nmap to all know private IPs in legacy – CLICK Something wrong here
  • #7 Simplifed arch diagram matches design docs – seems OK Instance is in the compute subnet, routing tables make sense Checked the IaC, can see the TGW attachments to the Egress, no clear signs what is amiss Checked the NACLs. And then again. And again. But it’s not just one instance, all can be hit – no ALLOW all with high priority Time for a google Search turned up an interesting post on AWS Discussion Forum Not exaclty the same problem, someone is trying to connect TGWs to multiple subnets in the AZ but can’t – the docs and testing confirms you can’s only one subnet But the response tells us what we need to know
  • #8 The association is in THE ENTIRE AZ So even if we connect to Egress… we are not ONLY connecting to the Egress
  • #9 Revisit our diagram, there is a mistake – the TGW attachments should really be here And if the TGW is here, the it also means the Egress subnets are basically not there… And with our Egress subnets, we also lose our NACLs, and so… Now we can see the root of the problem. It’s not that the NACLs were wrong. IT’s that the traffic never even hit the NACLs. There were no NACLs in the Compute subnet, not a diagram oversight
  • #10 So what can we do to fix it? Caveat – solutions are context dependant But, NACLs should really be applied to the subnet where compute is deployed, also don’t use default NACLs with default allow Maybe VPC peering? But that is a design change Maybe separate VPCs for compute and Egress? Again, context and design needs need to be considered