Connecting Kafka Across Multiple
AWS VPCs
Benoit Carrière, Expedia Group
• Our use case

• Alternatives

• Our solution
Topics
Use Case
• Expedia site

• Hotel pages and +

• Different AWS accounts/VPCs
• IP spaces overlapping issue

• Exposing more than what is
needed

• Too many peering to manage

(one per client VPCs)

• Strongly discouraged by our
architects
VPC Peering ?
Can
I haz topics,
plz?
Can I haz
topics, plz?
Can I haz
topics, plz?
Can I haz
topics, plz?
Help
:(
Internet-Facing ?
• Judged to be too risky

• Had to manage whitelist of private IPs

• Firewall limitation (4 Gbps cap)

• Traffic going out of AWS private network

• Data transfer cost incurred
• Require redesign

• Payload size limit

• Retention limit

• Not a drop-in replacement for us
Kinesis ?
Robert Couse-Baker (CC-BY-2.0)
AWS VPC Endpoint
• Released November 2017

• PrivateLink service in a VPC

• Connect through an interface VPC endpoint

• Use NLB to expose service
xing
• Bootstrap servers



SSL://kafka.usw2.exp.com:6000
• Security settings



security.protocol=SSL

ssl.enabled.protocols=TLSv1.2

ssl.protocol=TLSv1.2

ssl.truststore.location=truststore.jks

ssl.truststore.password=...

ssl.keystore.location=keystore.jks

ssl.keystore.password=...

ssl.key.password=...
Client Side
• kafka.usw2.exp.com

• Route 53

• CNAME

• Hosted Zone
Client Side
• SSL://kafka.usw2.exp.com:6000

• Only one server?

• All brokers behind Load Balancer

• Security

• SSL/TLS certificates

• Two-way authentication
Client Side
Client Side
Finding Nemo - Pixar Animation Studios via Know Your Meme ®
kafka.usw2.exp.com
???
• Powered by PrivateLink

• Essentially exposes a NLB to another VPC

• Access managed using IAM principals

• DNS hostname



vpce-afghnkxcptx0oefx0-98a1tpfp.vpce-
svc-jw9zwsubz7cbnxkg.us-
west-2.vpce.amazonaws.com

• https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html
AWS VPC Endpoint
kafka.usw2.exp.com
vpce-
afghnkxcptx0oefx0-98a1tpfp.v
pce-svc-jw9zwsubz7cbnxkg.us-
west-2.vpce.amazonaws.com
Client Side
• advertised.listeners
• Listeners for clients to use

• Different from the interface to which the broker
binds (0.0.0.0)

• CNAME : kafka.usw2.exp.com

• Unique network port per broker

• 6001, 6002, etc.
Server Side
• Target Groups

• One per broker

• One partition leader at any time

• Group 6000 for bootstrapping
Server Side
• Network Load Balancer

• Act as a reverse proxy

• Can’t balance request

• One leader per partition

• Cross-Zone load balancing

• One-to-one mapping

• Listener ➞ Group ➞ Instance
Server Side
• Example

Listener 6001

Target Group 6001

EC2 Instance for broker 6001

• Keep things aligned!
Server Side
1. VPC Endpoint Service - Provider



2. VPC Endpoint - Consumer



3. Consumer requests connection



4. Provider accepts connection
VPC Endpoint
SSL://...:6000SSL://...:6002
Broker
6002 is leader
on partition
topic-0.
Roger
that.
And it works !
Simple !
Activity on the VPC Endpoint NLB
Topics size progression during data load
• Kafka over VPC Endpoint

• No VPC Peering, no Internet-facing

• Secure, resilient, efficient

• Takeaways

• VPC Endpoint can work for distributed systems

• NLB as a reverse proxy

• Kafka + NLB = unique broker endpoint
Conclusion
Q & A
Connecting Kafka Across Multiple AWS VPCs

Connecting Kafka Across Multiple AWS VPCs

  • 1.
    Connecting Kafka AcrossMultiple AWS VPCs Benoit Carrière, Expedia Group
  • 2.
    • Our usecase • Alternatives • Our solution Topics
  • 3.
    Use Case • Expediasite • Hotel pages and + • Different AWS accounts/VPCs
  • 4.
    • IP spacesoverlapping issue • Exposing more than what is needed • Too many peering to manage
 (one per client VPCs) • Strongly discouraged by our architects VPC Peering ? Can I haz topics, plz? Can I haz topics, plz? Can I haz topics, plz? Can I haz topics, plz? Help :(
  • 5.
    Internet-Facing ? • Judgedto be too risky • Had to manage whitelist of private IPs • Firewall limitation (4 Gbps cap) • Traffic going out of AWS private network • Data transfer cost incurred
  • 6.
    • Require redesign •Payload size limit • Retention limit • Not a drop-in replacement for us Kinesis ? Robert Couse-Baker (CC-BY-2.0)
  • 7.
    AWS VPC Endpoint •Released November 2017 • PrivateLink service in a VPC • Connect through an interface VPC endpoint • Use NLB to expose service
  • 8.
  • 9.
    • Bootstrap servers
 
 SSL://kafka.usw2.exp.com:6000 •Security settings
 
 security.protocol=SSL
 ssl.enabled.protocols=TLSv1.2
 ssl.protocol=TLSv1.2
 ssl.truststore.location=truststore.jks
 ssl.truststore.password=...
 ssl.keystore.location=keystore.jks
 ssl.keystore.password=...
 ssl.key.password=... Client Side
  • 10.
    • kafka.usw2.exp.com • Route53 • CNAME • Hosted Zone Client Side
  • 11.
    • SSL://kafka.usw2.exp.com:6000 • Onlyone server? • All brokers behind Load Balancer • Security • SSL/TLS certificates • Two-way authentication Client Side
  • 12.
    Client Side Finding Nemo- Pixar Animation Studios via Know Your Meme ® kafka.usw2.exp.com ???
  • 13.
    • Powered byPrivateLink • Essentially exposes a NLB to another VPC • Access managed using IAM principals • DNS hostname
 
 vpce-afghnkxcptx0oefx0-98a1tpfp.vpce- svc-jw9zwsubz7cbnxkg.us- west-2.vpce.amazonaws.com • https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html AWS VPC Endpoint
  • 14.
  • 15.
    • advertised.listeners • Listenersfor clients to use • Different from the interface to which the broker binds (0.0.0.0) • CNAME : kafka.usw2.exp.com • Unique network port per broker • 6001, 6002, etc. Server Side
  • 16.
    • Target Groups •One per broker • One partition leader at any time • Group 6000 for bootstrapping Server Side
  • 17.
    • Network LoadBalancer • Act as a reverse proxy • Can’t balance request • One leader per partition • Cross-Zone load balancing • One-to-one mapping • Listener ➞ Group ➞ Instance Server Side
  • 18.
    • Example Listener 6001 TargetGroup 6001 EC2 Instance for broker 6001 • Keep things aligned! Server Side
  • 19.
    1. VPC EndpointService - Provider
 
 2. VPC Endpoint - Consumer
 
 3. Consumer requests connection
 
 4. Provider accepts connection VPC Endpoint
  • 20.
  • 21.
    And it works! Simple ! Activity on the VPC Endpoint NLB Topics size progression during data load
  • 22.
    • Kafka overVPC Endpoint • No VPC Peering, no Internet-facing • Secure, resilient, efficient • Takeaways • VPC Endpoint can work for distributed systems • NLB as a reverse proxy • Kafka + NLB = unique broker endpoint Conclusion
  • 23.