VPC Configuration
- VPC CIDR:
10.0.0.0/16
- Region 1:
10.0.0.0/24
(public),10.0.64.0/24
(private) - Region 2:
10.0.16.0/24
(public),10.0.80.0/24
(private) - Region 3:
10.0.32.0/24
(public),10.0.96.0/24
(private)
EC2 Configuration
- Instance A: Deployed on private subnet
10.0.64.0/24
in Region 1. Acts as the control-plane node in my kubernetes cluster. - Instance B: Deployed on private subnet
10.0.96.0/24
in Region 3. Acts as the worker node in my kubernetes cluster. - Instance C: Deployed on public subnet
10.0.16.0/24
in Region 2. Acts as the worker node in my kubernetes cluster. - Instance D: Deployed on public subnet
10.0.0.0/24
in Region 1. Acts as the test machine.
Kubernetes Setup
I've setup a kubernetes cluster with a Instance A as the master node, Instance B as the private worker node and Instance C as the public worker node. I'm using Cilium CNI with VXLAN routing and I've enabled Cilium's L2Annoucements feature. I've deployed a nginx deployement with an nginx service called nginx-svc
of type LoadBalancer
. I've also created a CiliumLoadBalancerIPPool
resource in my cluster that will grant any services an External IP
from subnet 10.0.128.0/24
to services of type LoadBalancer
.
I chose
10.0.128.0/24
because it was unused and wouldn't conflict with my existing VPC subnets.
Problem
As expected, my nginx-svc
received an External IP
from the virtual subnet 10.0.128.0/24
. Let's say this External IP
was 10.0.128.1
. When I do a curl http://10.0.128.1
from Instance A, Instance B and Instance C, I'm able to access my nginx-svc
. However, when I do curl http://10.0.128.1
on Instance D that isn't joined to my kubernetes cluster I'm unable to access the service and the request times out. This is the problem. I've read into how the L2Annoucement feature works and it does so by sending an ARP
reply to the router responsible for LAN CIDR (such as 10.0.0.0/16 in AWS case) to make it aware of the usage of virtual IP 10.0.128.1
such that ARP
requests from other instances (Instance A,B,C,D) within the same LAN (10.0.0.0/16
) are forwarded to the MAC
address of instance running the service.
This exact same setup works locally when I run it on QEMU and I'm able to access 10.0.128.1
even from virtual machines that are not joined into the cluster but this fails when running on AWS VPC. I am not entirely sure why this fails on AWS VPC.
The reason I want to to access service running on the virtual IP 10.0.128.1
from Instance D which isn't joined to the cluster is so I can create a DNAT/SNAT rule on Instance D via iptables and have it forward traffic from/to it's public IPv4 address to/from the private service address 10.0.128.1
accessible via LAN/VPC thereby simulating a public facing service.