Modern-day enterprise networks are complex and dynamic, designed to support various applications and services while ensuring security, scalability, and reliability. Kubernetes has become a go-to platform for enterprises looking to modernize their application deployment and management processes. With Multi-Cloud and Hybrid Cloud support, Enterprises can leverage multiple cloud providers or combine on-premises and cloud resources, optimizing costs and performance based on specific needs. Now imagine a team creating a sandbox environment where some of the Pods appear to have the same IP address that Pods have on another cloud provider, leading to massive network outages and application downtime. Isovalent Enterprise for Cilium offers a unique headway into this problem by first connecting multiple Kubernetes clusters or swarms of clusters across cloud providers, hybrid cloud set-ups, and mitigating the overlapping Pod IP addressing problem. This tutorial will guide you through setting up Isovalent Cilium Enterprise’s Cluster Mesh with overlapping Pod CIDR.
What leads to this scenario in an Enterprise Network?
Overlapping IP addresses occurs when identical IP ranges are allocated across different networks or applications, causing massive communication hurdles. Here are the key causes:
Merger and Acquisition
When organizations merge, they may have existing networks with overlapping CIDR blocks. Integrating these networks without proper planning can lead to conflicts.
Dynamic Environment
In dynamic environments (e.g., cloud-native applications), rapid resource provisioning can lead to overlapping CIDRs if not managed carefully, especially in auto-scaling or microservices architectures.
Multiple Cloud Providers
Using different cloud providers without a unified IP addressing strategy can lead to conflicts when the same CIDR ranges are employed across providers.
Legacy Systems
Legacy systems may have fixed IP addresses or ranges that conflict with newer allocations, especially if the legacy systems are poorly documented.
Suboptimal IP Address Management (IPAM)
Failure to use robust IP address management tools can result in misallocations and overlaps, particularly in large or complex networks.
Testing Environments
Duplicate IP ranges may be used in test or staging environments that mirror production setups, leading to conflicts when integrating these environments.
How does Isovalent address this issue?
You can minimize these interruptions by using Isovalent Enterprise for Cilium’s Cluster Mesh support for Overlapping Pod CIDR or deploying an Egress Gateway. Let’s look at the Overlapping Pod CIDR support from Isovalent.
How do packets traverse a Cluster Mesh with Overlapping Pod CIDRs?
The following diagram provides an overview of the packet flow of Cluster Mesh with overlapping Pod CIDR support.
apiVersion: v1
kind: Service
metadata:name: httpbin-service
annotations:service.cilium.io/global:"true"
Inter-cluster communication must be performed in this mode through Global or Phantom service.
Global Service– Isovalent can load balance traffic to Pods across all clusters in a Cluster Mesh. This is achieved by using global services. A global service is a service that is created with the same spec in each cluster and annotated with service.cilium.io/global: "true".
Global service will load balance across all available backends in all clusters by default.
Phantom Service– Global services mandate that an identical service be present in each cluster from which the service is accessed. Phantom services lift this requirement, allowing a given service to be accessed from remote clusters even if it is not there.
A phantom service is a LoadBalancer service associated with at least one VIP and annotated with service.isovalent.com/phantom: "true".
This makes the phantom service LoadBalancer IP address accessible from all clusters in the Cluster Mesh. Source IP addresses and identities are preserved for cross-cluster communication.
When the traffic crosses the cluster boundary, the source IP address is translated to Node IP.
For Intra-Cluster communication, the source IP address is preserved.
Intra-Cluster communication via a service leads to a destination IP translation.
What is Isovalent Enterprise for Cilium?
Isovalent Cilium Enterprise is an enterprise-grade, hardened distribution of open-source projects Cilium, Hubble, and Tetragon, built and supported by the Cilium creators. Cilium enhances networking and security at the network layer, while Hubble ensures thorough network observability and tracing. Tetragon ties it all together with runtime enforcement and security observability, offering a well-rounded solution for connectivity, compliance, multi-cloud, and security concerns.
Why Isovalent Enterprise for Cilium?
For enterprise customers requiring support and usage of Advanced Networking, Security, and Observability features, “Isovalent Enterprise for Cilium” is recommended with the following benefits:
Advanced network policy: advanced network policy capabilities that enable fine-grained control over network traffic for micro-segmentation and improved security.
Hubble flow observability + User Interface: real-time network traffic flow, policy visualization, and a powerful User Interface for easy troubleshooting and network management.
Multi-cluster connectivity via Cluster Mesh: seamless networking and security across multiple cloud providers like AWS, Azure, Google, and on-premises environments.
Advanced Security Capabilities via Tetragon: Tetragon provides advanced security capabilities such as protocol enforcement, IP and port whitelisting, and automatic application-aware policy generation to protect against the most sophisticated threats. Built on eBPF, Tetragon can easily scale to meet the needs of the most demanding cloud-native environments.
Service Mesh: Isovalent Cilium Enterprise provides sidecar-free, seamless service-to-service communication and advanced load balancing, making deploying and managing complex microservices architectures easy.
Enterprise-grade support: Enterprise-grade support from Isovalent’s experienced team of experts ensures that issues are resolved promptly and efficiently. Additionally, professional services help organizations deploy and manage Cilium in production environments.
Overlapping Pod CIDR Feature Compatibility Matrix
Feature
Status
Inter-cluster service communication
Supported
L3/L4 network policy enforcement
Supported
L7 Network Policies
Roadmap
Transparent Encryption
Roadmap
Endpoint Health Checking
Roadmap
Socket-based Load Balancing
Roadmap
Pre-Requisites
The following prerequisites need to be taken into account before you proceed with this tutorial:
Two up-and-running Kubernetes clusters. For this tutorial, we will create two Azure Kubernetes Service clusters using the network-plugin as BYOCNI:
Cluster Mesh with Overlapping PodCIDR requires Isovalent Enterprise for Cilium 1.13 or later.
Users can contact their partner Sales/SE representative(s) at sales@isovalent.com for more detailed insights into the features below and access the requisite documentation and hubble CLI software images.
Creating the AKS clusters
Let’s briefly see the commands to create AKS clusters with the network plugin BYOCNI.
What are the Pod IPs across clusters?
Once the AKS clusters are created, you can check that the Pods on the AKS clusters are on the same IP addresses.
How can you Peer the AKS clusters?
Use VNet peering to peer the AKS clusters across the two chosen regions. This step only needs to be done in one direction. The connection will automatically be established in both directions.
Login to the Azure Portal
Click Home
Click Virtual Network
Select the respective Virtual Network
Click Peerings
Click Add
Give the local peer a name
Select “Allow cluster1 to access cluster2”
Give the remote peer a name
Select the virtual network deployment model as “Resource Manager”
Select the subscription
Select the virtual network of the remote peer
Select “Allow cluster2 to access cluster1”
Click Add
How can you enable Cluster Mesh with overlapping Pod CIDR?
A unique Cluster ID and Cluster Name must identify all clusters.
All clusters must be configured with the same datapath mode, in other words, either native routing or encapsulation (using the same encapsulation protocol).
Install the cert-manager CRDs and set up the cilium issuer associated with the same Certification Authority in all clusters.
It doesn’t have to be via Cert-Manager, but it’s highly recommended, as manual CA cert copying and pasting is error-prone.
Create a sample yaml file. (Unique per cluster)
The yaml configuration file contains the basic properties to set up Cilium, Cluster Mesh, and Hubble.
Configures Cilium in CRD identity allocation mode.
Enables Hubble and Hubble Relay.
Enables the Cluster Mesh API Server and exposes it using a service of Type LoadBalancer. Cloud-provider-specific annotations are added to force the usage of private IP addresses.
Enables the automatic generation of the certificates using cert-manager, leveraging the existing cilium Issuer associated with the shared certificate authority.
Configures the most granular cross-cluster authentication scheme for improved segregation.
Install Isovalent Enterprise for Cilium and connect the clusters using the Cluster Mesh documentation.
How can you verify Cluster Mesh status?
Check the status of the clusters by running cilium clustermesh status on either of the clusters. If you use a service of type LoadBalancer, it will also wait for the LoadBalancer to be assigned an IP.
What’s the status of the clustermesh-api pod?
The Cluster Mesh API Server contains an etcd instance to keep track of the cluster’s state. The state from multiple clusters is never mixed. Cilium agents in other clusters connect to the Cluster Mesh API Server to watch for changes and replicate the multi-cluster state into their cluster. Access to the Cluster Mesh API Server is protected using TLS certificates. Access from one cluster to another is always read-only, ensuring failure domains remain unchanged. A failure in one cluster never propagates to other clusters.
Ensure that the clustermesh-api pod is running on both clusters.
How can you use L3/L4 policies in an Overlapping Pod CIDR scenario?
When using Cilium, endpoint IP addresses are irrelevant when defining security policies. Instead, you can use the labels assigned to the Pods to define security policies. The policies will be applied to the right Pods based on the labels, irrespective of where or when they run within the cluster.
The layer 3 policy establishes the base connectivity rules regarding which endpoints can talk to each other.
The layer 4 policy can be specified independently or independently in addition to the layer 3 policies. It restricts an endpoint’s ability to emit and/or receive packets on a particular port using a protocol.
A cilium network policy is always only applied to a single cluster and not automatically synced to the remote one if Cluster Mesh is enabled.
You can use the following commands to troubleshoot Cluster Mesh-related deployments.
Once the clusters are connected via Cluster Mesh, you can check the health of Nodes from either cluster.
Notice Nodes for both clusters are displayed.
kubectl -n kube-system exec ds/cilium -- cilium-health status
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)Probe time: 2024-09-26T07:32:50Z
Nodes:
amitgag-18499/aks-nodepool1-28392065-vmss000000 (localhost):
Host connectivity to 192.168.10.5:
ICMP to stack: OK, RTT=339.001µs
HTTP to agent: OK, RTT=201.801µs
amitgag-18499/aks-nodepool1-28392065-vmss000001:
Host connectivity to 192.168.10.6:
ICMP to stack: OK, RTT=1.212905ms
HTTP to agent: OK, RTT=1.192004ms
amitgag-18499/aks-nodepool1-28392065-vmss000002:
Host connectivity to 192.168.10.4:
ICMP to stack: OK, RTT=1.196305ms
HTTP to agent: OK, RTT=643.202µs
amitgag-30384/aks-nodepool1-15397795-vmss000000:
Host connectivity to 192.168.20.4:
ICMP to stack: OK, RTT=73.772068ms
HTTP to agent: OK, RTT=73.182465ms
amitgag-30384/aks-nodepool1-15397795-vmss000001:
Host connectivity to 192.168.20.5:
ICMP to stack: OK, RTT=74.130569ms
HTTP to agent: OK, RTT=73.496767ms
amitgag-30384/aks-nodepool1-15397795-vmss000002:
Host connectivity to 192.168.20.6:
ICMP to stack: OK, RTT=68.362948ms
HTTP to agent: OK, RTT=67.507044ms
Check the service endpoints from cluster 1; the remote endpoints are marked as preferred with the @2 suffix that denotes the Cluster-ID of the cluster.
Notice the httpbin-service has cluster-IP 10.11.42.230, and the remote backend Pods in cluster-2 are the preferred Pods with IPs 10.10.0.252@2 and 10.10.4.129@2.
kubectl exec -n kube-system -ti ds/cilium -- cilium service list --clustermesh-affinity
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)ID Frontend Service Type Backend
110.11.42.230:80 ClusterIP 1=>10.10.0.252@2:80 (active)2=>10.10.4.129@2:80 (active)3=>10.10.3.174:80 (active)4=>10.10.0.119:80 (active)210.11.0.1:443 ClusterIP 1=>4.155.152.37:443 (active)310.11.155.247:2379 ClusterIP 1=>10.10.0.218:2379 (active)4192.168.10.7:2379 LoadBalancer 1=>10.10.0.218:2379 (active)5192.168.10.5:31755 NodePort 1=>10.10.0.218:2379 (active)60.0.0.0:31755 NodePort 1=>10.10.0.218:2379 (active)710.11.155.89:443 ClusterIP 1=>192.168.10.5:4244 (active)810.11.0.10:53 ClusterIP 1=>10.10.3.145:53 (active)2=>10.10.0.188:53 (active)910.11.234.73:443 ClusterIP 1=>10.10.3.243:4443 (active)2=>10.10.1.250:4443 (active)
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
httpbin-service ClusterIP 10.11.42.230 <none>80/TCP 8d
kubernetes ClusterIP 10.11.0.1 <none>443/TCP 9d
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
httpbin-66c877d7d-fj7jc 1/1 Running 0 31h 10.10.0.119 aks-nodepool1-28392065-vmss000000 <none><none>httpbin-66c877d7d-wpn5z 1/1 Running 0 31h 10.10.3.174 aks-nodepool1-28392065-vmss000001 <none><none>netshoot-7cd4fdf959-4fvr6 1/1 Running 0 31h 10.10.3.68 aks-nodepool1-28392065-vmss000001 <none><none>kubectl get pods -o wide --context=amitgag-30384
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
httpbin-66c877d7d-pvxtp 1/1 Running 0 32h 10.10.4.129 aks-nodepool1-15397795-vmss000002 <none><none>httpbin-66c877d7d-sshvf 1/1 Running 0 32h 10.10.0.252 aks-nodepool1-15397795-vmss000000 <none><none>
Verify whether Cilium agents are successfully connected to all remote clusters.
kubectl exec -n kube-system -ti ds/cilium -- cilium-dbg status --all-clusters
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)KVStore: Ok Disabled
Kubernetes: Ok 1.29(v1.29.7)[linux/amd64]Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]KubeProxyReplacement: True [eth0 192.168.10.5 fe80::222:48ff:fec0:9a09 (Direct Routing)]Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.15.8-cee.1 (v1.15.8-cee.1-44b1b109)NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 5/254 allocated from 10.10.0.0/24,
ClusterMesh: 1/1 remote clusters ready, 1 global-services
amitgag-30384: ready, 3 nodes, 8 endpoints, 5 identities, 1 services, 0 reconnections (last: never) └ etcd: 1/1 connected, leases=0, lock leases=1, has-quorum=true: endpoint status checks are disabled, ID: e5cffb6071101644
└ remote configuration: expected=true, retrieved=true, cluster-id=2, kvstoremesh=false, sync-canaries=true
└ synchronization status: nodes=true, endpoints=true, identities=true, services=true
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Host Routing: Legacy
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]Controller Status: 38/38 healthy
Proxy Status: OK, ip10.10.0.70, 0 redirects active on ports 10000-20000, Envoy: embedded
Global Identity Range: min 65536, max 131071Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 19.52 Metrics: Disabled
Encryption: Disabled
Cluster health: Probe disabled
kubectl exec -n kube-system -ti ds/cilium -- cilium-dbg status --all-clusters
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)KVStore: Ok Disabled
Kubernetes: Ok 1.29(v1.29.7)[linux/amd64]Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]KubeProxyReplacement: True [eth0 192.168.20.4 fe80::7e1e:52ff:fe43:809c (Direct Routing)]Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.15.8-cee.1 (v1.15.8-cee.1-44b1b109)NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 4/254 allocated from 10.10.0.0/24,
ClusterMesh: 1/1 remote clusters ready, 1 global-services
amitgag-18499: ready, 3 nodes, 9 endpoints, 6 identities, 1 services, 2 reconnections (last: 31h58m18s ago) └ etcd: 1/1 connected, leases=0, lock leases=0, has-quorum=true: endpoint status checks are disabled, ID: f4644e0a228aaf04
└ remote configuration: expected=true, retrieved=true, cluster-id=1, kvstoremesh=false, sync-canaries=true
└ synchronization status: nodes=true, endpoints=true, identities=true, services=true
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Host Routing: Legacy
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]Controller Status: 33/33 healthy
Proxy Status: OK, ip10.10.0.153, 0 redirects active on ports 10000-20000, Envoy: embedded
Global Identity Range: min 131072, max 196607Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 16.78 Metrics: Disabled
Encryption: Disabled
Cluster health: Probe disabled
Multiple causes can prevent Cilium agents (or KVStoreMesh, when enabled) from correctly connecting to the remote etcd cluster, being it the sidecar instance part of the clustermesh-apiserver, or a separate etcd cluster when Cilium operates in KVStore mode.
Cilium features an automatic cilium-dbg troubleshoot Cluster Mesh command, which performs automatic checks to validate DNS resolution, network connectivity, mTLS authentication, etcd authorization, and more and reports the output in a user-friendly format.
kubectl exec -n kube-system -ti ds/cilium -- cilium-dbg troubleshoot clustermesh
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)Found 1 cluster configurations
Cluster "amitgag-30384":📄 Configuration path: /var/lib/cilium/clustermesh/amitgag-30384
🔌 Endpoints:
- https://amitgag-30384.mesh.cilium.io:2379
✅ Hostname resolved to: 192.168.20.7
✅ TCP connection successfully established to 192.168.20.7:2379
✅ TLS connection successfully established to 192.168.20.7:2379
ℹ️ Negotiated TLS version: TLS 1.3, ciphersuite TLS_AES_128_GCM_SHA256
ℹ️ Etcd server version: 3.5.15
🔑 Digital certificates:
✅ TLS Root CA certificates:
- Serial number: ################################### Subject: CN=Cilium CA
Issuer: CN=Cilium CA
Validity:
Not before: 2024-09-17 07:54:58 +0000 UTC
Not after: 2027-09-17 07:54:58 +0000 UTC
✅ TLS client certificates:
- Serial number: ############################################## Subject: CN=remote
Issuer: CN=Cilium CA
Validity:
Not before: 2024-09-17 07:54:00 +0000 UTC
Not after: 2027-09-17 07:54:00 +0000 UTC
⚙️ Etcd client:
✅ Etcd connection successfully established
ℹ️ Etcd cluster ID: e5cffb6071101644
kubectl exec -n kube-system -ti ds/cilium -- cilium-dbg troubleshoot clustermesh
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)Found 1 cluster configurations
Cluster "amitgag-18499":📄 Configuration path: /var/lib/cilium/clustermesh/amitgag-18499
🔌 Endpoints:
- https://amitgag-18499.mesh.cilium.io:2379
✅ Hostname resolved to: 192.168.10.7
✅ TCP connection successfully established to 192.168.10.7:2379
✅ TLS connection successfully established to 192.168.10.7:2379
ℹ️ Negotiated TLS version: TLS 1.3, ciphersuite TLS_AES_128_GCM_SHA256
ℹ️ Etcd server version: 3.5.15
🔑 Digital certificates:
✅ TLS Root CA certificates:
- Serial number: ################################### Subject: CN=Cilium CA
Issuer: CN=Cilium CA
Validity:
Not before: 2024-09-17 07:53:19 +0000 UTC
Not after: 2027-09-17 07:53:19 +0000 UTC
✅ TLS client certificates:
- Serial number: ############################################ Subject: CN=remote
Issuer: CN=Cilium CA
Validity:
Not before: 2024-09-17 07:52:00 +0000 UTC
Not after: 2027-09-17 07:52:00 +0000 UTC
⚙️ Etcd client:
✅ Etcd connection successfully established
ℹ️ Etcd cluster ID: f4644e0a228aaf04
Conclusion
The evolution of network architectures poses challenges, and the Isovalent team is here to help you surge through these challenges. Overlapping addresses is one such challenge, but as you can see we can easily overcome that. Hopefully, this post gave you an overview of setting up Isovalent Cilium Enterprise’s Cluster Mesh with overlapping Pod CIDR. You can schedule a demo with our experts if you’d like to learn more.
Amit Gupta is a senior technical marketing engineer at Isovalent, powering eBPF cloud-native networking and security. Amit has 21+ years of experience in Networking, Telecommunications, Cloud, Security, and Open-Source. He has previously worked with Motorola, Juniper, Avi Networks (acquired by VMware), and Prosimo. He is keen to learn and try out new technologies that aid in solving day-to-day problems for operators and customers.
He has worked in the Indian start-up ecosystem for a long time and helps new folks in that area outside of work. Amit is an avid runner and cyclist and also spends considerable time helping kids in orphanages.
This tutorial describes the steps of how to enable cilium cluster mesh on an AKS cluster running Isovalent Enterprise for Cilium from Azure Marketplace.
With the rise of Kubernetes adoption, an increasing number of clusters is deployed for various needs, and it is becoming common for companies to have clusters running on multiple cloud providers, as well as on-premise.
Kubernetes Federation has for a few years brought the promise of connecting these clusters into multi-zone layers, but latency issues are more often than not preventing such architectures.
Cilium Cluster Mesh allows you to connect the networks of multiple clusters in such as way that pods in each cluster can discover and access services in all other clusters of the mesh, provided all the clusters run Cilium as their CNI.
This allows to effectively join multiple clusters into a large unified network, regardless of the Kubernetes distribution each of them is running.
In this lab, we will see how to set up Cilium Cluster Mesh, and the benefits from such an architecture.
Industry insights you won’t delete. Delivered to your inbox weekly.