Back to blog

Cilium Cluster Mesh in AKS

Amit Gupta
Amit Gupta
Published: Updated: Cilium
Cilium Cluster Mesh in AKS

What is Cilium Cluster Mesh? Cilium cluster mesh can be called “Global Services made accessible across a swarm of clusters.” These swarms of clusters could be within the respective cloud provider, span multiple cloud providers, span HA zones, span availability zones, and be across on-prem and cloud environments. This tutorial guides you on enabling Cilium Cluster Mesh (using Azure CLI) in an AKS cluster (in Dynamic IP allocation mode) running Isovalent Enterprise for Cilium from Azure Marketplace.

What is Isovalent Enterprise for Cilium?

Isovalent Cilium Enterprise is an enterprise-grade, hardened distribution of open-source projects Cilium, Hubble, and Tetragon, built and supported by the Cilium creators. Cilium enhances networking and security at the network layer, while Hubble ensures thorough network observability and tracing. Tetragon ties it all together with runtime enforcement and security observability, offering a well-rounded solution for connectivity, compliance, multi-cloud, and security concerns.

Why Isovalent Enterprise for Cilium?

For enterprise customers requiring support and usage of Advanced Networking, Security, and Observability features, “Isovalent Enterprise for Cilium” is recommended with the following benefits:

  • Advanced network policy: Isovalent Cilium Enterprise provides advanced network policy capabilities, including DNS-aware policy, L7 policy, and deny policy, enabling fine-grained control over network traffic for micro-segmentation and improved security.
  • Hubble flow observability + User Interface: Isovalent Cilium Enterprise Hubble observability feature provides real-time network traffic flow, policy visualization, and a powerful User Interface for easy troubleshooting and network management.
  • Multi-cluster connectivity via Cluster Mesh: Isovalent Cilium Enterprise provides seamless networking and security across multiple clouds, including public cloud providers like AWS, Azure, and Google Cloud Platform, as well as on-premises environments.
  • Advanced Security Capabilities via Tetragon: Tetragon provides advanced security capabilities such as protocol enforcement, IP and port whitelisting, and automatic application-aware policy generation to protect against the most sophisticated threats. Built on eBPF, Tetragon can easily scale to meet the needs of the most demanding cloud-native environments.
  • Service Mesh: Isovalent Cilium Enterprise provides seamless service-to-service communication that’s sidecar-free and advanced load balancing, making it easy to deploy and manage complex microservices architectures.
  • Enterprise-grade support: Isovalent Cilium Enterprise includes enterprise-grade support from Isovalent’s experienced team of experts, ensuring that any issues are resolved promptly and efficiently. Additionally, professional services help organizations deploy and manage Cilium in production environments.

Pre-Requisites

The following prerequisites need to be taken into account before you proceed with this tutorial:

  • You should have an Azure Subscription.
  • Install kubectl
  • Install Cilium CLI.
    • Users can contact their partner Sales/SE representative(s) at sales@isovalent.com for more detailed insights into the features below and access the requisite documentation and Cilium CLI software images.
  • Azure CLI version 2.48.1 or later. Run az –version to see the currently installed version. If you need to install or upgrade, see Install Azure CLI.
  • VNet peering is in place across both clusters. You can also use Private Link Service or Azure VWAN.
  • Ensure you have enough quota resources to create an AKS cluster. Go to the Subscription blade, navigate to “Usage + Quotas”, and make sure you have enough quota for the following resources:
    -Regional vCPUs
    -Standard Dv4 Family vCPUs

Creating the AKS clusters

Let’s briefly see the commands you would use to bring up the AKS clusters running Azure CNI powered by Cilium.

Create an AKS cluster with Azure CNI powered by Cilium in Region A.

az group create --name cluster1 --location canadacentral

az network vnet create -g cluster1 --location canadacentral --name cluster1 --address-prefixes 192.168.16.0/22 -o none

az network vnet subnet create -g cluster1 --vnet-name cluster1 --name nodesubnet --address-prefixes 192.168.16.0/24 -o none

az network vnet subnet create -g cluster1 --vnet-name cluster1 --name podsubnet --address-prefixes 192.168.17.0/24 -o none

az aks create -n cluster1 -g cluster1 -l canadacentral \
  --max-pods 250 \
  --network-plugin azure \
  --vnet-subnet-id /subscriptions/<subscription>/resourceGroups/cluster1/providers/Microsoft.Network/virtualNetworks/cluster1/subnets/nodesubnet \
  --pod-subnet-id /subscriptions/<subscription>/resourceGroups/cluster1/providers/Microsoft.Network/virtualNetworks/cluster1/subnets/podsubnet \
  --network-dataplane cilium

az aks get-credentials --resource-group cluster1 --name cluster1

How can you Upgrade your AKS clusters to Isovalent Enterprise for Cilium?

To enable Cluster Mesh on an AKS cluster, you need to upgrade the AKS clusters to Isovalent Enterprise for Cilium. In this tutorial, we will upgrade the AKS clusters using Azure Marketplace.

Note- you can also upgrade your AKS clusters via Azure CLI and Azure ARM Templates.

  • Login to the Azure Portal
  • Search for Marketplace
  • Search for Isovalent Enterprise for Cilium and select the container app.
  • Click Create
  • Select the Subscription, Resource Group, and Region where the cluster was created
  • Select “No” for the option that says “Create New Dev Cluster”
  • Click Next
  • Select the AKS cluster that was created in the previous section.
  • Click Next
  • The AKS cluster will now be upgraded to Isovalent Enterprise for Cilium.

What are the Node and Pod IPs across clusters?

Once the AKS clusters are upgraded to Isovalent Enterprise for Cilium, you can check that the nodes and pods on the AKS clusters are on distinct IP addresses.

Note- This is a mandatory requirement. You can also leverage support for Overlapping IP CIDR, but that support will be available in a future release of Isovalent Enterprise for Cilium from Azure marketplace.

Use kubectl get pods and kubectl get nodes and note the distinct IP addresses used for both the pods and nodes.

kubectl get nodes -o wide -A

NAME                                STATUS   ROLES   AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-13443004-vmss000000   Ready    agent   2d1h   v1.27.7   192.168.16.4   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-nodepool1-13443004-vmss000001   Ready    agent   2d1h   v1.27.7   192.168.16.6   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-nodepool1-13443004-vmss000002   Ready    agent   2d1h   v1.27.7   192.168.16.5   <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1

kubectl get pods -o wide -A

NAMESPACE                       NAME                                     READY   STATUS    RESTARTS      AGE    IP              NODE                                NOMINATED NODE   READINESS GATES
azure-extensions-usage-system   billing-operator-84cd55c557-qvgr2        5/5     Running   0             2d     192.168.17.32   aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     azure-cns-dqspr                          1/1     Running   0             114m   192.168.16.5    aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     azure-cns-fxdzd                          1/1     Running   0             115m   192.168.16.6    aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     azure-cns-nrldk                          1/1     Running   0             115m   192.168.16.4    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     cilium-48wg7                             1/1     Running   0             2d     192.168.16.4    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     cilium-operator-d78f778f7-kq4hd          1/1     Running   0             2d     192.168.16.4    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     cilium-operator-d78f778f7-qswpm          1/1     Running   0             2d     192.168.16.5    aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     cilium-wgs4b                             1/1     Running   0             2d     192.168.16.6    aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     cilium-xrpjl                             1/1     Running   1 (33h ago)   2d     192.168.16.5    aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     cloud-node-manager-6skgj                 1/1     Running   0             2d1h   192.168.16.6    aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     cloud-node-manager-7nkgb                 1/1     Running   0             2d1h   192.168.16.5    aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     cloud-node-manager-f4zbj                 1/1     Running   0             2d1h   192.168.16.4    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     clustermesh-apiserver-555ccb8654-xg68h   2/2     Running   0             2d     192.168.17.39   aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     coredns-789789675-5745t                  1/1     Running   0             2d1h   192.168.17.42   aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     coredns-789789675-8mm2h                  1/1     Running   0             2d1h   192.168.17.27   aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     coredns-autoscaler-649b947bbd-zn96n      1/1     Running   0             2d1h   192.168.17.34   aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     csi-azuredisk-node-f6l7d                 3/3     Running   0             2d1h   192.168.16.5    aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     csi-azuredisk-node-sspm4                 3/3     Running   0             2d1h   192.168.16.6    aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     csi-azuredisk-node-stljp                 3/3     Running   0             2d1h   192.168.16.4    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     csi-azurefile-node-c5dqz                 3/3     Running   0             2d1h   192.168.16.4    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     csi-azurefile-node-hdxxv                 3/3     Running   0             2d1h   192.168.16.6    aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     csi-azurefile-node-wf8ww                 3/3     Running   0             2d1h   192.168.16.5    aks-nodepool1-13443004-vmss000002   <none>           <none>
kube-system                     extension-agent-59ff6f87bc-6wsdv         2/2     Running   0             2d     192.168.17.6    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     extension-operator-59fcdc5cdc-zbgxt      2/2     Running   0             2d     192.168.17.20   aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     hubble-relay-76ff659b59-vk4kp            1/1     Running   0             2d     192.168.17.5    aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     konnectivity-agent-6854bdfcf9-f74gw      1/1     Running   0             2d     192.168.17.50   aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     konnectivity-agent-6854bdfcf9-j6b9r      1/1     Running   0             2d     192.168.17.10   aks-nodepool1-13443004-vmss000000   <none>           <none>
kube-system                     metrics-server-5467676b76-ngmn7          2/2     Running   0             2d1h   192.168.17.40   aks-nodepool1-13443004-vmss000001   <none>           <none>
kube-system                     metrics-server-5467676b76-rtvrx          2/2     Running   0             2d1h   192.168.17.16   aks-nodepool1-13443004-vmss000000   <none>           <none>

How can you Peer the AKS clusters?

Use VNet peering to peer the AKS clusters across the two chosen regions. This step only needs to be done in one direction. The connection will automatically be established in both directions.

  • Login to the Azure Portal
  • Click Home
  • Click Virtual Network
  • Select the respective Virtual Network
  • Click Peerings
  • Click Add
  • Give the local peer a name
  • Select “Allow cluster1 to access cluster2”
  • Give the remote peer a name
  • Select the virtual network deployment model as “Resource Manager”
  • Select the subscription
  • Select the virtual network of the remote peer
  • Select “Allow cluster2 to access cluster1”
  • Click Add

Are the pods and nodes across clusters reachable?

With VNet peering in place, ensure that pods and nodes across clusters are reachable. As you can see below, the cluster mesh pods are reachable from distinct nodes on the AKS clusters running in different regions:

Clustermesh pods in Cluster 2 should be reachable from Cluster 1.

root@aks-nodepool1-13443004-vmss000000:/# ping 192.168.10.44
PING 192.168.10.44 (192.168.10.44) 56(84) bytes of data.
64 bytes from 192.168.10.44: icmp_seq=1 ttl=63 time=82.8 ms
64 bytes from 192.168.10.44: icmp_seq=2 ttl=63 time=82.5 ms
64 bytes from 192.168.10.44: icmp_seq=3 ttl=63 time=82.0 ms
64 bytes from 192.168.10.44: icmp_seq=4 ttl=63 time=82.3 ms
64 bytes from 192.168.10.44: icmp_seq=5 ttl=63 time=86.0 ms
64 bytes from 192.168.10.44: icmp_seq=6 ttl=63 time=82.2 ms
64 bytes from 192.168.10.44: icmp_seq=7 ttl=63 time=82.3 ms
--- 192.168.10.44 ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6007ms
rtt min/avg/max/mdev = 81.981/82.855/85.965/1.289 ms

How can you enable Cluster Mesh?

You can enable Cluster Mesh via Azure CLI or an ARM template.

Note- since you are altering the ClusterID, it will lead to the recreation of all Cilium identities in a production cluster with existing connections, which should be handled appropriately.

Azure CLI

Follow the detailed instructions on Azure CLI to create and upgrade a cluster on AKS running Isovalent Enterprise for Cilium.

You can enable Clustermesh by using Azure CLI. In this case, you can enable it on an AKS cluster running Isovalent Enterprise for Cilium and name the cluster 'cluster1'.

az k8s-extension update -c cluster1 -t managedClusters -g cluster1 -n cilium --configuration-settings cluster.name=cluster1

az k8s-extension update -c cluster1 -t managedClusters -g cluster1 -n cilium --configuration-settings cluster.id=1

az k8s-extension update -c cluster1 -t managedClusters -g cluster1 -n cilium --configuration-settings clustermesh.config.enabled=false

az k8s-extension update -c cluster1 -t managedClusters -g cluster1 -n cilium --configuration-settings clustermesh.useAPIServer=true

az k8s-extension update -c cluster1 -t managedClusters -g cluster1 -n cilium --configuration-settings clustermesh.apiserver.service.type=LoadBalancer

How can you verify Cluster Mesh status?

You can check the status of the clusters by running cilium clustermesh status on either of the clusters.

cilium clustermesh status --context cluster1 --wait

cilium clustermesh status --context cluster2 --wait

The output would be something as below:

Cluster Mesh Output in Cluster 1

cilium clustermesh status --context cluster1 --wait
✅ Service "clustermesh-apiserver" of type "LoadBalancer" found
✅ Cluster access information is available:
  - 20.116.185.121:2379
✅ Deployment clustermesh-apiserver is ready
✅ All 3 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]
🔌 Cluster Connections:
  - cluster2: 3/3 configured, 3/3 connected
🔀 Global services: [ min:10 / avg:10.0 / max:10 ]

What’s the status of the clustermesh-api pod?

The Cluster Mesh API Server contains an etcd instance to keep track of the cluster’s state. The state from multiple clusters is never mixed. Cilium agents in other clusters connect to the Cluster Mesh API Server to watch for changes and replicate the multi-cluster state into their cluster. Access to the Cluster Mesh API Server is protected using TLS certificates. 

Access from one cluster into another is always read-only. This ensures failure domains remain unchanged. A failure in one cluster never propagates into other clusters.

You must ensure that the clustermesh-api pod is running on both clusters.

kubectl get pods -A -o wide

NAMESPACE                       NAME                                     READY   STATUS    RESTARTS   AGE   IP              NODE                                NOMINATED NODE   READINESS GATES
azure-extensions-usage-system   billing-operator-84cd55c557-k7spc        5/5     Running   0          2d    192.168.10.25   aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     azure-cns-5psrk                          1/1     Running   0          2d    192.168.9.6     aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     azure-cns-krcfn                          1/1     Running   0          2d    192.168.9.5     aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     azure-cns-nmfkl                          1/1     Running   0          2d    192.168.9.4     aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     cilium-59zm4                             1/1     Running   0          47h   192.168.9.6     aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     cilium-operator-d78f778f7-mhkg8          1/1     Running   0          2d    192.168.9.4     aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     cilium-operator-d78f778f7-s5r92          1/1     Running   0          2d    192.168.9.5     aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     cilium-rcmbd                             1/1     Running   0          47h   192.168.9.4     aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     cilium-s9czz                             1/1     Running   0          47h   192.168.9.5     aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     cloud-node-manager-7t5cz                 1/1     Running   0          2d    192.168.9.5     aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     cloud-node-manager-c66ft                 1/1     Running   0          2d    192.168.9.4     aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     cloud-node-manager-dtdqk                 1/1     Running   0          2d    192.168.9.6     aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     clustermesh-apiserver-555ccb8654-grxtt   2/2     Running   0          47h   192.168.10.44   aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     coredns-789789675-7k5zv                  1/1     Running   0          2d    192.168.10.17   aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     coredns-789789675-hb5rr                  1/1     Running   0          2d    192.168.10.49   aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     coredns-autoscaler-649b947bbd-rggpl      1/1     Running   0          2d    192.168.10.18   aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     csi-azuredisk-node-hrdj2                 3/3     Running   0          2d    192.168.9.6     aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     csi-azuredisk-node-sngjz                 3/3     Running   0          2d    192.168.9.5     aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     csi-azuredisk-node-zn2ft                 3/3     Running   0          2d    192.168.9.4     aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     csi-azurefile-node-46st2                 3/3     Running   0          2d    192.168.9.5     aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     csi-azurefile-node-4t2lg                 3/3     Running   0          2d    192.168.9.6     aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     csi-azurefile-node-v7kdj                 3/3     Running   0          2d    192.168.9.4     aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     extension-agent-59ff6f87bc-k9rb8         2/2     Running   0          2d    192.168.10.5    aks-nodepool1-42826393-vmss000001   <none>           <none>
kube-system                     extension-operator-59fcdc5cdc-cgx2t      2/2     Running   0          2d    192.168.10.29   aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     hubble-relay-76ff659b59-h9vh6            1/1     Running   0          2d    192.168.10.41   aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     konnectivity-agent-9b76986cd-96qhh       1/1     Running   0          47h   192.168.10.26   aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     konnectivity-agent-9b76986cd-kncrq       1/1     Running   0          47h   192.168.10.40   aks-nodepool1-42826393-vmss000002   <none>           <none>
kube-system                     metrics-server-5467676b76-6tjfv          2/2     Running   0          2d    192.168.10.24   aks-nodepool1-42826393-vmss000000   <none>           <none>
kube-system                     metrics-server-5467676b76-8vc95          2/2     Running   0          2d    192.168.10.48   aks-nodepool1-42826393-vmss000002   <none>           <none>

How can you connect the clusters?

Note- Since the Isovalent image available from the marketplace is 1.12 it is recommended to use the Cilium CLI.

You can connect the AKS clusters using Cilium CLI. This step only needs to be done in one direction. The connection will automatically be established in both directions:

cilium clustermesh connect --context cluster1 --destination-context cluster2

✅ Detected Helm release with Cilium version 1.12.17
⚠️ Cilium Version is less than 1.14.0. Continuing in classic mode.
✨ Extracting access information of cluster cluster2...
🔑 Extracting secrets from cluster cluster2...
ℹ️  Found ClusterMesh service IPs: [ 20.116.185.121]
✨ Extracting access information of cluster cluster1...
🔑 Extracting secrets from cluster cluster1...
ℹ️  Found ClusterMesh service IPs: [52.142.113.148]
✨ Connecting cluster cluster1 -> cluster2...
🔑 Secret cilium-clustermesh does not exist yet, creating it...
🔑 Patching existing secret cilium-clustermesh...
✨ Patching DaemonSet with IP aliases cilium-clustermesh...
✨ Connecting cluster cluster2 -> cluster1...
🔑 Secret cilium-clustermesh does not exist yet, creating it...
🔑 Patching existing secret cilium-clustermesh...
✨ Patching DaemonSet with IP aliases cilium-clustermesh...
✅ Connected cluster cluster1 and cluster2!

What can you do now that cluster mesh is enabled?

Load-balancing & Service Discovery

The global service discovery of Cilium’s multi-cluster model is built using standard Kubernetes services and designed to be completely transparent to existing Kubernetes application deployments.

You can create a global service that can be accessed from both clusters. Establishing load-balancing between clusters is achieved by defining a Kubernetes service with an identical name and namespace in each cluster and adding the annotation io.cilium/global-service: "true" to declare it global. Cilium will automatically perform load-balancing to pods in both clusters.

apiVersion: v1
kind: Service
metadata:
  name: rebel-base
  annotations:
    io.cilium/global-service: "true"
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    name: rebel-base

Deploying a Service

  • In Cluster 1, let’s deploy a service
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.12/examples/kubernetes/clustermesh/global-service-example/rebel-base-global-shared.yaml

kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.12/examples/kubernetes/clustermesh/global-service-example/cluster1.yaml
  • In Cluster 2, let’s deploy a service
kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.12/examples/kubernetes/clustermesh/global-service-example/rebel-base-global-shared.yaml

kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.12/examples/kubernetes/clustermesh/global-service-example/cluster2.yaml
  • From either cluster, access the global service. You can notice the response is received from pods across both clusters.
kubectl exec -ti deployments/x-wing -- /bin/sh -c 'for i in $(seq 1 10); do curl rebel-base; done'

{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}

Service Affinity

Using the Local Service Affinity annotation, it is possible to load balance traffic only to local endpoints unless the local endpoints aren’t available, after which traffic will be sent to endpoints in remote clusters. Load-balancing across multiple clusters might not be ideal in some cases. The annotation io.cilium/service-affinity: "local|remote|none" can be used to specify the preferred endpoint destination.

For example, if the value of annotation io.cilium/service-affinity is local, the Global Service will load-balance across healthy local backends, and only user remote endpoints if and only if all of the local backends are not available or unhealthy.

  • In cluster 1, add io.cilium/service-affinity="local" to existing global service
kubectl annotate service rebel-base io.cilium/service-affinity=local --overwrite
  • From cluster 1, access the global service. You will see replies from pods in cluster 1 only.
kubectl exec -ti deployments/x-wing -- /bin/sh -c 'for i in $(seq 1 10); do curl rebel-base; done'

{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
  • From cluster 2, access the global service. You will see replies from pods in both clusters as usual.
kubectl exec -ti deployments/x-wing -- /bin/sh -c 'for i in $(seq 1 10); do curl rebel-base; done'

{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-1"}
  • From cluster 1, check the service endpoints, the local endpoints are marked as preferred.
kubectl exec -n kube-system -ti ds/cilium -- cilium service list --clustermesh-affinity

Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init), block-wireserver (init)
ID   Frontend              Service Type   Backend
1    10.0.0.1:443          ClusterIP      1 => 20.123.98.50:443 (active)
2    10.0.0.10:53          ClusterIP      1 => 192.168.10.17:53 (active)
                                          2 => 192.168.10.49:53 (active)
3    10.0.116.42:443       ClusterIP      1 => 192.168.10.24:4443 (active)
                                          2 => 192.168.10.48:4443 (active)
4    10.0.202.68:8443      ClusterIP
5    10.0.124.112:8443     ClusterIP
6    10.0.26.200:443       ClusterIP      1 => 192.168.10.25:8443 (active)
7    10.0.181.227:80       ClusterIP      1 => 192.168.10.41:4245 (active)
8    10.0.11.10:443        ClusterIP      1 => 192.168.9.5:4244 (terminating)
                                          2 => 192.168.9.4:4244 (active)
                                          3 => 192.168.9.6:4244 (active)
9    10.0.191.205:2379     ClusterIP      1 => 192.168.10.44:2379 (active)
10   192.168.9.5:32379     NodePort       1 => 192.168.10.44:2379 (active)
11   0.0.0.0:32379         NodePort       1 => 192.168.10.44:2379 (active)
12   52.142.113.148:2379   LoadBalancer   1 => 192.168.10.44:2379 (active)
18   10.0.201.56:80        ClusterIP      1 => 192.168.10.19:80 (active) (preferred)
                                          2 => 192.168.10.30:80 (active) (preferred)
                                          3 => 192.168.17.18:80 (active)
                                          4 => 192.168.17.23:80 (active)
  • In cluster 1, change io.cilium/service-affinity the value to remote  for existing global service
kubectl annotate service rebel-base io.cilium/service-affinity=remote --overwrite
  • From cluster 1, access the global service. This time, the replies are coming from pods in cluster 2 only.
kubectl exec -ti deployments/x-wing -- /bin/sh -c 'for i in $(seq 1 10); do curl rebel-base; done'

{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}
{"Galaxy": "Alderaan", "Cluster": "Cluster-2"}

Conclusion

Hopefully, this post gave you a good overview of deploying Cluster mesh in an AKS cluster running Isovalent Enterprise for Cilium. If you have any feedback on the solution, please share it with us. You’ll find us on the Cilium Slack channel.

Try it Out

Isovalent Enterprise for Cilium on the Azure marketplace.

Further Reading

Amit Gupta
AuthorAmit GuptaSenior Technical Marketing Engineer

Related

Cilium in Azure Kubernetes Service (AKS)

In this tutorial, users will learn how to deploy Isovalent Enterprise for Cilium on your AKS cluster from Azure Marketplace on a new cluster and also upgrade an existing cluster from an AKS cluster running Azure CNI powered by Cilium to Isovalent Enterprise for Cilium.

Cilium in Azure Kubernetes Service (AKS)
Amit Gupta

Enabling Enterprise features for Cilium in Azure Kubernetes Service (AKS)

In this tutorial, you will learn how to enable Enterprise features (Layer-3, 4 & 7 policies, DNS-based policies, and observe the Network Flows using Hubble-CLI) in an Azure Kubernetes Service (AKS) cluster running Isovalent Enterprise for Cilium.

Enabling Enterprise features for Cilium in Azure Kubernetes Service (AKS)
Amit Gupta

Industry insights you won’t delete. Delivered to your inbox weekly.