Back to blog

All Azure Network plugins lead to Cilium

Amit Gupta
Amit Gupta
Published: Updated: Cilium
Azure Plugin(s) Upgrade Path

“Your mind is your software; always update and upgrade it to experience its functionality in a better version”- Mike Ssendikwanawa. As soon as I read that quote, the one thing that came to my mind was how you could have a single pane of glass that reduces the complexity of networking plugins. A unified approach would decrease the potential for human error in changes, whether the changes were done manually or via some automation at a large scale. Azure CNI powered by Cilium is that single pane of glass to which you can upgrade all your Azure network plugins.

As a part of this tutorial, you will learn how to upgrade existing clusters in Azure Kubernetes Service (AKS) using different network plugins (supported by Azure) to Azure CNI powered by Cilium.

What is Azure CNI powered by Cilium? 

By making use of eBPF programs loaded into the Linux kernel, Azure CNI Powered by Cilium provides the following benefits:

  • Functionality equivalent to existing Azure CNI and Azure CNI Overlay plugins
  • Faster service routing
  • More efficient network policy enforcement
  • Better observability of cluster traffic
  • Support for clusters with more nodes, pods, and services.

How and What of Azure Networking?

In AKS, you can deploy a cluster that uses one of the following network models:

Kubenet networking

You can create AKS clusters using kubenet and create a virtual network and subnet. With kubenet, nodes get an IP address from a virtual network subnet. Network address translation (NAT) is configured on the nodes, and pods receive an IP address hidden behind the node IP. This approach reduces the number of IP addresses you must reserve in your network space for pods.

Azure CNI networking

With Azure Container Networking Interface (CNI), every pod gets an IP address from the subnet and can be accessed directly. Systems in the same virtual network as the AKS cluster see the pod IP as the source address for any traffic from the pod. Systems outside the AKS cluster virtual network see the node IP as the source address for any traffic from the pod. These IP addresses must be unique across your network space and must be planned. Each node has a configuration parameter for the maximum number of pods it supports. The equivalent number of IP addresses per node is then reserved upfront for that node. This approach requires more planning and often leads to IP address exhaustion or the need to rebuild clusters in a larger subnet as your application demands grow. Azure CNI networking has been further bifurcated into offerings you can opt for based on your requirements.

Azure CNI (Advanced) networking for Dynamic Allocation of IPs and enhanced subnet support

In Azure CNI (Legacy mode), every pod gets an IP address from the subnet and can be accessed directly. These IP addresses must be planned and unique across your network space. Each node has a configuration parameter for the maximum number of pods it supports. The equivalent number of IP addresses per node is then reserved upfront. This approach can lead to IP address exhaustion. To avoid these planning challenges, it is recommended that Azure CNI networking be enabled for the dynamic allocation of IPs and enhanced subnet support.

Azure CNI overlay networking

Azure CNI Overlay represents an evolution of Azure CNI, addressing scalability and planning challenges arising from assigning VNet IPs to pods. It achieves this by assigning private CIDR IPs to pods, which are separate from the VNet and can be reused across multiple clusters. The Azure VNet stack routes packets without encapsulation. Unlike Kubenet, where the traffic dataplane is handled by user-defined routes configured in the subnet’s route table, Azure CNI Overlay delegates this responsibility to Azure networking.

Azure CNI Powered by Cilium

In Azure CNI Powered by Cilium, Cilium leverages eBPF programs in the Linux kernel to accelerate packet processing for faster performance. Azure CNI powered by Cilium is a convenient “out-of-the-box” approach that gets you started with Cilium in an AKS environment, but if you wish to use Advanced CIlium features, then you can opt for Isovalent Enterprise for Cilium. Azure CNI powered by Cilium AKS clusters can be created (as explained above) in the following ways:

  • Overlay Mode
  • Dynamic IP allocation mode

Isovalent Enterprise for Cilium

Isovalent Enterprise for Cilium provides a one-click seamless upgrade from Azure CNI powered by Cilium. You can leverage the rich feature set (Network Policy, Encryption, Hubble-UI, Clustermesh, etc.) and get access to Microsoft and Isovalent support.

Bring your own CNI (BYOCNI)

AKS allows you to install any third-party CNI plugin like Cilium. You can install Cilium using BYOCNI Bring your own CNI feature and leverage the rich Enterprise feature set from Isovalent.

The Upgrade Matrix

There are key features provided by each plugin (as discussed above), but what stands out for Azure CNI powered by Cilium is that the users of Cilium benefit from the Azure CNI control plane with improved IP Address Management (IPAM) and even better integration into Azure Kubernetes Service (AKS) whereas users of AKS benefit from the feature set of Cilium. Providing an upgrade path is key to helping you in this progressive journey with Cilium.

Note

  • If you upgrade to Azure CNI powered by Cilium, your AKS clusters will no longer have Kube-proxy-based iptables implementation. Your clusters will be automatically migrated as a part of the upgrade process.
  • Scenarios 1-3 below assume that Cilium was not installed on these clusters before the upgrade.
  • Scenario 4 talks about upgrading from Legacy Azure IPAM with Cilium OSS to Azure CNI powered by Cilium. If you have more questions about it, contact sales@isovalent.com
  • Scenario 5 talks about upgrading from Kubenet to Azure CNI powered by Cilium.
  • Scenario 6 discusses upgrading from Kubenet to Azure CNI powered by Cilium (disabling Network Policy on an existing AKS cluster with Kubenet).
  • Scenario 7 below briefly touches upon the Upgrade from Azure CNI powered by Cilium to Isovalent Enterprise for Cilium, and it’s recommended to read Isovalent in Azure Kubernetes Service to get full insights into Isovalent Enterprise for Cilium. Reach out to support@isovalent.com for any support-related queries.
  • Upgrading from BYOCNI OSS to BYOCNI Isovalent Enterprise for Cilium (CEE) is available by contacting sales@isovalent.com
  • Upgrading from Azure CNI OSS to Azure CNI Isovalent Enterprise for Cilium (CEE) is available by contacting sales@isovalent.com

Pre-Requisites

The following prerequisites need to be taken into account before you proceed with this tutorial:

  • An Azure account with an active subscription- Create an account for free
  • Azure CLI version 2.48.1 or later. Run az --version to see the currently installed version. If you need to install or upgrade, see Install Azure CLI.
  • If using ARM templates or the REST API, the AKS API version must be 2022-09-02-preview or later.
  • The kubectl command line tool is installed on your device. The version can be the same as or up to one minor version earlier or later than the Kubernetes version of your cluster. For example, if your cluster version is 1.26, you can use kubectl version 1.25, 1.26, or 1.27 with it. To install or upgrade kubectl, see Installing or updating kubectl.
  • Subscription to Azure Monitor (Optional).
  • Install Cilium CLI.
  • Install Helm

Scenario 1: Upgrade an AKS cluster on Azure CNI Overlay to Azure CNI powered by Cilium

Create an Azure CNI Overlay cluster and upgrade it to Azure CNI powered by Cilium. This is optional if you have an existing cluster and can directly go to the upgrade section.

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

AKS Resource Group Creation

Create a Resource Group

clusterName="overlaytocilium"
resourceGroup="overlaytocilium"
vnet="overlaytocilium"
location="westcentralus"

az group create --name $resourceGroup --location $location
{
  "id": "/subscriptions/##########################/resourceGroups/overlaytocilium",
  "location": "westcentralus",
  "managedBy": null,
  "name": "overlaytocilium",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

AKS Cluster creation

Create a cluster with Azure CNI Overlay, and use the argument --network-plugin-mode to specify an overlay cluster. If the pod CIDR isn’t specified, AKS assigns a default space: 10.244.0.0/16.

Output Truncated:

az aks create -n $clusterName -g $resourceGroup --location $location --network-plugin azure --network-plugin-mode overlay --pod-cidr 192.168.0.0/16

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "azure",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": null,
    "outboundType": "loadBalancer",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Set the Kubernetes Context

Log in to the Azure portal, browse to Kubernetes Services>, select the respective Kubernetes service created (AKS Cluster), and click connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials $resourceGroup --$clusterName

Create a sample application.

  • Use the sample manifest below for an application to see how the pod and node addresses are distinct.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx
spec:
  selector:
    matchLabels:
      run: my-nginx
  replicas: 1
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      containers:
      - name: my-nginx
        image: nginx
        ports:
        - containerPort: 80
  • Apply the manifest
kubectl apply -f nginx_deployment_kpr.yaml
deployment.apps/my-nginx created

kubectl get pods -o wide
NAME                        READY   STATUS    RESTARTS   AGE   IP              NODE                                NOMINATED NODE   READINESS GATES
my-nginx-754c4d44b4-7qw2z   1/1     Running   0          14s   192.168.2.195   aks-nodepool1-20463487-vmss000002   <none>           <none>

kubectl get nodes -o wide
NAME                                STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-20463487-vmss000000   Ready    agent   10m   v1.26.6   10.224.0.5    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1
aks-nodepool1-20463487-vmss000001   Ready    agent   10m   v1.26.6   10.224.0.4    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1
aks-nodepool1-20463487-vmss000002   Ready    agent   10m   v1.26.6   10.224.0.6    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1

Upgrade the cluster to Azure CNI Powered by Cilium

You can update an existing cluster to Azure CNI Powered by Cilium if the cluster meets the following criteria:

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update -n $clusterName -g $resourceGroup --network-plugin azure --network-plugin-mode overlay --network-dataplane cilium --network-policy cilium

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "cilium",
    "outboundType": "loadBalancer",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]

Kube-Proxy Before Upgrade

Notice the kube-proxy daemonset is running.

Azure-CNS During Upgrade

Notice a pod that gets created with the prefix azure-cns-transition. The job of CNS is to manage IP allocation for Pods per Node and serve requests from Azure IPAM. The way azure-CNS works for overlay is different from the way it works in the case of Azure CNI powered by Cilium and hence a transition pod is spun up to take care of the migration during the upgrade.

No Kube-Proxy After Upgrade

Post the upgrade, kube-proxy daemonset is no longer there and Cilium completely takes over.

Scenario 2: Upgrade an AKS cluster on Azure CNI for Dynamic Allocation of IPs and enhanced subnet support to Azure CNI powered by Cilium

Create an Azure CNI cluster for Dynamic Allocation of IPs and subnet support and upgrade it to Azure CNI powered by Cilium. This is optional if you have an existing cluster and can directly go to the upgrade section.

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

AKS Resource Group creation

Create a Resource Group

clusterName="azurecnidynamictocilium"
resourceGroup="azurecnidynamictocilium"
vnet="azurecnidynamictocilium"
location="australiacentral"

az group create --name $resourceGroup --location $location
{
  "id": "/subscriptions/###########################/resourceGroups/azurecnidynamictocilium",
  "location": "australiacentral",
  "managedBy": null,
  "name": "azurecnidynamictocilium",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

AKS Network creation

Create a virtual network with a subnet for nodes and pods and retrieve the subnetID.

az network vnet create -g $resourceGroup --location $location --name $vnet --address-prefixes 10.0.0.0/8 -o none 

az network vnet subnet create -g $resourceGroup --vnet-name $vnet --name nodesubnet --address-prefixes 10.240.0.0/16 -o none 

az network vnet subnet create -g $resourceGroup --vnet-name $vnet --name podsubnet --address-prefixes 10.241.0.0/16 -o none

AKS Cluster creation

Create an AKS cluster referencing the node subnet --vnet-subnet-id and the pod subnet using --pod-subnet-id. Make sure to use the argument --network-plugin as azure.

Output Truncated:

az aks create -n $clusterName -g $resourceGroup -l $location \
    --max-pods 250 \
    --node-count 2 \
    --network-plugin azure \
    --vnet-subnet-id /subscriptions/$subscription/resourceGroups/$resourceGroup/providers/Microsoft.Network/virtualNetworks/$vnet/subnets/nodesubnet \
    --pod-subnet-id /subscriptions/$subscription/resourceGroups/$resourceGroup/providers/Microsoft.Network/virtualNetworks/$vnet/subnets/podsubnet

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "azure",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": null,
    "networkPolicy": null,
    "outboundType": "loadBalancer",
    "podCidr": null,
    "podCidrs": null,
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Note

IPs are allocated to nodes in batches of 16. Pod subnet IP allocation should be planned with a minimum of 16 IPs per node in the cluster; nodes will request 16 IPs on startup and another batch of 16 any time there are <8 IPs unallocated in their allotment.

Set the Kubernetes Context

Log in to the Azure portal, browse to Kubernetes Services>, select the respective Kubernetes service created ( AKS Cluster), and click connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials --$resourceGroup --$clusterName

Create a sample application. 

  • Use the below sample manifest for an application to see how the pod and node addressing are distinct.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx
spec:
  selector:
    matchLabels:
      run: my-nginx
  replicas: 1
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      containers:
      - name: my-nginx
        image: nginx
        ports:
        - containerPort: 80
  • Apply the manifest
kubectl apply -f nginx_deployment_kpr.yaml
deployment.apps/my-nginx created

kubectl get pods -o wide
NAME                        READY   STATUS    RESTARTS   AGE   IP           NODE                                NOMINATED NODE   READINESS GATES
my-nginx-754c4d44b4-8hvkt   1/1     Running   0          13s   10.241.0.9   aks-nodepool1-35671130-vmss000001   <none>           <none>

kubectl get nodes -o wide
NAME                                STATUS   ROLES   AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-35671130-vmss000000   Ready    agent   7m28s   v1.26.6   10.240.0.5    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1
aks-nodepool1-35671130-vmss000001   Ready    agent   7m40s   v1.26.6   10.240.0.4    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1

Upgrade the cluster to Azure CNI Powered by Cilium

You can update an existing cluster to Azure CNI Powered by Cilium if the cluster meets the following criteria:

The upgrade process triggers each node pool to be re-imaged simultaneously. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update -n $clusterName -g $resourceGroup --network-plugin azure --network-plugin-mode overlay --network-dataplane cilium --network-policy cilium

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": null,
    "networkPolicy": "cilium",
    "outboundType": "loadBalancer",
    "podCidr": null,
    "podCidrs": null,
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Kube-Proxy Before Upgrade

Notice the kube-proxy daemonset is running.

Azure CNS During Upgrade

Notice a pod that gets created with the prefix azure-cns-transition. The job of CNS is to manage IP allocation for Pods per Node and serve requests from Azure IPAM. Azure-CNS works for Azure CNI (Dynamic Allocation), which is different from how it works in the case of Azure CNI powered by Cilium; hence, a transition pod is spun up to take care of the migration during the upgrade.

No Kube-Proxy After Upgrade

After the upgrade, the kube-proxy daemonset is no longer there and Cilium completely takes over.

Scenario 3: Upgrade an AKS cluster on Azure CNI to Azure CNI powered by Cilium

This is a three-step upgrade. In this upgrade, an existing cluster on Azure CNI is upgraded to Azure CNI Overlay and eventually upgraded to Azure CNI powered by Cilium.

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

Step 1- Create a cluster on Azure CNI

Create an Azure CNI cluster and upgrade it to Azure CNI Overlay. This is optional if you have an existing cluster and can directly go to the upgrade section.

AKS Resource Group creation

Create a Resource Group

clusterName="azurecnitocilium"
resourceGroup="azurecnitocilium"
vnet="azurecnitocilium"
location="canadacentral"

az group create --name $resourceGroup --location $location
{
  "id": "/subscriptions/###########################/resourceGroups/azurecnitocilium",
  "location": "canadacentral",
  "managedBy": null,
  "name": "azurecnitocilium",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

AKS Network creation

Create a virtual network with a subnet for nodes and retrieve the subnet ID.

az network vnet create -g $resourceGroup --location $location --name $clusterName --address-prefixes 10.0.0.0/8 -o none

az network vnet subnet create -g $resourceGroup --vnet-name $vnet --name $clusterName --address-prefixes 10.240.0.0/16 -o none 

subnetid=$(az network vnet subnet show --resource-group $resourceGroup --vnet-name $vnet --name %clusterName --query id -o tsv)

AKS Cluster creation

Create an AKS cluster, and make sure to use the argument --network-plugin as azure.

Output Truncated:

az aks create -n $clusterName -g $resourceGroup -l $location --network-plugin azure --vnet-subnet-id $subnetid

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "azure",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": null,
    "networkPolicy": null,
    "outboundType": "loadBalancer",
    "podCidr": null,
    "podCidrs": null,
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Set the Kubernetes Context

Log in to the Azure portal, browse to Kubernetes Services>, select the respective Kubernetes service created ( AKS Cluster), and click connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials --$resource-group --$clusterName

Step 2- Upgrade the cluster to Azure CNI Overlay

You can update an existing Azure CNI cluster to Overlay if the cluster meets the following criteria:

  • The cluster is on Kubernetes version 1.22+.
  • Doesn’t use the dynamic pod IP allocation feature.
  • Doesn’t have network policies enabled.
  • Doesn’t use any Windows node pools with docker as the container runtime.

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update --name $clusterName --resource-group $resourceGroup --network-plugin-mode overlay --pod-cidr 192.168.0.0/16

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "azure",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": null,
    "outboundType": "loadBalancer",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Step 3- Upgrade the cluster to Azure CNI Powered by Cilium

You can update an existing cluster to Azure CNI Powered by Cilium if the cluster meets the following criteria:

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update -n $clusterName -g $resourceGroup --network-plugin azure --network-plugin-mode overlay --network-dataplane cilium --network-policy cilium

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "cilium",
    "outboundType": "loadBalancer",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Kube-Proxy Before Upgrade

Notice the kube-proxy daemonset is running.

Azure-CNS During Upgrade

Notice a pod that gets created with the prefix azure-cns-transition. The job of CNS is to manage IP allocation for Pods per Node and serve requests from Azure IPAM. The way azure-CNS works for overlay is different from the way it works in the case of Azure CNI powered by Cilium, and hence a transition pod is spun up to take care of the migration during the upgrade.

No Kube-Proxy After Upgrade

After the upgrade, the kube-proxy daemonset is no longer there, and Cilium completely takes over.

Scenario 4: Upgrade an AKS cluster on Legacy Azure IPAM with Cilium OSS to Azure CNI powered by Cilium

This is a three-step upgrade. In this upgrade, an existing cluster on Legacy Azure IPAM with Cilium OSS is upgraded to Azure CNI Overlay and eventually upgraded to Azure CNI powered by Cilium.

Note- There could be a potential loss in services and deployments hence it would be recommended to contact support@isovalent.com before you proceed with this upgrade, who can advise you on the migration path.

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

Step 1- Create a cluster on Legacy Azure IPAM

Create a cluster on Legacy Azure IPAM and upgrade it to Azure CNI Overlay. This is optional if you have an existing cluster and can directly go to the upgrade section.

AKS Resource Group creation

Create a Resource Group

clusterName="legacyipam"
resourceGroup="legacyipam"
location="westus3"

az group create -l $location -n $resourceGroup

{
  "id": "/subscriptions/##########################/resourceGroups/legacyipam",
  "location": "westus3",
  "managedBy": null,
  "name": "overlaytocilium",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

AKS Cluster creation

Create an AKS cluster and make sure to use the argument --network-plugin as azure.

Output Truncated:

az aks create -l $location -g $resourceGroup -n $clusterName --network-plugin azure

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "azure",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": null,
    "networkPolicy": "none",
    "outboundType": "loadBalancer",
    "podCidr": null,
    "podCidrs": null,
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"

Set the Kubernetes Context

Log in to the Azure portal, browse to Kubernetes Services>, select the respective Kubernetes service created ( AKS Cluster), and click connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials --$resourceGroup --$clusterName

Create a Service Principal:

To allow the cilium-operator to interact with the Azure API, a Service Principal with Contributor privileges over the AKS cluster is required (see Azure IPAM required privileges for more details). It is recommended to create a dedicated Service Principal for each Cilium installation with minimal privileges over the AKS node resource group:

Note- The AZURE_NODE_RESOURCE_GROUP is the MC_* Nodegroup that contains all of the infrastructure resources associated with the cluster.

AZURE_SUBSCRIPTION_ID=$(az account show --query "id" --output tsv)
AZURE_NODE_RESOURCE_GROUP=$(az aks show --resource-group $resourceGroup--name $resourceGroup --query "nodeResourceGroup" --output tsv)
AZURE_SERVICE_PRINCIPAL=$(az ad sp create-for-rbac --scopes /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_NODE_RESOURCE_GROUP} --role Contributor --output json --only-show-errors)
AZURE_TENANT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.tenant')
AZURE_CLIENT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.appId')
AZURE_CLIENT_SECRET=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.password')

Setup Helm repository:

Add the Cilium repo

helm repo add cilium https://helm.cilium.io/

Install Cilium

helm install cilium cilium/cilium --version 1.14.0 \
  --namespace kube-system \
  --set azure.enabled=true \
  --set azure.resourceGroup=$AZURE_NODE_RESOURCE_GROUP \
  --set azure.subscriptionID=$AZURE_SUBSCRIPTION_ID \
  --set azure.tenantID=$AZURE_TENANT_ID \
  --set azure.clientID=$AZURE_CLIENT_ID \
  --set azure.clientSecret=$AZURE_CLIENT_SECRET \
  --set tunnel=disabled \
  --set ipam.mode=azure \
  --set enableIPv4Masquerade=false \
  --set nodeinit.enabled=true

Step 2- Upgrade the cluster to Azure CNI Overlay

You can update an existing Azure CNI cluster to Overlay if the cluster meets the following criteria:

  • The cluster is on Kubernetes version 1.22+.
  • Doesn’t use the dynamic pod IP allocation feature.
  • Doesn’t have network policies enabled.
  • Doesn’t use any Windows node pools with docker as the container runtime.

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update --name $clusterName \
--resource-group $resourceGroup \
--network-plugin-mode overlay \
--pod-cidr 192.168.0.0/16


"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "azure",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": null,
    "outboundType": "loadBalancer",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Step 3- Upgrade the cluster to Azure CNI powered by Cilium

You can update an existing cluster to Azure CNI Powered by Cilium if the cluster meets the following criteria:

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update -n $clusterName -g $resourceGroup --network-plugin azure --network-plugin-mode overlay --network-dataplane cilium --network-policy cilium

    "loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "cilium",
    "outboundType": "loadBalancer",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]
  },

Kube-Proxy Before Upgrade

Notice the kube-proxy daemonset is running.

Azure-CNS During Upgrade

Notice a pod that gets created with the prefix azure-cns-transition. The job of CNS is to manage IP allocation for Pods per Node and serve requests from Azure IPAM. The way azure-CNS works for overlay is different from the way it works in the case of Azure CNI powered by Cilium, and hence a transition pod is spun up to take care of the migration during the upgrade.

No Kube-Proxy After Upgrade

After the upgrade, the kube-proxy daemonset is no longer there, and Cilium completely takes over.

Scenario 5: Kubenet to Azure CNI powered by Cilium

This is a three-step upgrade. In this upgrade, an existing cluster on Kubenet is upgraded to Azure CNI Overlay and eventually upgraded to Azure CNI powered by Cilium.

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

Step 1- Create a cluster on Kubenet

Create a cluster on Kubenet and upgrade it to Azure CNI Overlay. This is optional if you have an existing cluster and can directly go to the upgrade section.

AKS Resource Group creation

Create a Resource Group

clusterName="kubenetnopolicy"
resourceGroup="kubenetnopolicy"
location="westus2"

az group create -l $location -n $resourceGroup

{
  "id": "/subscriptions/###########################/resourceGroups/kubenetnopolicy",
  "location": "westus2",
  "managedBy": null,
  "name": "kubenetnopolicy",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

Create a Managed Identity

Create a Managed Identity for the AKS cluster. Take note of the principalId as that would be required while assigning a role for the managed identity.

az identity create --name myIdentity --resource-group kubenetnopolicy

{
  "clientId": "###########################",
  "id": "/subscriptions/###########################/resourcegroups/kubenetnopolicy/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myIdentity",
  "location": "westus2",
  "name": "myIdentity",
  "principalId": "###########################",
  "resourceGroup": "kubenetnopolicy",
  "systemData": null,
  "tags": {},
  "tenantId": "###########################",
  "type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}

AKS Network creation

Create a virtual network with a subnet for nodes and retrieve the subnet ID.

az network vnet create \
    --resource-group kubenetnopolicy \
    --name kubenetnopolicy \
    --address-prefixes 192.168.0.0/16 \
    --subnet-name kubenetnopolicy \
    --subnet-prefix 192.168.1.0/24

{
  "newVNet": {
    "addressSpace": {
      "addressPrefixes": [
        "192.168.0.0/16"
      ]
    },
    "enableDdosProtection": false,
    "etag": "W/\"###########################\"",
    "id": "/subscriptions/###########################/resourceGroups/kubenetpolicy/providers/Microsoft.Network/virtualNetworks/kubenetpolicy",
    "location": "westus3",
    "name": "kubenetpolicy",
    "provisioningState": "Succeeded",
    "resourceGroup": "kubenetpolicy",
    "resourceGuid": "###########################",
    "subnets": [
      {
        "addressPrefix": "192.168.1.0/24",
        "delegations": [],
        "etag": "W/\"###########################\"",
        "id": "/subscriptions/###########################/resourceGroups/kubenetpolicy/providers/Microsoft.Network/virtualNetworks/kubenetpolicy/subnets/kubenetpolicy",
        "name": "kubenetpolicy",
        "privateEndpointNetworkPolicies": "Disabled",
        "privateLinkServiceNetworkPolicies": "Enabled",
        "provisioningState": "Succeeded",
        "resourceGroup": "kubenetpolicy",
        "type": "Microsoft.Network/virtualNetworks/subnets"
      }
    ],
    "type": "Microsoft.Network/virtualNetworks",
    "virtualNetworkPeerings": []
  }
}

VNET_ID=$(az network vnet show --resource-group kubenetnopolicy --name kubenetnopolicy --query id -o tsv)

Role assignment for User-Managed Identity

Assign a network contributor role for the user-managed identity using principalId in the previous step. Also, fetch the subnet ID.

az role assignment create --assignee <principalId> --scope $VNET_ID --role "Network Contributor"
az role assignment create --assignee ########################### --scope $VNET_ID --role "Network Contributor"

{
  "condition": null,
  "conditionVersion": null,
  "createdBy": "###########################",
  "createdOn": "2023-12-02T08:46:40.033976+00:00",
  "delegatedManagedIdentityResourceId": null,
  "description": null,
  "id": "/subscriptions/###########################/resourceGroups/kubenetnopolicy/providers/Microsoft.Network/virtualNetworks/kubenetnopolicy/providers/Microsoft.Authorization/roleAssignments/###########################",
  "name": "###########################",
  "principalId": "###########################",
  "principalName": "###########################",
  "principalType": "ServicePrincipal",
  "resourceGroup": "kubenetnopolicy",
  "roleDefinitionId": "/subscriptions/###########################/providers/Microsoft.Authorization/roleDefinitions/###########################",
  "roleDefinitionName": "Network Contributor",
  "scope": "/subscriptions/###########################/resourceGroups/kubenetnopolicy/providers/Microsoft.Network/virtualNetworks/kubenetnopolicy",
  "type": "Microsoft.Authorization/roleAssignments",
  "updatedBy": "###########################",
  "updatedOn": "2023-12-02T08:46:40.033976+00:00"
}

SUBNET_ID=$(az network vnet subnet show --resource-group kubenetnopolicy --vnet-name kubenetnopolicy --name kubenetnopolicy --query id -o tsv)

AKS Cluster creation

Create an AKS cluster with managed identities.

Output Truncated:

az aks create \
    --resource-group $resourceGroup \
    --name $clusterName \
    --network-plugin kubenet \
    --service-cidr 10.0.0.0/16 \
    --dns-service-ip 10.0.0.10 \
    --pod-cidr 10.244.0.0/16 \
    --docker-bridge-address 172.17.0.1/16 \
    --vnet-subnet-id $SUBNET_ID \
    --assign-identity /subscriptions/##########################/resourcegroups/kubenetnopolicy/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myIdentity

Set the Kubernetes Context

Log in to the Azure portal, browse to Kubernetes Services>, select the respective Kubernetes service created ( AKS Cluster), and click connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials --$resourceGroup --$clusterName

Step 2- Upgrade the cluster to Azure CNI Overlay

You can update an existing Azure CNI cluster to Overlay if the cluster meets the following criteria:

  • The cluster is on Kubernetes version 1.22+.
  • Doesn’t use the dynamic pod IP allocation feature.
  • Doesn’t have network policies enabled.
  • Doesn’t use any Windows node pools with docker as the container runtime.

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update --name $clusterName \
--resource-group $resourceGroup \
--network-plugin azure \
--network-plugin-mode overlay

Step 3- Upgrade the cluster to Azure CNI powered by Cilium

You can update an existing cluster to Azure CNI Powered by Cilium if the cluster meets the following criteria:

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update -n $clusterName -g $resourceGroup --network-plugin azure --network-plugin-mode overlay --network-dataplane cilium --network-policy cilium

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "cilium",
    "outboundType": "loadBalancer",
    "podCidr": "10.244.0.0/16",
    "podCidrs": [
      "10.244.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]

Kube-Proxy Before Upgrade

Notice the kube-proxy daemonset is running.

Azure-CNS During Upgrade

Notice a pod that gets created with the prefix azure-cns-transition. The job of CNS is to manage IP allocation for Pods per Node and serve requests from Azure IPAM. The way azure-CNS works for overlay is different from the way it works in the case of Azure CNI powered by Cilium, and hence a transition pod is spun up to take care of the migration during the upgrade.

No Kube-Proxy After Upgrade

After the upgrade, the kube-proxy daemonset is no longer there, and Cilium completely takes over.

Scenario 6: Kubenet to Azure CNI powered by Cilium (disabling Network Policy)

This is a three-step upgrade. In this upgrade, an existing cluster on Kubenet with network policy (Calico) is upgraded to Azure CNI Overlay and eventually upgraded to Azure CNI powered by Cilium.

Note

  • The uninstall process does not remove Custom Resource Definitions (CRDs) and Custom Resources (CRs) used by Calico. These CRDs and CRs all have names ending with either “projectcalico.org” or “tigera.io.” These CRDs and associated CRs can be manually deleted after Calico is successfully uninstalled (deleting the CRDs before removing Calico breaks the cluster).
  • The upgrade will not remove any NetworkPolicy resources in the cluster, but these policies will no longer be enforced after uninstalling them.

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

Step 1- Create a cluster on Kubenet

Create a cluster on Kubenet and upgrade it to Azure CNI Overlay. This is optional if you have an existing cluster and can directly go to the upgrade section.

AKS Resource Group creation

Create a Resource Group

clusterName="kubenetpolicy"
resourceGroup="kubenetpolicy"
location="australiacentral"

az group create -l $location -n $resourceGroup

{
  "id": "/subscriptions/###########################/resourceGroups/kubenetpolicy",
  "location": "australiacentral",
  "managedBy": null,
  "name": "kubenetpolicy",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

Create a Managed Identity

Create a Managed Identity for the AKS cluster. Take note of the principalId which will be required while assigning a role for the managed identity.

az identity create --name myIdentity --resource-group kubenetpolicy

{
  "clientId": "###########################",
  "id": "/subscriptions/###########################/resourcegroups/kubenetpolicy/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myIdentity",
  "location": "australiacentral",
  "name": "myIdentity",
  "principalId": "###########################",
  "resourceGroup": "kubenetpolicy",
  "systemData": null,
  "tags": {},
  "tenantId": "###########################",
  "type": "Microsoft.ManagedIdentity/userAssignedIdentities"
}

AKS Network creation

Create a virtual network with a subnet for nodes and retrieve the subnet ID.

az network vnet create \
    --resource-group kubenetpolicy \
    --name kubenetpolicy \
    --address-prefixes 192.168.0.0/16 \
    --subnet-name kubenetpolicy \
    --subnet-prefix 192.168.1.0/24

{
  "newVNet": {
    "addressSpace": {
      "addressPrefixes": [
        "192.168.0.0/16"
      ]
    },
    "enableDdosProtection": false,
    "etag": "W/\"###########################\"",
    "id": "/subscriptions/###########################/resourceGroups/kubenetpolicy/providers/Microsoft.Network/virtualNetworks/kubenetpolicy",
    "location": "australiacentral",
    "name": "kubenetpolicy",
    "provisioningState": "Succeeded",
    "resourceGroup": "kubenetpolicy",
    "resourceGuid": "###########################",
    "subnets": [
      {
        "addressPrefix": "192.168.1.0/24",
        "delegations": [],
        "etag": "W/\"###########################\"",
        "id": "/subscriptions/###########################/resourceGroups/kubenetpolicy/providers/Microsoft.Network/virtualNetworks/kubenetpolicy/subnets/kubenetpolicy",
        "name": "kubenetpolicy",
        "privateEndpointNetworkPolicies": "Disabled",
        "privateLinkServiceNetworkPolicies": "Enabled",
        "provisioningState": "Succeeded",
        "resourceGroup": "kubenetpolicy",
        "type": "Microsoft.Network/virtualNetworks/subnets"
      }
    ],
    "type": "Microsoft.Network/virtualNetworks",
    "virtualNetworkPeerings": []
  }
}

VNET_ID=$(az network vnet show --resource-group kubenetpolicy --name kubenetpolicy --query id -o tsv)

Role assignment for User-Managed Identity

Assign a network contributor role for the user-managed identity using principalId in the previous step. Also, fetch the subnet ID.

az role assignment create --assignee <principalId> --scope $VNET_ID --role "Network Contributor"
az role assignment create --assignee ########################### --scope $VNET_ID --role "Network Contributor"

{
  "condition": null,
  "conditionVersion": null,
  "createdBy": "###########################",
  "createdOn": "2023-12-02T08:46:40.033976+00:00",
  "delegatedManagedIdentityResourceId": null,
  "description": null,
  "id": "/subscriptions/###########################/resourceGroups/kubenetnopolicy/providers/Microsoft.Network/virtualNetworks/kubenetnopolicy/providers/Microsoft.Authorization/roleAssignments/###########################",
  "name": "###########################",
  "principalId": "###########################",
  "principalName": "###########################",
  "principalType": "ServicePrincipal",
  "resourceGroup": "kubenetnopolicy",
  "roleDefinitionId": "/subscriptions/###########################/providers/Microsoft.Authorization/roleDefinitions/###########################",
  "roleDefinitionName": "Network Contributor",
  "scope": "/subscriptions/###########################/resourceGroups/kubenetnopolicy/providers/Microsoft.Network/virtualNetworks/kubenetnopolicy",
  "type": "Microsoft.Authorization/roleAssignments",
  "updatedBy": "###########################",
  "updatedOn": "2023-12-02T08:46:40.033976+00:00"
}

SUBNET_ID=$(az network vnet subnet show --resource-group kubenetpolicy --vnet-name kubenetpolicy --name kubenetpolicy --query id -o tsv)

AKS Cluster creation

Create an AKS cluster with managed identities and network policy set to calico.

Output Truncated:

az aks create \
    --resource-group $resourceGroup \
    --name $clusterName \
    --network-plugin kubenet \
    --network-policy calico \
    --service-cidr 10.0.0.0/16 \
    --dns-service-ip 10.0.0.10 \
    --pod-cidr 10.244.0.0/16 \
    --docker-bridge-address 172.17.0.1/16 \
    --vnet-subnet-id $SUBNET_ID \
    --assign-identity /subscriptions/##########################/resourcegroups/kubenetpolicy/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myIdentity

Set the Kubernetes Context

Log in to the Azure portal, browse Kubernetes Services>, select the respective Kubernetes service created ( AKS Cluster), and click connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials --$resourceGroup --$clusterName
kubectl get pods -A -o wide

NAMESPACE         NAME                                      READY   STATUS        RESTARTS   AGE     IP            NODE                                NOMINATED NODE   READINESS GATES
calico-system     calico-kube-controllers-554ffc48c-nj8hj   1/1     Running       0          2m5s    10.244.0.2    aks-nodepool1-11723128-vmss000001   <none>           <none>
calico-system     calico-node-kflv8                         1/1     Running       0          2m5s    192.168.1.4   aks-nodepool1-11723128-vmss000002   <none>           <none>
calico-system     calico-node-nsvvt                         1/1     Running       0          2m5s    192.168.1.5   aks-nodepool1-11723128-vmss000000   <none>           <none>
calico-system     calico-node-sx66w                         1/1     Running       0          2m5s    192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>
calico-system     calico-typha-6d6f885dbc-ht6k9             1/1     Running       0          2m5s    192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>
calico-system     calico-typha-6d6f885dbc-tcnhh             1/1     Running       0          118s    192.168.1.4   aks-nodepool1-11723128-vmss000002   <none>           <none>
kube-system       cloud-node-manager-fn4zd                  1/1     Running       0          3m4s    192.168.1.4   aks-nodepool1-11723128-vmss000002   <none>           <none>
kube-system       cloud-node-manager-ld2x8                  1/1     Running       0          3m16s   192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       cloud-node-manager-mvct2                  1/1     Running       0          3m6s    192.168.1.5   aks-nodepool1-11723128-vmss000000   <none>           <none>
kube-system       coredns-789789675-7r2n8                   1/1     Running       0          3m20s   10.244.0.7    aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       coredns-789789675-rfdqp                   1/1     Running       0          78s     10.244.1.2    aks-nodepool1-11723128-vmss000000   <none>           <none>
kube-system       coredns-autoscaler-649b947bbd-wzwkh       1/1     Running       0          3m20s   10.244.0.8    aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       csi-azuredisk-node-2vjfr                  3/3     Running       0          3m6s    192.168.1.5   aks-nodepool1-11723128-vmss000000   <none>           <none>
kube-system       csi-azuredisk-node-7lr6x                  3/3     Running       0          3m4s    192.168.1.4   aks-nodepool1-11723128-vmss000002   <none>           <none>
kube-system       csi-azuredisk-node-9mfr6                  3/3     Running       0          3m16s   192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       csi-azurefile-node-4r8dk                  3/3     Running       0          3m4s    192.168.1.4   aks-nodepool1-11723128-vmss000002   <none>           <none>
kube-system       csi-azurefile-node-cn5qh                  3/3     Running       0          3m16s   192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       csi-azurefile-node-pnmcj                  3/3     Running       0          3m6s    192.168.1.5   aks-nodepool1-11723128-vmss000000   <none>           <none>
kube-system       konnectivity-agent-6b8fc9f4cc-9n2g4       1/1     Running       0          3m19s   10.244.0.4    aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       konnectivity-agent-6b8fc9f4cc-qstfv       1/1     Running       0          3m19s   10.244.0.6    aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       kube-proxy-6lq5l                          1/1     Running       0          3m16s   192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>
kube-system       kube-proxy-j5lv7                          1/1     Running       0          3m4s    192.168.1.4   aks-nodepool1-11723128-vmss000002   <none>           <none>
kube-system       kube-proxy-rrrqk                          1/1     Running       0          3m6s    192.168.1.5   aks-nodepool1-11723128-vmss000000   <none>           <none>
kube-system       metrics-server-5467676b76-6dlsp           2/2     Running       0          71s     10.244.2.2    aks-nodepool1-11723128-vmss000002   <none>           <none>
kube-system       metrics-server-5467676b76-dl9pw           2/2     Running       0          71s     10.244.1.3    aks-nodepool1-11723128-vmss000000   <none>           <none>
kube-system       metrics-server-7557c5798-2hbtw            2/2     Terminating   0          3m19s   10.244.0.3    aks-nodepool1-11723128-vmss000001   <none>           <none>
tigera-operator   tigera-operator-65ff6ffb6d-xxklk          1/1     Running       0          3m17s   192.168.1.6   aks-nodepool1-11723128-vmss000001   <none>           <none>

Note-

  • The uninstall process does not remove Custom Resource Definitions (CRDs) and Custom Resources (CRs) used by Calico. These CRDs and CRs all have names ending with either “projectcalico.org” or “tigera.io.” These CRDs and associated CRs can be manually deleted after Calico is successfully uninstalled (deleting the CRDs before removing Calico breaks the cluster).
  • The upgrade will not remove any NetworkPolicy resources in the cluster, but these policies will no longer be enforced after uninstalling them.

Step 2- Upgrade the cluster to Azure CNI Overlay

You can update an existing Azure CNI cluster to Overlay if the cluster meets the following criteria:

  • The cluster is on Kubernetes version 1.22+.
  • Doesn’t use the dynamic pod IP allocation feature.
  • Doesn’t have network policies enabled.
  • Doesn’t use any Windows node pools with docker as the container runtime.

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Note- Make sure that network-policy is set to none.

Output Truncated:

az aks update --name kubenetpolicy \
--resource-group kubenetpolicy \
--network-plugin azure \
--network-policy none \
--network-plugin-mode overlay

Step 3- Upgrade the cluster to Azure CNI powered by Cilium

You can update an existing cluster to Azure CNI Powered by Cilium if the cluster meets the following criteria:

The upgrade process triggers each node pool to be re-imaged simultaneously. Upgrading each node pool separately to Overlay isn’t supported. Any disruptions to cluster networking are similar to a node image upgrade or Kubernetes version upgrade, where each node in a node pool is re-imaged.

Output Truncated:

az aks update -n $clusterName -g $resourceGroup --network-plugin azure --network-plugin-mode overlay --network-dataplane cilium --network-policy cilium

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "cilium",
    "outboundType": "loadBalancer",
    "podCidr": "10.244.0.0/16",
    "podCidrs": [
      "10.244.0.0/16"
    ],
    "serviceCidr": "10.0.0.0/16",
    "serviceCidrs": [
      "10.0.0.0/16"
    ]

Kube-Proxy Before Upgrade

Notice the kube-proxy daemonset is running.

Azure-CNS During Upgrade

Notice a pod that gets created with the prefix azure-cns-transition. The job of CNS is to manage IP allocation for Pods per Node and serve requests from Azure IPAM. The way azure-CNS works for overlay is different from the way it works in the case of Azure CNI powered by Cilium, and hence a transition pod is spun up to take care of the migration during the upgrade.

No Kube-Proxy After Upgrade

After the upgrade, the kube-proxy daemonset is no longer there, and Cilium completely takes over.

Scenario 7: Upgrade to Isovalent Enterprise for Cilium

You can upgrade your AKS cluster in all the above 5 scenarios to Isovalent Enterprise for Cilium. For the brevity of the tutorial, you can follow one such upgrade scenario as described.

You can follow this blog and the steps to upgrade an existing AKS cluster to Isovalent Enterprise for Cilium.

  • In the Azure portal, search for Marketplace on the top search bar. In the results, under Services, select Marketplace.
  • Type ‘Isovalent’ In the search window and select the offer.
  • On the Plans + Pricing tab, select an option. Ensure that the terms are acceptable, and then select Create.
  • Select the resource group in which the cluster exists that we will be upgraded.
  • Click Create New Dev Cluster, select ‘No,’ and click Next: Cluster Details.
  • As ‘No’ was selected, this will upgrade an existing cluster in that region.
  • The name for the AKS cluster will be auto-populated by clicking on the drop-down selection.
  • Click ‘Next: Review + Create’ Details.
  • Once Final validation is complete, click ‘Create’
  • When the application is deployed, the portal will show ‘Your deployment is complete’, along with deployment details.

Failure Scenario

Upgrade from Azure CNI to Azure CNI powered by Cilium

As you learned in Scenario 3, an AKS cluster on Azure CNI can be upgraded to Azure CNI powered by Cilium via a 3-step procedure. In case you attempt to upgrade directly, you will see error messages-

In the example below, a user attempts to upgrade an AKS cluster (clusterName=azurecni) on Azure CNI to Azure CNI powered by Cilium and is rightly prompted with the error that it’s impossible to proceed.

az aks update -n azurecni -g azurecni  --network-dataplane cilium

(BadRequest) Cilium dataplane requires either network plugin mode overlay or pod subnet
Code: BadRequest
Message: Cilium dataplane requires either network plugin mode overlay or pod subnet
Target: networkProfile.networkPluginMode

Validation- Isovalent Enterprise for Cilium

Cluster status check

Check the status of the nodes and make sure they are in a “Ready” state

kubectl get nodes -o wide
NAME                                STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-20463487-vmss000000   Ready    agent   15m   v1.26.6   10.224.0.5    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1
aks-nodepool1-20463487-vmss000001   Ready    agent   13m   v1.26.6   10.224.0.4    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1
aks-nodepool1-20463487-vmss000002   Ready    agent   11m   v1.26.6   10.224.0.6    <none>        Ubuntu 22.04.3 LTS   5.15.0-1041-azure   containerd://1.7.5-1

Validate Cilium version

Check the version of cilium with cilium version:

kubectl -n kube-system exec ds/cilium -- cilium version
Defaulted container "cilium-agent" out of: cilium-agent, install-cni-binaries (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), systemd-networkd-overrides (init), block-wireserver (init)
Client: 1.12.10 628b5209ef 2023-05-16T12:50:44-04:00 go version go1.18.10 linux/amd64
Daemon: 1.12.10 628b5209ef 2023-05-16T12:50:44-04:00 go version go1.18.10 linux/amd64

Cilium Health Check

cilium-health is a tool available in Cilium that provides visibility into the overall health of the cluster’s networking connectivity. You can check node-to-node health with cilium-health status:

kubectl -n kube-system exec ds/cilium -- cilium-health status
Defaulted container "cilium-agent" out of: cilium-agent, install-cni-binaries (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), systemd-networkd-overrides (init), block-wireserver (init)
Probe time:   2023-09-28T14:06:14Z
Nodes:
  aks-nodepool1-20463487-vmss000002 (localhost):
    Host connectivity to 10.224.0.6:
      ICMP to stack:   OK, RTT=180.007µs
      HTTP to agent:   OK, RTT=207.908µs
  aks-nodepool1-20463487-vmss000000:
    Host connectivity to 10.224.0.5:
      ICMP to stack:   OK, RTT=877.833µs
      HTTP to agent:   OK, RTT=452.617µs
  aks-nodepool1-20463487-vmss000001:
    Host connectivity to 10.224.0.4:
      ICMP to stack:   OK, RTT=812.63µs
      HTTP to agent:   OK, RTT=450.417µs

Cilium Connectivity Test

The Cilium connectivity test deploys a series of services and deployments, and CiliumNetworkPolicy will use various connectivity paths to connect. Connectivity paths include with and without service load-balancing and various network policy combinations.

The cilium connectivity test was run for all of the above scenarios, and the tests were passed successfully. A truncated output for one such test result:

Output Truncated:

cilium connectivity test

ℹ️  Monitor aggregation detected, will skip some flow validation steps
[overlaytocilium] Creating namespace cilium-test for connectivity check...
[overlaytocilium] Deploying echo-same-node service...
[overlaytocilium] Deploying DNS test server configmap...
[overlaytocilium] Deploying same-node deployment...
[overlaytocilium] Deploying client deployment...
[overlaytocilium] Deploying client2 deployment...
[overlaytocilium] Deploying echo-other-node service...
[overlaytocilium] Deploying other-node deployment...
[host-netns] Deploying azurecni daemonset...
[host-netns-non-cilium] Deploying overlaytocilium daemonset...
[overlaytocilium] Deploying echo-external-node deployment...
[overlaytocilium] Waiting for deployments [client client2 echo-same-node] to become ready...
[overlaytocilium] Waiting for deployments [echo-other-node] to become ready...
[overlaytocilium] Waiting for CiliumEndpoint for pod cilium-test/client-6f6788d7cc-6nxql to appear...
[overlaytocilium] Waiting for CiliumEndpoint for pod cilium-test/client2-bc59f56d5-j64dj to appear...
✅ All 34 tests (218 actions) successful, 20 tests skipped, 0 scenarios skipped.

Azure Monitor (Optional)

When critical applications and business processes rely on Azure resources, you want to monitor those resources for their availability, performance, and operation. You can monitor data generated by AKS and analyze it with Azure Monitor.

  • Login to the Azure portal
  • Select the respective resource group where the AKS cluster has been created.
  • Select Monitoring > Insights > Configure Monitoring
  • Select Enable Container Logs

Azure Monitor in action

  • Creation of an azure-cns-transition pod
  • Deletion of a node during the upgrade process
  • Removal of a surge node
  • Reimage of node during the upgrade process

Events

When you upgrade your cluster, the following Kubernetes events may occur on each node:

  • Surge: Creates a surge node.
  • Drain: Evicts pods from the node. Each pod has a 30-second timeout to complete the eviction.
  • Update: An update of a node succeeds or fails.
  • Delete: Deletes a surge node.

Use kubectl get events to show events in the default namespaces while running an upgrade.

kubectl get events -w 

Activity Logs

The Azure Monitor activity log is a platform log in Azure that provides insight into subscription-level events. The activity log includes information like when a resource is modified, or a virtual machine is started.

  • Upgrade alert (Kubernetes Services > Activity Log) that indicates a cluster upgrade from Azure CNI to Azure CNI powered by Cilium.
  • You can also take a look at the change analysis ( Kubernetes Services > Activity Log > Changed Properties) for diving deeper into the change set.

Conclusion

Hopefully, this tutorial gave you a good overview of how to upgrade your existing AKS clusters in Azure to Azure CNI powered by Cilium. If you have any feedback on the solution, please share it with us. You’ll find us on the Cilium Slack channel.

Try it Out

Further Reading

Amit Gupta
AuthorAmit GuptaSenior Technical Marketing Engineer

Related

Tutorial : Azure CNI Powered by Cilium

In this tutorial, you will learn how to use Azure CNI Powered by Cilium, while presenting you with the various AKS networking options.

Tutorial : Azure CNI Powered by Cilium
Nico Vibert

Cilium in Azure Kubernetes Service (AKS)

In this tutorial, users will learn how to deploy Isovalent Enterprise for Cilium on your AKS cluster from Azure Marketplace on a new cluster and also upgrade an existing cluster from an AKS cluster running Azure CNI powered by Cilium to Isovalent Enterprise for Cilium.

Cilium in Azure Kubernetes Service (AKS)
Amit Gupta

AKS Bring Your Own CNI (BYOCNI) and Cilium

[03:09] In this short video, Senior Technical Marketing Engineer Nico Vibert deploys a AKS cluster without a CNI to ease the installation of Cilium.

AKS Bring Your Own CNI (BYOCNI) and Cilium
Nico Vibert

Industry insights you won’t delete. Delivered to your inbox weekly.