Back to blog

Cilium on a Private AKS cluster

Amit Gupta
Amit Gupta
Published: Updated: Isovalent
Cilium on a Private AKS cluster

Security is at the forefront of most cloud architects’ minds. Engineers often turn to isolated and private environments in businesses with stringent security requirements. Due to the nature of the cloud, where resources are commonly shared, options to cordon off your environments are limited. In the case of Azure Kubernetes Service users, there is, however, a robust option: private AKS clusters. A private cluster means the managed API server created has internal IP addresses, and your Kubernetes nodes talk to it over your VNet. Private AKS cluster operators will want to benefit from Isovalent Enterprise for Cilium’s zero-trust tooling to optimize their security posture. This tutorial will guide you in creating a private AKS cluster with Azure CNI Powered by Cilium and upgrading to Isovalent Enterprise for Cilium.

What is Isovalent Enterprise for Cilium?

Isovalent Cilium Enterprise is an enterprise-grade, hardened distribution of open-source projects Cilium, Hubble, and Tetragon, built and supported by the Cilium creators. Cilium enhances networking and security at the network layer, while Hubble ensures thorough network observability and tracing. Tetragon ties it all together with runtime enforcement and security observability, offering a well-rounded solution for connectivity, compliance, multi-cloud, and security concerns.

Why Isovalent Enterprise for Cilium?

For enterprise customers requiring support and usage of Advanced Networking, Security, and Observability features, “Isovalent Enterprise for Cilium” is recommended with the following benefits:

  • Advanced network policy: Isovalent Cilium Enterprise provides advanced network policy capabilities, including DNS-aware policy, L7 policy, and deny policy, enabling fine-grained control over network traffic for micro-segmentation and improved security.
  • Hubble flow observability + User Interface: Isovalent Cilium Enterprise Hubble observability feature provides real-time network traffic flow, policy visualization, and a powerful User Interface for easy troubleshooting and network management.
  • Multi-cluster connectivity via Cluster Mesh: Isovalent Cilium Enterprise provides seamless networking and security across multiple clouds, including public cloud providers like AWS, Azure, and Google Cloud Platform, as well as on-premises environments.
  • Advanced Security Capabilities via Tetragon: Tetragon provides advanced security capabilities such as protocol enforcement, IP and port whitelisting, and automatic application-aware policy generation to protect against the most sophisticated threats. Built on eBPF, Tetragon can easily scale to meet the needs of the most demanding cloud-native environments.
  • Service Mesh: Isovalent Cilium Enterprise provides seamless service-to-service communication that’s sidecar-free and advanced load balancing, making it easy to deploy and manage complex microservices architectures.
  • Enterprise-grade support: Isovalent Cilium Enterprise includes enterprise-grade support from Isovalent’s experienced team of experts, ensuring that any issues are resolved promptly and efficiently. Additionally, professional services help organizations deploy and manage Cilium in production environments.

Pre-Requisites

The following prerequisites need to be taken into account before you proceed with this tutorial:

  • You should have an Azure Subscription.
  • Install kubectl
  • Install Cilium CLI.
    • Users can contact their partner Sales/SE representative(s) at sales@isovalent.com for more detailed insights into the features below and access the requisite documentation and Cilium CLI software images.
  • Azure CLI version 2.48.1 or later. Run az –version to see the currently installed version. If you need to install or upgrade, see Install Azure CLI.
  • Permission to configure Azure Firewall.
  • Familiarity with Azure Hub and Spoke Architecture.
  • VNet peering is in place across both VNet’s.
  • You must ensure that the following network and FQDN/application rules are required for AKS cluster extensions to work seamlessly.
  • In the case of a private AKS clusterconfigure rules to access an Azure container registry behind a firewall. The endpoints could be:
    • REST endpoints.
    • Dedicated data endpoints.
      • Example: arcmktplaceprod.westeurope.data.azurecr.io
    • Registry FQDN’s.

Note– this is a mandatory requirement as the images for the cilium-* and cilium-operator-* pods are pulled from these Microsoft repositories.

  • Ensure you have enough quota resources to create an AKS cluster. Go to the Subscription blade, navigate to “Usage + Quotas,” and make sure you have enough quota for the following resources:
    -Regional vCPUs
    -Standard Dv4 Family vCPUs

What does our deployment consist of?

In this deployment example, you will create an AKS cluster in a Spoke VNet and an Azure Firewall in the Hub VNet. Using a private link, you can broadcast the AKS control plane’s private IP address, which runs in an Azure-managed virtual network, into an IP address that is part of your AKS subnet. To execute all Kubernetes and Azure CLI-related commands, you will create a Jumpbox VM in the Hub VNet.

You will also set up the Azure Firewall to lock down ingoing and outgoing traffic from a Kubernetes cluster in Azure.

Creating the Azure Resources

Set the subscription

Choose the subscription you want to use if you have multiple Azure subscriptions.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

AKS Resource Group creation

  • Create a Resource Group for the Kubernetes Resources
 az group create -n prvaks -l westeurope
{
  "id": "/subscriptions/###########################/resourceGroups/prvaks",
  "location": "westeurope",
  "managedBy": null,
  "name": "prvaks",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}
  • Create a Resource Group for the Hub and VNet Resources
az group create -n prvaksvnet -l westeurope
{
  "id": "/subscriptions/###########################/resourceGroups/prvaksvnet",
  "location": "westeurope",
  "managedBy": null,
  "name": "prvaksvnet",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": "Microsoft.Resources/resourceGroups"
}

AKS Network creation

  • Create a virtual network for the hub.
az network vnet create -g prvaksvnet -n hub1-firewalvnet  --address-prefixes 10.0.0.0/22
  • Create a subnet in the hub VNet.
az network vnet subnet create -g prvaksvnet --vnet-name hub1-firewalvnet -n AzureFirewallSubnet --address-prefix 10.0.0.0/24
  • Create a subnet for the Jumpbox host in the hub VNet
az network vnet subnet create -g prvaksvnet --vnet-name hub1-firewalvnet -n jumpbox-subnet --address-prefix 10.0.1.0/24
  • Create a virtual network for the spoke.
az network vnet create -g prvaksvnet -n spoke1-kubevnet --address-prefixes 10.0.4.0/22
  • Create an ingress subnet in the spoke VNet
az network vnet subnet create -g prvaksvnet --vnet-name spoke1-kubevnet -n ing-1-subnet --address-prefix 10.0.4.0/24
  • Create an AKS subnet in the spoke VNet
az network vnet subnet create -g prvaksvnet --vnet-name spoke1-kubevnet -n aks-2-subnet --address-prefix 10.0.5.0/24

VNet Peering

Peer the hub and spoke VNet’s.

  • Hub to Spoke
az network vnet peering create -g prvaksvnet -n HubToSpoke1 --vnet-name hub1-firewalvnet --remote-vnet spoke1-kubevnet --allow-vnet-access
  • Spoke to Hub
az network vnet peering create -g prvaksvnet -n Spoke1ToHub --vnet-name spoke1-kubevnet --remote-vnet hub1-firewalvnet --allow-vnet-access

Azure Firewall

An open question that Azure users have is how to lock down ingoing to and outgoing traffic from a Kubernetes cluster in Azure. The Azure Firewall is one option that can be set up relatively easily but is not documented in detail.

  • Enable the firewall extension.
 az extension add --name azure-firewall
  • Configure a public IP for the Azure Firewall
az network public-ip create -g prvaksvnet -n prvaks --sku Standard
  • Create the Azure Firewall
az network firewall create --name prvaks --resource-group prvaksvnet --location westeurope

az network firewall ip-config create --firewall-name prvaks --name prvaks --public-ip-address prvaks --resource-group prvaksvnet --vnet-name hub1-firewalvnet
  • To observe outbound network traffic, you can set up azure monitor and configure a log-analytics workplace. Note- this is optional.
az monitor log-analytics workspace create --resource-group prvaksvnet --workspace-name prvaks --location westeurope

Create a User-defined Route

  • Create a user-defined route that will force all traffic from the AKS subnet to the internal IP of the Azure firewall.
KUBE_AGENT_SUBNET_ID="/subscriptions/##########################/resourceGroups/prvaksvnet/providers/Microsoft.Network/virtualNetworks/spoke1-kubevnet/aks-2-subnet"

az network route-table create -g prvaksvnet --name prvaks

az network route-table route create --resource-group prvaksvnet --name prvaks --route-table-name prvaks --address-prefix 0.0.0.0/0 --next-hop-type VirtualAppliance --next-hop-ip-address 10.0.0.4 --subscription ##########################

az network vnet subnet update --route-table prvaks --ids $KUBE_AGENT_SUBNET_ID
  • Check if the route table has been created successfully.
az network route-table route list --resource-group prvaksvnet --route-table-name prvaks

Create Exception Rules

Create the necessary exception rules for the AKS-required network dependencies to ensure the worker nodes can set their system time and Ubuntu updates (optional) and retrieve the AKS kube-system container images from the Microsoft Container Registry. 

Note- The exception rules carry a wildcard “*” entry for source and destination, which can be carefully looked into in the respective source(s) from which traffic will be allowed and similarly in the respective destination (s). These firewall rules are examples; you can work around them after checking with your network and system administrators.

az network firewall network-rule create --firewall-name prvaks --collection-name "time" --destination-addresses "*"  --destination-ports 123 --name "allow network" --protocols "UDP" --resource-group prvaksvnet --source-addresses "*" --action "Allow" --description "aks node time sync rule" --priority 106

az network firewall network-rule create --firewall-name prvaks --collection-name "dns" --destination-addresses "*"  --destination-ports 53 --name "allow network" --protocols "UDP" "TCP" --resource-group prvaksvnet --source-addresses "*" --action "Allow" --description "aks node dns rule" --priority 105

az network firewall network-rule create --firewall-name prvaks --collection-name "servicetags" --destination-addresses "AzureContainerRegistry" "MicrosoftContainerRegistry" "AzureActiveDirectory" "AzureMonitor" --destination-ports "*" --name "allow service tags" --protocols "Any" --resource-group prvaksvnet --source-addresses "*" --action "Allow" --description "allow service tags" --priority 104

az network firewall application-rule create --firewall-name prvaks --resource-group prvaksvnet --collection-name 'aksfwar' -n 'fqdn' --source-addresses '*' --protocols 'http=80' 'https=443' --fqdn-tags "AzureKubernetesService" --action allow --priority 103

az network firewall application-rule create  --firewall-name prvaks --collection-name "osupdates" --name "allow network" --protocols http=80 https=443 --source-addresses "*" --resource-group prvaksvnet --action "Allow" --target-fqdns "download.opensuse.org" "security.ubuntu.com" "packages.microsoft.com" "azure.archive.ubuntu.com" "changelogs.ubuntu.com" "snapcraft.io" "api.snapcraft.io" "motd.ubuntu.com"  --priority 102

az network firewall application-rule create  --firewall-name prvaks --collection-name "mktplace" --name "allow network" --protocols https=443 --source-addresses "*" --resource-group prvaksvnet --action "Allow" --target-fqdns "westeurope.dp.kubernetesconfiguration.azure.com" "mcr.microsoft.com" "*.data.mcr.microsoft.com" "arcmktplaceprod.azurecr.io" "*.ingestion.msftcloudes.com" "*.microsoftmetrics.com" "marketplaceapi.microsoft.com" "eus2azreplstore162.blob.core.windows.net" --priority 100

az network firewall application-rule create  --firewall-name prvaks --collection-name "dockerhub" --name "allow network" --protocols http=80 https=443 --source-addresses "*" --resource-group prvaksvnet --action "Allow" --target-fqdns "auth.docker.io" "registry-1.docker.io" --priority 110

Now you have created the virtual networks and subnets, the peering between them, and a route from the AKS subnet to the internet going through the Azure firewall that only allows the required Azure Kubernetes Services dependencies.

An AKS cluster, by default, will attach a public IP to the standard load balancer. In this case, we want to prevent this by setting the outboundType to userDefinedRouting in our deployment, configuring a private cluster with a managed identity, and assigning the correct permissions.

  • Create a managed Identity
az identity create --name myKubeletIdentity --resource-group prvaksvnet
  • Create an AKS cluster with Azure CNI powered by Cilium in overlay mode.
KUBE_NAME=“prvaks”
KUBE_GROUP=“prvaksvnet”

az aks create --resource-group $KUBE_GROUP --name $KUBE_NAME --load-balancer-sku standard --vm-set-type VirtualMachineScaleSets --enable-private-cluster --network-plugin azure --vnet-subnet-id /subscriptions/##########################/resourceGroups/prvaksvnet/providers/Microsoft.Network/virtualNetworks/spoke1-kubevnet/subnets/aks-2-subnet --network-plugin-mode overlay --pod-cidr 192.168.0.0/16 --network-dataplane cilium --docker-bridge-address 172.17.0.1/16 --dns-service-ip 10.2.0.10 --service-cidr 10.2.0.0/24 --enable-managed-identity --assign-identity /subscriptions/##########################/resourcegroups/prvaksvnet/providers/Microsoft.ManagedIdentity/userAssignedIdentities/prvaks --assign-kubelet-identity /subscriptions/##########################/resourcegroups/prvaksvnet/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myKubeletIdentity --outbound-type userDefinedRouting --disable-public-fqdn

Output truncated:

  "aiToolchainOperatorProfile": null,
  "apiServerAccessProfile": {
    "authorizedIpRanges": null,
    "disableRunCommand": null,
    "enablePrivateCluster": true,
    "enablePrivateClusterPublicFqdn": false,
    "enableVnetIntegration": null,
    "privateDnsZone": "system",
    "subnetId": null

"loadBalancerSku": "Standard",
    "monitoring": null,
    "natGatewayProfile": null,
    "networkDataplane": "cilium",
    "networkMode": null,
    "networkPlugin": "azure",
    "networkPluginMode": "overlay",
    "networkPolicy": "cilium",
    "outboundType": "userDefinedRouting",
    "podCidr": "192.168.0.0/16",
    "podCidrs": [
      "192.168.0.0/16"
    ],
    "serviceCidr": "10.2.0.0/24",
    "serviceCidrs": [
      "10.2.0.0/24"
    ]
  },

Once the AKS cluster is created, you need to create a link in your DNS zone to the hub VNet to ensure that your jumpbox can correctly resolve the DNS name of your private link-enabled cluster using kubectl from the jumpbox. This will ensure that your jumpbox can resolve the private IP of the API server using Azure DNS. 

NODE_GROUP=$(az aks show --resource-group prvaksvnet --name prvaks --query nodeResourceGroup -o tsv)

DNS_ZONE_NAME=$(az network private-dns zone list --resource-group $NODE_GROUP --query "[0].name" -o tsv)

HUB_VNET_ID=$(az network vnet show -g prvaksvnet -n hub1-firewalvnet --query id -o tsv)

az network private-dns link vnet create --name "hubnetdnsconfig" --registration-enabled false --resource-group $NODE_GROUP --virtual-network $HUB_VNET_ID --zone-name $DNS_ZONE_NAME

Create a Jumpbox VM

  • Deploy an Ubuntu Jumpbox VM into the JumpboxSubnet so that you can issue all Azure CLI and kubectl commands for the private cluster. Login to the Jumpbox VM and install Azure CLI and kubectl. Make sure you set the correct subscription.
az login --use-device-code


az account set --subscription "##########################"
  • Issue kubectl commands to see the status of all the pods and nodes in your private AKS cluster running Azure CNI powered by Cilium in overlay mode.
kubectl get pods -A -o wide
NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE                                NOMINATED NODE   READINESS GATES
kube-system   azure-cns-2tww9                       1/1     Running   0          13h   10.0.5.7        aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   azure-cns-pqgwl                       1/1     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   azure-cns-vj92m                       1/1     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   azure-ip-masq-agent-9q7cv             1/1     Running   0          13h   10.0.5.7        aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   azure-ip-masq-agent-gxnnm             1/1     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   azure-ip-masq-agent-k6b4j             1/1     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   cilium-5cdgl                          1/1     Running   0          13h   10.0.5.7        aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   cilium-7dzvg                          1/1     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   cilium-mkvk5                          1/1     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   cilium-operator-8cff7865b-jjzvx       1/1     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   cilium-operator-8cff7865b-qf556       1/1     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   cloud-node-manager-9kqql              1/1     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   cloud-node-manager-fmb5h              1/1     Running   0          13h   10.0.5.7        aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   cloud-node-manager-s7gbd              1/1     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   coredns-76b9877f49-dcbdg              1/1     Running   0          13h   192.168.2.121   aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   coredns-76b9877f49-sfn8b              1/1     Running   0          13h   192.168.0.77    aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   coredns-autoscaler-85f7d6b75d-2hcxp   1/1     Running   0          13h   192.168.0.222   aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   csi-azuredisk-node-4rd6c              3/3     Running   0          13h   10.0.5.7        aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   csi-azuredisk-node-pqg7g              3/3     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   csi-azuredisk-node-z887z              3/3     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   csi-azurefile-node-665gg              3/3     Running   0          13h   10.0.5.5        aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   csi-azurefile-node-fqkqc              3/3     Running   0          13h   10.0.5.7        aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   csi-azurefile-node-h67hc              3/3     Running   0          13h   10.0.5.6        aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   extension-agent-5d85679847-tdc5p      2/2     Running   0          13h   192.168.0.164   aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   extension-operator-74d6488fd7-n6j7b   2/2     Running   0          13h   192.168.0.119   aks-nodepool1-36216811-vmss000000   <none>           <none>
kube-system   konnectivity-agent-846597cfbc-ggx94   1/1     Running   0          13h   192.168.1.193   aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   konnectivity-agent-846597cfbc-kkxlq   1/1     Running   0          13h   192.168.2.14    aks-nodepool1-36216811-vmss000001   <none>           <none>
kube-system   metrics-server-5654598dc8-7bv4w       2/2     Running   0          13h   192.168.1.109   aks-nodepool1-36216811-vmss000002   <none>           <none>
kube-system   metrics-server-5654598dc8-psq45       2/2     Running   0          13h   192.168.2.126   aks-nodepool1-36216811-vmss000001   <none>           <none>

Upgrade to Isovalent Enterprise for Cilium

  • Upgrade your cluster to Isovalent Enterprise for Cilium. For this tutorial, we will be upgrading the cluster using az cli.

Note- Refer to this section to obtain the values for plan-name, plan-product & plan-publisher

az k8s-extension create --name cilium --extension-type Isovalent.CiliumEnterprise.One --scope cluster --cluster-name prvaks  --resource-group prvaksvnet --cluster-type managedClusters --plan-name isovalent-cilium-enterprise-base-edition --plan-product isovalent-cilium-enterprise --plan-publisher isovalentinc1234567890123
  • Issue kubectl commands to see the status of all the pods and nodes in your private AKS cluster running Isovalent Enterprise for Cilium in overlay mode.
kubectl get pods -A -o wide

NAMESPACE                       NAME                                  READY   STATUS    RESTARTS        AGE    IP              NODE                                NOMINATED NODE   READINESS GATES
azure-extensions-usage-system   billing-operator-84cd55c557-b7hqq     5/5     Running   0               5d3h   192.168.0.201   aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     azure-cns-5w2zk                       1/1     Running   0               9d     10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     azure-cns-82bgw                       1/1     Running   0               9d     10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     azure-cns-vlc9g                       1/1     Running   0               9d     10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     azure-ip-masq-agent-bmwzb             1/1     Running   0               9d     10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     azure-ip-masq-agent-sr4hk             1/1     Running   0               9d     10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     azure-ip-masq-agent-x89f6             1/1     Running   0               9d     10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     cilium-kd5tj                          1/1     Running   0               5d3h   10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     cilium-mjjbk                          1/1     Running   0               5d3h   10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     cilium-operator-d78f778f7-7zvs9       1/1     Running   0               5d3h   10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     cilium-operator-d78f778f7-ghm5g       1/1     Running   1 (4d13h ago)   5d3h   10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     cilium-vftrz                          1/1     Running   0               5d3h   10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     cloud-node-manager-8b74n              1/1     Running   0               9d     10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     cloud-node-manager-d55lx              1/1     Running   0               9d     10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     cloud-node-manager-ngmf4              1/1     Running   0               9d     10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     coredns-789789675-qc9kn               1/1     Running   0               9d     192.168.2.127   aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     coredns-789789675-w4v84               1/1     Running   0               9d     192.168.1.236   aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     coredns-autoscaler-649b947bbd-hxqwr   1/1     Running   0               9d     192.168.2.130   aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     csi-azuredisk-node-j4jkg              3/3     Running   0               9d     10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     csi-azuredisk-node-l42qt              3/3     Running   0               9d     10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     csi-azuredisk-node-vjp2q              3/3     Running   0               9d     10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     csi-azurefile-node-m565p              3/3     Running   0               9d     10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     csi-azurefile-node-xr75n              3/3     Running   0               9d     10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     csi-azurefile-node-xzwhs              3/3     Running   0               9d     10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     extension-agent-55d4f4795f-pktc6      2/2     Running   0               5d3h   192.168.0.44    aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     extension-operator-56c8d5f96c-9sgqx   2/2     Running   0               5d3h   192.168.0.150   aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     hubble-relay-76ff659b59-vkrsf         1/1     Running   0               5d3h   192.168.1.232   aks-nodepool1-39806308-vmss000001   <none>           <none>
kube-system                     konnectivity-agent-54c85967cb-7tbxr   1/1     Running   0               9d     192.168.2.169   aks-nodepool1-39806308-vmss000000   <none>           <none>
kube-system                     konnectivity-agent-54c85967cb-mgfdg   1/1     Running   0               9d     192.168.0.127   aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     metrics-server-5467676b76-7xmh8       2/2     Running   0               9d     192.168.0.179   aks-nodepool1-39806308-vmss000002   <none>           <none>
kube-system                     metrics-server-5467676b76-kg9l6       2/2     Running   0               9d     192.168.2.47    aks-nodepool1-39806308-vmss000000   <none>           <none>

Validation- Isovalent Enterprise for Cilium

Cluster status check

Check the status of the nodes and make sure they are in a “Ready” state

kubectl get nodes -o wide

NAME                                STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-39806308-vmss000000   Ready    agent   9d    v1.27.7   10.0.5.6      <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-nodepool1-39806308-vmss000001   Ready    agent   9d    v1.27.7   10.0.5.7      <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1
aks-nodepool1-39806308-vmss000002   Ready    agent   9d    v1.27.7   10.0.5.5      <none>        Ubuntu 22.04.3 LTS   5.15.0-1053-azure   containerd://1.7.5-1

Validate Cilium version

Check the version of cilium with cilium version:

kubectl -n kube-system exec ds/cilium -- cilium version

Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init), block-wireserver (init)

Client: 1.12.17-cee.1 9c32cdb5 2023-12-13T09:11:33+00:00 go version go1.20.12 linux/amd64
Daemon: 1.12.17-cee.1 9c32cdb5 2023-12-13T09:11:33+00:00 go version go1.20.12 linux/amd64

Cilium Health Check

cilium-health is a tool available in Cilium that provides visibility into the overall health of the cluster’s networking and connectivity. You can check node-to-node health with cilium-health status:

kubectl -n kube-system exec ds/cilium -- cilium-health status

Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init), block-wireserver (init)
Probe time:   2024-02-20T10:25:29Z
Nodes:
  aks-nodepool1-39806308-vmss000001 (localhost):
    Host connectivity to 10.0.5.7:
      ICMP to stack:   OK, RTT=148.4µs
      HTTP to agent:   OK, RTT=380.901µs
  aks-nodepool1-39806308-vmss000000:
    Host connectivity to 10.0.5.6:
      ICMP to stack:   OK, RTT=1.147301ms
      HTTP to agent:   OK, RTT=1.855003ms
  aks-nodepool1-39806308-vmss000002:
    Host connectivity to 10.0.5.5:
      ICMP to stack:   OK, RTT=1.000901ms
      HTTP to agent:   OK, RTT=1.710202ms

Cilium Connectivity Test (Optional)

The Cilium connectivity test deploys a series of services and deployments, and CiliumNetworkPolicy will use various connectivity paths to connect. Connectivity paths include with and without service load-balancing and various network policy combinations.

Note-

  • This test is optional since it requires access to outbound IPs like 1.1.1.1 and 1.0.0.1 on port 80 and 443. If administrators want to run the test, they must add rules in the Azure Firewall to enable outbound traffic toward the IPs.
  • You can also limit these tests to an internal IP, like the firewall internal IP in this case, which has access to the internet and can override cilium connectivity test with the below option:
cilium connectivity test 
--external-ip string                                    IP to use as external target in connectivity tests (default "1.1.1.1")
--external-other-ip string                          Other IP to use as external target in connectivity tests (default "1.0.0.1")

When you run the cilium connectivity test command, cilium creates deployments and pods. The pods’ images are downloaded from specific URLs that must be allowed in the Azure Firewall. Failing to add them in the Azure Firewall leads to the pods being in Error: ImagePullBackOff state.

  • Before adding the rule in Azure Firewall, the state of the pods
kubectl get pods -A -o wide

NAMESPACE                       NAME                                  READY   STATUS             RESTARTS        AGE    IP              NODE                                NOMINATED NODE   READINESS GATES
cilium-test                     client-56f8968958-tgwn9               0/1     ErrImagePull       0               12s    192.168.0.202   aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     client2-5668f9f59b-n54cz              0/1     ErrImagePull       0               12s    192.168.0.164   aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     client3-7557dd665c-rhww7              0/1     ErrImagePull       0               12s    192.168.2.160   aks-nodepool1-39806308-vmss000000   <none>           <none>
cilium-test                     echo-other-node-bd6cd689f-8lq85       1/2     ImagePullBackOff   0               12s    192.168.1.25    aks-nodepool1-39806308-vmss000001   <none>           <none>
cilium-test                     echo-same-node-57bb597f97-lkb9k       1/2     ImagePullBackOff   0               12s    192.168.0.177   aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     host-netns-29pds                      0/1     ErrImagePull       0               12s    10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
cilium-test                     host-netns-dlzsd                      0/1     ErrImagePull       0               12s    10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     host-netns-w2tmd                      0/1     ErrImagePull       0               12s    10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
  • Describe the pod(s) to see the URL(s) that are being blocked
Pod1- https://quay.io
Pod2- https://cdn03.quay.io
Pod3- https://cloudflare.docker.com
  • Create an exception rule in Azure Firewall to allow the URL’s
az network firewall application-rule create  --firewall-name prvaks --collection-name "ciliumconntests" --name "allow network" --protocols http=80 https=443 --source-addresses "*" --resource-group prvaksvnet --action "Allow" --target-fqdns "quay.io" "cdn03.quay.io" "https://cloudflare.docker.com" --priority 120
  • Once the exception rule is added, we can see that the pods are up and running.
kubectl get pods -A -o wide

NAMESPACE                       NAME                                  READY   STATUS    RESTARTS        AGE    IP              NODE                                NOMINATED NODE   READINESS GATES
cilium-test                     client-56f8968958-8dk7c               1/1     Running   0               28m    192.168.0.124   aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     client2-5668f9f59b-p4l7n              1/1     Running   0               28m    192.168.0.206   aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     client3-7557dd665c-98thz              1/1     Running   0               28m    192.168.2.39    aks-nodepool1-39806308-vmss000000   <none>           <none>
cilium-test                     echo-other-node-bd6cd689f-b5fk6       2/2     Running   0               28m    192.168.1.48    aks-nodepool1-39806308-vmss000001   <none>           <none>
cilium-test                     echo-same-node-57bb597f97-kkh27       2/2     Running   0               28m    192.168.0.8     aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     host-netns-jmnbp                      1/1     Running   0               28m    10.0.5.5        aks-nodepool1-39806308-vmss000002   <none>           <none>
cilium-test                     host-netns-qfjdl                      1/1     Running   0               28m    10.0.5.7        aks-nodepool1-39806308-vmss000001   <none>           <none>
cilium-test                     host-netns-z92tq                      1/1     Running   0               28m    10.0.5.6        aks-nodepool1-39806308-vmss000000   <none>           <none>
  • You will also observe that outbound traffic to 1.0.0.1 and 1.1.1.1 on port 80 and 443 is blocked. You can create a network rule in Azure Firewall for the DNS-based tests to pass.
az network firewall network-rule create --firewall-name prvaks --collection-name "ciliumtraffic" --destination-addresses "*"  --destination-ports 80,443 --name "allow cilium" --protocols "UDP,TCP,ICMP" --resource-group prvaksvnet --source-addresses "1.0.0.1,1.1.1.1" --action "Allow" --description "cilium network rules" --priority 121

The cilium connectivity test was run for all of the above scenarios, and the tests were passed successfully. A truncated output for one such test result is added.

Output Truncated:

cilium connectivity test

✅ All 45 tests (472 actions) successful, 19 tests skipped, 0 scenarios skipped.

Conclusion

Hopefully, this post gave you a good overview of deploying a private AKS cluster running Isovalent Enterprise for Cilium in a hub and spoke deployment model with Azure Firewall. If you have any feedback on the solution, please share it with us. You’ll find us on the Cilium Slack channel.

Further Reading

Amit Gupta
AuthorAmit GuptaSenior Technical Marketing Engineer

Related

Blogs

Cilium in Azure Kubernetes Service (AKS)

In this tutorial, users will learn how to deploy Isovalent Enterprise for Cilium on your AKS cluster from Azure Marketplace on a new cluster and also upgrade an existing cluster from an AKS cluster running Azure CNI powered by Cilium to Isovalent Enterprise for Cilium.

By
Amit Gupta
Blogs

Enabling Enterprise features for Cilium in Azure Kubernetes Service (AKS)

In this tutorial, you will learn how to enable Enterprise features (Layer-3, 4 & 7 policies, DNS-based policies, and observe the Network Flows using Hubble-CLI) in an Azure Kubernetes Service (AKS) cluster running Isovalent Enterprise for Cilium.

By
Amit Gupta

Industry insights you won’t delete. Delivered to your inbox weekly.