Security is at the forefront of most cloud architects’ minds. Engineers often turn to isolated and private environments in businesses with stringent security requirements. Due to the nature of the cloud, where resources are commonly shared, options to cordon off your environments are limited. In the case of Azure Kubernetes Service users, there is, however, a robust option: private AKS clusters. A private cluster means the managed API server created has internal IP addresses, and your Kubernetes nodes talk to it over your VNet. Private AKS cluster operators will want to benefit from Isovalent Enterprise for Cilium’s zero-trust tooling to optimize their security posture. This tutorial will guide you in creating a private AKS cluster with Azure CNI Powered by Cilium and upgrading to Isovalent Enterprise for Cilium.
What is Isovalent Enterprise for Cilium?
Isovalent Cilium Enterprise is an enterprise-grade, hardened distribution of open-source projects Cilium, Hubble, and Tetragon, built and supported by the Cilium creators. Cilium enhances networking and security at the network layer, while Hubble ensures thorough network observability and tracing. Tetragon ties it all together with runtime enforcement and security observability, offering a well-rounded solution for connectivity, compliance, multi-cloud, and security concerns.
Why Isovalent Enterprise for Cilium?
For enterprise customers requiring support and usage of Advanced Networking, Security, and Observability features, “Isovalent Enterprise for Cilium” is recommended with the following benefits:
Advanced network policy: Isovalent Cilium Enterprise provides advanced network policy capabilities, including DNS-aware policy, L7 policy, and deny policy, enabling fine-grained control over network traffic for micro-segmentation and improved security.
Hubble flow observability + User Interface: Isovalent Cilium Enterprise Hubble observability feature provides real-time network traffic flow, policy visualization, and a powerful User Interface for easy troubleshooting and network management.
Multi-cluster connectivity via Cluster Mesh: Isovalent Cilium Enterprise provides seamless networking and security across multiple clouds, including public cloud providers like AWS, Azure, and Google Cloud Platform, as well as on-premises environments.
Advanced Security Capabilities via Tetragon: Tetragon provides advanced security capabilities such as protocol enforcement, IP and port whitelisting, and automatic application-aware policy generation to protect against the most sophisticated threats. Built on eBPF, Tetragon can easily scale to meet the needs of the most demanding cloud-native environments.
Service Mesh: Isovalent Cilium Enterprise provides seamless service-to-service communication that’s sidecar-free and advanced load balancing, making it easy to deploy and manage complex microservices architectures.
Enterprise-grade support: Isovalent Cilium Enterprise includes enterprise-grade support from Isovalent’s experienced team of experts, ensuring that any issues are resolved promptly and efficiently. Additionally, professional services help organizations deploy and manage Cilium in production environments.
Pre-Requisites
The following prerequisites need to be taken into account before you proceed with this tutorial:
Users can contact their partner Sales/SE representative(s) at sales@isovalent.com for more detailed insights into the features below and access the requisite documentation and Cilium CLI software images.
Azure CLI version 2.48.1 or later. Run az –version to see the currently installed version. If you need to install or upgrade, see Install Azure CLI.
Permission to configure Azure Firewall.
Familiarity with Azure Hub and Spoke Architecture.
Note– this is a mandatory requirement as the images for the cilium-* and cilium-operator-* pods are pulled from these Microsoft repositories.
Ensure you have enough quota resources to create an AKS cluster. Go to the Subscription blade, navigate to “Usage + Quotas,” and make sure you have enough quota for the following resources: -Regional vCPUs -Standard Dv4 Family vCPUs
What does our deployment consist of?
In this deployment example, you will create an AKS cluster in a Spoke VNet and an Azure Firewall in the Hub VNet. Using a private link, you can broadcast the AKS control plane’s private IP address, which runs in an Azure-managed virtual network, into an IP address that is part of your AKS subnet. To execute all Kubernetes and Azure CLI-related commands, you will create a Jumpbox VM in the Hub VNet.
You will also set up the Azure Firewall to lock down ingoing and outgoing traffic from a Kubernetes cluster in Azure.
Creating the Azure Resources
Set the subscription
Choose the subscription you want to use if you have multiple Azure subscriptions.
Replace SubscriptionName with your subscription name.
You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName
AKS Resource Group creation
Create a Resource Group for the Kubernetes Resources
az group create -n prvaks -l westeurope
{"id":"/subscriptions/###########################/resourceGroups/prvaks",
"location":"westeurope",
"managedBy": null,
"name":"prvaks",
"properties":{"provisioningState":"Succeeded"},
"tags": null,
"type":"Microsoft.Resources/resourceGroups"}
Create a Resource Group for the Hub and VNet Resources
az group create -n prvaksvnet -l westeurope
{"id":"/subscriptions/###########################/resourceGroups/prvaksvnet",
"location":"westeurope",
"managedBy": null,
"name":"prvaksvnet",
"properties":{"provisioningState":"Succeeded"},
"tags": null,
"type":"Microsoft.Resources/resourceGroups"}
AKS Network creation
Create a virtual network for the hub.
az network vnet create -g prvaksvnet -n hub1-firewalvnet --address-prefixes 10.0.0.0/22
An open question that Azure users have is how to lock down ingoing to and outgoing traffic from a Kubernetes cluster in Azure. The Azure Firewall is one option that can be set up relatively easily but is not documented in detail.
Enable the firewall extension.
az extension add --name azure-firewall
Configure a public IP for the Azure Firewall
az network public-ip create -g prvaksvnet -n prvaks --sku Standard
Check if the route table has been created successfully.
az network route-table route list --resource-group prvaksvnet --route-table-name prvaks
Create Exception Rules
Create the necessary exception rules for the AKS-required network dependencies to ensure the worker nodes can set their system time and Ubuntu updates (optional) and retrieve the AKS kube-system container images from the Microsoft Container Registry.
Note- The exception rules carry a wildcard “*” entry for source and destination, which can be carefully looked into in the respective source(s) from which traffic will be allowed and similarly in the respective destination (s). These firewall rules are examples; you can work around them after checking with your network and system administrators.
Now you have created the virtual networks and subnets, the peering between them, and a route from the AKS subnet to the internet going through the Azure firewall that only allows the required Azure Kubernetes Services dependencies.
Enable Private Link in the AKS cluster.
An AKS cluster, by default, will attach a public IP to the standard load balancer. In this case, we want to prevent this by setting the outboundType to userDefinedRouting in our deployment, configuring a private cluster with a managed identity, and assigning the correct permissions.
Create a managed Identity
az identity create --name myKubeletIdentity --resource-group prvaksvnet
Create an AKS cluster with Azure CNI powered by Cilium in overlay mode.
Once the AKS cluster is created, you need to create a link in your DNS zone to the hub VNet to ensure that your jumpbox can correctly resolve the DNS name of your private link-enabled cluster using kubectl from the jumpbox. This will ensure that your jumpbox can resolve the private IP of the API server using Azure DNS.
NODE_GROUP=$(az aks show --resource-group prvaksvnet --name prvaks --query nodeResourceGroup -o tsv)DNS_ZONE_NAME=$(az network private-dns zone list --resource-group $NODE_GROUP --query "[0].name" -o tsv)HUB_VNET_ID=$(az network vnet show -g prvaksvnet -n hub1-firewalvnet --query id -o tsv)az network private-dns link vnet create --name "hubnetdnsconfig" --registration-enabled false --resource-group $NODE_GROUP --virtual-network $HUB_VNET_ID --zone-name $DNS_ZONE_NAME
Create a Jumpbox VM
Deploy an Ubuntu Jumpbox VM into the JumpboxSubnet so that you can issue all Azure CLI and kubectl commands for the private cluster. Login to the Jumpbox VM and install Azure CLI and kubectl. Make sure you set the correct subscription.
az login --use-device-code
az account set --subscription "##########################"
Issue kubectl commands to see the status of all the pods and nodes in your private AKS cluster running Azure CNI powered by Cilium in overlay mode.
Issue kubectl commands to see the status of all the pods and nodes in your private AKS cluster running Isovalent Enterprise for Cilium in overlay mode.
kubectl -n kube-system exec ds/cilium -- cilium version
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init), block-wireserver (init)Client: 1.12.17-cee.1 9c32cdb5 2023-12-13T09:11:33+00:00 go version go1.20.12 linux/amd64
Daemon: 1.12.17-cee.1 9c32cdb5 2023-12-13T09:11:33+00:00 go version go1.20.12 linux/amd64
Cilium Health Check
cilium-health is a tool available in Cilium that provides visibility into the overall health of the cluster’s networking and connectivity. You can check node-to-node health with cilium-health status:
kubectl -n kube-system exec ds/cilium -- cilium-health status
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init), block-wireserver (init)Probe time: 2024-02-20T10:25:29Z
Nodes:
aks-nodepool1-39806308-vmss000001 (localhost):
Host connectivity to 10.0.5.7:
ICMP to stack: OK, RTT=148.4µs
HTTP to agent: OK, RTT=380.901µs
aks-nodepool1-39806308-vmss000000:
Host connectivity to 10.0.5.6:
ICMP to stack: OK, RTT=1.147301ms
HTTP to agent: OK, RTT=1.855003ms
aks-nodepool1-39806308-vmss000002:
Host connectivity to 10.0.5.5:
ICMP to stack: OK, RTT=1.000901ms
HTTP to agent: OK, RTT=1.710202ms
Cilium Connectivity Test (Optional)
The Cilium connectivity test deploys a series of services and deployments, and CiliumNetworkPolicy will use various connectivity paths to connect. Connectivity paths include with and without service load-balancing and various network policy combinations.
Note-
This test is optional since it requires access to outbound IPs like 1.1.1.1 and 1.0.0.1 on port 80 and 443. If administrators want to run the test, they must add rules in the Azure Firewall to enable outbound traffic toward the IPs.
You can also limit these tests to an internal IP, like the firewall internal IP in this case, which has access to the internet and can override cilium connectivity test with the below option:
cilium connectivity test--external-ip string IP to use as external target in connectivity tests (default "1.1.1.1")--external-other-ip string Other IP to use as external target in connectivity tests (default "1.0.0.1")
When you run the cilium connectivity test command, cilium creates deployments and pods. The pods’ images are downloaded from specific URLs that must be allowed in the Azure Firewall. Failing to add them in the Azure Firewall leads to the pods being in Error: ImagePullBackOff state.
Before adding the rule in Azure Firewall, the state of the pods
You will also observe that outbound traffic to 1.0.0.1 and 1.1.1.1 on port 80 and 443 is blocked. You can create a network rule in Azure Firewall for the DNS-based tests to pass.
The cilium connectivity test was run for all of the above scenarios, and the tests were passed successfully. A truncated output for one such test result is added.
Hopefully, this post gave you a good overview of deploying a private AKS cluster running Isovalent Enterprise for Cilium in a hub and spoke deployment model with Azure Firewall. If you have any feedback on the solution, please share it with us. You’ll find us on the Cilium Slack channel.
Amit Gupta is a senior technical marketing engineer at Isovalent, powering eBPF cloud-native networking and security. Amit has 21+ years of experience in Networking, Telecommunications, Cloud, Security, and Open-Source. He has previously worked with Motorola, Juniper, Avi Networks (acquired by VMware), and Prosimo. He is keen to learn and try out new technologies that aid in solving day-to-day problems for operators and customers.
He has worked in the Indian start-up ecosystem for a long time and helps new folks in that area outside of work. Amit is an avid runner and cyclist and also spends considerable time helping kids in orphanages.
In this tutorial, users will learn how to deploy Isovalent Enterprise for Cilium on your AKS cluster from Azure Marketplace on a new cluster and also upgrade an existing cluster from an AKS cluster running Azure CNI powered by Cilium to Isovalent Enterprise for Cilium.
In this tutorial, you will learn how to enable Enterprise features (Layer-3, 4 & 7 policies, DNS-based policies, and observe the Network Flows using Hubble-CLI) in an Azure Kubernetes Service (AKS) cluster running Isovalent Enterprise for Cilium.
By
Amit Gupta
Industry insights you won’t delete. Delivered to your inbox weekly.