“At any given moment, you have the power to say, this is not how the story will end.”—Christine Mason Miller. With that, we resume where we left off while Deploying Isovalent Enterprise for Cilium. This tutorial teaches you how to enable Enterprise features in an Azure Kubernetes Service (AKS) cluster running Isovalent Enterprise for Cilium.
What is Isovalent Enterprise for Cilium?
Isovalent Cilium Enterprise is an enterprise-grade, hardened distribution of open-source projects Cilium, Hubble, and Tetragon, built and supported by the Cilium creators. Cilium enhances networking and security at the network layer, while Hubble ensures thorough network observability and tracing. Tetragon ties it all together with runtime enforcement and security observability, offering a well-rounded solution for connectivity, compliance, multi-cloud, and security concerns.
Why Isovalent Enterprise for Cilium?
For enterprise customers requiring support and usage of Advanced Networking, Security, and Observability features, “Isovalent Enterprise for Cilium” is recommended with the following benefits:
- Advanced network policy: Isovalent Cilium Enterprise provides advanced network policy capabilities, including DNS-aware policy, L7 policy, and deny policy, enabling fine-grained control over network traffic for micro-segmentation and improved security.
- Hubble flow observability + User Interface: Isovalent Cilium Enterprise Hubble observability feature provides real-time network traffic flow, policy visualization, and a powerful User Interface for easy troubleshooting and network management.
- Multi-cluster connectivity via Cluster Mesh: Isovalent Cilium Enterprise provides seamless networking and security across multiple clouds, including public cloud providers like AWS, Azure, and Google Cloud Platform, as well as on-premises environments.
- Advanced Security Capabilities via Tetragon: Tetragon provides advanced security capabilities such as protocol enforcement, IP and port whitelisting, and automatic application-aware policy generation to protect against the most sophisticated threats. Built on eBPF, Tetragon can easily scale to meet the needs of the most demanding cloud-native environments.
- Service Mesh: Isovalent Cilium Enterprise provides sidecar-free, seamless service-to-service communication and advanced load balancing, making it easy to deploy and manage complex microservices architectures.
- Enterprise-grade support: Isovalent Cilium Enterprise includes enterprise-grade support from Isovalent’s experienced team of experts, ensuring that any issues are resolved promptly and efficiently. Additionally, professional services help organizations deploy and manage Cilium in production environments.
Pre-Requisites
The following prerequisites need to be taken into account before you proceed with this tutorial:
- You should have an Azure Subscription.
- You have installed Isovalent Enterprise for Cilium via:
- You can enable the features below by using the following:
- Install kubectl
- Install jq
- Users can contact their partner Sales/SE representative(s) at sales@isovalent.com for more detailed insights into the features below and access the requisite documentation and hubble CLI software images.
What is in it for my Enterprise?
Isovalent Enterprise provides a range of advanced enterprise features you will learn from this tutorial.
Layer 3/ Layer4 Policy
When using Cilium, endpoint IP addresses are irrelevant when defining security policies. Instead, you can use the labels assigned to the pods to define security policies. The policies will be applied to the right pods based on the labels, irrespective of where or when they run within the cluster.
The layer 3 policy establishes the base connectivity rules regarding which endpoints can talk to each other.
The layer 4 policy can be specified independently or independently in addition to the layer 3 policies. It restricts an endpoint’s ability to emit and/or receive packets on a particular port using a particular protocol.
You can take a Star Wars-inspired example in which there are three microservices applications: deathstar, tiefighter, and xwing. The deathstar runs an HTTP web service on port 80, which is exposed as a Kubernetes Service to load-balance requests to deathstar across two pod replicas. The deathstar service provides landing services to the empire’s spaceships so that they can request a landing port. The tiefighter pod represents a landing-request client service on a typical empire ship, and xwing represents a similar service on an alliance ship. They exist so that you can test different security policies for access control to deathstar landing services.
Validate L3/ L4 Policies
- Deploy three services: deathstar, xwing, and firefighter
- Kubernetes will deploy the pods and service in the background.
- Running
kubectl get pods,svc
will inform you about the progress of the operation.
- Check basic access
- From the perspective of the deathstar service, only the ships with the label
org=empire
are allowed to connect and request landing. Since you have no rules enforced, both xwing and tiefighter can request landing.
- From the perspective of the deathstar service, only the ships with the label
- You can start with the basic policy restricting deathstar landing requests to only the ships that have a label
org=empire
. This will not allow any ships that don’t have theorg=empire
label to even connect with the deathstar service. This simple policy filters only on IP protocol (network layer 3) and TCP protocol (network layer 4), often called an L3/L4 network security policy. - The above policy whitelists traffic sent from any pods with the label org=empire to deathstar pods with the label
org=empire, class=deathstar
on TCP port 80. - You can now apply this L3/L4 policy:
- If you run the landing requests, only the tiefighter pods with the label org=empire will succeed. The xwing pods will be blocked.
- Now, the same request run from an xwing pod will fail:
This request will hang, so press Control-C to kill the curl request or wait for it to time out.
HTTP-aware L7 Policy
Layer 7 policy rules are embedded into Layer 4 Example rules and can be specified for ingress and egress. A layer 7 request is permitted if at least one of the rules matches. If no rules are specified, then all traffic is permitted. If a layer 4 rule is specified in the policy, and a similar layer 4 rule with layer 7 rules is also specified, then the layer 7 portions of the latter rule will have no effect.
Note-
- The feature is available in a “Beta status” as of now. For production use, you can contact support@isovalent.com and sales@isovalent.com
- For wireguard encryption, l7Proxy is set to False, and hence, it is recommended that users enable the same by updating the ARM template or via Azure CLI.
- This will be available in an upcoming release.
To provide the strongest security (i.e., enforce least-privilege isolation) between microservices, each service that calls deathstar’s API should be limited to making only the set of HTTP requests it requires for legitimate operation.
For example, consider that the deathstar service exposes some maintenance APIs that random empire ships should not call.
Cilium can enforce HTTP-layer (i.e., L7) policies to limit the firefighter pod’s ability to reach URLs.
Validate L7 Policy
- Apply L7 Policy—An example policy file extends our original policy by limiting tiefighter to making only a POST /v1/request-landing API call and disallowing all other calls (including PUT /v1/exhaust port).
- Update the existing rule (from the L3/L4 section) to apply L7-aware policy to protect deathstar
- re-run a curl towards deathstar & exhaust port
- As this rule builds on the identity-aware rule, traffic from pods without the label
org=empire
will continue to be dropped, causing the connection to time out:
DNS-Based Policies
DNS-based policies are useful for controlling access to services outside the Kubernetes cluster. DNS acts as a persistent service identifier for both external services provided by Google and internal services, such as database clusters running in private subnets outside Kubernetes. CIDR or IP-based policies are cumbersome and hard to maintain as the IPs associated with external services can change frequently. The Cilium DNS-based policies provide an easy mechanism to specify access control, while Cilium manages the harder aspects of tracking DNS to IP mapping.
Validate DNS-Based Policies
- In line with our Star Wars theme examples, you can use a simple scenario where the Empire’s
mediabot
pods need access to GitHub to manage the Empire’s git repositories. The pods shouldn’t have access to any other external service.
- Apply DNS Egress Policy- The following Cilium network policy allows
mediabot
pods to only accessapi.github.com
- Testing the policy, you can see that
mediabot
has access toapi.github.com
but doesn’t have access to any other external service, e.g.,support.github.com
This request will hang, so press Control-C to kill the curl request or wait for it to time out.
Combining DNS, Port, and L7 Rules
The DNS-based policies can be combined with port (L4) and API (L7) rules to restrict access further. In our example, you can restrict mediabot pods to access GitHub services only on port 443. The toPorts section in the policy below achieves the port-based restrictions and DNS-based policies.
Validate the combination of DNS, Port, and L7-based Rules
- Applying the policy
- Testing, the access to https://support.github.com on port 443 will succeed, but the access to http://support.github.com on port 80 will be denied.
Encryption
Cilium supports the transparent encryption of Cilium-managed host traffic and traffic between Cilium-managed endpoints using WireGuard® or IPsec. In this tutorial, you will learn about Wireguard.
Note-
- The feature is available in a “Beta status” as of now. For production use, you can contact support@isovalent.com and sales@isovalent.com
- For wireguard encryption, l7Proxy is set to False, and hence, users should disable the same by updating the ARM template or via Azure CLI.
Wireguard
When WireGuard is enabled in Cilium, the agent running on each cluster node will establish a secure WireGuard tunnel between it and all other known nodes.
Packets are not encrypted when destined to the same node from which they were sent. This behavior is intended. Encryption would provide no benefits, given that the raw traffic can be observed on the node anyway.
Validate Wireguard Encryption
- To demonstrate Wireguard encryption, users can create a client pod spun up on one node and a server pod spun up on another in AKS.
- The client does a “wget” towards the server every 2 seconds.
- Run a
bash
shell in one of the Cilium pods withkubectl -n kube-system exec -ti ds/cilium -- bash
and execute the following commands:- Check that WireGuard has been enabled (the number of peers should correspond to the number of nodes subtracted by one):
- Install tcpdump on the node where the server pod has been created.
- Check that traffic (HTTP requests and responses) is sent via the
cilium_wg0
tunnel device on the node where the server pod has been created:
Kube-Proxy Replacement
One of the additional benefits of using Cilium is its extremely efficient data plane. It’s particularly useful at scale, as the standard kube-proxy is based on a technology – iptables – that was never designed with the churn and the scale of large Kubernetes clusters.
Validate Kube-Proxy Replacement
- Users can first validate that the Cilium agent is running in the desired mode with kube-proxy set to Strict:
- You can also check that kube-proxy is not running as a daemonset on the AKS cluster.
- You can deploy nginx pods, create a new NodePort service, and validate that Cilium installed the service correctly.
- The following yaml is used for the backend pods:
- Verify that the NGINX pods are up and running:
- Create a NodePort service for the instances:
- Verify that the NodePort service has been created:
- With the help of the
cilium service list
command, validate that Cilium’s eBPF kube-proxy replacement created the new NodePort services under port 31076: (Truncated O/P)
- At the same time, you can verify using
iptables
in the host namespace (on the node), there are noiptables
rules for the service are present:
- Last but not least, a simple
curl
test shows connectivity for the exposed NodePort port31076
as well as for the ClusterIP:
Upgrade AKS clusters running kube-proxy
Note- When I initially wrote this blog post, kube-proxy was enabled as a daemonset. You can now create or upgrade an AKS cluster on Isovalent Enterprise for Cilium, and your AKS clusters will no longer have a Kube-proxy-based iptables implementation.
- This example shows a service created running a Kube-Proxy-based implementation on an AKS cluster (not running Isovalent Enterprise for Cilium).
- The AKS cluster is then upgraded to Isovalent Enterprise for Cilium. As you can see, traffic works seamlessly.
- There are no more iptables but Cilium endpoints that come into play.
Hubble CLI
Hubble’s CLI extends the visibility that is provided by standard kubectl commands like kubectl get pods
to give you more network-level details about a request, such as its status and the security identities associated with its source and destination.
The Hubble CLI can be leveraged to observe network flows from Cilium agents. Users can observe the flows from their local machine workstation for troubleshooting or monitoring. For this tutorial, users can see that all hubble outputs are related to the tests that are done above. Users can try other tests and see the same results with different values as expected.
Setup Hubble Relay Forwarding
Use the kubectl port forward to hubble-relay, then edit the hubble config to point at the remote hubble server component.
Hubble Status
Hubble status can check the overall health of Hubble within your cluster. If using Hubble Relay, a counter for the number of connected nodes will appear in the last line of the output.
View Last N Events
hubble observe
displays the most recent events based on the number filter. Hubble Relay will display events over all the connected nodes:
Follow Events in Real-Time
hubble observe --follow
will follow the event stream for all connected clusters.
Troubleshooting HTTP & DNS
- Suppose you have a CiliumNetworkPolicy that enforces DNS or HTTP policy. In that case, you can use the –type l7 filtering options for hubble to check our applications’ HTTP methods and DNS resolution attempts.
- You can use
--http-status
to view specific flows with 200 HTTP responses
- You can also show HTTP PUT methods with
--http-method
- To view DNS traffic for a specific FQDN, you can use the
--to-fqdn
flag
Filter by Verdict
Hubble provides a field called VERDICT that displays one of FORWARDED
, ERROR
, or DROPPED
for each flow. DROPPED
could indicate an unsupported protocol within the underlying platform or Network Policy enforcing pod communication. Hubble can introspect the reason for ERROR
or DROPPED
flows and display the reason within the TYPE field of each flow.
Filter by Pod or Namespace
- To show all flows for a specific pod, filter with the
--pod
flag
- If you are only interested in traffic from a pod to a specific destination, combine
--from-pod
and--to-pod
- If you want to see all traffic from a specific namespace, specify the
--from-namespace
Filter Events with JQ
To view filter events through the jq tool, you can swap the output to json mode. Visualize your metadata through jq, which will help you see more metadata around the workload labels like pod name/namespace assigned to both source and destination. This information is accessible by Cilium because it is encoded in the packets based on pod identities.
Hubble UI
Note- To obtain the helm values to install Hubble UI and access the Enterprise documentation, you need to reach out to sales@isovalent.com and support@isovalent.com
The graphical user interface (Hubble-Ui
) utilizes relay-based visibility to provide a graphical service dependency and connectivity map. Hubble-UI is enabled via helm charts. The feature is not enabled when you create a new cluster using Isovalent Enterprise for Cilium or upgrade to Isovalent Enterprise for Cilium.
Once the installation is complete, you will notice hubble-ui running as a daemonset, and also, the pods are up and running:
Validate the installation
To access Hubble UI, forward a local port to the Hubble UI service:
Then, open http://localhost:12000 in your browser:
Select the namespace by default, and you will observe a service map and network event list. In this case, the pod mediabot
(created in the previous test case) is trying to access support.github.com
over port 443
Note- You can read more on hubble in a detailed blog post that is a 3-part series.
DNS Proxy HA
Cilium Enterprise supports deploying an additional DNS Proxy daemonset called cilium-dnsproxy
that can be life-cycled independently of Cilium daemonset.
What is Cilium DNS Proxy HA?
When Cilium Network Policies that make use of toFQDNs are installed in a Kubernetes cluster, the Cilium agent starts an in-process DNS proxy that becomes responsible for proxying all DNS requests between all pods and the Kubernetes internal kube-dns service. Whenever a Cilium agent is restarted, such as during an upgrade or due to something unexpected, DNS requests from all pods on that node do not succeed until the Cilium agent is online again.
When cilium-dnsproxy is enabled, an independently life-cycled DaemonSet is deployed. cilium-dnsproxy
acts as a hot standby that mirrors DNS policy rules. cilium-agent
and cilium-dnsproxy
bind to the same port, relying on the Linux kernel to distribute DNS traffic between the two DNS proxy instances. This allows you to lifecycle either cilium or cilium-dnsproxy daemonset without impacting DNS traffic.
Installation of DNS-Proxy HA using helm
Note-
- DNS Proxy High Availability relies on configuring the cilium-config ConfigMap with
external-dns-proxy: true
and to deploy the DNS proxy component.- To enable DNS Proxy HA and access the Enterprise documentation, you need to reach out to sales@isovalent.com and support@isovalent.com
DNS-Proxy HA is enabled via helm charts. The feature is not enabled when you create a new cluster using Isovalent Enterprise for Cilium or upgrade to Isovalent Enterprise for Cilium.
Once the installation is complete, you will notice cilium-dnsproxy
running as a daemonset, and also the pods are up and running:
Validate DNS-Proxy HA
- In line with our Star Wars theme examples, you can use a simple scenario where the Empire’s
mediabot
pods need access to GitHub to manage the Empire’s git repositories. The pods shouldn’t have access to any other external service.
- Apply DNS Egress Policy- The following Cilium network policy allows
mediabot
pods to only accessapi.github.com
- Testing the policy, you can see that
mediabot
pod has access to api.github.com
- Send packets continuously in a loop.
- Simulate a failure scenario wherein the
cilium-agent
pods are not up and running and the traffic still goes through thecilium-dnsproxy-*
pods
cilium-agent
pods are restarted as a part of the test
- Traffic is not disrupted and continues to flow through the
cilium-dnsproxy
pods
Tetragon
Tetragon provides powerful security observability and a real-time runtime enforcement platform. The creators of Cilium have built Tetragon and brought the full power of eBPF to the world of security.
Tetragon helps platform and security teams solve the following:
Security Observability:
- Observing application and system behavior such as process, syscall, file, and network activity
- Tracing namespace, privilege, and capability escalations
- File integrity monitoring
Runtime Enforcement:
- Application of security policies to limit the privileges of applications and processes on a system (system calls, file access, network, kprobes)
Tetragon has been specifically built for Kubernetes and cloud-native infrastructure but can be run on any Linux system. Sometimes, you might want to enable process visibility in an environment without Cilium as the CNI. The Security Observability provided by the Hubble-Enterprise daemonset can operate in a standalone mode, de-coupled from Cilium as a CNI.
Installation of Tetragon using helm
Note- To obtain the helm values to install Tetragon and access to Enterprise documentation, you need to reach out to sales@isovalent.com and support@isovalent.com
Tetragon is enabled via helm charts, as the feature is not enabled when you create a new cluster using Isovalent Enterprise for Cilium or upgrade to Isovalent Enterprise for Cilium.
Once the installation is complete, you will notice Tetragon running as a daemonset, and also, the pods are up and running:
Validate Tetragon
You can use our Demo Application to explore the Process and Networking Events:
- Create a namespace
- Deploy a demo application. Verify that all pods are up and running.
- Reach out to sales@isovalent.com or support@isovalent.com to get access to the respective demo applications.
- You can view all the pods in the
tenant-jobs
namespace in hubble-ui
- You can examine the Process and Networking Events in two different ways:
- Raw Json events –
kubectl logs -n kube-system ds/hubble-enterprise -c export-stdout -f
- Enable Hubble UI– The second way is to visualize the processes running on a certain workload by observing their Process Ancestry Tree. This tree gives you rich Kubernetes API, identity-aware metadata, and OS-level process visibility about the executed binary, its parents, and the execution time up until
dockerd
has started the container. - While in a real-world deployment, the Hubble Event Data would likely be exported to an SIEM or other logging datastore, in this quickstart, you will access this Event Data by redirecting the logs of the export-stdout container of the hubble-enterprise pod:
kubectl logs -n kube-system ds/hubble-enterprise -c export-stdout > export.log
- Raw Json events –
- From the main Hubble UI screen, click on the
tenant-jobs
namespace in the list. Then, in the left navigation sidebar, click Processes.
- To upload the exported logs, click Upload on the left of the screen:
- Use the file selector dialog to choose the events.log generated earlier and select the tenants-job namespace from the namespace dropdown.
- Here, you can get a brief overview of a security use case that can easily be detected and be interesting to visualize. By using Hubble UI and visualizing the Process Ancestry Tree, you can detect a shell execution in the crawler-YYYYYYYYY-ZZZZZ Pod that occurred more than 5 minutes after the container has started. After clicking on the crawler-YYYYYYYY-ZZZZZ pod name from the Pods selector dropdown on the left of the screen, you will be able to see the Process Ancestry Tree for that pod:
- The Process Ancestry Tree gives us:
- Rich Kubernetes Identity-Aware Metadata: You can see the name of the team or namespace and the specific application service to be inspected in the first row.
- OS-Level Process Visibility: You can see all the processes that have been executed on the inspected service or were related to its Pod lifecycle
- DNS Aware Metadata: You can see all the external connections with the exact DNS name as an endpoint made from specific processes of the inspected service.
Enabling DNS visibility
Outbound network traffic remains a major attack vector for many enterprises. For example, in the above example, you can see that the crawler service reaches out to one or more services outside the Kubernetes cluster on port 443. However, the identity of these external services is unknown, as the flows only show an IP address.
Cilium Enterprise can parse the DNS-layer requests emitted by services and associate that identity data with outgoing connections, enriching network connectivity logs.
To inspect the DNS lookups for pods within a namespace, you must apply a network policy that tells Cilium to inspect port 53 traffic from pods to Kube-DNS at Layer 7. With this DNS visibility, Hubble flow data will be annotated with DNS service identity for destinations outside the Kubernetes cluster. The demo app keeps reaching out to Twitter at regular intervals, and now both the service map and flows table for the tenant-jobs
namespace shows connections to api.twitter.com
:
Enabling HTTP Visibility
Like enabling DNS visibility, you can also apply a visibility policy to instruct Cilium to inspect certain traffic flows at the application layer (e.g., HTTP) without requiring any changes to the application itself.
In this example, you’ll inspect ingress connections to services within the tenant-jobs
namespace at the HTTP layer. You can inspect flow details to get application-layer information. For example, you can inspect HTTP queries that coreapi
make it to elasticsearch
:
Network Policy
In this scenario, you will create services, deployments, and pods (created in the above step (Observing Network Flows with Hubble CLI) and then use the Hubble UI’s network policy editor and service map/flow details pages to create, troubleshoot, and update a network policy.
Apply an L3/L4 Policy
- You can start with a basic policy restricting
deathstar
landing requests to only the ships that have a label (org=empire
). This will not allow any ships that don’t have theorg=empire
label to even connect with thedeathstar
service. This simple policy filters only on IP protocol (network layer 3) and TCP protocol (network layer 4), often called an L3/L4 network security policy. - The above policy whitelists traffic sent from pods with the label (org=empire) to deathstar pods with the label (org=empire, class=deathstar) on TCP port 80.
- To apply this L3/L4 policy, run:
- If you run the landing requests again, only the
tiefighter
pods with the labelorg=empire
will succeed. Thexwing
pods will be blocked.
- The same request ran from an
xwing
pod will fail:
This request will hang, so press Control-C to kill the curl request or wait for it to time out. As you can see, Hubble-UI is reporting that the flow will be dropped.
- Using the policy editor, click on the denied/dropped flows and add them to your policy.
- You need to download the policy and apply/update it again.
- Once the updated policy is applied, run the same request from a
xwing
pod, and that should pass
Enabling TLS visibility
Cilium’s Network Security Observability provides deep insights into network events, such as tls
events containing information about the exact TLS connections, including the negotiated cipher suite, the TLS version, the source/destination IP addresses, and ports of the connection made by the initial process. In addition, tls
events are also enriched by Kubernetes Identity-Aware metadata (Kubernetes namespace, pod/container name, labels, container image, etc).
By default, Tetragon does not show connectivity-related events. TLS visibility requires a specific tracing policy and a CRD, which we will apply.
Apply a TLS visibility CRD:
Note- Reach out to sales@isovalent.com or support@isovalent.com to get access to the TLS visibility CRD.
- Let’s see the events we get out of the box. You can use the same pods that were created in the previous step.
- Create a TLS visibility Policy
- Apply the Policy:
- Send traffic to see the TLS ciphers and versions being used. From the
xwing
pod shell, try a simplecurl
to google.com and useopenssl
to check ciphers:
- Check the events in a different terminal:
- You can also view the above results in a tabular format by using ‘jq’:
File Enforcement
Tetragon Enterprise uses information about the kernel’s internal structures to provide file monitoring. To configure File Enforcement, apply an example File Enforcement Policy, which blocks any write access on /etc/passwd and /etc/shadow and blocks deleting these files. This policy applies to host files and all Kubernetes pods in the default Kubernetes namespace.
Note- Reach out to sales@isovalent.com or support@isovalent.com to get access to the specific file enforcement policies.
Using one of the already existing pods, you’ll see events from a test pod on the default Kubernetes namespace when trying to edit the /etc/passwd
file:
Process Visibility
Tetragon Enterprise uses information about the internal structures of the kernel to provide process visibility. Run kubectl to validate that Hubble Enterprise is configured with process visibility enabled:
Conclusion
Hopefully, this post gave you a good overview of how and why you would Deploy Isovalent Enterprise for Cilium in Azure Marketplace and enable end-users with Enterprise features. If you have any feedback on the solution, please share it with us. You’ll find us on the Cilium Slack channel.
Try it Out
- Isovalent Enterprise for Cilium on the Azure marketplace.
Further Reading
- Upgrade to cilium in Azure
- Azure and Isovalent main partner page
Amit Gupta is a senior technical marketing engineer at Isovalent, powering eBPF cloud-native networking and security. Amit has 21+ years of experience in Networking, Telecommunications, Cloud, Security, and Open-Source. He has previously worked with Motorola, Juniper, Avi Networks (acquired by VMware), and Prosimo. He is keen to learn and try out new technologies that aid in solving day-to-day problems for operators and customers.
He has worked in the Indian start-up ecosystem for a long time and helps new folks in that area outside of work. Amit is an avid runner and cyclist and also spends considerable time helping kids in orphanages.