Back to blog

Isovalent Enterprise for Cilium 1.16 – High-Performance Networking With Per-Flow Encryption, End-To-End Multi-Cluster Visibility, BGPV2, and BFD for BGP

Dean Lewis
Dean Lewis
Published: Updated: Cilium
Isovalent Enteprise for Cilium 1.16 - Cover Image

Isovalent Enterprise for Cilium 1.16 has landed! The theme of the release is “Faster, Stronger, Smarter” – faster for the blazing performances you will get with improvements to network policy performance and ELF Loader Logic, stronger for all the security and operational improvements, such as Per-Flow Selective Encryption and Port Range support in network policies and smarter for all the new traffic engineering features such as Kubernetes Service Traffic Distribution, Egress IPAM and Hubble + Timescape Multi-Cluster support! 

This blog post covers the major new additions in this enterprise release.

What Is New in Isovalent Enterprise for Cilium 1.16?

Networking

  • Cilium netkit: Container-network throughput and latency as fast as host-network (more details)
  • Service traffic distribution: Kubernetes 1.30 Service Traffic Distribution is the successor to topology-aware routing (more details)
  • BGPv2: Fresh new API for Cilium’s BGP feature (more details)
  • BGP ClusterIP advertisement: BGP advertisements of ExternalIP and Cluster IP Services (more details)
  • Node IPAM Service LB: Ability to assign IP addresses from the nodes themselves to Kubernetes services, providing alternative access to services from outside of the cluster (more details)
  • BFD for BGP: Detect link or neighbor loss faster, forcing traffic to take an alternate path and greatly reducing downtime (more details)
  • Egress Gateway IPAM: Specify a list of CIDRs to be used as Egress IPs to be assigned to the Egress Gateways (more details)
  • Per-pod fixed MAC address: Addressing use-cases such as software that is licenced based on a known MAC address (more details)

Service Mesh & Ingress/Gateway API

  • Gateway API GAMMA support: East-west traffic management for the cluster via Gateway API (more details)
  • Gateway API 1.1 support: Cilium now supports Gateway API 1.1 (more details)
  • Gateway API support for more protocol options: Cilium Gateway API supports new protocol options such as proxyProtocol, ALPN and appProtocol (more details)
  • Local ExternalTrafficPolicy support for Ingress/Gateway API: External traffic can now be routed to node-local endpoints, preserving the client source IP address (more details)
  • L7 Envoy Proxy as dedicated DaemonSet: With a dedicated DaemonSet, Envoy and Cilium can have a separate life-cycle from each other. Now on by default for new installs (more details)
  • Host Network mode & Envoy listeners of subset of nodes: Cilium Gateway API Gateway/Ingress can now be deployed on the host network and on selected nodes (more details)

Security

  • Per-Flow Encryption: Selectively encrypt traffic between workloads (more details)
  • Port Range support in Network Policies: This long-awaited feature has been implemented into Cilium (more details)
  • Network Policy validation status: kubectl describe cnp <name> will be able to tell if the Cilium Network Policy is valid or invalid (more details)
  • CIDRGroups support for Egress and Deny rules: Add support for matching CiliumCIDRGroups in Egress and Deny policy rules (more details)
  • Select nodes as the target of Cilium Network Policies: With new ToNodes/FromNodes selectors, traffic can be allowed or denied based on the labels of the target Node in the cluster (more details)
  • FIPS Compliant Cilium Images: Isovalent customers can now use specific built FIPS compliant Cilium Images (more details)

Day 2 Operations and Scale

  • Introduction of Cilium Feature Gates: Provide operators enhanced control over experimental features (more details)
  • Improved DNS-based network policy performance: Reduction of up to 5x reduction in tail latency for DNS-based network policies (more details)
  • New ELF loader logic: With this new loader logic, the median memory usage of Cilium was decreased by 24% (more details)

Hubble & Observability

  • Hubble Timescape: New Alternative deployment architecture for shorter retention time of hubble flows, multi-cluster and multiple buckets support (more details)
  • Hubble UI Enterprise: Multi-cluster support, HTTP and Mutual Authentication Policies in Network Policy Editor (more details)
  • Filtering Hubble flows by node labels: Filter Hubble flows observed on nodes matching the given label (more details)
  • Improvements for egress traffic path observability: Enhancements to Cilium Metrics and Hubble flow data for traffic that traverses Egress Gateways, allowing better troubleshooting of this popular Cilium feature (more details)

Networking

Cilium netkit

Status: Beta

Containerization has always come at a performance cost, with the most visible one on networking velocity. A standard container networking architecture could result in a 35% drop in network performance compared to the host. How could we bridge that gap?

Over the past seven years, Isovalent developers have added several features to reduce this performance penalty.

With the introduction of Cilium netkit, you can finally achieve performance parity between host and container. 

Cilium is the first public project providing built-in support for netkit, a Linux network device introduced in the 6.7 kernel release and developed by Isovalent engineers. Expect to hear about developments of this feature in later releases of Isovalent Enterprise.

Cilium Netkit - Latency in usec Pod to Pod over wire
Cilium Netkit - TCP stream single flow pot to pod over wire

To learn more, read our deep dive into the journey into high-performance container networking, with netkit the final frontier.

Cilium netkit: The Final Frontier in Container Networking Performance

Learn how Cilium can now provide host-level performance.

Read Deep Dive

Service Traffic Distribution

Isovalent Enterprise for Cilium 1.16 will support Kubernetes’ new “Traffic Distribution for Services” model, which aims to provide topology-aware traffic engineering. It can be considered the successor to features such as Topology-Aware Routing and Topology-Aware Hints. 

Introduced in Kubernetes 1.30, Service Traffic Distribution enables users to express preferences on traffic policy for a Kubernetes Service. Currently, the only supported value is “PreferClose,” which indicates a preference for routing traffic to endpoints that are topologically proximate to the client. 

Users can optimise performance, cost and reliability by keeping the traffic within a local zone..

Introduced in Kubernetes 1.30, Service Traffic Distribution can be enabled directly in the Service specification (rather than using annotations as was the case with Topology-Aware Hints & Routing):

apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  ports:
    - port: 80
      protocol: TCP
      targetPort: 80
  selector:
    app: web
  type: ClusterIP
  trafficDistribution: PreferClose

In the demo below, we have 6 backend pods across two Availability Zones. By default, traffic to the Service is distributed across all 6 pods, regardless of their location.

When enabling Traffic Distribution, you can see that the traffic from the client is only sent to the 3 pods in the same AZ. When no local backend is unavailable, traffic is forwarded to backends outside the local zone.

Alongside Service Traffic Distribution support, Cilium is also introducing tooling to monitor traffic inter- and intra-zones. Given that Service Traffic Distribution and its predecessors all derive the zones from labels placed on the nodes, users can now monitor cross-zone traffic using Hubble’s new node labels capabilities, described later in this blog post, and identify ways to reduce cost and latency.

BGPv2

Most applications running Kubernetes clusters need to talk to external networks. In self-managed Kubernetes environments, BGP is often used to advertise networks used by Kubernetes Pods and Services, to make applications accessible from traditional workloads.

Cilium has natively supported BGP since the 1.10 release. It’s quickly become a popular feature as users appreciate not having to install a separate BGP daemon to connect their clusters to the rest of the network. We’ve seen thousands of users taking our “BGP on Cilium” and “Advanced BGP on Cilium” labs and collecting badges along the way. 

What we’ve noticed recently is an acceleration of its adoption in scenarios such as:

  • Users needing to access KubeVirt-managed Virtual Machines running on Red Hat OpenShift clusters
  • Users connecting their external network fabric (such as Cisco ACI) to Kubernetes clusters

Expect some content on both architectures in the coming weeks and months.

However, this fast adoption of BGP is a victim of its own success. The current method to deploy BGP on Cilium is through a single CustomResourceDefinition CiliumBGPPeeringPolicy, which comes with a couple of drawbacks: 

  • We need to explicitly enumerate per-neighbor settings, even when they match between multiple peers
  • All Peers currently get the same advertisement (there’s no control such as concepts such as prefix-list)

Simply put, while the CRD worked in simple topologies, sophisticated networking topologies require a better set of abstractions, similar to what most of the popular BGP daemons use (e.g., peer templates, route maps, etc.…). 

To provide users with the flexibility they need, Cilium 1.16 is introducing a new API – BGPv2 APIs.

Instead of a single CRD, you will be able to use a new set of CRDs to define complex network policies and configurations, making management more modular and scalable within Cilium.

  • CiliumBGPClusterConfig: Defines BGP instances and peer configurations applied to multiple nodes.
  • CiliumBGPPeerConfig: A common set of BGP peering settings that can be used across multiple peers.
  • CiliumBGPAdvertisements: Defines prefixes that are injected into the BGP routing table.
  • CiliumBGPNodeConfigOverride: Defines node-specific BGP configuration to provide a finer control.
Cilium - BGPv2 Architecture

Here is a sample BGP configuration – you can find it in the containerlab examples in this repo.

apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
  name: cilium-bgp
spec:
  nodeSelector:
    matchLabels:
      bgp: "65001"
  bgpInstances:
  - name: "65001"
    localASN: 65001
    peers:
    - name: "65000"
      peerASN: 65000
      peerAddress: fd00:10::1
      peerConfigRef:
        name: "cilium-peer"
    - name: "65011"
      peerASN: 65011
      peerAddress: fd00:11::1
      peerConfigRef:
        name: "cilium-peer"

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
  name: cilium-peer
spec:
  authSecretRef: bgp-auth-secret
  gracefulRestart:
    enabled: true
    restartTimeSeconds: 15
  families:
    - afi: ipv4
      safi: unicast
      advertisements:
        matchLabels:
          advertise: "pod-cidr"
    - afi: ipv6
      safi: unicast
      advertisements:
        matchLabels:
          advertise: "pod-cidr"

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
  name: pod-cidr-advert
  labels:
    advertise: pod-cidr
spec:
  advertisements:
    - advertisementType: "PodCIDR"
      attributes:
        communities:
          wellKnown: [ "no-export" ]

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPNodeConfigOverride
metadata:
  name: bgpv2-cplane-dev-multi-homing-control-plane
spec:
  bgpInstances:
    - name: "65001"
      routerID: "1.2.3.4"

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPNodeConfigOverride
metadata:
  name: bgpv2-cplane-dev-multi-homing-worker
spec:
  bgpInstances:
    - name: "65001"
      routerID: "5.6.7.8"

For those already running the existing BGPv1 feature: note that the v1 APIs will still be available in Cilium 1.16 but they will eventually be deprecated. Migration recommendations and tooling to help you move to v2 are on the roadmap.

We recommend you start any new BGP deployments with the new v2 APIs.

BGP ClusterIP Advertisement

In addition to the new BGP APIs, Cilium 1.16 introduces support for new service advertisements. In prior releases, Cilium BGP could already announce the PodCIDR prefixes (for various IPAM scenarios) and LoadBalancer IP services. In 1.16, ExternalIP and ClusterIP Services can also be advertised. 

The latter might seem like an anti-pattern: ClusterIP Services are designed for internal access only. But there were 2 reasons why this feature was requested:

  1. Many users are migrating from other CNIs to Cilium, and some CNIs already support ClusterIP advertisements. 
  2. ClusterIP services automatically get a DNS record like svc.namespace.svc.cluster.example. So, by synchronizing your Kubernetes Services upstream, you could access your services via their name from outside the cluster.

In the demo below, we start before configuring the BGP session. You can see the deathstar service’s IP, label, and the BGP config. Note how we now advertise ClusterIP Services but only those with the empire label. We end by checking that the BGP sessions have come up and that the backbone router can see the prefix.

Node IPAM Service LB

For users that cannot use BGP or Cilium’s L2 Announcement (a feature particularly appreciated for development environments and a great replacement for MetalLB), Cilium 1.16 is introducing another alternative to access services from outside the cluster: Node IPAM Service LB

Similar to the ServiceLB feature available for the lightweight distribution K3S, Node IPAM Service LB assigns IP Addresses from the nodes themselves to the Kubernetes Services.

Enable the node IPAM feature with the nodeIPAM.enabled=true flag and make sure your Service has the loadBalancerClass set to io.cilium/node:

apiVersion: v1
kind: Service
metadata:
  name: my-loadbalancer-service
  namespace: default
spec:
  loadBalancerClass: io.cilium/node 
  type: LoadBalancer
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Once the service is created, it will receive an IP address from the node itself. This option is likely to be useful in small environments.

$ kubectl get nodes -o wide
NAME                 STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
kind-control-plane   Ready    control-plane   77m   v1.29.2   172.18.0.4    <none>        Debian GNU/Linux 12 (bookworm)   6.8.0-1008-gcp   containerd://1.7.13
kind-worker          Ready    <none>          77m   v1.29.2   172.18.0.3    <none>        Debian GNU/Linux 12 (bookworm)   6.8.0-1008-gcp   containerd://1.7.13
kind-worker2         Ready    <none>          77m   v1.29.2   172.18.0.2    <none>        Debian GNU/Linux 12 (bookworm)   6.8.0-1008-gcp   containerd://1.7.13

$ kubectl apply -f svc.yaml 
service/my-loadbalancer-service created

$ kubectl get svc my-loadbalancer-service 
NAME                      TYPE           CLUSTER-IP     EXTERNAL-IP             PORT(S)        AGE
my-loadbalancer-service   LoadBalancer   10.96.197.50   172.18.0.2,172.18.0.3   80:31453/TCP   14s

BFD for BGP

Status: Beta

Bidirectional Forwarding Detection (BFD) enhances the stability of BGP (Border Gateway Protocol) by providing rapid failure detection between upstream routers.

Integrated into Isovalent Enterprise for Cilium, BFD proactively monitors BGP sessions and detects connectivity issues on bidirectional links within milliseconds. When deployed in a platform, BFD minimizes downtime by enabling swift failovers, essential for high-availability and low-latency environments. This feature ensures that even minor network disruptions are detected early, helping maintain stable, resilient connectivity across clusters, which is crucial for applications with stringent uptime requirements.

Read our technical walkthrough of this new feature, and watch the below video from Nico.

Egress Gateway IPAM

Status: Limited

This update to Egress Gateway in Isovalent Enterprise for Cilium allows users additional control of the IP Addressing distributed to their Egress Gateway nodes, removing the complexity of manually targeting each Egress Gateway node with specific configuration when it comes to IP address management.

The IPAM feature allows you to specify an IP pool in the IsovalentEgressGatewayPolicy from which Cilium leases egress IPs and assigns them to the selected egress interfaces.

The IP pool should be specified in the egressCIDRs field of the IsovalentEgressGatewayPolicy and may be composed of one or more CIDRs:

In the policy example below, you can see the new addition of the key egressCIDR.

apiVersion: isovalent.com/v1
kind: IsovalentEgressGatewayPolicy
metadata:
  name: namespace-1-netshoot
spec:
  destinationCIDRs:
  - 0.0.0.0/0
  egressCIDRs:
  - 10.100.255.50/28
  - 10.100.255.100/28
  egressGroups:
  - nodeSelector:
  	matchLabels:
    	node.kubernetes.io/name: a-specific-node
  selectors:
  - podSelector:
  	matchLabels:
    	app.kubernetes.io/name: netshoot
    	io.kubernetes.pod.namespace: namespace-1

The Cilium operator is in charge of performing the egress IPs allocation to the active egress gateways. Each egress gateway is assigned a different egress IP.

A successful allocation from the operator is reflected in the IsovalentEgressGatewayPolicy status.

status:
  conditions:
    - lastTransitionTime: "2024-09-17T16:06:49Z"
      message: allocation requests satisfied
      observedGeneration: 1
      reason: noreason
      status: "True"
      type: isovalent.com/IPAMRequestSatisfied
  groupStatuses:
  - activeGatewayIPs:
    - 172.22.0.5
    egressIPByGatewayIP:
      172.22.0.5: 10.100.255.50
    healthyGatewayIPs:
    - 172.22.0.5

Cilium Egress Gateway

Get hands on and learn how Cilium’s Egress Gateway enables controlled, secure connectivity from Kubernetes pods to external workloads by assigning specific nodes for outbound traffic, addressing traditional firewall constraints in enterprise environments

Start Lab

Per-pod fixed MAC address

Some applications require software licenses to be based on network interface MAC addresses.

With Cilium 1.16, you can set a specific MAC address for your pods, which should make licensing and reporting easier.

Simply add a specific annotation to your pod with the MAC address value:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    cni.cilium.io/mac-address: be:ef:ca:fe:d0:0d
  name: pod-with-fixed-mac-address
spec:
  containers:
  - name: netshoot
    image: nicolaka/netshoot:latest
    command: ["sleep", "infinite"]

For more information, check out the very short demo below:

Service mesh & Ingress/Gateway API

Gateway API GAMMA Support

The Cilium Service Mesh announcement back in 2021 had wide ramifications. It made our industry entirely reconsider the concept of a service mesh and reflect on the widely-accepted sidecar-based architecture. Why did we need a service mesh in the first place? Was it for traffic observability? To encrypt the traffic within our cluster? Ingress and L7 load-balancing? And do we really need a sidecar proxy in each one of our pods?

It turns out that Cilium could already do a lot of these things natively: network policies, encryption, observability, tracing. When Cilium added support for Ingress and Gateway API to manage traffic coming into the cluster (North-South), it further alleviated the need to install and manage additional third-party tools ; simplifying the life of platform operators.

One of the remaining areas of improvements for Cilium Service Mesh capabilities was traffic management within the cluster: it was possible through customizing the onboard Envoy proxy but it required advanced knowledge of the proxy.

With Cilium 1.16, Cilium Gateway API can now be used for sophisticated East-West traffic management – within the cluster – by leveraging the standard Kubernetes Gateway API GAMMA.

GAMMA stands for “Gateway API for Mesh Management and Administration”. It provides a consistent model for east-west traffic management for the cluster, such as path-based routing and load-balancing internally within the cluster.

Let’s review a GAMMA configuration:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: gamma-route
  namespace: gamma
spec:
  parentRefs:
  - group: ""
    kind: Service
    name: echo
  rules:
  - matches:
    - path:
        type: Exact
        value: /v1
    backendRefs:
    - name: echo-v1
      port: 80
  - matches:
    - path:
        type: Exact
        value: /v2
    backendRefs:
    - name: echo-v2
      port: 80

You will notice that, instead of attaching a route to a (North/South) Gateway like we’ve done so far when using Gateway API for traffic entering our cluster, we can now attach the route to a parent Service, called echo,  using the parentRefs field.

Traffic bound to this parent service will be intercepted by Cilium and routed through the per-node Envoy proxy.

Note how we will forward traffic to the /v1 path to the echo-v1 service and the same for v2. This is how we can, for example, do a/b or green/blue canary testing for internal apps.

To learn more, try the newly updated Advanced Gateway API lab, and check out the below video:

Gateway API enhancements

In addition to GAMMA support, Cilium 1.16’s Gateway API implementation has been boosted with multiple enhancements:

Gateway API 1.1 support:

in Gateway API 1.1, several features are graduating to Standard Channel (GA), notably including support for GAMMA (mentioned above) and GRPCRoute (supported in Cilium since Cilium 1.15). Features on the Standard release channel denotes a high level of confidence in the API surface and provides guarantees of backward compatibility.

New protocol options support:

Cilium 1.16 Gateway API now support new protocol options:

  • proxyProtocol: Some Load Balancing solutions use the HAProxy Proxy Protocol to pass source IP information along. With this new feature, Cilium will be able to pass PROXY protocol and will provide another option to preserve the source IP (another one is highlighted below).
  • ALPN: Application-Layer Protocol Negotiation is a TLS extension required for HTTP/2. As gRPC is built on HTTP/2, when you enable TLS for gRPC you will also need ALPN to negotiate whether both client and server support HTTP/2.
  • appProtocol: Kubernetes 1.20 introduced appProtocol support for Kubernetes Services, enabling users to specify an application protocol for a particular port.

Local ExternalTrafficPolicy support for Ingress/Gateway API:

When external clients access applications running in your cluster, it’s sometimes useful to preserve the original client source IP for various reasons such as observability and security. Kubernetes Services can be configured with the externalTrafficPolicy set to Local to ensure the client source IP is maintained.

In Cilium 1.16, the Cilium-managed Ingress/Gateway API LoadBalancer Services’ external traffic policies can be configured globally via Helm flags or via dedicated Ingress annotation.

Envoy enhancements

Every Cilium release brings improvements to its usage of Envoy, the lightweight cloud native proxy. Envoy has been a core component of Cilium’s architecture for years and has always been relied upon to provide Layer 7 functionalities to complement eBPF’s L3/L4 capabilities.

Cilium 1.16 introduces some subtle changes to Envoy’s use within Cilium:

Envoy as a DaemonSet is now the default option:

Introduced in Cilium 1.14, the Envoy DaemonSet deployment option was introduced as an alternative to embedding Envoy within the Cilium agent. This option decouples Envoy from the Cilium agent, providing more opacity between the Cilium and Envoy lifecycles. In Cilium 1.16, Envoy as a DaemonSet is now the default for new installations.

Host Network mode & Envoy listeners on subset of nodes

Host network mode allows you to expose the Cilium Gateway API Gateway directly on the host network. This is useful in cases where a LoadBalancer Service is unavailable, such as in development environments or environments with cluster-external load balancers.

gatewayAPI:
  enabled: true
  hostNetwork:
    enabled: true

Alongside this feature, you can use a new option to only expose the Gateway/Ingress functionality on a subset of nodes, rather than on all of them.

gatewayAPI:
  enabled: true
  hostNetwork:
    enabled: true
    nodes:
      matchLabels:
        role: infra
        component: gateway-api

This will deploy the Gateway API Envoy listener only on the Cilium Nodes matching the configured labels. An empty selector selects all nodes and continues to expose the functionality on all Cilium nodes. Note both of these features are also available for the Cilium Ingress Controller.

Security

Per-Flow Selective Encryption

Status: Beta

Isovalent Enterprise for Cilium 1.16 introduces a new security and performance enhancement: Selective Encryption! Hundreds of Cilium users already leverage Cilium’s native ability to encrypt traffic flows within a Kubernetes cluster or between clusters, for confidentiality and regulatory purposes. But Cilium Transparent Encryption has traditionally been all-or-nothing: it either encrypts all the traffic (which obviously comes with overhead) or none of it. There are however many users who only wanted to encrypt the traffic for a specific application and rely on application encryption (HTTPS/TLS) when required.

Isovalent Enteprise for Cilium 1.16 - Per-Flow Encryption Diagram

With Selective Encryption, we’re introducing the ability to selectively encrypt specific traffic based on objects like namespaces and pod labels, using a Isovalent Encryption policy that should look and feel like a Cilium Network Policy:

apiVersion: isovalent.com/v1alpha1
kind: IsovalentClusterwideEncryptionPolicy
metadata:
  name: encrypt-client-to-nginx
spec:
   namespaceSelector:
     matchLabels:
        kubernetes.io/metadata.name: default
   podSelector:
     matchLabels:
       kind: client
   peers:
   - podSelector:
       matchLabels:
         app: nginx
     namespaceSelector:
       matchLabels:
         kubernetes.io/metadata.name: default
     ports:
     - protocol: TCP
       port: 80

To learn more, watch the below video recorded by Nico, and also follow along in our updated lab “Cilium Transparent Encryption with IPSec and WireGuard

Cilium Transparent Encryption with IPSec and WireGuard

This lab guides you through implementing transparent pod-to-pod encryption in Kubernetes using Cilium’s IPsec and WireGuard options, simplifying compliance and security by eliminating the need for complex Service Mesh or in-application encryption.

Start Lab

Port Range Support in Network Policies

Cilium 1.16 supports a long-awaited feature: support for port ranges in network policies

Before this, network policies would require you to list ports one by one in your network rule, even if they were contiguous.

The Port Range feature, announced in Kubernetes 1.21 and promoted to Stable in Kubernetes 1.25, lets you target a range of ports instead of a single port in a Network Policy, using the endPort keyword.

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "range-port-rule"
spec:
  description: "L3-L4 policy to restrict range of ports"
  endpointSelector:
    matchLabels:
      app: server
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: client
    toPorts:
    - ports:
      - port: "8080"
        endPort: 8082
        protocol: TCP

In the demo below, we start by verifying we have access from a client across to 3 servers listening on ports 8080, 8081 and 8082. It’s successful until we deployed a Cilium Network Policy only allowing the 8080-8081 range, thereafter access to 8082 is blocked.

Access is successful again after expanding the range to 8080-8082.

This feature is available with Kubernetes Network Policies and the more advanced Cilium Network Policies.

Network Policy Validation Status

Sometimes, Cilium cannot detect and alert when Network Policies are semantically incorrect until after deployment. 

The demo below shows that even though our network policy is missing a field, it seems accepted. The only way to find out it was rejected is by checking the verbose agent logs, which is not an ideal user experience.

Cilium 1.16 adds information about the network policy validation condition in the operator. This means that, as you can see in the demo, you can easily find the status of the policy – valid or invalid – by checking the object with kubectl describe cnp.

CIDRGroups support for Egress and Ingress/Egress Deny Rules

Another feature designed to simplify the creation and management of network policy: in Cilium 1.16, you can now use CiliumCIDRGroups in Egress and Ingress/Egress Deny Policy rules. A CiliumCIDRGroup is a list of CIDRs that can be referenced as a single entity in Cilium Network Policies and Cilium Cluster Network Policies.

To consume this enhancement, first create a CIDR Group as per the YAML example below:

apiVersion: cilium.io/v2alpha1
kind: CiliumCIDRGroup
metadata:
  name: example-cidr-group
spec:
  externalCIDRs:
    - "172.19.0.1/32"
    - "192.168.0.0/24"

You can then use the meta name from the CiliumCIDRGroup within the policy as below for Egress Policies:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "egress-example-cidr-group-ref-policy"
  namespace: "default"
spec:
  endpointSelector:
	matchLabels:
  	app: my-service
  egress:
  - toCIDRSet:
	- cidrGroupRef: "example-cidr-group"

Select Nodes as the Target of Cilium Network Policies

The feature adds the ability to provide Cilium Network Policies with the ability to use Node Labels with the policy selector statements. Before the Cilium 1.16 release, users who needed to filter traffic from/to Kubernetes nodes in their cluster using Network Policies would need to use either the “remote-node” entity or a CIDR-based policy. Using either of these methods had its pitfalls, such as remote-node selecting all nodes in a cluster mesh configuration.

Before Cilium 1.16, to target nodes in a Cilium Network Policy, you would use a policy as the example below:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "from-nodes"
spec:
  endpointSelector:
    matchLabels:
  	env: prod
  ingress:
  - fromEntities:
    - remote-node

Now, with this new feature, which allows nodes to be selectable by their labels instead of CIDR and/or remote-node entity, you can configure the helm value nodeSelectorLabels=true and use a policy such as the example below, which allows pods with the label env: prod, to communicate with control-plane nodes:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "to-prod-from-control-plane-nodes"
spec:
  endpointSelector:
	matchLabels:
  	env: prod
  ingress:
	- fromNodes:
    	- matchLabels:
        	node-role.kubernetes.io/control-plane: ""

The following extraVar can be configured to select which node labels are used for node identity.

--node-labels strings List of label prefixes used to determine identity of a node (used only when enable-node-selector-labels is enabled)

FIPS-compliant Cilium Images

In today’s regulatory landscape, industries such as finance, healthcare, and government must meet rigorous compliance standards to ensure data integrity and security. Federal Information Processing Standards (FIPS) compliance is especially critical, as it certifies that cryptographic algorithms meet stringent federal security requirements. 

With the release of Isovalent Enterprise for Cilium 1.16, customers with stringent compliance needs can now confidently integrate industry-leading security measures into their operations with our FIPS-compliant Cilium images.

If your organization requires robust, FIPS-compliant solutions to meet critical security standards, contact our team to learn how we can support your compliance journey.

Speak with an Isovalent Solution Architect Today

Get connected with one of our Cilium experts today and see Isovalent Cilium Enterprise in action.

Get in touch

Day 2 Operations and Scale

Introduction of Cilium Feature Gates

In the upcoming Isovalent Enterprise for Cilium 1.16 release, we’re introducing feature gates to give operators more control over beta, limited, and unsupported features. This change enables teams to safely experiment with new functionalities in a controlled manner, ensuring that only intended features are activated within their environment. While the current release will provide warnings if feature gates aren’t configured, future versions will require them for Cilium to start.

Feature gates empower operators to manage feature availability more precisely, aligning with evolving security and reliability standards. 

Below is an example of the output created when trying to install a feature that is not stable with 1.16.

$ helm install foo install/kubernetes/cilium --set localRedirectPolicy=true
<snip>
WARNING: A Limited feature LocalRedirectPolicy was enabled, but it is not an approved feature.
Please contact Isovalent Support for more information on how to grant an exception.

Furthermore, this is also captured in the Cilium Agent logs.

level=warn msg="Unsupported feature(s) enabled: EnvoyDaemonSet (Limited), LocalRedirectPolicy (Limited). Please contact Isovalent Support for more information on how to grant an exception." module=enterprise-agent.features-agent

In the future Isovalent Enterprise for Cilium 1.17 release, the feature gates warning will block the installation of features that are not approved, with a bypass needed as part of your installation configuration.

Improved DNS-Based Network Policy Performance

One of the reasons Cilium Network Policies are so popular with cluster administrators are their ability to filter based on Fully-Qualified Domain Names (FQDN). 

DNS-aware Cilium Network Policies enable users to permit traffic based on specific FQDNs, for example, by using the toFQDN selector in network policies to only allow traffic to, for example,  my-remote-service.com. It’s supported by using a DNS proxy that is responsible for intercepting DNS traffic and recording IP addresses seen in the responses. 

DNS-Based Network policies are extremely useful when implementing API security; improving this feature’s performance ultimately offers a better developer and end-user experience. 

With this latest release, Cilium 1.16 has significantly improved CPU and memory usage and, even more crucially, up to 5x reduction in tail latency

The implementation of toFQDNs selectors in policies has been overhauled to improve performance when many different IPs are observed for a selector: Instead of creating cidr identities for each allowed IP, IPs observed in DNS lookups are now labelled with the selectors toFQDNs matching them. 

This reduces tail latency significantly for FQDNs with a highly dynamic set of IPs, such as e.g. content delivery networks and cloud object storage services. As you can see from the below graphs, with these enhancements, there is a 5x improvement in tail latency.

Cilium DNS Latency

Elegantly, upon upgrade or downgrade, Cilium will automatically migrate its internal state for toFQDNs policy entries.

That’s not the only performance improvement in Cilium 1.16. Let’s dive into another enhancement.

New ELF Loader Logic

Theoretically, we must recompile the BPF for an endpoint whenever its configuration changes. Compiling is a fairly expensive process, so a mechanism named “ELF templating / substitution” had been developed to avoid recompilation in the most common cases. This substitution process was, however, sub-optimal. In Cilium 1.16, it has been improved, resulting into noticeable memory gains:

Hubble and Observability

A New Hubble Timescape Deployment Mode

Hubble Timescape delivers in-depth observability for Cilium by tracking application interactions, security events, and dependencies, enabling users to analyze, troubleshoot, and identify anomalies within Kubernetes clusters. Designed to scale, Timescape also supports extensive historical data retention for long-term analysis.

With Hubble Timescape 1.4, the new Timescape Lite deployment mode is introduced, offering a simplified, single-StatefulSet deployment without the need for additional storage or microservices, ideal for short-term flow visibility. Users can start with Timescape Lite and seamlessly upgrade to the full Timescape deployment as data retention needs grow.

Configuration is deployable via Helm, including Lite-specific parameters for quick setup.

Hubble UI - Microservices demo - Timescape

Configuration is deployable via Helm, including Lite-specific parameters for quick setup.

lite:
  enabled: true
  resources:
    requests:
      cpu: 300m
      memory: 256Mi
  clickhouse:
    resources:
      requests:
        cpu: 700m
        memory: 4Gi

Hubble Timescape now supports multi-cluster environments, enabling events to be stored across one or multiple object storage buckets, with the Timescape ingester capable of processing data from multiple sources. We recommend deploying a single Hubble Timescape instance within a dedicated monitoring cluster and configuring it to ingest data from multiple buckets or specified bucket prefixes. Multi-cluster mode is not supported for use with Hubble Timescape Lite.

Hubble UI Enterprise Improvements

Starting from version 1.4, Hubble UI Enterprise introduces two major features: multi-cluster support and the ability to create Cilium Network Policies with HTTP rules.

When using Hubble for visibility across your Kubernetes environement, in previous versions you would need a Hubble UI instance per cluster. With Multi-Cluster support, Hubble UI Enterprise provides visualization of service map and flow tables from multiple clusters, when used in conjunction with Hubble Timescape. 

Within Hubble UI enterprise the former “flows” tab has been renamed to “Connections” and now displays connections information at three different levels:

  • Cluster level: shows all clusters and how they connect (where applicable) with each other
  • Namespace level: shows all namespaces in a cluster and connections between them and external ones (host, other cluster, etc)
  • Service level: this is the original Hubble UI view that Hubble UI is known for.
Hubble UI Enterprise - Multi-Cluster

The second feature update to Hubble UI Enterprise brings the creation of Network Policies using HTTP rule and also Mutual Authentication to the Network Policy Editor, further expanding policy options for users.

With this addition, the Network Policy Editor now supports key elements of Cilium’s Layer 3 to Layer 7 policy capabilities, offering users enhanced control over HTTP traffic management and simplifying policy creation for common configurations.

When using the Network Policy Editor, clicking an endpoint in the service map will provide the ability to creating a HTTP Rule, as can be seen in the screenshot below. By clicking to Add an HTTP Rule, the dialog box will expand with additional inputs for the necessary HTTP Rules aligned to the configurations available within the Cilium Network Policy CRD.

Hubble UI - Network Policy Editor - Add Layer 7 Rule

Once applied, the network policy editor will display an updated Cilium Network Policy that includes the HTTP rule, which can be uploaded to the Kubernetes environment. 

Hubble UI - Network Policy Editor - Add Layer 7 Rule - YAML Output

Rules created for intra-cluster communications between workloads can also be enabled for Mutual Authentication using the network policy editor using a simple toggle. Below within a policy that allows traffic between any workloads within the namespace that have the label app=veducate, the Mutual Authentication toggle has been enabled. The resulting network policy will have authentication.mode=required present.

Additional recent enhancements to Hubble UI Enterprise include updated tooltips across configurable components, providing clear explanations of accepted data formats and details within dialog value boxes.

Hubble UI - Network Policy Editor - Tooltips

Filtering Hubble Flows by Node Labels

Hubble now captures the node labels for flows, allowing you to filter by particular nodes in your cluster. This can be helpful for Kubernetes deployments across availability zones and help to identify cross-availability zone traffic between workloads regardless of their source or destination namespaces, for example. 

With Hubble providing this level of visibility, it will help platform owners identify misconfigured services that allow cross-AZ traffic and cause an increase in costs from the cloud provider. Typically, most deployments should be set to local preferred traffic, with remote traffic set as a fallback, meaning all traffic should be localised within the same AZ, only traversing the AZ when the service fails. 

$ hubble observe --pod default/curl-75fd79b7-gjrgg --node-label topology.kubernetes.io/zone=az-a  --not --namespace kube-system --print-node-name 
Jul 19 11:32:35.739 [kind-kind/kind-worker2]: default/curl-75fd79b7-gjrgg:58262 (ID:13886) -> default/nginx-57fdc5ff77-qpnqp:80 (ID:1359) to-endpoint FORWARDED (TCP Flags: ACK)
Jul 19 11:32:36.747 [kind-kind/kind-worker3]: default/curl-75fd79b7-gjrgg:56786 (ID:13886) -> default/nginx-57fdc5ff77-vncdm:80 (ID:1359) to-endpoint FORWARDED (TCP Flags: ACK)
Jul 19 11:32:36.747 [kind-kind/kind-worker]: default/curl-75fd79b7-gjrgg:56786 (ID:13886) -> default/nginx-57fdc5ff77-vncdm:80 (ID:1359) to-network FORWARDED (TCP Flags: ACK)

Below is a recording showing how to use these new filters to view cross-availability zone traffic. 

Improvements for egress traffic path observability

The Cilium Egress Gateway feature allows you to select defined exit routes on your network for your containers. This feature is particularly useful when traffic leaving your cluster transits via external traffic management devices. These devices need to understand the specific endpoints from which traffic originates. The Egress Gateway feature works by implementing deterministic source NAT for all traffic that traverses through a node; allocating a predictable IP to traffic coming from a particular Pod or a specific namespace.

In Cilium 1.16, several enhancements are implemented to aid better observability for traffic using Egress Gateway nodes. 

The first is the creation of additional metrics within the Cilium Agent, which tracks the number of allocated ports in each NAT connection tuple: {source_ip, endpoint_ip, endpoint_port}. These additional statistics help monitor the saturation of the endpoint connections based on the allocation of source ports. 

$ cilium-dbg statedb nat-stats
IPFamily   Proto    EgressIP                RemoteAddr                   Count
ipv4       TCP      10.244.1.89             10.244.3.86:4240             1
ipv4       TCP      10.244.1.89             10.244.0.170:4240            1
ipv4       TCP      172.18.0.2              172.18.0.5:4240              1
ipv4       TCP      172.18.0.2              172.18.0.3:4240              1
ipv4       TCP      172.18.0.2              172.18.0.5:6443              4
ipv4       ICMP     172.18.0.2              172.18.0.5                   6
ipv4       ICMP     172.18.0.2              172.18.0.3                   6
ipv6       TCP      [fc00:c111::2]          [fc00:c111::5]:4240          1
ipv6       TCP      [fd00:10:244:1::e8ce]   [fd00:10:244:3::6860]:4240   1
ipv6       TCP      [fc00:c111::2]          [fc00:c111::3]:4240          1
ipv6       TCP      [fd00:10:244:1::e8ce]   [fd00:10:244::b991]:4240     1
ipv6       ICMPv6   [fc00:c111::2]          [fc00:c111::5]               6
ipv6       ICMPv6   [fc00:c111::2]          [fc00:c111::3]               6

he Cilium Metric nat_endpoint_max_connection has also been implemented to monitor these statistics in your alerting platform.

Hubble flow data has been further updated with Egress Gateway traffic paths in mind. Earlier in this blog post, we’ve already covered the ability to capture and filter traffic flows based on Node Label, so let’s look at two further new filters.

In the below Hubble flow output, the pod xwing is contacting an external device on IP address 172.18.0.7; this traffic is subject to address translation by the Egress Gateway. 

The new fields implemented are:

  • IP.source_xlated
  • node_labels
  • interface

Here is a JSON output of a flow recorded by Hubble:

{
  "flow": {
	"time": "2024-07-18T13:58:32.826611870Z",
	"uuid": "39a6a8f3-53cf-4dc8-8cb0-ce19c558228c",
	"verdict": "FORWARDED",
	"ethernet": {
  	"source": "02:42:ac:12:00:06",
  	"destination": "02:42:ac:12:00:07"
	},
	"IP": {
  	"source": "10.244.3.136",
  	"source_xlated": "172.18.0.42",
  	"destination": "172.18.0.7",
  	"ipVersion": "IPv4"
	},
	"l4": {
  	"TCP": {
    	"source_port": 60288,
    	"destination_port": 8000,
    	"flags": {
      	"FIN": true,
      	"ACK": true
    	}
  	}
	},
	"source": {
  	"identity": 3706,
  	"cluster_name": "kind-kind",
  	"namespace": "default",
  	"labels": [
    	"k8s:class=xwing",
    	"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default",
    	"k8s:io.cilium.k8s.policy.cluster=kind-kind",
    	"k8s:io.cilium.k8s.policy.serviceaccount=default",
    	"k8s:io.kubernetes.pod.namespace=default",
    	"k8s:org=alliance"
  	],
  	"pod_name": "xwing"
	},
	"destination": {
  	"identity": 2,
  	"labels": [
    	"reserved:world"
  	]
	},
	"Type": "L3_L4",
	"node_name": "kind-kind/kind-worker3",
	"node_labels": [
  	"beta.kubernetes.io/arch=amd64",
  	"beta.kubernetes.io/os=linux",
  	"egress-gw=true",
  	"kubernetes.io/arch=amd64",
  	"kubernetes.io/hostname=kind-worker3",
  	"kubernetes.io/os=linux"
	],
	"event_type": {
  	"type": 4,
  	"sub_type": 11
	},
	"traffic_direction": "EGRESS",
	"trace_observation_point": "TO_NETWORK",
	"trace_reason": "ESTABLISHED",
	"is_reply": false,
	"interface": {
  	"index": 13,
  	"name": "eth0"
	},
	"Summary": "TCP Flags: ACK, FIN"
  },
  "node_name": "kind-kind/kind-worker3",
  "time": "2024-07-18T13:58:32.826611870Z"
}

These additional fields are further combined with updates to the Hubble CLI, which includes the following new arguments;

hubble observe
  --interface filter           Show all flows observed at the given interface name (e.g. eth0)
  --snat-ip filter             Show all flows SNATed to the given IP address. Each of the SNAT IPs can be specified as an exact match (e.g. '1.1.1.1') or as a CIDR range (e.g.'1.1.1.0/24').

In the example below, we filter all traffic using the specific node label in our environment to identify egress gateway nodes. The traffic has been translated to the IP address 172.18.0.42.

$ hubble observe --node-label egress-gw=true  --snat-ip 172.18.0.42
Jul 18 13:58:32.439: default/xwing:60262 (ID:3706) -> 172.18.0.7:8000 (world) to-network FORWARDED (TCP Flags: SYN)
Jul 18 13:58:32.439: default/xwing:60262 (ID:3706) -> 172.18.0.7:8000 (world) to-network FORWARDED (TCP Flags: ACK)
Jul 18 13:58:32.439: default/xwing:60262 (ID:3706) -> 172.18.0.7:8000 (world) to-network FORWARDED (TCP Flags: ACK, PSH)
Jul 18 13:58:32.440: default/xwing:60262 (ID:3706) -> 172.18.0.7:8000 (world) to-network FORWARDED (TCP Flags: ACK, FIN)

In the recording below, you can see these new features to extend Egress Gateway Observability in action. 

Core Isovalent Enterprise for Cilium features

Advanced Networking Capabilities

In addition to all the core networking features available in the open source edition of Cilium, Isovalent Enterprise for Cilium also includes advanced routing and connectivity features popular with large enterprises and Telco, including:

Platform Observability, Forensics, and Auditing

Isovalent Enterprise for Cilium includes advanced observability and auditing capabilities designed for platform teams. With Role-Based Access Control (RBAC), teams can provide users access to network data and dashboards specific to their applications and namespaces. 

The Hubble Enterprise UI features a Network Policy Editor that enables easy creation of network policies based on actual traffic. Hubble Timescape offers time-travel analytics for historical observability data, extending beyond real-time insights available in Hubble OSS. Additionally, logs can be exported to SIEM platforms like Splunk or ELK for broader security and monitoring needs.

For more on Hubble enterprise features, explore the Hubble for the Enterprise blog or labs like the Network Policies Lab and Connectivity Visibility Lab.

Enterprise-Grade Resilience

Isovalent Enterprise for Cilium includes capabilities for organizations that require the highest level of availability. These include features such as High Availability for DNS-aware network policy (video) and High Availability for the Cilium Egress Gateway (video), as well as FIPs compliant images for those who need to meet the most stringent security compliance requirements.

Enterprise-Grade Support

Last but certainly not least, Isovalent Enterprise for Cilium includes enterprise-grade support from Isovalent’s experienced team of experts, ensuring that any issues are resolved promptly and efficiently. Customers also benefit from the help and training from professional services to deploy and manage Cilium in production environments.

Shortening time to value with Isovalent Enterprise for Cilium Support

In this brief you can learn why many leading Fortune 500 companies including Adobe, Goldman Sachs, IBM, JPMC, and Roche picked Isovalent to partner with them on their cloud native journey.

Download Brief

Conclusions

Since the previous enterprise release, many new end users have stepped forward to tell their stories of why and how they’re using Isovalent Enterprise for Cilium in production. These use cases cover multiple industries: software (Adobe), medical (Roche), and IT service and cloud providers (Schuberg Philis). Read below some of the testimonies:

Want to learn more about this release and Isovalent Enterprise platform? Join our webinar with Thomas Graf, creator of Cilium, and Nico Vibert, Senior Staff Technical Marketing Engineer to learn about the latest innovations in the leading Cloud Native Networking Platform Cilium and Isovalent’s Enterprise edition.

Cilium and Isovalent Enterprise for Cilium 1.16 Release Webinar

In this webinar, join Thomas Graf, founder of the Cilium project, and Nico Vibert, Senior Staff Technical Marketing Engineer to learn about the latest innovations in the leading Cloud Native Networking Platform Cilium and Isovalent’s Enterprise edition.

Register Now

To learn more about Isovalent Enterprise for Cilium 1.16, check out the following links:

Feature Status

Here is a brief definition of the feature maturity levels in regards to Isovalent Enterprise for Cilium used in this blog post:

  • Stable: A feature that is appropriate for production use in a variety of supported configurations due to significant hardening from testing and use.
  • Limited: A feature that is appropriate for production use only in specific scenarios and in close consultation with the Isovalent team.
  • Beta: A feature that is not appropriate for production use, but where user testing and feedback is requested. Customers should contact Isovalent support before considering Beta features.

Previous Releases

Find earlier release announcements on our blog.

Dean Lewis
AuthorDean LewisSenior Technical Marketing Engineer

Related

Blogs

BFD: A Networking Beacon for Highly Available Kubernetes Clusters

With Isovalent Enterprise for Cilium, you can leverage BFD to detect defective links and avoid long convergence times.

By
Nico Vibert
Briefs

Scale and Succeed with Isovalent: Top 3 Use Cases for Your Kubernetes Journey

Join the number of organizations trusting Isovalent to modernize their critical infrastructure and make the most of their Kubernetes platforms. See the key use cases driving the adoption of Isovalent’s technical solutions and Isovalent’s role in future-proofing your infrastructure, reducing tool sprawl, and accelerating teams on the path to platform and compliance.

By
Jeremy Colvin
Videos

How Does Isovalent’s Support Model Work?

Join Toufic Arabi, Isovalent's VP of Customer Success, as he provides a high-level overview of the types of support that Isovalent customers can expect from our Customer Success team.

By
Toufic Arabi

Industry insights you won’t delete. Delivered to your inbox weekly.