Back to blog

Cilium Hubble Series (Part 2): Hubble for the Enterprise

Dean Lewis
Dean Lewis
Published: Updated: Isovalent
Cilium Hubble Enterprise

In part 1 of this series re-introducing Cilium Hubble to the cloud-native world, we answered many questions about Hubble – What is it? Who uses it? What are the main use cases for it?

We had focused on the widely adopted open-source version of Hubble. Let’s now review some of the scenarios, such as role based access control, troubleshooting with historical data, and integrations with other systems that enterprise users might encounter and how Isovalent Cilium for Enterprise addresses them.

As always, this blog post includes plenty of demos and follow-up labs to take if you’d like to learn more!

Let’s get started.

How do I easily improve the security posture of my cluster with Network Policies?

One of the primary use cases for Cilium remains its support for Kubernetes Network Policies and the more granular L7-aware Cilium Network Policies. This is usually early on the “path to production” – when enterprises move their use of Kubernetes from a test to a production level.

Having a cluster with no internal traffic restrictions is certainly not something you’d want for critical workloads.

The challenge – as we also discussed in the Zero Trust Security Journey blog post – is no longer having to convince users that Zero Trust is required (it’s widely understood that a perimeter-based only approach is of little use). The real barrier to adoption is the creation of the network policies.

The Enterprise edition of Hubble is there to help, with a built-in Network Policy editor, that allows you to build new policies from the live view of your environment. The Network Policy editor shows you all the flows captured for your given namespace, and allows you to dynamically create a policy that permits or denies the traffic you are interested in, providing a YAML manifest that can be applied against your cluster.

The Network Policy editor is intuitive by design. You can simply select an existing flow of traffic to add it to a policy, or you can start to build out your policies by specifying the selectors and traffic directions.

Take a look at the video below to see this feature in action, and run through both our Network Policies and Zero Trust Security Labs, and use the feature in a live environment.

This feature covers both the standard Kubernetes Network Policies and the more powerful Cilium Network Policies (which extend to include the ability to create rulesets for layer 7 network traffic). Migrating to a zero trust model couldn’t be easier, with the ability to build network policies from live network flows, setting your default ingress/egress rules, and selectors based on Kubernetes labels to identify which endpoints the rule applies to.

Isovalent Cilium Enterprise: Network Policies

In this hands-on demo we will walk through some of those challenges and their solutions.

Start Lab!

Zero Trust Visibility

In this lab, you will use Hubble metrics to build a Network Policy Verdict dashboard in Grafana showing which flows need to be allowed in your policy approach.

Start Lab!

I need to delegate network troubleshooting to application developers

In our conversations with Isovalent Enterprise for Cilium customers such as VSHN or TietoEvry, we heard a recurring goal – reducing the overall burden on the SRE team.

SRE teams tend to be relatively small – sometimes a SRE might support dozens if not hundreds of developers. This only works if the SRE can hand over some of the troubleshooting and operational tasks to the developers themselves.

And it’s not just that SREs are extremely busy – it’s also because app developers don’t want to have to raise a support ticket to investigate a connectivity issue.

While Hubble UI Open-Source provides a namespace-based view of network connectivity, there is no way to restrict the view or the privileges of the Hubble UI user.

Cilium Hubble Enterprise

In the Enterprise edition, multi-tenant self-service access is possible, with the OpenID Connect (OIDC) integration, which allows you to integrate with your existing Identity and Authorization platforms such as Okta, Auth0 and others.

Hubble Enterprise uses policy based authorization settings, which allow you to control what resources and data (such as flows and metrics) can be accessed, and by whom in your organization. See the example policy code below.

rbac:
  enabled: true
  observerProxy:
    authMode: oidc
    oidcURL: https://{oidcURL}
...
    oidcClientID: {oidcClientID}
    jwtScopeField: {jwtScopeField:}
  policy:
    mode: configMap
    configMap:
      bindings:
      - role: admin
        scope: email
        value: 'admin@example.com'
      - role: otel-demo-dev
        scope: email
        value: 'otel-demo-dev@example.com'
      roles:
      # admins can get flows from all namespaces.
      - name: admin
        rules:
          - actions:
              - "*"
            kind: "*"
            allowAllContexts: true
      # otel-demo-dev team can get flows and metrics from the otel-demo namespace, but no other namespace.
      - name: otel-demo-dev
        rules:
          - actions:
              - get
            contexts:
              - field: namespace
                values:
                  - otel-demo
            kind: flows
          - actions:
              - get
            contexts:
              - field: namespace
                values:
                  - otel-demo
            kind: metrics

This feature extends throughout all the components of the Hubble architecture, including the UI and Timescape.

In the short demo below we cover the Hubble Enterprise UI, however the Hubble Enterprise CLI also offers the same RBAC features:

  • The platform administrator can access all namespaces and views
  • The “Otel-Demo” Developer logs in and can access their namespace’s service map but they do not see any other namespaces.
  • The “Tenant-Jobs” Developer logs in and can access their namespace’s service map (and we can see the application is totally different).
  • Finishing off with an overview of the configuration of the RBAC Policies applied via Helm.

You can see a longer deep dive video on our YouTube Channel.

Taking ownership of troubleshooting not only empowers application owners but also promotes a healthy and productive work environment – It is vital for application owners to actively engage in troubleshooting their own applications to foster a culture of collaboration and eliminate the blame game during post-mortems.

We’ve had a application outage and need to find the root cause

On the subject of post-mortems there will come a time where something goes wrong and conducting a root cause analysis of the incident will be required.

This is an integral part of the roles of both SREs and application owners. It enables them to proactively address incidents, improve system reliability, drive continuous improvement, foster collaboration, and enhance their overall understanding of the system.

As useful as the Hubble Relay component is (provides multi-node support across the Hubble peers), it only provides real-time information; which is of limited use hours, days or weeks after the incident.

Hubble Timescape can take you back in time to the moment where the app started misbehaving, providing you full visibility into the network flow lifecycle for your application and associated services.

The below video captures a quick overview of how Timescape furthers your ability to troubleshoot, you can find a longer deep dive video on our YouTube Channel.

Let’s dive into the architecture of Hubble Timescape. It is designed to be either deployed into the same cluster in which it is monitoring, or, for larger environments, can be deployed into a separate Kubernetes cluster.

Hubble Enterprise is configured to export the data into S3 compliant object storage or public cloud storage such as Google Cloud Storage and Azure Blob Storage.

Hubble Timescape is built on top of ClickHouse, an OSS columnar database, this can be deployed in-cluster or database external to the Kubernetes cluster, e.g. ClickHouse Cloud. The Hubble Timescape Trimmer serves as an optional component designed to enforce a pre-defined limit on the number of flows ingested into the database, regardless of time-based considerations.

Hubble Timescape deploys an ingester to load the Hubble flows and Tetragon process events into a Clickhouse Database. The Hubble Timescape server implements and serves the gRPC API, and is accessed via the Hubble UI or Hubble CLI.

You can now get hands-on with Hubble Timescape in our recently updated Isovalent Cilium Enterprise: Connectivity Visibility Lab, consuming both the Hubble UI and Hubble CLI to troubleshoot using both live flows and historical flows from Timescape.

Isovalent Cilium Enterprise: Connectivity Visibility

This lab provides an introduction to Isovalent Cilium Enterprise capabilities related to connectivity observability using Hubble Enterprise.

Start Lab

How can I take this meaningful data and visualize it in my existing monitoring platform?

We know that tool sprawl is a top consideration for users and is made worse when your different tools do not interoperate with one another. Last year, we announced a strategic partnership with Grafana Labs, with the goal to provide infrastructure and developer teams deep insights into the connectivity, security, and performance of their applications. As part of this partnership we developed the Hubble data source plugin (currently in beta in Grafana) to help monitor network and security events.

This plugin was created using the Grafana plugin development tools and integrates with three underlying data stores: Hubble Timescape, Prometheus (storing Hubble networking metrics), and Grafana Tempo (storing traces that can be correlated with different signals).

With this plugin, the Hubble data can be widely adopted within your business through a common-interface, providing the ability to reduce the tool sprawl between platform and application teams. In the below screenshot, you can see a representation of our applications Service MAP and HTTP metrics from Hubble.

As a small preview in the next Grafana plugin release, we will also be adding a Tetragon dashboard which will provide a process ancestry view in Grafana, similar to the process view capability in Hubble Enterprise UI.

In the next blog post we will dive into the Hubble Enterprise and Grafana features further, but if you are itching to take an early look at the Grafana integration, then you can get started with the following resources: Anna’s guest blog post over on the Grafana website, or the Golden Signal’s with Hubble and Grafana lab.

In this lab you will learn how Cilium can provide metrics for an existing application with and without tracing functionality, and how you can use Grafana dashboards provided by Cilium to gain insight into how your application is behaving.

Golden Signals with Hubble and Grafana

Learn how to monitor the four Golden Signals with Cilium, Hubble & Grafana.

Start Lab

Where can I learn more?

If you haven’t already checked out the Part 1 blog post of this series, I urge you to head over there now! You can get started today with Hubble, head over to the official documentation to follow the steps to get Hubble installed in a few minutes once Cilium is up and running.

In the next blog post, Part 3, we will deep dive into the new Grafana Hubble Data Source plugin and use-cases, that will enhance your Kubernetes troubleshooting for your platforms.

Alternatively we have a number of Labs which show the benefits of the Observability and troubleshooting features of Hubble alongside features of Cilium, such as the “Observability with Hubble and Cilium Service Mesh” lab.

Observability with Hubble and Cilium Service Mesh

Cilium and Hubble provide observability for Service Mesh, without the overhead of sidecars. Start the lab to find out how.

Start Lab
Dean Lewis
AuthorDean LewisSenior Technical Marketing Engineer

Related

Cilium Hubble Series (Part 1): Re-introducing Hubble

In this first post in this new Hubble series, learn about the Why/What/How of Hubble!

Cilium Hubble Series (Part 1): Re-introducing Hubble
Nico Vibert

Isovalent Enterprise for Cilium: Connectivity Visibility with Hubble

This lab provides an introduction to Isovalent Enterprise for Cilium capabilities related to connectivity observability. This track primarily focuses on Hubble Flow events that provide label-aware, DNS-aware, and API-aware visibility for network connectivity within a Kubernetes environment using Hubble CLI, Hubble UI and Hubble Timescape, which provides historical data for troubleshooting.

Cilium Hubble Series (Part 3): Hubble and Grafana Better Together

Learn how to get started with Cilium Hubble and the Grafana Integration to gain access to network flows and process ancestry events.

Cilium Hubble Series (Part 3): Hubble and Grafana Better Together
Dean Lewis

Industry insights you won’t delete. Delivered to your inbox weekly.