Monitoring

Monitor Kyverno policy metrics with Prometheus

Introduction

As a cluster admistrator, it is beneficial for you to have capabilities to monitor the state and execution of the Kyverno policies applied over your cluster. Things like tracking the applied policies, the changes associated with them, the activity associated with the incoming requests processed, and the results associated with policies can prove to be extremely useful as a part of cluster observability and compliance.

In addition, providing flexible monitoring of targets from the rule level or policy level to entire cluster level gives you options to extract insights from the collected metrics.

Installation and Setup

When you install Kyverno via Helm, a service called kyverno-svc-metrics gets created inside the kyverno namespace and this service exposes metrics om port 8000.

 1$ values.yaml
 2
 3...
 4metricsService:
 5  create: true
 6  type: ClusterIP
 7  ## Kyverno's metrics server will be exposed at this port
 8  port: 8000
 9  ## The Node's port which will allow access Kyverno's metrics at the host level. Only used if service.type is NodePort.
10  nodePort:
11  ## Provide any additional annotations which may be required. This can be used to
12  ## set the LoadBalancer service type to internal only.
13  ## ref: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
14  ##
15  annotations: {}
16...
bash

By default, the service type is going to be ClusterIP meaning that the metrics would be only capable of being scraped by a Prometheus server sitting inside the cluster.

In many cases, the Prometheus server may be outside the workload cluster as an shared service. In those scenarios, you will want the kyverno-svc-metrics service to be publicly exposed so as to expose the metrics (available at port 8000) to your Prometheus server sitting outside the cluster.

Services can be exposed to external clients via an Ingress, or using LoadBalancer or NodePort service types.

To expose your kyverno-svc-metrics service publicly as NodePort at host’s/node’s port number 8000, you can configure your values.yaml before Helm installation as follows:

 1...
 2metricsService:
 3  create: true
 4  type: NodePort
 5  ## Kyverno's metrics server will be exposed at this port
 6  port: 8000
 7  ## The Node's port which will allow access Kyverno's metrics at the host level. Only used if service.type is NodePort.
 8  nodePort: 8000
 9  ## Provide any additional annotations which may be required. This can be used to
10  ## set the LoadBalancer service type to internal only.
11  ## ref: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
12  ##
13  annotations: {}
14...
bash

To expose the kyverno-svc-metrics service using a LoadBalancer type, you can configure your values.yaml before Helm installation as follows:

 1...
 2metricsService:
 3  create: true
 4  type: LoadBalancer
 5  ## Kyverno's metrics server will be exposed at this port
 6  port: 8000
 7  ## The Node's port which will allow access Kyverno's metrics at the host level. Only used if service.type is NodePort.
 8  nodePort: 
 9  ## Provide any additional annotations which may be required. This can be used to
10  ## set the LoadBalancer service type to internal only.
11  ## ref: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
12  ##
13  annotations: {}
14...
bash

Metrics and Dashboard


Policies and Rule Counts

This metric can be used to track the number of policies as well as rules present in the cluster which are currently active and even the ones which are not currently active but were created in the past.

Policy and Rule Execution

This metric can be used to track the results associated with the rules executing as a part of incoming resource requests and even background scans. This metric can be further aggregated to track policy-level results as well.

Policy Rule Execution Latency

This metric can be used to track the latencies associated with the execution/processing of the individual rules whenever they evaluate incoming resource requests or execute background scans. This metric can be further aggregated to present latencies at the policy-level.

Admission Review Latency

This metric can be used to track the end-to-end latencies associated with the entire individual admission review, corresponding to the incoming resource request triggering a bunch of policies and rules.

Admission Requests Counts

This metric can be used to track the number of admission requests which were triggered as a part of Kyverno.

Policy Change Counts

This metric can be used to track the history of all the Kyverno policies-related changes such as policy creations, updations and deletions.

Grafana Dashboard

A ready-to-use dashboard for Kyverno metrics.

Last modified July 20, 2021 at 10:16 AM PST: add arch and install diagrams and shorten headings (5f8f959)