Skip to content

Monitoring metrics

This page explains how to enable calico_prometheus_metrics in Calico to access Prometheus monitoring metrics.

Enable component metrics

When deploying through kubespray, you can decide whether to enable it according to the calico_felix_prometheusmetricsenabled parameter, which is false by default, or manually enable it in the following ways:

  1. Enable calico_felix_prometheusmetricsenabled:

    calicoctl patch felixconfiguration default --patch '{"spec":{"prometheusMetricsEnabled": true}}'
    

    or

    kubectl patch felixconfiguration default --type merge --patch '{"spec":{"prometheusMetricsEnabled": true}}'
    
  2. Enable calico_kube_controller_metrics:

    calicoctl patch kubecontrollersconfiguration default --patch '{"spec":{"prometheusMetricsPort": 9095}}'
    

    or

    kubectl patch kubecontrollersconfiguration default --type=merge --patch '{"spec":{"prometheusMetricsPort": 9095}}'
    

Create the metrics service of the respective components

calico-node-metrics Service:

apiVersion: v1
kind: Service
metadata:
  name: calico-node-metrics
  namespace: kube-system
  labels:
    app: calico-node
    role: metrics
spec:
  clusterIP: None
  selector:
    k8s-app: calico-node
  ports:
  - port: 9091
    name: metrics
    targetPort: 9091

calico-kube-controllers-metrics Service:

apiVersion: v1
kind: Service
metadata:
  name: calico-kube-controllers-metrics
  namespace: kube-system
  labels:
    app: calico-kube-controllers
    role: metrics
spec:
  clusterIP: None
  selector:
    k8s-app: calico-kube-controllers
  ports:
  - port: 9095
    name: metrics
    targetPort: 9095

Create ServiceMonitor object

calico-node ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    app: calico-node
    operator.insight.io/managed-by: insight
spec:
  endpoints:
  - interval: 30s
    port: metrics
  selector:
    matchLabels:
      app: calico-node
      role: metrics

calico-kube-controllers-metrics ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: calico-kube-controller
  namespace: kube-system
  labels:
    app: calico-kube-controller
    operator.insight.io/managed-by: insight
spec:
  endpoints:
  - interval: 30s
    port: metrics
  selector:
    matchLabels:
      app: calico-kube-controllers
      role: metrics

List of important metrics

No. Index Explanation Value Index Other Explanations Associated Issues
1 felix_ipset_errors Execution ipset-restore failed times ** Consider collection Times +1 does not necessarily cause problems
2 felix_iptables_restore_calls Number of iptables-restore executions ***** Collection NA
3 felix_iptables_restore_errors Number of iptables-restore failures ***** Collection Restore failure may cause Pod access failure. The restore failure may be due to the failure of xtables_lock competition, please check whether the number of iptables on the host is too large
4 felix_iptables_save_calls Number of times to run iptables-save **** Can consider collecting NA
5 felix_iptables_save_errors Number of failed iptables-save operations ***** Collection
6 felix_log_errors The number of times the log reports an error ***** Recommended collection
7 ipam_allocations_per_node Number of IP allocations on each node **** Suggested collection
8 ipam_blocks_per_node Number of Blocks allocated on each node ***** Collection

Comments