Probe¶

Probe refers to the use of black-box monitoring to regularly test the connectivity of targets through HTTP, TCP, and other methods, enabling quick detection of ongoing faults.

Insight uses the Prometheus Blackbox Exporter tool to probe the network using protocols such as HTTP, HTTPS, DNS, TCP, and ICMP, and returns the probe results to understand the network status.

Prerequisites¶

The insight-agent has been successfully deployed in the target cluster and is in the Running state.

View Probes¶

Go to the Insight product module.
Select Infrastructure -> Probes in the left navigation bar.
- Click the cluster or namespace dropdown in the table to switch between clusters and namespaces.
- The list displays the name, probe method, probe target, connectivity status, and creation time of the probes by default.
- The connectivity status can be:
  - Normal: The probe successfully connects to the target, and the target returns the expected response.
  - Abnormal: The probe fails to connect to the target, or the target does not return the expected response.
  - Pending: The probe is attempting to connect to the target.
- Supports fuzzy search of probe names.

Create a Probe¶

Click Create Probe .
Fill in the basic information and click Next .
- Name: The name can only contain lowercase letters, numbers, and hyphens (-), and must start and end with a lowercase letter or number, with a maximum length of 63 characters.
- Cluster: Select the cluster for the probe task.
- Namespace: The namespace where the probe task is located.
Configure the probe parameters.
- Blackbox Instance: Select the blackbox instance responsible for the probe.
- Probe Method:
  - HTTP: Sends HTTP or HTTPS requests to the target URL to check its connectivity and response time. This can be used to monitor the availability and performance of websites or web applications.
  - TCP: Establishes a TCP connection to the target host and port to check its connectivity and response time. This can be used to monitor TCP-based services such as web servers and database servers.
  - Other: Supports custom probe methods by configuring ConfigMap. For more information, refer to: Custom Probe Methods
- Probe Target: The target address of the probe, supports domain names or IP addresses.
- Labels: Custom labels that will be automatically added to Prometheus' labels.
- Probe Interval: The interval between probes.
- Probe Timeout: The maximum waiting time when probing the target.
After configuring, click OK to complete the creation.

Warning

After the probe task is created, it takes about 3 minutes to synchronize the configuration. During this period, no probes will be performed, and probe results cannot be viewed.

View Monitoring Dashboards¶

Click __ ...__ in the operations column and click View Monitoring Dashboard .

Metric Name	Description
Current Status Response	Represents the response status code of the HTTP probe request.
Ping Status	Indicates whether the probe request was successful. 1 indicates a successful probe request, and 0 indicates a failed probe request.
IP Protocol	Indicates the IP protocol version used in the probe request.
SSL Expiry	Represents the earliest expiration time of the SSL/TLS certificate.
DNS Response (Latency)	Represents the duration of the entire probe process in seconds.
HTTP Duration	Represents the duration of the entire process from sending the request to receiving the complete response.

Custom Metrics Alert¶

After creating a blackbox probing task, besides checking the health status of the probe target via the monitoring dashboard, you can also create corresponding alert rules based on the metrics related to the probing. Specifically, Prometheus Blackbox Exporter generates a series of metrics from the probing results, with the most commonly used ones listed as follows:

Metric	Description
probe_success	Ping status
probe_http_ssl	SSL verification result
probe_ssl_earliest_cert_expiry	Earliest SSL certificate expiration date
probe_ip_protocol	IP protocol being used
probe_http_status_code	HTTP status code returned
probe_http_duration_seconds	HTTP request duration
probe_http_version	HTTP version

Commonly used alert rules are configured as below:

apiVersion: operator.victoriametrics.com/v1beta1
kind: VMRule
metadata:
  labels:
    operator.insight.io/managed-by: insight
  name: probe-alert-rule
  namespace: test1
spec:
  groups:
    - name: probe
      rules:
        - alert: ProbeFailed
          annotations:
            description: Probe job {{ .labels.job }} access {{ .labels.instance }} in namespace {{ .labels.namespace }} target down for 15s
            value: '{{$value}}'
          expr: probe_success == 0
          for: 15s
          labels:
            severity: critical
        - alert: SlowProbe
          annotations:
            description: Probe job {{ .labels.job }} access {{ .labels.instance }} in namespace {{ .labels.namespace }} took more than 1s to complete
            value: '{{$value}}'
          expr: avg_over_time(probe_duration_seconds[1m]) > 1
          for: 1m
          labels:
            severity: warning

Probe¶

Prerequisites¶

View Probes¶

Create a Probe¶

View Monitoring Dashboards¶

Custom Metrics Alert¶

Edit a Probe¶

Delete a Probe¶

Comments