Skip to content

Features list¶

This page lists the features supported by Observability Insight.

Community Edition - Observability¶

DCE 5.0 Community Edition provides the following observable features.

Category	Subcategory	Description
Resource monitoring	Multi-cluster monitoring	Provide multi-cluster business centralized observability The administrator manages multi-cluster alarms in a unified manner, and satisfies cluster and tenant administrator data isolation Supports persistent cluster indicators and log data.
	Scenario monitoring	Provides a monitoring overview of a single cluster, allowing you to view the running status of the cluster, understand the resource usage of the cluster, and the current alarms that are occurring in the cluster
	Node monitoring	Support to view the running status of the node, etc., and understand the changes in the CPU, memory, network and other resources of the node
	Container Monitoring	Supports monitoring of resources such as stateless loads, daemon processes, container groups, etc., can monitor the running status of the workload, and can view the number of alarms and the trend chart of resource consumption such as CPU and memory
Dashboard	Platform Component Monitoring	Provide open-source selected dashboards through native Grafana, and provide built-in dashboards to support monitoring etcd, APIServer and other components
	Cluster Resource Monitoring	Provides multi-dimensional monitoring of clusters, nodes, and namespaces. The data source used by Grafana supports viewing data from multiple clusters.
Data Query	Index Query	Common Query pre-orders basic indicators, and after selecting query conditions such as cluster, type, node, and indicator name, you can query the change trend of resources Support querying indicator charts and data details through native PromQL statements
	Log query	You can query the logs of Node, Pod, Depoyment, Statefulset, etc., and you can query the context content of a single log Support searching by keyword Sort by time by default, and you can query the number of logs through the histogram Support querying detailed information and context of a single log
	Log Download	Support to download logs within a period of time according to search criteria Support exporting the content of a single log context
Alarm Center	Active Alarm	Provide a histogram to view the change trend of the alarm time Support to view all the rules and details that are alarming
	Historical alarms	You can query all alarms after automatic recovery or manual resolution
	Alert rules	Built-in 100+ alert rules, providing predefined alert rules for cluster components, container resources, etc. Administrators can create global alert rules to provide unified alerts for clusters that have installed insight-agent Support creating alarm rules through predefined indicators Support creating alarm rules by writing PromQL statements Support custom thresholds, durations and notification methods You can customize the level of alarms, support emergency, warning , Prompt three levels
	Notification configuration	On the notification configuration page, you can configure to send messages to users through email groups, corporate WeChat, DingTalk, Webhook, etc. Support simultaneous notification to multiple alarm objects
	Message template	The message template function supports customizing the content of the message template, and can notify the specified object in the form of email, corporate WeChat, DingTalk, and Webhook
Log collection and query	Unified log collection	Unified collection of log data of nodes, containers, containers, and k8s events Collect the audit operation of the global management platform, and the collection of k8s audit logs is not enabled by default
	Log persistent storage	Logs can be marked and output to middleware such as Elasticsearch for persistence
Metric collection	Metric data collection	Support to use ServiceMonitor to define the namespace scope of Pod discovery and select the listening Service through matchLabel
System configuration	System configuration	System configuration displays the default storage time of indicators, logs, and links and the default Apdex threshold Support custom modification of the storage time of indicators, logs, and link data

Commercial Edition - Observability¶

On the basis of the community edition, the commercial edition of DCE 5.0 provides more abundant and customizable observable features.

Category	Subcategory	Description
Resource monitoring	Multi-cluster monitoring	Provide multi-cluster business centralized observability The administrator manages multi-cluster alarms in a unified manner, and satisfies cluster and tenant administrator data isolation Supports persistent cluster indicators and log data.
	Cluster Monitoring	Provides an overview of the monitoring of a single cluster, allowing you to view the running status of the cluster, understand the resource usage of the cluster, and the alarms that are currently occurring in the cluster
	Node monitoring	Support to view the running status of the node, etc., and understand the changes in the CPU, memory, network and other resources of the node
	Container Monitoring	Supports monitoring of resources such as stateless loads, daemon processes, container groups, etc., can monitor the running status of the workload, and can view the number of alarms and the trend chart of resource consumption such as CPU and memory
Scenario Monitoring	Service Monitoring¹	You can view key indicators such as real-time throughput, number of requests, request delay and error rate of the service, as well as the trend of change over a period of time You can view the service's real-time performance over a period of time Requests, as well as the trend of real-time throughput, number of requests, request delay and error rate of a single request
	Topology map¹	The administrator can view the call relationship and health status between services connected to the observation platform and link collection, and quickly locate faults You can view the traffic direction and key indicators requested between services You can quickly view the real-time throughput, number of requests, request latency and error rate of a single service
Dashboard	Platform Component Monitoring	Provide open-source selected dashboards through native Grafana, and provide built-in dashboards to support monitoring etcd, APIServer and other components
	Cluster Resource Monitoring	Provides multi-dimensional monitoring of clusters, nodes, and namespaces. The data source used by Grafana supports viewing data from multiple clusters.
Data Query	Index Query	Common Query pre-orders basic indicators, and after selecting query conditions such as cluster, type, node, and indicator name, you can query the change trend of resources Support querying indicator charts and data details through native PromQL statements
	Log query	You can query the logs of Node, Pod, Depoyment, Statefulset, etc., and you can query the context content of a single log Support searching by keyword Sort by time by default, and you can query the number of logs through the histogram Support querying detailed information and context of a single log
	Log Download	Support to download logs within a period of time according to search criteria Support exporting the content of a single log context
	Link query¹	Through link query, you can view all the requests of the service within a certain period of time, support configuring clusters, namespaces, services, operations, tags, and then click Search for precise search Supports viewing a single Requested aggregated link graph for fast fault location
Alarm Center	Active Alarm	Provide a histogram to view the change trend of the alarm time Support to view all the rules and details that are alarming
	Historical alarms	You can query all alarms after automatic recovery or manual resolution
	Alert rules	Built-in 100+ alert rules, providing predefined alert rules for cluster components, container resources, etc. Administrators can create global alert rules to provide unified alerts for clusters that have installed insight-agent Support creating alarm rules through predefined indicators Support creating alarm rules by writing PromQL statements Support custom thresholds, durations and notification methods You can customize the level of alarms, support emergency, warning , Prompt three levels
	Notification configuration	On the notification configuration page, you can configure to send messages to users through email groups, corporate WeChat, DingTalk, Webhook, etc. Support simultaneous notification to multiple alarm objects
	Message template	The message template function supports customizing the content of the message template, and can notify the specified object in the form of email, corporate WeChat, DingTalk, and Webhook
Log collection and query	Unified log collection	Unified collection of log data of nodes, containers, containers, and k8s events Collect the audit operation of the global management platform, and the collection of k8s audit logs is not enabled by default
	Persistent storage of logs	Logs can be marked and output to middleware such as Elasticsearch for persistence
Metric collection	Metric data collection	Support to use ServiceMonitor to define the namespace scope of Pod discovery and select the monitored Service through matchLabel
	Component status¹	Support to view the status of the container group of the collection component, and jump to the corresponding container group details
Link Collection¹	Link Data Collection	Support link data collection by using OTEL SDK in a non-intrusive or less intrusive way Support link collection by injecting Sidecar into grid applications data
System configuration	System configuration	System configuration displays the default storage time of indicators, logs, and links and the default Apdex threshold Support custom modification of the storage time of indicators, logs, and link data

This is a feature only available in the commercial edition. ↩↩↩↩↩

Comments