GPU Scheduling Configuration (Binpack and Spread)¶

This page introduces how to reduce GPU resource fragmentation and prevent single points of failure through Binpack and Spread when using NVIDIA vGPU, achieving advanced scheduling for vGPU. The DCE 5.0 platform provides Binpack and Spread scheduling policies across two dimensions: clusters and workloads, meeting different usage requirements in various scenarios.

Prerequisites¶

GPU devices are correctly installed on the cluster nodes.
The gpu-operator component and Nvidia-vgpu component are correctly installed in the cluster.
The NVIDIA-vGPU type exists in the GPU mode in the node list in the cluster.

Use Cases¶

Scheduling policy based on GPU dimension
- Binpack: Prioritizes using the same GPU on a node, suitable for increasing GPU utilization and reducing resource fragmentation.
- Spread: Multiple Pods are distributed across different GPUs on nodes, suitable for high availability scenarios to avoid single card failures.
Scheduling policy based on node dimension
- Binpack: Multiple Pods prioritize using the same node, suitable for increasing GPU utilization and reducing resource fragmentation.
- Spread: Multiple Pods are distributed across different nodes, suitable for high availability scenarios to avoid single node failures.

Use Binpack and Spread at Cluster-Level¶

Note

By default, workloads will follow the cluster-level Binpack and Spread. If a workload sets its own Binpack and Spread scheduling policies that differ from the cluster, the workload will prioritize its own scheduling policy.

On the Clusters page, select the cluster for which you want to adjust the Binpack and Spread scheduling policies. Click the ┇ icon on the right and select GPU Scheduling Configuration from the dropdown list.
Adjust the GPU scheduling configuration according to your business scenario, and click OK to save.

Use Binpack and Spread at Workload-Level¶

Note

When the Binpack and Spread scheduling policies at the workload level conflict with the cluster-level configuration, the workload-level configuration takes precedence.

Follow the steps below to create a deployment using an image and configure Binpack and Spread scheduling policies within the workload.

Click Clusters in the left navigation bar, then click the name of the target cluster to enter the Cluster Details page.
On the Cluster Details page, click Workloads -> Deployments in the left navigation bar, then click the Create by Image button in the upper right corner of the page.
Sequentially fill in the Basic Information, Container Settings, and in the Container Configuration section, enable GPU configuration, selecting the GPU type as NVIDIA vGPU. Click Advanced Settings, enable the Binpack / Spread scheduling policy, and adjust the GPU scheduling configuration according to the business scenario. After configuration, click Next to proceed to Service Settings and Advanced Settings. Finally, click OK at the bottom right of the page to complete the creation.