Deploy vGPU Apps Using vK8s


This document provides instructions on how to enable Virtual GPU (vGPU) for the Volterra VMware sites and deploy vGPU apps using Volterra vK8s. To know more about how Volterra vK8s, see Virtual Kubernetes (vK8s).

Volterra supports enabling vGPU capability using Volterra fleet configuration. For more information on Volterra fleet, see Fleet.

After enabling vGPU capability using fleet, the vGPU applications are deployed using the Volterra vK8s that is associated with the same virtual site as that of the fleet of sites.

Using the instructions provided in this guide, you can enable vGPU capabilities in Volterra sites using fleet configuration and deploy vGPU apps using Volterra vK8s.

Note: Only Volterra VMware site and NVIDIA Tesla T4 vGPU software are supported.



Deploying vGPU applications using vK8s requires you to perform the following sequence of actions:

  • Add the vGPU device to your VM
  • Enable the vGPU capability to your site using the fleet in which the site is a member
  • Create a K8s deployment or job for vGPU app on your site using Volterra vK8s

Do the following to deploy vGPU apps on your site:

Step 1: Set your GPU device to vGPU mode.
  • Log onto your vSphere platform using the vSphere client.
  • Select your server under the Datacenter in the left menu. Select Hardware -> Graphics under the Configure tab.
  • Select the graphics device under the Host Graphics tab and click Edit.

vmw gpus
Figure: Graphics Devices

  • Select Shared Direct and Spread VMs Across GPUs (best performance) options and click Ok.

gpu to vgpu
Figure: Set GPU to vGPU

Note: Restart the host or the xorg service to enforce these settings.

Step 2: Add vGPU device for your VM.
  • Right click on your VM and select Edit Settings.
  • Click Add NEW DEVICE -> New PCI Device and select the type as NVIDIA GRID vGPU.
  • Select a profile for the NVIDIA GRID vGPU Profile field as per your requirement. This example selects grid_t4-4q.

add vgpu to vm
Figure: Add vGPU to VM

  • Click OK to add the vGPU software.
Step 3: Enable GPU for your site using fleet.
  • Log into VoltConsole and go to Manage -> Site Management -> Fleets in the System namespace.
  • Click ...->Edit for the fleet in which your VMware site is a member.
  • Go to Advanced Configuration section and enable Show Advanced Fields option.
  • Select vGPU Enabled for the Enable/Disable GPU field.
  • Enter the license server IP address or FQDN in the License Server Address field.
  • Enter license server port number in the License Server Port Number field. Port 7070 is populated by default.
  • Select a type for the Feature Type field. This example sets NVIDIA vGPU as the feature type.

fleet vgpu en
Figure: Enable vGPU in Fleet

Note: You can set Unlicensed for the feature type field. However, unlicensed usage of NVIDIA vGPU software results in degraded performance based on time elapsed since the boot time. See Grid Licensing Guide for more information.

  • Go to Enable Default Fleet Config Download section and enable the Show Advanced Fields option. Select Enable Default Fleet Config Download option.
  • Click Save and Exit.
Step 4: Prepare vGPU deployment.

You can deploy the vGPU app using the vK8s workloads or jobs. This example shows deploying a vGPU app using the vK8s jobs.

Prepare a vGPU application manifest in JSON or YAML format. Ensure that the manifest resources should specify that a vGPU is required. See the following resources sample for NVIDIA vGPU:

apiVersion: batch/v1
kind: Job
  name: vgpu-test1
      - name: cuda-container
        image: nvidia/cuda:11.0-base
        command: ["nvidia-smi"]
            cpu: 200m
            memory: 600Mi
            cpu: 0m
            memory: 100Mi
      restartPolicy: Never
  backoffLimits: 2

The following is a JSON version of the deployment:

  "apiVersion": "batch/v1",
  "kind": "Job",
  "metadata": {
    "name": "vgpu-test1"
  "spec": {
    "template": {
      "spec": {
        "containers": [
            "name": "cuda-container",
            "image": "nvidia/cuda:11.0-base",
            "command": [
            "resources": {
              "limits": {
                "": 1,
                "cpu": "200m",
                "memory": "600Mi"
              "requests": {
                "cpu": "0m",
                "memory": "100Mi"
        "restartPolicy": "Never"
    "backoffLimits": 2

Note: In case of continuous use of vGPU such as video monitoring applications, it is recommended to use Kubernetes deployment and in other cases, it is recommended to use a Kubernetes job so that the vGPU is released after the task is completed.

Step 5: Navigate to your vK8s object and deploy the vGPU app.

Note: Ensure that the vK8s object is associated with virtual site that groups all the VMware sites enabled with vGPU or sites that are part of the fleet created in previous steps.

  • Click on the application namespace option on the namespace selector. Select your application namespace from the namespace dropdown list to change to that namespace.
  • Select Applications in the configuration menu and Virtual K8s in the options pane.
  • Click on the Jobs tab and click Add Job.
  • Paste the YAML prepared in previous step and click Save.
Step 6: Verify that the deployment is utilising the vGPU.

You can verify that the sites are enabled with vGPU and the application processes are consuming the enabled vGPU resources.

  • Log into your node and check the vGPU processes. This example shows command to monitor the vGPU devices of Nvidia.

Note: The nvidia-smi command displays the information on the GPU devices and the running processes for that GPU.

  • Log into VoltConsole and navigate to the Sites -> Site List page. Click on a site that is part of the fleet you created. This opens the site dashboard. Click Nodes tab and click on a node to open its dashboard. Click on Metrics to monitor the GPU usage, GPU temperature, and GPU throughput metrics.


API References