server

Using vGPU Unlock with Proxmox 7

How to configure vGPU to be used in Proxmox and Windows VM

Michael

23 Aug 2022 • 9 min read

Foreword

This is a guide to unlock supported NVIDIA GPU’s vGPU functionality. With vGPU unlocked, we will be able to divide one physical GPU into smaller chunks of vGPUs for various VMs. For example, if we have both Linux and Windows VMs that we would like to use GPU for hardware acceleration, then we could allocate two vGPUs for each of these VMs to use. Furthermore, with vGPU profile overrides, we could divide the GPU into different sized chunks for finer control over the hardware resources. Let’s say we have a Windows VM for gaming that we would like more VRAM allocated, we could use the profile overrides to do so.

Credits:

This guide is heavily based on PolloLoco’s NVIDIA vGPU Guide, link: https://gitlab.com/polloloco/vgpu-proxmox/-/tree/master

Thanks to vGPU unlock project for making all of these possible: https://github.com/mbilker/vgpu_unlock-rs

Prerequisite:

The Proxmox version I used is 7.2-7 with Linux kernel 5.15.39-4
vGPU unlock only supports up to Turing architecture. The GPU I am using in this guide is EVGA GTX 1080 Ti SC which is Pascal architecture.
The host KVM driver version is 510.47.03, I have confirmed that it will work with Windows Quadro driver version 511.09, with spoofed Quadro P6000 profile.

Known Issues:

When using spoofed profile in Windows VMs, OpenGL performance will be greatly reduced. If you need to use OpenGL based software (like running fur mark), then don’t spoof the vGPU profile and install NVIDIA GRID drivers instead.

System Configuration:

CPU: AMD Ryzen 9 3900X

Mobo: Gigabyte X570 Master Rev.A

RAM: Gskill 16 GB x2

GPU: EVGA GTX 1080 Ti SC

Prepping the System for vGPU unlock:

Prepping Proxmox Environment:

Run Post-Installation Script to Make Sure Everything is Up-to-Date:

bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/misc/post-install.sh)"

(PVE 7 ONLY)

It’s recommended to answer y to all options, except for “Add (Disabled) Beta/Test Repository?”.

Credits: https://tteck.github.io/Proxmox/

Install Necessary Packages:

apt install -y git build-essential dkms pve-headers mdevctl unzip

Git Repos and Rust Compiler:

NOTE: The guide in master branch has updated to host driver version 510.85.03. However, I found this driver hard to come-by with ordinary means. So I will be using the widely available version 510.47.03.

First, clone this repo to your home folder, which is root. We are looking for the Proxmox 7.1 Branch, which includes 510.43 patch.

See here: https://gitlab.com/polloloco/vgpu-proxmox/-/tree/proxmox-7.1

git clone -b proxmox-7.1 https://gitlab.com/polloloco/vgpu-proxmox.git

Rename the NVIDIA patch file for easier access:

cd vgpu-proxmox
mv NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.patch 510.47.03.patch

Also, clone the vgpu_unlock-rs.repo

cd /opt
git clone https://github.com/mbilker/vgpu_unlock-rs.git

After that, install the rust compiler. The script will install Rust Compiler for you.

curl https://sh.rustup.rs -sSf | sh -s -- -y

After installation, you will see the screen prompt to include Cargo’s bin directory. Just execute the following command.

source $HOME/.cargo/env

You should still be in /opt path, if not, simply do:

cd /opt

Next, go to vgpu_unlock-rs directory and compile the library. This may take awhile.

cd vgpu_unlock-rs/
cargo build --release

As you can see, it took me a bit over 36 seconds.

Create files for vGPU unlock

The vgpu_unlock-rs library requires a few files and folders in order to work properly, lets create those

First create the folder for your vgpu unlock config and create an empty config file.

mkdir /etc/vgpu_unlock
touch /etc/vgpu_unlock/profile_override.toml

Then, create folders and files for systemd to load the vgpu_unlock-rs library when starting the NVIDIA vgpu services.

mkdir /etc/systemd/system/{nvidia-vgpud.service.d,nvidia-vgpu-mgr.service.d}
echo -e "[Service]\nEnvironment=LD_PRELOAD=/opt/vgpu_unlock-rs/target/release/libvgpu_unlock_rs.so" > /etc/systemd/system/nvidia-vgpud.service.d/vgpu_unlock.conf
echo -e "[Service]\nEnvironment=LD_PRELOAD=/opt/vgpu_unlock-rs/target/release/libvgpu_unlock_rs.so" > /etc/systemd/system/nvidia-vgpu-mgr.service.d/vgpu_unlock.conf

Loading required kernel modules and blacklisting the open source NVIDIA driver

We have to load the vfio, vfio_iommu_type1, vfio_pci and vfio_virqfd kernel modules to get vGPU working.

echo -e "vfio\nvfio_iommu_type1\nvfio_pci\nvfio_virqfd" >> /etc/modules

Proxmox comes with the open source nouveau driver for nvidia gpus, however we have to use our patched nvidia driver to enable vGPU. The next line will prevent the nouveau driver from loading.

echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf

Applying Kernel Configuration

update-initramfs -u -k all
reboot

NVIDIA Drivers

Obtaining the Driver:

The version I am using is “NVIDIA-GRID-Linux-KVM-510.47.03-511.65”. A quick Google search will land you some great results.

Use wget to download the zip file to your home folder. File size should be around 1.6G, it may take awhile to download. Go grab a cup of coffee, we will carry on once the download is done.

After downloading, extract the zip file.

unzip NVIDIA-GRID-Linux-KVM-510.47.03-511.65.zip

Do a ls to see if your output contains files as such

The file we are looking for is "NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.run”. Now, make it executable.

chmod +x NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.run

Then patch it

./NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm.run --apply-patch ~/vgpu-proxmox/510.47.03.patch

After that, you should see this line:

Self-extractible archive "NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm-custom.run" successfully created.

The patch is applied successfully, now we can install this newly patched driver.

./NVIDIA-Linux-x86_64-510.47.03-vgpu-kvm-custom.run

When prompted for DKMS kernel module, answer with “Yes”.

If everything goes to plan, you should be able to see this message:

Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 510.47.03) is now complete.

Then, reboot.

reboot

To check if everything works, use Nvidia-SMI utility.

nvidia-smi

You should see an output similar to this:

Then, verify if vGPU unlock worked, use this command:

mdevctl types

The output will be like this:

If the command doesn’t output anything, then vGPU unlock isn’t working. Check your steps and start over.

Finishing Touches

In PolloLoco’s repo, he kindly included a wrapper for nvidia-smi to monitor vGPU usage. To install this wrapper, simply run this command:

cp ~/vgpu-proxmox/nvidia-smi /usr/local/bin/
chmod +x /usr/local/bin/nvidia-smi

Then run this command to see if it works:

nvidia-smi vgpu

You should see output that is similar to this:

Using vGPU in VMs

In this section, we will be configuring vGPUs for various VMs.

What is Spoofing and Why We Need it

Spoofing a GPU profile allows us to use normal Quadro drivers instead of GRID drivers which requires proprietary licensing. However, the drawback is abysmal OpenGL performance. My workload doesn’t rely on OpenGL, so I’ve chosen to spoof my profile to Quadro P6000 in order to use Quadro driver.

Furthermore, it allows us to allocate GPU resources based on VM roles. For example, if I am to use two Windows guest VMs, VM A for ordinary web browsing and light duty office work, and VM B for gaming. With profile overrides, I can allocate 2GB of VRAM to VM A, and 6GB of VRAM to VM B in order to maximize GPU utilization.

If you are to use the vGPU in Linux guest VMs, then spoofing won’t work. You will have to install GRID driver, and use a script to extend the valid license state to 1 day, then reboot this VM daily to keep using vGPU functionality.

Create vGPU Profile Override

We will be editing this file for our overrides:

nano /etc/vgpu_unlock/profile_override.toml

When opened, it should be empty. Then, depending on your GPU architecture, you will have various of profiles to choose from. To list all the supported profiles, use this command:

mdevctl types

For my Pascal based GTX 1080 Ti, I’ve chosen “nvidia-47” as my profile. Here’s my configuration:

[profile.nvidia-47]
num_displays = 1
display_width = 1920
display_height = 1080
max_pixels = 2073600
cuda_enabled = 1
frl_enabled = 0
                          # Other options:
                          # 1GB: 0x3B000000
                          # 2GB: 0x76000000
                          # 3GB: 0xB1000000
                          # 4GB: 0xEC000000
                          # 8GB: 0x1D8000000
                          # 16GB: 0x3B0000000

[mdev.00000000-0000-0000-0000-000000000xxx]
pci_id = 0x1B3011A0
pci_device_id = 0x1B30
framebuffer = 0x1D8000000

This profile contains two blocks. The first one “profile.nvidia-47” is the default configuration override for all vGPUs that uses this profile. I won’t go into too much detail here, as these settings rarely need to be tweaked. If you need to change the resolution, then just change the display_height and display_width to your liking, then multiply these two values together and change the max_pixels value accordingly. The “frl_enabled” option allows you to cap the frame rate of your guest VM. I didn’t find it particularly useful since I will be installing RivaTuner Statistic Server to limit the frame rate to 60 fps.

The second block “mdev.xxx” applies to individual VMs, this is where we can specify how much VRAM each VM gets, and what kind of graphics card profile we are spoofing it to. In my configuration, I am spoofing it to Quadro P6000, and allocating 8GB of VRAM. Sample VRAM values are included in the configuration file as well, you can edit them as you see fit. Now, the “xxx” values in this profile needs to match your VM ID. It is not a requirement, rather for ease of operation. If your VM ID is 777, then change the “xxx” to “777”, or if you have some really big numbers, say “12345”, then delete two zeros with the “xxx” and fill in “12345”. For pci_id and pci_device_id, you can get them here: https://pci-ids.ucw.cz/read/PC/10de/

Adding this vGPU to a Guest VM

Now, assuming you have a Guest VM already configured, there should be a configuration file in Proxmox that you can edit. Turn off the about-to-be-configured guest VM, then open the configuration file. VM-ID will be your numeric VM ID in Proxmox:

nano /etc/pve/qemu-server/<VM-ID>.conf

For me, it’s VM 136:

nano /etc/pve/qemu-server/136.conf

In this file, add this line at the end. Same as before, “xxx” will be your numeric VM ID:

args: -uuid 00000000-0000-0000-0000-000000000xxx

Hit Ctrl+X to save and exit. Now, we will turn to the Proxmox web GUI. Locate your VM, go to “Hardware”, and click on “Add”, Select “PCI Device”. In the pop-up window, select your GPU in the “Device” drop-down, then select the profile we have configured in the “MDEV Type” drop-down. Last, click on “Add” to add this vGPU to the VM.

Then, start the VM and install drivers. The Quadro driver that is confirmed to work in this guide is 511.09. You can download this driver from NVIDIA directly.

After the driver installation, I recommend using some means of Remote Desktop software to get into the Guest VM since we will be disabling the default Proxmox virtual display adapter. I personally use Parsec the most, and I’ve also heard good things about Moonlight/Sunshine. After the Remote Desktop is configured, shut down the VM, then go to “Hardware”, select “Display”, click on “Edit”, then select “none” from the drop-down menu.

Validation

I’ve used UNIGINE Valley Benchmark and Yuzu Switch Emulator for validation.

First up is Valley benchmark. The settings I used is DirectX 11, Ultra quality, 8x Anti-aliasing, full screen at 1080p. I capped the frame rate to 60 fps with RivaTuner since I use Parsec to connect to this VM, there’s no need for anything above 60 fps. As you can see, GPU utilization sits comfortably at less than 50%. Also, frame time is rather consistent.

Then I fired up Yuzu to see if Vulcan can work as good as DirectX. I did not tweak any of the settings, running Triangle Strategy, one of my most favorite titles with Vulcan API. When frame rate is locked to default, a consistent 30 fps is easily achieved, and GPU sits at about a quarter utilization.

If we unlock the frame rate by hitting Ctrl+U, then we can achieve up to about 200 fps. GPU utilization jumps to about 1/3 accordingly. However, when in-game, the frame rate will drop to about 100 fps, but GPU utilization remains at about 1/3.

By running watch nvidia-smi

We can see that the GPU is using about 70W of power, and it is relatively cool at 57 degrees Celsius. When in-game, the power jumps to 135W with unlocked frame rate, and about 65-80W with locked frame rate at 30 fps. RivaTuner doesn’t work with Vulcan, so either running the game at native 30 fps, or unlock it to go above 100 fps.