Hello, I have render farm which using proxmox as a server and installed VMs for render manager and the render worker in it. The render farm has RTX A5000 GPU that's already passhtrough and assigned to render worker vm. But, when I test render compared to render PC (not a vm) with GPU RTX 3070 with the same blender file the render time different is too far, render pc finished 45 minutes but the VM worker takes 1 hour 23 minutes. That's makes me wondering where's the bottleneck? The CPU or GPU or other konfiguration in proxmox.
GPU passthrough reference I used:
https://pve.proxmox.com/wiki/PCI(e)_Passthrough
https://forum.proxmox.com/threads/p...x-ve-8-installation-and-configuration.130218/
Here are some information about proxmox server and the vm:
One server with 2 vm render worker
Lenovo Thinksystem SRV665 V3
VM worker specs: (both of them are same)
GRUB config:
dmesg | grep -e IOMMU
/etc/modules
And I found that's the vm worker GPU r/w throughput speed seems not normal compared to pc-render. Then the novabench vm worker result is lower than pc-render, is vm worker storage r/w speed normal? I think it's too small.
I'm new to proxmox, I'm just following all the tutorials, and I have no idea why this issue happen. Thanks for the attention.
GPU passthrough reference I used:
https://pve.proxmox.com/wiki/PCI(e)_Passthrough
https://forum.proxmox.com/threads/p...x-ve-8-installation-and-configuration.130218/
Here are some information about proxmox server and the vm:
One server with 2 vm render worker
Lenovo Thinksystem SRV665 V3
- CPU: 2x AMD EPYC 9254 24 cores
- RAM: 128 GB DDR5
- GPU: 2x RTX A5000
- Disk: SSD 3840GB
- OS : Windows
- CPU : Intel core i7-11700 @ 2.50GHz
- RAM : Kingston DDR4 4 x 8GB 1600MHz
- GPU : NVIDIA GeForce RTX 3070 8GB
- Storage : - SSD Samsung 500GB
- HDD Seagate 2TB
Code:
pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.4-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.6
libpve-cluster-perl: 8.0.6
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.1
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.2.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.0-1
proxmox-backup-file-restore: 3.2.0-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.1
pve-cluster: 8.0.6
pve-container: 5.0.10
pve-docs: 8.2.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.0
pve-firewall: 5.0.5
pve-firmware: 3.11-1
pve-ha-manager: 4.0.4
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2
Code:
qm config 110
balloon: 0
bios: ovmf
boot: order=scsi0;ide0;net0
cores: 32
cpu: x86-64-v2-AES
efidisk0: NAS-CD02:110/vm-110-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:81:00,pcie=1,x-vga=1
machine: pc-q35-8.1
memory: 40960
meta: creation-qemu=8.1.5,ctime=1716871656
name: SRV-RENDER-WORKER02
net0: virtio=BC:24:11:5D:AD:32,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: NAS-CD02:110/vm-110-disk-1.qcow2,iothread=1,size=1000G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=55bf8114-c1c6-4dff-b54f-deb347c4700a
sockets: 1
tags: renderfarm;win
vga: std
vmgenid: aea02d35-9c85-4f21-827f-1244bd99a88d
VM worker specs: (both of them are same)
- OS : Windows
- CPU : 32 cores
- RAM : 40 GB
- GPU : 2 x NVIDIA RTX A5000
- Storage : 1 TB
Code:
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 52 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 96
On-line CPU(s) list: 0-95
Vendor ID: AuthenticAMD
BIOS Vendor ID: Advanced Micro Devices, Inc.
Model name: AMD EPYC 9254 24-Core Processor
BIOS Model name: AMD EPYC 9254 24-Core Processor Unknown CPU @ 2.9GHz
BIOS CPU family: 107
CPU family: 25
Model: 17
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 2
Stepping: 1
Frequency boost: enabled
CPU(s) scaling MHz: 63%
CPU max MHz: 4151.7568
CPU min MHz: 1500.0000
BogoMIPS: 5791.70
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 n
opl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extap
ic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 i
brs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512
vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin cppc arat npt lbrv svm_l
ock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni v
aes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid overflow_recov succor smca fsrm flush_l1d debug_swap
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 1.5 MiB (48 instances)
L1i: 1.5 MiB (48 instances)
L2: 48 MiB (48 instances)
L3: 256 MiB (8 instances)
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s): 0-23,48-71
NUMA node1 CPU(s): 24-47,72-95
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Mitigation; Safe RET
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
GRUB config:
Code:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet iommu=pt nomodeset"
GRUB_CMDLINE_LINUX=""
dmesg | grep -e IOMMU
Code:
dmesg | grep -e IOMMU
[ 1.332256] pci 0000:60:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.340979] pci 0000:40:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.348393] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.359385] pci 0000:20:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.369606] pci 0000:e0:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.377982] pci 0000:c0:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.387868] pci 0000:80:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.395506] pci 0000:a0:00.2: AMD-Vi: IOMMU performance counters supported
[ 1.406557] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[ 1.406572] perf/amd_iommu: Detected AMD IOMMU #1 (2 banks, 4 counters/bank).
[ 1.406586] perf/amd_iommu: Detected AMD IOMMU #2 (2 banks, 4 counters/bank).
[ 1.406600] perf/amd_iommu: Detected AMD IOMMU #3 (2 banks, 4 counters/bank).
[ 1.406614] perf/amd_iommu: Detected AMD IOMMU #4 (2 banks, 4 counters/bank).
[ 1.406628] perf/amd_iommu: Detected AMD IOMMU #5 (2 banks, 4 counters/bank).
[ 1.406647] perf/amd_iommu: Detected AMD IOMMU #6 (2 banks, 4 counters/bank).
[ 1.406661] perf/amd_iommu: Detected AMD IOMMU #7 (2 banks, 4 counters/bank).
/etc/modules
Code:
vfio
vfio_iommu_type1
vfop_pci
And I found that's the vm worker GPU r/w throughput speed seems not normal compared to pc-render. Then the novabench vm worker result is lower than pc-render, is vm worker storage r/w speed normal? I think it's too small.
I'm new to proxmox, I'm just following all the tutorials, and I have no idea why this issue happen. Thanks for the attention.