Hello!
We encountered a similar NCCL issue with CUDA error: operation not supported.
Our H200s were managed by vGPU. We uninstalled the vGPU drivers (on the Proxmox host and the VM) and installed on the VM the latest Nvidia Linux driver. After...
I did exactly what is described above with 8 hugepages of 1GB
cat /proc/cmdline
initrd=\EFI\proxmox\6.17.4-2-pve\initrd.img-6.17.4-2-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt transparent_hugepage=never default_hugepagesz=1G...
@fweber @fiona
Is there any progress on NVMe disk emulation?
FYI:
https://techcommunity.microsoft.com/blog/windowsservernewsandbestpractices/announcing-native-nvme-in-windows-server-2025-ushering-in-a-new-era-of-storage-p/4477353
This is what I've already suggested in my original post:
Yesterday evening I made some waves to finally get some attention of the maintainers (I've tagged them at least few times in the manner: "maintainers may decide" - last attempt was in...
Perhaps the Proxmox team could consider building its own version of virtio drivers with its own installer, signed by Proxmox. This would allow for faster patch deployment, as is done for other subsystems (ZFS, etc.).
@fweber
Check this Post in thread 'Redhat VirtIO developers would like to coordinate with Proxmox devs re: "[vioscsi] Reset to device ... system unresponsive"'...
And @fweber, of course I understand the omnipresent DEV buzz, so all is OK, just please, there should be some regular "bumps" in the rising threads from the Proxmox staff. We, as a community, are quite mighty and capable, but definitely not...
@JonKohler
That was a very informative KFM Forum presentation. Thanks for all the RTFM you did on that one, especially the fine print.
Can you comment on where the resolutions are at today? Still making their way into upstream KVM / QEMU I...
Just an intermediate update: while I'm quite sure that at least my latest Patch 3 is more than competent and production-ready candidate, in the hunt of another WS2025-related problems on PVE I found a whole bunch of another threads about the...
I've already completed the full driver downgrade, so let's see what happens. I hope the yellow triangles and the damn IO error stop. I’ll keep you updated.
If you use virtio drivers 1.285 check this thread: https://forum.proxmox.com/threads/redhat-virtio-developers-would-like-to-coordinate-with-proxmox-devs-re-vioscsi-reset-to-device-system-unresponsive.139160/post-814067
As far as I know, profile type C is best suited for AI (requires a vCS license, but works with a vDWS license as well).
Unfortunately, I have no other ideas.
Subscribed to the post.