worked perfectly for me, both 7.4 and 8.0 - thanks
EDIT: EXCEPT that it disables any GVT features on Intel iGPU's. Have to remove the nomodeset line to get it back.
Another vote for the E5-2650L V2 -- 10C/20T, low power, perfect fit for a "lots of VM's, virtualization" environment. Just bought a couple last week - got the pair for $19 shipped.
GOT IT!
Looking at the kernel messages for wx-4100/rx550/rx560 in Ubuntu guest, I only see one primary thing different.
550:
[ 10.409863] amdgpu 0000:06:10.0: amdgpu: Using BACO for runtime pm
Maybe onto something?
*...
Status Update:
It appears to be some sort of interaction between the kernel, KVM, Ubuntu, and the AMD drivers.
Pulled spare hardware
* installed fresh 7.4 pmx
* tried both 5.15 and 5.19 kernel
* installed both the rx560 & rx550 in same server, vendor-reset, etc
* Ubuntu 22 guest, AMD 5.5...
Have done that as well - hoping it made a difference. No change in behavior. (tried several of the hookscripts - no change)
REFERENCE: https://www.nicksherlock.com/2020/11/working-around-the-amd-gpu-reset-bug-on-proxmox/
@aaron : Thank you for providing a specific work around. However, this feels like one of those "old" ideas in desperate need of an update.
I challenge you : When is rebooting the entire cluster on the loss of a network element the preferred behavior? You're taking a communication issue and...
Here's the complaint INSIDE the VM when I run ffmpeg, for example, which has amd support compiled in (The one that comes with jellyfin) - which then must be powered off (hangs on PCI when trying to shut it down)
74.958805] BUG: kernel NULL pointer dereference, address: 00000000000000d8
[...
almost identical configs as the OP, except "AMD-Vi: Interrupt remapping enabled". Same blacklists, same VFIO, same kernel switches. (proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve))
Same card(s) same issue. RX560 (1002:67ef) works but RX550 (1002:699f) does not. Same physical machine(s) -...
I did another lab cluster - 5 nodes, still on 7.4, but upgrading to Quincy as a prep to upgrade to 8.0. Ran into a serious blocker following the directions above. (https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy)
The Manager daemons wouldn't restart, got the "masked" message. Turns out...
Update: Moved to ZFS storage - VM starts fine. Moved back to local-LVM - fails to start, same error message.
setting "ACL = disabled" in the UI (acl=0 in the conf file) prevents the container from starting. Worked fine under pve7. Works fine on ZFS.
DOES NOT WORK ON LVM:
rootfs...
All of my production nodes are running open-vswitch in a similar fashion to my lab nodes. This means I **must** have kvm/console for every node, which is a change to previous upgrade processes.
Node #1 back online - going to work on node #2 next.
Steps:
* attach drive to other linux machine (or boot off recovery CD/etc)
* mkdir /tmp/1
* mount /dev/mapper/pve-root /tmp/1
* cd /tmp/1
* mount -t proc /proc proc/
* mount --rbind /sys sys/
* mount --rbind /dev dev/
* chroot /tmp/1
* dpkg...
>https://pve.proxmox.com/wiki/Upgrade_from_7_to_8 said:
>recommended to have access over a host independent channel like iKVM/IPMI or physical access.
>f only SSH is available we recommend testing the upgrade on an identical, but non-production machine first.
Far too mild. Should say: "If only...
/etc/network/interfaces
auto lo
iface lo inet loopback
auto ens1
iface ens1 inet manual
mtu 1500
ovs_mtu 1500
auto enp3s0
iface enp3s0 inet manual
mtu 1500
ovs_mtu 1500
auto bond0
iface bond0 inet manual
ovs_bridge vmbr0
ovs_type OVSBond...
Upgraded my tiny lab cluster today - 3 headless nodes of miniPC, (2) with C2930, and (1) with N3160. Identical drives, ram. dual NIC, LACP, managed by open-vswitch, with multiple vlan's (including management) across the bundle.
all had green pve7to8 reports.
All dumped networking at the same...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.