I can duplicate this pretty easy. Sorry for the lack of any details; home lab user here and my troubleshooting is a little limited.
Fresh install of Proxmox, single node.
Server Specs:
SuperMicro Server
Dual E5-2630 (6 core)
32GB RAM
Seagate Constellation ST91000640NS 1TB 7200 RPM 64MB Cache SATA drives (guest storage 6 drives in ZFS RAID 10)
Kensington 120GB SATA SSD drives (Proxmox system, ZFS RAID 1)
I set up two Ubuntu 20.04 VMs, 8GB RAM, 200GB drive. On both VMs run the stress command.
Code:
stress -d 2 --hdd-bytes 512M
Eventually I'll start to see the "timeout waiting on systemd" errors on the guest console. Sometimes it's as little as 10 minutes other times it can be an hour before it happens.
I couldn't reboot or shutdown the guests that had the "timeout waiting on systemd" error, rebooting the Proxmox server was the only way I could get them to power up again.
My system is Intel so I added "intel_idle.max_cstate=1" to my grub config and it did help. It took longer for a guest to show the "waiting on systemd" on the console but still had an eventual freeze on the guest and had to reboot the system so I could do anything with the guest again.
I then installed the qemu-guest-agent on the Ubuntu guests and enabled the QEMU Guest Agent in the Proxmox guest config. I also disabled Memory Ballooning. After doing these two things I'll get the "waiting on systemd" error in the guest console but I can cancel the stress command, do a reboot, etc. with the guests. The system stays operational with those guests.
Is anyone from Proxmox interested in getting access to the system to test with? I read somewhere that Proxmox can't duplicate so maybe there's something specific with my system that can help.