Hi!
To create HA resources (i.e., add VMs/containers to the HA stack), a user needs the Sys.Console permission on path /, see the POST request description in [0]. The permissions for the HA stack are very coarse grained at the moment, but there...
Just rebooted the node containing this particular vm, with the Start at boot set to No, and indeed it is not automatically started.
FYI, I have set the HA state to ignored.
As far as I can see, it works as it should now.
Thank you for your...
Hi!
One more thing, is it possible that the ignored HA resource's underlying guest has "Start at boot" set? Otherwise I could not reproduce the issue at hand and more information about a minimal and working reproducer would be needed to...
Hi!
As far as I can remember, there was a patch applied for that specific SATA controller. Could you post the syslog before, at and after the VM hangs and the output of lspci -s 0000:01:00 -vvnnk?
Hi!
Which package versions are the source and target node of the VM running on? Is there more information in the syslog of the source or target node in that time frame? pveversion -v returns a full list of the package versions of a node.
Thanks for the reports! The errors described by @zhouu, @capnspacehook and @agross should be fixed with pve-container version 6.1.1 available on the pve-test repository.
Yes, the problem from your log is different from the one that is fixed by the linked patches, but there is already a Bugzilla report for this [0] and this will be fixed by a separate patch set.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=7271
Hi!
Thanks for the report! Sent a patch in for review to fix the issue [0].
[0] https://lore.proxmox.com/pve-devel/20260204091740.102914-2-d.kral@proxmox.com/
Hi!
The first warning indicates that the iothread option is set on a disk that is not a virtio disk or a virtio-scsi-single controller.
The error which causes the VMs fail to start is the second error about the bridge not being set up...
I'm not sure when the reboot happened here, did it happen before the syslog excerpt (before 2026-02-03T14:52:36.278369+01:00) or in the middle at 2026-02-03T14:56:46.362744+01:00?
If it's the latter I don't see vm:118 being started here after...
Thanks! Can you also post the syslog in the time period where the HA resource was unexpectedly started when the node was rebooted? The syslog should contain at least the task starts and ends of starting vm:118 and the pve-ha-lrm's output, ideally...
Hi!
How does the crontab on the PVE node schedule the starting and stopping of the HA resource?
If a HA resource is in the stopped state before the node is rebooted/shutdown and started, the HA resource should not start on its own again until...
Hm, yes these paths/files should exist, because these are the rrd databases, which store the data for the statistics shown in e.g. the summary graph and these should be created when these resources (node, storage or guests) are created.
Is...
Hi!
Could you also send the output for journalctl -u pve-ha-crm -u pve-ha-lrm --since '2026-01-31 22:25:00' --until '2026-01-31 22:30:00' on the involved nodes (especially the HA Manager)? This shows explicitly when the node maintenance mode is...
Hi!
Which guide did you follow? Does the file /var/lib/rrdcached/db/pve-node-9.0/Lab exist? Is the name of the node itself Lab? Was the hostname changed inbetween? Is it possible that there is some IP address collision?
Hi!
The HA groups are deprecated and will be removed eventually by Proxmox VE 10. The reason that the group API endpoints and parameters still exist is because of the state, where not all nodes have been upgraded yet and therefore only a part of...
Thank you for the feedback above, it gave me the kick I needed to shut down the server and check the hardware cases you mentioned above.
It turned out to be a motherboard issue (probably what I get for trying to run 128GB ram on a consumer...
Hi!
A bad page with a non-zero reference counter usually means bad memory, a bad disk (if swap - e.g. hibernating - is used), or a kernel bug. Does the system use swap?
Does this happen with other processes? In which kernel version did this...