Hardware has wear too as any other component and sometimes that is even exacerbated by implementation faults, e.g. the 13th and 14th Intel 700 and 900 series had problems with overvoltage, which could render cores permanently degraded or even...
Ein beherztes
journalctl --list-boots
zeigte dann tatsächlich 150 Neustarts in drei Tagen. Das erklärt natürlich die Einträge im Tasklog.
Es handelt sich um einen HP Elitedesk G800 G4 im Headless-Betrieb. Die Lösung gegen die willenlosen Reboots...
Which CPU tests have you made? A good test suite that usually shows some signs of hardware trouble is stress-ng as these kinds of errors are usually caused when there's quite a load on the CPU. Otherwise random segfaults of widespread executables...
Hello!
Have you seen the documentation section about configuring a hardware watchdog? You probably still have the default value (softdog) set there, which is why it resets on reboot.
Is there a reason you want to switch to a hardware watchdog?
Hi!
I couldn't fully reproduce the issue yet, but from what I can see there is a little difference in migration times for VM 910 (25 secs) and VM 911 (45 secs), which might cause the HA Manager to detect a separation between those guests...
After months of hard work and collaboration with our community, we are thrilled to release the beta version of Proxmox Datacenter Manager. This version is based on the great Debian 13 "Trixie" and comes with a 6.14.11 Kernel as stable default and...
Hi!
Thanks for the report! I'll look into it, this should be rather straightforward to fix. The package should be able to run independently on a machine without any Proxmox VE packages installed, but it seems that there are still some...
Hi!
Thanks for the feedback! Correct, HA resource affinity rules are currently strict only as one of the initial main use cases was to provide strict separation in case of a resource limitation, e.g. a specific PCIe device is only available once...
Thanks for the report! The other bug is already fixed in a newer version and for this bug there's now a proposed fix [0].
[0] https://lore.proxmox.com/pve-devel/20250905101648.79655-1-d.kral@proxmox.com/
Hi!
Have you moved all guests from the host and powered off the host before issuing the pvecm delnode $nodename command [0]? What is the output of pvecm status and an output/excerpt of journalctl would be helpful to diagnose the issue further...
Hi!
For larger clusters, we usually recommend a proper Ceph setup, but for smaller setups a NFS can do OK as a shared storage.
For the rest of the post, it is hard to tell without any configurations and when the error exactly happens. How is...
Hi!
AFAICT this is something that could be implemented in a rather straight-forward fashion, so if you want the feature, you can create a feature request at our Bugzilla [0]. If I didn't oversee it, there was no entry with that feature request...
Hi!
Ein shared storage ist die beste Variante um HA zu garantieren, aber es ist auch möglich mit den Replication Jobs HA zu konfigurieren, wenn auf den Nodes jeweils das ZFS Pool auch den gleichen Namen hat und die Gäste so häufig wie möglich...
Thanks for the report! This is a known issue and will be fixed with pve-manager >= 9.0.7 [0], which should be packaged soon for pve-no-subscription.
For now, HA rules can also be enabled again by using ha-manager rules set <type> <ruleid>...
Thanks for the report! This is fixed with pve-manager >= 9.0.7 [0], which should be packaged soon for pve-no-subscription.
[0] https://git.proxmox.com/?p=pve-manager.git;a=commit;h=4008a6472ada2bbd0f21c15fd7f5b047d71fcbd3
Just FYI, there is another user reporting problems with an ASM1166 between these kernel versions in [0], but AFAICT the errors are not the same.
[0] https://forum.proxmox.com/threads/asm1166-issues-with-pve-9-kernel-6-14-11-1-pve.170905/
The log just states that it cannot connect to the PBS server, is it offline?
pve01-c pvestatd[3595]: VM 200 qmp command failed - VM 200 qmp command 'query-proxmox-support' failed - unable to connect to VM 200 qmp socket - timeout after 50...
Thanks! It would be great if you added the "[ SOLVED ]" prefix to your post, so that other users with the same problem can find the solution quicker.
On another note, does the ASM1166/the disks support LRM? The lpm-pol 3 indicates that it's...
It's also worth a try to check the dmesg/syslog around the time where the segfaults happen and/or if there are any errors during boot. Are there any BIOS settings that were changed? What about resetting the BIOS settings to default?