Recent content by Michiel_1afa

  1. M

    Watchdog Reboots

    For my side, had another reboot this friday, 2 PVE hosts. We traced the issue to our ceph HDD pool which is being slow at that moment. Were moving some copy/sync which is going to the HDD pool to be more stretched out and not all hosts hitting it at once, but it still feels silly a node has to...
  2. M

    Watchdog Reboots

    The current version of pve-ha-manager is 5.1.0 which does not contain any of these patches. Also there is no 'testing' version available yet, the patches seem a bit much to all do manual, do we have any timeline when a 5.1.1 would come in testing?
  3. M

    Watchdog Reboots

    it seems, but it is absolutely not, our normal cpu load during the day is kept very low, same with memory. This 'overloaded' is caused purely by io delay on the mounted backup volume, which makes sense is slower during backup windows. This should however not cause a complete PVE node to reboot...
  4. M

    Watchdog Reboots

    `Journalctl -b -1` (previous boot log) - Cleaned up and anonymized. from ~15 min before restart. Jan 24 01:49:28 pve25 vzdump[2026007]: <root@pam> starting task UPID:pve25:001EEA18:06F4E8D3:69741718:vzdump::root@pam: Jan 24 01:49:29 pve25 vzdump[2026008]: INFO: starting new backup job: vzdump...
  5. M

    Watchdog Reboots

    Good to know this affects everyone equally :-) We have had discussion on this topic in the past on this forum, It would be nice to get a way to see the softdog status and get logging of when the watchers decide to NOT ping the watchdog for whatever reason. So far this whole thing is a big...
  6. M

    Watchdog Reboots

    We have had this same issue since replacing our intel based nodes with amd ones. Lately we have unexpected reboots at least weekly on one or more nodes. For us this always happens during the backup window (lucky?) and we see high IO delay right before the PVE host decides to shit itself. Still...
  7. M

    create a VLAN without having a physical switch or changing anything in the router

    That is unfortunate. But yes, you might want to make a 2nd bridge device, and a vm or container to act as a router between the 2 bridges.
  8. M

    create a VLAN without having a physical switch or changing anything in the router

    You often do not have to, but I would consider it good practice to do anyway so you have a clear indication of what lan these vms are on.
  9. M

    create a VLAN without having a physical switch or changing anything in the router

    Hi moshe, you have 2 options here, you enable vlan support on the first (default) bridge vmbr0. it you will put vm's on vlan 1, they will be able to communicate with router, vlan1 is the default vlan in most (all) netwerk equiptment. If you put any other vlan besides 1, they can only communicate...
  10. M

    Uploading ISO's to a different server in the cluster fails.

    This is a bit of a meh.. issue easy to work around, but would be nice to find out whats going on. Affects: at least PVE 8 & 9 - exact patch version does not matter. - File size in this case does not matter, happens with any size image. Uploading an iso to a server in the cluster that is not...
  11. M

    Opt-in Linux 6.17 Kernel for Proxmox VE 9 available on test & no-subscription

    Crosspost reply here. I also noticed problems with 6.17.2-2 which are not an issue on 6.17.2-1: https://forum.proxmox.com/threads/super-slow-timeout-and-vm-stuck-while-backing-up-after-updated-to-pve-9-1-1-and-pbs-4-0-20.176444/post-822997 On top of that, in that treat it does not look like...
  12. M

    [SOLVED] Super slow, timeout, and VM stuck while backing up, after updated to PVE 9.1.1 and PBS 4.0.20

    Yes, we have determined as a group the problem is on the PBS (kernel) side, affecting all versions of PVE. - Would also like to add 6.17.2-2 has problems on the PVE side, we have noticed vm disks halting randomly with 'watchers' being stuck on the ceph side. This happens with live migrations...
  13. M

    Homelab/Home Office 2-Node Proxmox Setup: When to Use PDM vs PVE-CM (Clustering)?

    No ! while this technically might work it invalidates the cluster, you shouldnt have 2 votes on a single node... Might as well install qdevice on 1 of the hosts directly (is that even possible?) What exactly do you disagree with, @SInisterPisces can build a valid 2 node cluster, and there for...
  14. M

    Homelab/Home Office 2-Node Proxmox Setup: When to Use PDM vs PVE-CM (Clustering)?

    In our internal documentation, a 'just' 2 node setup is not recommended. The recommendation we do, 2 PVE servers + 1 PBS server (hardware) or something else that can act as a voting node: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_qdevice_technical_overview In all cases we end up...
  15. M

    [SOLVED] Super slow, timeout, and VM stuck while backing up, after updated to PVE 9.1.1 and PBS 4.0.20

    Yes I agree with fabian here, if you look at your 'read' speed that is stable, and it is scanning the full disk to find the blocks to backup which can result in parts of the backup actually writing 0 bytes and in nearly all cases itll write less then it will read. This looks normal.