Dear Proxmox Community,
we are new users of the Proxmox VE solution and quite happy with everything at our beginner level usage. Now we want to experiment and test the HA usage. During these first steps we have one server on which the services pve-ha-crm and pve-ha-lrm failed after a couple of minutes because the watchdog-mux services was not running. These reason for it's failure to start is that systemd hoggs the device /dev/watchdog.
Could You help me identify the cause for this behaviour?
In /etc/systemd/system.conf we have only this line active/not commented out:
I thought that it must be deviating BIOS settings but I found the same configuration looking with
On the problematic system lsof /dev/watchdog shows:
on the workings systems
Where could I find further configuration possibilities for watchdog-mux / systemd regarding /dev/watchdog?
Also: I found the information under https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x quite interesting. Is this still relevant for watchdog configurations? The newer docs are not that verbose regarding watchdog (https://pve.proxmox.com/wiki/High_Availability).
in anyway: thanks for Your attention and have a nice day, Y'all
Regards,
Martin
P.S. I will proceed search for the cause and I am happy to provide further informations.
we are new users of the Proxmox VE solution and quite happy with everything at our beginner level usage. Now we want to experiment and test the HA usage. During these first steps we have one server on which the services pve-ha-crm and pve-ha-lrm failed after a couple of minutes because the watchdog-mux services was not running. These reason for it's failure to start is that systemd hoggs the device /dev/watchdog.
Code:
Nov 21 16:37:25 pve-cit1-hv-2-test systemd[1]: Started watchdog-mux.service - Proxmox VE watchdog multiplexer.
Nov 21 16:37:25 pve-cit1-hv-2-test systemd[1]: Using hardware watchdog 'Software Watchdog', version 0, device /dev/watchdog
Nov 21 16:37:25 pve-cit1-hv-2-test systemd[1]: Watchdog running with a hardware timeout of 30s.
Nov 21 16:37:25 pve-cit1-hv-2-test watchdog-mux[2029]: watchdog open: Device or resource busy
Could You help me identify the cause for this behaviour?
In /etc/systemd/system.conf we have only this line active/not commented out:
Code:
/etc/systemd/system.conf
[Manager]
RuntimeWatchdogSec=30
I thought that it must be deviating BIOS settings but I found the same configuration looking with
Code:
ipmitool mc watchdog get
Watchdog Timer Use: Reserved (0x00)
Watchdog Timer Is: Stopped
Watchdog Timer Logging: On
Watchdog Timer Action: No action (0x00)
Pre-timeout interrupt: None
Pre-timeout interval: 0 seconds
Timer Expiration Flags: None (0x00)
Initial Countdown: 0.0 sec
Present Countdown: 0.0 sec
On the problematic system lsof /dev/watchdog shows:
Code:
sudo lsof /dev/watchdog
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 59w CHR 10,130 0t0 892 /dev/watchdog
on the workings systems
Code:
sudo lsof /dev/watchdog
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
watchdog- 2096 root 3w CHR 10,130 0t0 864 /dev/watchdog
Where could I find further configuration possibilities for watchdog-mux / systemd regarding /dev/watchdog?
Also: I found the information under https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x quite interesting. Is this still relevant for watchdog configurations? The newer docs are not that verbose regarding watchdog (https://pve.proxmox.com/wiki/High_Availability).
in anyway: thanks for Your attention and have a nice day, Y'all
Regards,
Martin
P.S. I will proceed search for the cause and I am happy to provide further informations.
Code:
proxmox-ve: 9.0.0 (running kernel: 6.14.11-4-pve)
pve-manager: 9.0.10 (running version: 9.0.10/deb1ca707ec72a89)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.14.11-4-pve-signed: 6.14.11-4
proxmox-kernel-6.14: 6.14.11-4
proxmox-kernel-6.14.8-2-pve-signed: 6.14.8-2
proxmox-kernel-6.8.12-13-pve-signed: 6.8.12-13
proxmox-kernel-6.8: 6.8.12-13
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
ceph: 19.2.3-pve2
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
ifupdown2: 3.3.0-1+pmx10
intel-microcode: 3.20250812.1~deb13u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.3
libpve-apiclient-perl: 3.4.0
libpve-cluster-api-perl: 9.0.6
libpve-cluster-perl: 9.0.6
libpve-common-perl: 9.0.11
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.1.8
libpve-rs-perl: 0.10.10
libpve-storage-perl: 9.0.13
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-1
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.0.16-1
proxmox-backup-file-restore: 4.0.16-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.0
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.2
proxmox-widget-toolkit: 5.0.5
pve-cluster: 9.0.6
pve-container: 6.0.12
pve-docs: 9.0.8
pve-edk2-firmware: 4.2025.02-4
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.3
pve-firmware: 3.17-2
pve-ha-manager: 5.0.4
pve-i18n: 3.6.0
pve-qemu-kvm: 10.0.2-4
pve-xtermjs: 5.5.0-2
qemu-server: 9.0.22
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve2
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1