[SOLVED] Proxmox cluster slow to shutdown.

Inglebard · Feb 4, 2020

Hi,

On a cluster node after an upgrade, it takes so long to shutdown. (40min or I kill it myself) Seems to be lock to unmounting. The issue appears on 3 nodes (of 4) of the same cluster.

Edit : Same issue on node 4.

Failed deactivating swap
A job is running for /dev/.... (x5)

Before the upgrade :

Code:

proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve)
pve-manager: 6.1-5 (running version: 6.1-5/9bf06119)
pve-kernel-5.3: 6.1-1
pve-kernel-helper: 6.1-1
pve-kernel-4.15: 5.4-12
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-4.15.18-24-pve: 4.15.18-52
pve-kernel-4.15.18-21-pve: 4.15.18-48
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.1-2
pve-container: 3.0-15
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191127-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-4
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2

after the upgrade :

Code:

proxmox-ve: 6.1-2 (running kernel: 5.3.13-2-pve)
pve-manager: 6.1-5 (running version: 6.1-5/9bf06119)
pve-kernel-5.3: 6.1-2
pve-kernel-helper: 6.1-2
pve-kernel-4.15: 5.4-12
pve-kernel-5.3.13-2-pve: 5.3.13-2
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-4.15.18-24-pve: 4.15.18-52
pve-kernel-4.15.18-21-pve: 4.15.18-48
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-10
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-2
pve-cluster: 6.1-3
pve-container: 3.0-18
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191127-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-4
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

t.lamprecht · Feb 5, 2020

Hi,

Inglebard said:
On a cluster node after an upgrade, it takes so long to shutdown. (40min or I kill it myself) Seems to be lock to unmounting. The issue appears on 3 nodes (of 4) of the same cluster.

Edit : Same issue on node 4.

Failed deactivating swap
A job is running for /dev/.... (x5)

Can you tell us a bit more about the setup? What storages are used (Ceph, iSCSI, NFS,...) What storage does the host itself use for its root partition?

Can you either check in the syslog files /var/log/... or enable persistent systemd journal (mkdir -p /var/log/journal && systemctl restart systemd-journald.service) and check the journal of the last boot journalctl -b-1 , at best at the start of the shutdown process - see if there are any errors showing up, the /dev stuff could be just followup symptoms..

Inglebard · Feb 5, 2020

Hi,

Storage used are NFS for images and CIFS for backup. The host as the default ext4 root partition.

Unfortunately, If I reboot now, I can't reproduce the issue and persistent systemd journal was not enabled.

Here is the syslog file. I reboot at approximately 11:30. The last entry in the log is at 11:38:48, however, the server shutdown at 12:09 before restarting.

ozdjh · Feb 15, 2020

Hi

I've just experienced this as well. The node is a new node we are using just for ceph exports (i.e. isn't running any vms). It was running the "public / free" code from a few weeks ago under kernel 5.0.15-1-pve. I upgraded it to current from the enterprise repo.

From the console (and in syslog) the shutdown looked normal. However it got stuck for 30 minutes disabling swap. I grabbed screen shots of the console during this part of the reboot. I've retyped the details (without some of the IDs) below. The root filesystem is the normal ext4 setup on a pair of mirrored SSDs. We're using Ceph but this node doesn't have any OSDs. It does however have a large hardware raid volume mounted as local storage. That appeared to unmount without any issues though. There's nothing interesting in syslog , just a 35 minute gap after 'systemd[1]: Stopped PVE Cluster Resource Manager Daemon'

Code:

[FAILED] Failed deactivating swap /dev/pve/swap
[      ] (1 of 5) A stop job is running for /dev/dm-0
[      ] (2 of 5) A stop job is running for /dev/disk/by-id/dm-uuid-xxxxxxx
[      ] (3 of 5) A stop job is running for /dev/disk/by-uuid/xxxxx-xxxx-xxxx-xxxx
[      ] (4 of 5) A stop job is running for /dev/mapper/pve-swap
[      ] (5 of 5) A stop job is running for /dev/disk/by-id/dm-name/pve-swap

** wait 30 minutes **

[ TIME ] Timed out starting Reboot.
[  !!  ] Forcibly rebooting: job times out
[2425209.626606] watchdog: watchdog0: watchdog did no stop!

ozdjh · Feb 18, 2020

Hi,

Any thoughts on this? We've held off upgrading our cluster as a result of this issue.

Thanks

David
...

Inglebard · Mar 2, 2020

The issue is still present. Here is an end of journalctl log :

Code:

Mar 02 09:34:47 node systemd[1]: networking.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Raise network interfaces.
Mar 02 09:34:47 node systemd[1]: systemd-sysctl.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Apply Kernel Variables.
Mar 02 09:34:47 node systemd[1]: systemd-modules-load.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Load Kernel Modules.
Mar 02 09:34:47 node systemd[1]: Stopped target Local File Systems.
Mar 02 09:34:47 node systemd[1]: Stopped target Local File Systems (Pre).
Mar 02 09:34:47 node systemd[1]: Stopping Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
Mar 02 09:34:47 node systemd[1]: systemd-tmpfiles-setup-dev.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Create Static Device Nodes in /dev.
Mar 02 09:34:47 node systemd[1]: systemd-sysusers.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Create System Users.
Mar 02 09:34:47 node lvm[24670]:   3 logical volume(s) in volume group "pve" unmonitored
Mar 02 09:34:47 node systemd[1]: systemd-remount-fs.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Remount Root and Kernel File Systems.
Mar 02 09:34:47 node systemd[1]: Reached target Shutdown.
Mar 02 09:34:47 node systemd[1]: lvm2-monitor.service: Succeeded.
Mar 02 09:34:47 node systemd[1]: Stopped Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
Mar 02 09:36:17 node systemd[1]: dev-pve-swap.swap: Deactivation timed out. Stopping.
Mar 02 09:36:45 node systemd[1]: dev-pve-swap.swap: Swap process exited, code=killed, status=15/TERM
Mar 02 09:36:45 node systemd[1]: Failed deactivating swap /dev/pve/swap.
Mar 02 10:03:47 node systemd[1]: reboot.target: Job reboot.target/start timed out.
Mar 02 10:03:47 node systemd[1]: Timed out starting Reboot.
Mar 02 10:03:47 node systemd[1]: reboot.target: Job reboot.target/start failed with result 'timeout'.
Mar 02 10:03:47 node systemd[1]: Forcibly rebooting: job timed out
Mar 02 10:03:47 node systemd[1]: Shutting down.
Mar 02 10:03:47 node systemd[1]: Hardware watchdog 'Software Watchdog', version 0
Mar 02 10:03:47 node systemd[1]: Set hardware watchdog to 10min.
Mar 02 10:03:47 node kernel: watchdog: watchdog0: watchdog did not stop!
Mar 02 10:03:48 node systemd-shutdown[1]: Syncing filesystems and block devices.
Mar 02 10:03:48 node systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Mar 02 10:03:48 node systemd-journald[5373]: Journal stopped

ozdjh · Mar 2, 2020

Yes. We saw the problem on 2 of our 5 nodes when upgrading. We opened a support case for it but forgot to enable journal persistence on one of the nodes and that was the second one to have the issue. As we couldn't provide any further details we closed the case. It's great that you could provide the details from your journal.

David

Inglebard · Apr 1, 2020

Do you have any news about the issue ?

Inglebard · May 4, 2020

Do you have any news about the issue ? I still have the issue.
@ozdjh what about you ? and what about the support case ?

ozdjh · May 5, 2020

As I mentioned in my last post

As we couldn't provide any further details we closed the case

If you want to open a ticket and reference ours then let them know it was ticket 9647352

t.lamprecht · May 5, 2020

Inglebard said:
Mar 02 09:34:47 node systemd[1]: Stopped Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling. Mar 02 09:36:17 node systemd[1]: dev-pve-swap.swap: Deactivation timed out. Stopping. Mar 02 09:36:45 node systemd[1]: dev-pve-swap.swap: Swap process exited, code=killed, status=15/TERM

Hmm, sometimes systemd service dependency loops, e.g., from mountpoint autogenerated ones are at fault, those are logged often quite early in the shutdown process.. systemd will resolve them by deleteting an random dependency to break the loop, depending on which it was it can be OK or not so. Just an educated guess, we solved one of those with cephfs mountpoints a bit ago, had a bit similar behavior.

The only thing I can see from that log is swp failing to unmount, you could thus also test to unmount swap manually
swapoff -a before starting off the reboot.

Inglebard · May 5, 2020

Hi,
@ozdjh, sorry I didn't notice.

@t.lamprecht , I will try swapoff -a next time (next month). What should I can do to properly identify the issue for a potential future fix ?

t.lamprecht · May 5, 2020

Inglebard said:
@t.lamprecht , I will try swapoff -a next time (next month). What should I can do to properly identify the issue for a potential future fix ?

A full shutdown journal log could help, you can sent it to my as email, just add @proxmox.com to my username.
But the SWAP is currently your smoking gun, so that should be def. worth a try.

Is the ext4 root LV where the SWAP is located on some HW raid or the like?

Inglebard · May 6, 2020

@t.lamprecht I send you a full log.

Is the ext4 root LV where the SWAP is located on some HW raid or the like?

Yes on a PERC 6/i MegaRAID SAS 1078.

Inglebard · Jun 2, 2020

Hi,
swapoff -a before the reboot avoid the issue.

swapoff -a takes 3 minutes to complete. Don't know if a lot of time or not.

@t.lamprecht Did you check the log sent by email ?

Inglebard · Aug 4, 2020

Hi,
Seems to be fix since 6.2.6 or 6.2.10. Don't need to swapoff and everything shutdown normally.

Matthew Daniel · Sep 1, 2020

Inglebard said:
Hi,
Seems to be fix since 6.2.6 or 6.2.10. Don't need to swapoff and everything shutdown normally.

I have seen this many times on our clusters and have to use "swapoff -a" to avoid having to hard reset nodes when they get stuck at "Failed deactivating swap /dev/mapper/pve-swap" or waiting for the 30 minute timeout, which forcibly reboots.

This was still happened when upgrading the cluster from 6.2.10 to 6.2.11 last week and I see it when rebooting a node today on 6.2.11.

I'm running a mix of HP Proliant G6, G7 and G8 servers and Fujitsu blades.

stra4d · Nov 24, 2020

FYI We are still experiencing this on update reboots.

Experienced during latest updates yesterday for version 6.2-15 (currently pve-manager/6.2-15/48bd51b6 (running kernel: 5.4.73-1-pve).

According to the logs (dpkg.log) I think we went from 6.2-12 to 15.,

Code:

2020-11-23 15:44:24 upgrade qemu-server:amd64 6.2-15 6.2-20
...
2020-11-23 15:44:25 upgrade pve-manager:amd64 6.2-12 6.2-15

As with @Matthew Daniel we are running HP Proliant G7 servers.

Martin Jonáš · Dec 31, 2020

Same problem here on latest PVE 6.3-3

deepcloud · May 23, 2021

Martin Jonáš said:
Same problem here on latest PVE 6.3-3

Same issue here...

[SOLVED] Proxmox cluster slow to shutdown.

Renowned Member

Proxmox Staff Member

Renowned Member

Attachments

Well-Known Member

Well-Known Member

Renowned Member

Well-Known Member

Renowned Member

Renowned Member

Well-Known Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Renowned Member

Active Member

Well-Known Member

Active Member

Member