VMs hung after backup

Pavletto

New Member
Sep 13, 2023
12
1
3
42
Hello.
After latest pre-8.1 update some of VMs (mostly with linux) has started to hung right after backup.
Backups are made with latest Proxmox Backup server (3.0-4) to NFS Datastore.
We have 6 backup jobs which don't cross each other in time. Today's night a linux VM has hung which was the only one VM in backup job.
Code:
Details
VMID    Name    Status    Time    Size    Filename
166    bitrix    ok    2min 14s    600.007 GiB    vm/166/2023-11-29T01:00:04Z
Total running time: 2min 14s
Logs
vzdump 166 --storage PBS_Backup_QNAP2 --mailto _some_mail_addresses_ --notes-template '{{node}}, {{guestname}}' --mailnotification failure --quiet 1 --mode snapshot --prune-backups 'keep-daily=14,keep-last=1,keep-monthly=3'


166: 2023-11-29 04:00:04 INFO: Starting Backup of VM 166 (qemu)
166: 2023-11-29 04:00:04 INFO: status = running
166: 2023-11-29 04:00:04 INFO: VM Name: bitrix
166: 2023-11-29 04:00:04 INFO: include disk 'scsi0' 'CEPH-NVME-POOL:vm-166-disk-0' 102407M
166: 2023-11-29 04:00:04 INFO: include disk 'scsi1' 'CEPH-NVME-POOL:vm-166-disk-1' 500G
166: 2023-11-29 04:00:04 INFO: backup mode: snapshot
166: 2023-11-29 04:00:04 INFO: ionice priority: 7
166: 2023-11-29 04:00:04 INFO: snapshots found (not included into backup)
166: 2023-11-29 04:00:04 INFO: creating Proxmox Backup Server archive 'vm/166/2023-11-29T01:00:04Z'
166: 2023-11-29 04:00:04 INFO: issuing guest-agent 'fs-freeze' command
166: 2023-11-29 04:00:05 INFO: issuing guest-agent 'fs-thaw' command
166: 2023-11-29 04:00:05 INFO: started backup task '43b5d346-f0fb-450e-814d-cf2cab8ced23'
166: 2023-11-29 04:00:05 INFO: resuming VM again
166: 2023-11-29 04:00:05 INFO: scsi0: dirty-bitmap status: OK (24.9 GiB of 100.0 GiB dirty)
166: 2023-11-29 04:00:05 INFO: scsi1: dirty-bitmap status: OK (13.1 GiB of 500.0 GiB dirty)
166: 2023-11-29 04:00:05 INFO: using fast incremental mode (dirty-bitmap), 38.0 GiB dirty of 600.0 GiB total
166: 2023-11-29 04:00:08 INFO:   1% (756.0 MiB of 38.0 GiB) in 3s, read: 252.0 MiB/s, write: 252.0 MiB/s
166: 2023-11-29 04:00:11 INFO:   3% (1.5 GiB of 38.0 GiB) in 6s, read: 249.3 MiB/s, write: 249.3 MiB/s
166: 2023-11-29 04:00:14 INFO:   5% (2.2 GiB of 38.0 GiB) in 9s, read: 246.7 MiB/s, write: 246.7 MiB/s
166: 2023-11-29 04:00:17 INFO:   8% (3.2 GiB of 38.0 GiB) in 12s, read: 344.0 MiB/s, write: 342.7 MiB/s
166: 2023-11-29 04:00:20 INFO:  10% (3.9 GiB of 38.0 GiB) in 15s, read: 256.0 MiB/s, write: 256.0 MiB/s
166: 2023-11-29 04:00:23 INFO:  12% (4.7 GiB of 38.0 GiB) in 18s, read: 245.3 MiB/s, write: 245.3 MiB/s
166: 2023-11-29 04:00:26 INFO:  14% (5.4 GiB of 38.0 GiB) in 21s, read: 249.3 MiB/s, write: 249.3 MiB/s
166: 2023-11-29 04:00:29 INFO:  16% (6.3 GiB of 38.0 GiB) in 24s, read: 312.0 MiB/s, write: 310.7 MiB/s
166: 2023-11-29 04:00:32 INFO:  18% (7.2 GiB of 38.0 GiB) in 27s, read: 286.7 MiB/s, write: 286.7 MiB/s
166: 2023-11-29 04:00:35 INFO:  20% (7.9 GiB of 38.0 GiB) in 30s, read: 241.3 MiB/s, write: 241.3 MiB/s
166: 2023-11-29 04:00:38 INFO:  23% (8.8 GiB of 38.0 GiB) in 33s, read: 324.0 MiB/s, write: 324.0 MiB/s
166: 2023-11-29 04:00:41 INFO:  25% (9.7 GiB of 38.0 GiB) in 36s, read: 293.3 MiB/s, write: 293.3 MiB/s
166: 2023-11-29 04:00:44 INFO:  27% (10.4 GiB of 38.0 GiB) in 39s, read: 249.3 MiB/s, write: 249.3 MiB/s
166: 2023-11-29 04:00:47 INFO:  30% (11.4 GiB of 38.0 GiB) in 42s, read: 357.3 MiB/s, write: 357.3 MiB/s
166: 2023-11-29 04:00:50 INFO:  31% (12.1 GiB of 38.0 GiB) in 45s, read: 237.3 MiB/s, write: 237.3 MiB/s
166: 2023-11-29 04:00:53 INFO:  33% (12.9 GiB of 38.0 GiB) in 48s, read: 260.0 MiB/s, write: 260.0 MiB/s
166: 2023-11-29 04:00:56 INFO:  35% (13.6 GiB of 38.0 GiB) in 51s, read: 250.7 MiB/s, write: 248.0 MiB/s
166: 2023-11-29 04:00:59 INFO:  38% (14.7 GiB of 38.0 GiB) in 54s, read: 358.7 MiB/s, write: 358.7 MiB/s
166: 2023-11-29 04:01:02 INFO:  40% (15.6 GiB of 38.0 GiB) in 57s, read: 302.7 MiB/s, write: 302.7 MiB/s
166: 2023-11-29 04:01:05 INFO:  43% (16.6 GiB of 38.0 GiB) in 1m, read: 353.3 MiB/s, write: 353.3 MiB/s
166: 2023-11-29 04:01:08 INFO:  46% (17.7 GiB of 38.0 GiB) in 1m 3s, read: 366.7 MiB/s, write: 366.7 MiB/s
166: 2023-11-29 04:01:11 INFO:  49% (18.7 GiB of 38.0 GiB) in 1m 6s, read: 362.7 MiB/s, write: 362.7 MiB/s
166: 2023-11-29 04:01:14 INFO:  51% (19.7 GiB of 38.0 GiB) in 1m 9s, read: 316.0 MiB/s, write: 316.0 MiB/s
166: 2023-11-29 04:01:17 INFO:  54% (20.5 GiB of 38.0 GiB) in 1m 12s, read: 298.7 MiB/s, write: 298.7 MiB/s
166: 2023-11-29 04:01:20 INFO:  56% (21.5 GiB of 38.0 GiB) in 1m 15s, read: 317.3 MiB/s, write: 317.3 MiB/s
166: 2023-11-29 04:01:23 INFO:  58% (22.3 GiB of 38.0 GiB) in 1m 18s, read: 278.7 MiB/s, write: 278.7 MiB/s
166: 2023-11-29 04:01:26 INFO:  60% (23.1 GiB of 38.0 GiB) in 1m 21s, read: 265.3 MiB/s, write: 265.3 MiB/s
166: 2023-11-29 04:01:29 INFO:  62% (23.8 GiB of 38.0 GiB) in 1m 24s, read: 264.0 MiB/s, write: 264.0 MiB/s
166: 2023-11-29 04:01:32 INFO:  64% (24.6 GiB of 38.0 GiB) in 1m 27s, read: 258.7 MiB/s, write: 258.7 MiB/s
166: 2023-11-29 04:01:35 INFO:  66% (25.4 GiB of 38.0 GiB) in 1m 30s, read: 266.7 MiB/s, write: 266.7 MiB/s
166: 2023-11-29 04:01:38 INFO:  68% (26.2 GiB of 38.0 GiB) in 1m 33s, read: 262.7 MiB/s, write: 262.7 MiB/s
166: 2023-11-29 04:01:41 INFO:  70% (27.0 GiB of 38.0 GiB) in 1m 36s, read: 276.0 MiB/s, write: 276.0 MiB/s
166: 2023-11-29 04:01:44 INFO:  73% (27.8 GiB of 38.0 GiB) in 1m 39s, read: 302.7 MiB/s, write: 302.7 MiB/s
166: 2023-11-29 04:01:47 INFO:  75% (28.7 GiB of 38.0 GiB) in 1m 42s, read: 301.3 MiB/s, write: 301.3 MiB/s
166: 2023-11-29 04:01:50 INFO:  77% (29.6 GiB of 38.0 GiB) in 1m 45s, read: 286.7 MiB/s, write: 286.7 MiB/s
166: 2023-11-29 04:01:53 INFO:  79% (30.4 GiB of 38.0 GiB) in 1m 48s, read: 268.0 MiB/s, write: 268.0 MiB/s
166: 2023-11-29 04:01:56 INFO:  82% (31.3 GiB of 38.0 GiB) in 1m 51s, read: 313.3 MiB/s, write: 313.3 MiB/s
166: 2023-11-29 04:01:59 INFO:  84% (32.3 GiB of 38.0 GiB) in 1m 54s, read: 346.7 MiB/s, write: 345.3 MiB/s
166: 2023-11-29 04:02:02 INFO:  87% (33.2 GiB of 38.0 GiB) in 1m 57s, read: 325.3 MiB/s, write: 321.3 MiB/s
166: 2023-11-29 04:02:05 INFO:  90% (34.2 GiB of 38.0 GiB) in 2m, read: 341.3 MiB/s, write: 322.7 MiB/s
166: 2023-11-29 04:02:08 INFO:  92% (35.1 GiB of 38.0 GiB) in 2m 3s, read: 304.0 MiB/s, write: 302.7 MiB/s
166: 2023-11-29 04:02:11 INFO:  94% (36.0 GiB of 38.0 GiB) in 2m 6s, read: 308.0 MiB/s, write: 306.7 MiB/s
166: 2023-11-29 04:02:14 INFO:  97% (37.1 GiB of 38.0 GiB) in 2m 9s, read: 358.7 MiB/s, write: 356.0 MiB/s
166: 2023-11-29 04:02:17 INFO:  99% (38.0 GiB of 38.0 GiB) in 2m 12s, read: 313.3 MiB/s, write: 313.3 MiB/s
166: 2023-11-29 04:02:18 INFO: 100% (38.0 GiB of 38.0 GiB) in 2m 13s, read: 20.0 MiB/s, write: 20.0 MiB/s
166: 2023-11-29 04:02:18 INFO: backup is sparse: 76.00 MiB (0%) total zero data
166: 2023-11-29 04:02:18 INFO: backup was done incrementally, reused 562.08 GiB (93%)
166: 2023-11-29 04:02:18 INFO: transferred 38.02 GiB in 133 seconds (292.8 MiB/s)
166: 2023-11-29 04:02:18 INFO: adding notes to backup
166: 2023-11-29 04:02:18 INFO: prune older backups with retention: keep-daily=14, keep-last=1, keep-monthly=3
166: 2023-11-29 04:02:18 INFO: running 'proxmox-backup-client prune' for 'vm/166'
166: 2023-11-29 04:02:18 INFO: pruned 1 backup(s) not covered by keep-retention policy
166: 2023-11-29 04:02:18 INFO: Finished Backup of VM 166 (00:02:14)

bitrix.JPG

We have 3 node cluster with all-nvme ceph storage. Proxmox backup server also as VM on this cluster with datastore connected by NFS to QNAP.
We had no problems at all before this update. Please give advice what we can do to solve this problem,

root@pve-down-1:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-19-pve)
pve-manager: 8.0.9 (running version: 8.0.9/fd1a0ae1b385cdcd)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.5
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.10
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.4
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.5
proxmox-mail-forward: 0.2.1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.1
pve-cluster: 8.0.5
pve-container: 5.0.6
pve-docs: 8.0.5
pve-edk2-firmware: 4.2023.08-1
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.0.7
pve-qemu-kvm: 8.1.2-3
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.8
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve3
 
  • Like
Reactions: Ayush
same issue for my centos 7 guest vm. it's OK for almalinux8, debian 12.

centos has 64G RAM and 16VCPU( 2 socket)
 
I can also confirm this for CentOS 7 with CloudLinux and XFS. It doesn't always happen, but it does happen very, very often.
 
Hi,
please share the VM configuration qm config <ID> and output of pveversion -v.

Does the VM eventually recover or is all IO stuck from that point forward? In the latter case, could you try if running qm suspend <ID> && qm resume <ID> makes it work again (note that suspend without --todisk just pauses the VM, not actually suspends)?

Is the issue still there after downgrading QEMU with apt install pve-qemu-kvm=8.0.2-7. You'll need to reboot the VM via UI (not just within the guest!) or migrate to a node with the downgrade already installed to actually use the older binary.
 
root@pve-down-2:~# qm config 166
agent: 1
balloon: 0
boot: order=scsi0
cores: 6
cpu: x86-64-v4
ide2: ISO_Storage_QNAP_1:iso/AlmaLinux-9.2-x86_64-dvd.iso,media=cdrom,size=8853M
machine: q35
memory: 16384
meta: creation-qemu=8.0.2,ctime=1695978206
name: bitrix
net0: virtio=00:15:5D:0B:16:95,bridge=vmbr10
numa: 1
onboot: 1
ostype: l26
parent: Vanyaupdate
scsi0: CEPH-NVME-POOL:vm-166-disk-0,cache=writeback,discard=on,iothread=1,size=102407M
scsi1: CEPH-NVME-POOL:vm-166-disk-1,cache=writeback,discard=on,iothread=1,size=500G
scsihw: virtio-scsi-single
smbios1: uuid=81b168e2-c8a3-474d-b69d-d8fbcb181b34
sockets: 2
tags: prod;var
vga: qxl
vmgenid: 98fec7d0-fe06-4aa8-a7dc-31486df96526
root@pve-down-2:~# pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-19-pve)
pve-manager: 8.0.9 (running version: 8.0.9/fd1a0ae1b385cdcd)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.5
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.10
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.4
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.5
proxmox-mail-forward: 0.2.1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.1
pve-cluster: 8.0.5
pve-container: 5.0.6
pve-docs: 8.0.5
pve-edk2-firmware: 4.2023.08-1
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.0.7
pve-qemu-kvm: 8.1.2-3
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.8
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.0-pve3
Sorry, i didn't understand:
i shoud try qm suspend <ID> && qm resume <ID> on working VM or next time when VM will hung?
 
Sorry, i didn't understand:
i shoud try qm suspend <ID> && qm resume <ID> on working VM or next time when VM will hung?
When it hangs, to see if you can recover with this.
 
VM hung again.
i tried qm resume <ID> -no result. Then tried qm suspend <ID> && qm resume <ID> - errors has been erased in console and linux console login screen has been appeared. But service (DB) looks like still not worked (may be i should wait fir it a little longer).
But after that i made i live migration of this VM to another node and after that a service of this VM started to work.
 
Thank you for reporting back!

For all: If you hit this bug and running qm suspend <ID> && qm resume <ID> does allow the VM to recover, it might be the same bug as described here. If it is, turning off iothread on the VM drives should prevent it.
 
Thank you for reporting back!

For all: If you hit this bug and running qm suspend <ID> && qm resume <ID> does allow the VM to recover, it might be the same bug as described here. If it is, turning off iothread on the VM drives should prevent it.
thanks, disalbe iothread, then it's OK now.

when will the patch be merged?
 
thanks, disalbe iothread, then it's OK now.

when will the patch be merged?
There is no patch yet unfortunately. The issue is still being investigated.
 
Is the issue still there after downgrading QEMU with apt install pve-qemu-kvm=8.0.2-7. You'll need to reboot the VM via UI (not just within the guest!) or migrate to a node with the downgrade already installed to actually use the older binary.
I had downgraded pve-qemu-kvm as you suggested apt install pve-qemu-kvm=8.0.2-7 on 1 node. Unfortunately i receiving an error message when i trying to migrate a VM to this node: TASK ERROR: Installed QEMU version '8.0.2' is too old to run machine type 'pc-q35-8.1+pve0', please upgrade node 'pve-down-1' after this downgrade.
On another 2 nodes of cluster i have a 8.1.2-3 version of pve-qemu-kvm
Did you mean i should do a downgrade on all 3 nodes? But how should i do this? Any updates or another unpredictable work on node i'm doing only after migration of all started VMs to another node. And moving them back only after succesfull reboot of updated node. In this situation looks like i can be trapped when all VMs will be on single node or (i really don't think it's safe) i should downgrade this package on node with working VMs on it.
Please give me advice what should i do in this situation.
 
I had downgraded pve-qemu-kvm as you suggested apt install pve-qemu-kvm=8.0.2-7 on 1 node. Unfortunately i receiving an error message when i trying to migrate a VM to this node: TASK ERROR: Installed QEMU version '8.0.2' is too old to run machine type 'pc-q35-8.1+pve0', please upgrade node 'pve-down-1' after this downgrade.
Right, you can't live-migrate a VM started with a newer version to an older version when using the latest machine type.
On another 2 nodes of cluster i have a 8.1.2-3 version of pve-qemu-kvm
Did you mean i should do a downgrade on all 3 nodes? But how should i do this? Any updates or another unpredictable work on node i'm doing only after migration of all started VMs to another node. And moving them back only after succesfull reboot of updated node. In this situation looks like i can be trapped when all VMs will be on single node or (i really don't think it's safe) i should downgrade this package on node with working VMs on it.
Please give me advice what should i do in this situation.
If you do want to downgrade, then yes, it should be all three nodes. You would need to shutdown the VMs and start them fresh to pick up the new binary after downgrade. As an alternative, you can disable the iothread setting on the VM's disks. But that also requires shutdown+start or reboot via UI (not within the guest!) to apply the change.
 
Our VM survived one backup and then failed a few hours before the next one. I then removed iothread and am monitoring the situation.

Suspend and Resume had no effect on me.
 
Suspend and Resume had no effect on me.
Then it most likely is not the same issue. Please share the full backup task log, VM configuration qm config <ID>, output of pveversion -v and for a stuck VM, output of qm status <ID> --verbose.
 
Please share the full backup task log
Code:
INFO: Starting Backup of VM 114 (qemu)
INFO: Backup started at 2023-11-30 22:36:26
INFO: status = running
INFO: VM Name: webserver
INFO: include disk 'scsi0' 'ceph:vm-114-disk-0' 350G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/114/2023-11-30T21:36:26Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '7a252b8c-55bd-4331-84fd-8be1c3196e5f'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO:   0% (764.0 MiB of 350.0 GiB) in 3s, read: 254.7 MiB/s, write: 78.7 MiB/s
INFO:   1% (3.6 GiB of 350.0 GiB) in 23s, read: 148.4 MiB/s, write: 102.4 MiB/s
INFO:   2% (7.0 GiB of 350.0 GiB) in 39s, read: 214.8 MiB/s, write: 50.8 MiB/s
INFO:   3% (11.2 GiB of 350.0 GiB) in 50s, read: 388.0 MiB/s, write: 10.2 MiB/s
INFO:   4% (15.1 GiB of 350.0 GiB) in 55s, read: 807.2 MiB/s, write: 19.2 MiB/s
INFO:   5% (17.9 GiB of 350.0 GiB) in 1m, read: 576.8 MiB/s, write: 73.6 MiB/s
INFO:   6% (21.2 GiB of 350.0 GiB) in 1m 7s, read: 484.0 MiB/s, write: 157.1 MiB/s
INFO:   7% (27.0 GiB of 350.0 GiB) in 1m 10s, read: 1.9 GiB/s, write: 78.7 MiB/s
INFO:   8% (29.0 GiB of 350.0 GiB) in 1m 13s, read: 670.7 MiB/s, write: 78.7 MiB/s
INFO:   9% (33.2 GiB of 350.0 GiB) in 1m 20s, read: 626.3 MiB/s, write: 29.7 MiB/s
INFO:  12% (45.2 GiB of 350.0 GiB) in 1m 23s, read: 4.0 GiB/s, write: 17.3 MiB/s
INFO:  18% (63.1 GiB of 350.0 GiB) in 1m 26s, read: 6.0 GiB/s, write: 5.3 MiB/s
INFO:  24% (85.3 GiB of 350.0 GiB) in 1m 29s, read: 7.4 GiB/s, write: 2.7 MiB/s
INFO:  25% (87.5 GiB of 350.0 GiB) in 1m 38s, read: 251.6 MiB/s, write: 110.2 MiB/s
INFO:  26% (91.1 GiB of 350.0 GiB) in 1m 59s, read: 173.9 MiB/s, write: 91.2 MiB/s
INFO:  27% (94.6 GiB of 350.0 GiB) in 2m 14s, read: 243.2 MiB/s, write: 37.9 MiB/s
INFO:  28% (98.1 GiB of 350.0 GiB) in 2m 24s, read: 359.6 MiB/s, write: 38.4 MiB/s
INFO:  29% (101.6 GiB of 350.0 GiB) in 2m 34s, read: 348.4 MiB/s, write: 50.0 MiB/s
INFO:  30% (105.4 GiB of 350.0 GiB) in 2m 41s, read: 559.4 MiB/s, write: 15.4 MiB/s
INFO:  31% (110.9 GiB of 350.0 GiB) in 2m 44s, read: 1.9 GiB/s, write: 12.0 MiB/s
INFO:  34% (119.9 GiB of 350.0 GiB) in 2m 47s, read: 3.0 GiB/s, write: 21.3 MiB/s
INFO:  38% (133.3 GiB of 350.0 GiB) in 2m 50s, read: 4.5 GiB/s, write: 5.3 MiB/s
INFO:  44% (155.1 GiB of 350.0 GiB) in 2m 53s, read: 7.3 GiB/s, write: 2.7 MiB/s
INFO:  49% (171.7 GiB of 350.0 GiB) in 2m 56s, read: 5.5 GiB/s, write: 42.7 MiB/s
INFO:  50% (175.0 GiB of 350.0 GiB) in 3m 18s, read: 155.6 MiB/s, write: 101.1 MiB/s
INFO:  51% (178.7 GiB of 350.0 GiB) in 3m 32s, read: 270.0 MiB/s, write: 28.0 MiB/s
INFO:  52% (182.0 GiB of 350.0 GiB) in 3m 41s, read: 375.6 MiB/s, write: 28.0 MiB/s
INFO:  53% (185.8 GiB of 350.0 GiB) in 3m 51s, read: 388.4 MiB/s, write: 31.2 MiB/s
INFO:  54% (189.1 GiB of 350.0 GiB) in 3m 57s, read: 564.7 MiB/s, write: 46.0 MiB/s
INFO:  55% (194.5 GiB of 350.0 GiB) in 4m 3s, read: 913.3 MiB/s, write: 70.7 MiB/s
INFO:  58% (204.3 GiB of 350.0 GiB) in 4m 6s, read: 3.3 GiB/s, write: 9.3 MiB/s
INFO:  62% (217.2 GiB of 350.0 GiB) in 4m 9s, read: 4.3 GiB/s, write: 12.0 MiB/s
INFO:  67% (236.5 GiB of 350.0 GiB) in 4m 12s, read: 6.4 GiB/s, write: 6.7 MiB/s
INFO:  73% (256.9 GiB of 350.0 GiB) in 4m 15s, read: 6.8 GiB/s, write: 8.0 MiB/s
INFO:  74% (259.1 GiB of 350.0 GiB) in 4m 31s, read: 138.8 MiB/s, write: 117.8 MiB/s
INFO:  75% (262.7 GiB of 350.0 GiB) in 4m 50s, read: 197.5 MiB/s, write: 70.7 MiB/s
INFO:  76% (266.0 GiB of 350.0 GiB) in 5m, read: 336.0 MiB/s, write: 15.6 MiB/s
INFO:  77% (269.7 GiB of 350.0 GiB) in 5m 11s, read: 340.7 MiB/s, write: 22.2 MiB/s
INFO:  78% (273.5 GiB of 350.0 GiB) in 5m 20s, read: 429.8 MiB/s, write: 4.0 MiB/s
INFO:  79% (276.7 GiB of 350.0 GiB) in 5m 25s, read: 656.0 MiB/s, write: 44.0 MiB/s
INFO:  80% (280.2 GiB of 350.0 GiB) in 5m 37s, read: 300.7 MiB/s, write: 43.0 MiB/s
INFO:  81% (284.3 GiB of 350.0 GiB) in 5m 41s, read: 1.0 GiB/s, write: 32.0 MiB/s
INFO:  82% (287.2 GiB of 350.0 GiB) in 5m 52s, read: 276.0 MiB/s, write: 0 B/s
INFO:  84% (294.7 GiB of 350.0 GiB) in 5m 57s, read: 1.5 GiB/s, write: 14.4 MiB/s
INFO:  89% (312.9 GiB of 350.0 GiB) in 6m, read: 6.1 GiB/s, write: 8.0 MiB/s
INFO:  95% (335.0 GiB of 350.0 GiB) in 6m 3s, read: 7.4 GiB/s, write: 8.0 MiB/s
INFO:  97% (342.7 GiB of 350.0 GiB) in 6m 6s, read: 2.6 GiB/s, write: 0 B/s
INFO:  98% (343.5 GiB of 350.0 GiB) in 6m 9s, read: 273.3 MiB/s, write: 0 B/s
INFO:  99% (346.7 GiB of 350.0 GiB) in 6m 21s, read: 274.7 MiB/s, write: 0 B/s
INFO: 100% (350.0 GiB of 350.0 GiB) in 6m 34s, read: 257.8 MiB/s, write: 0 B/s
INFO: backup is sparse: 272.36 GiB (77%) total zero data
INFO: backup was done incrementally, reused 331.56 GiB (94%)
INFO: transferred 350.00 GiB in 394 seconds (909.6 MiB/s)
INFO: adding notes to backup
INFO: Finished Backup of VM 114 (00:06:34)
INFO: Backup finished at 2023-11-30 22:43:00

VM configuration qm config <ID>
Code:
agent: 1
balloon: 6144
boot: cdn
bootdisk: scsi0
cores: 3
cpu: Broadwell
memory: 12288
name: webserver
net0: virtio=E1:83:E3:0E:8B:08,bridge=vmbr0,tag=666
numa: 1
onboot: 1
ostype: l26
sata0: none,media=cdrom
scsi0: ceph:vm-114-disk-0,cache=writeback,discard=on,size=350G
scsihw: virtio-scsi-single
smbios1: uuid=4b74e509-ff59-470e-bcc9-a2ef289fa7e4
sockets: 2

output of pveversion -v
Code:
proxmox-ve: 8.0.2 (running kernel: 6.2.16-19-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-8
pve-kernel-5.13: 7.1-9
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
pve-kernel-5.15.131-1-pve: 5.15.131-2
pve-kernel-5.15.126-1-pve: 5.15.126-1
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-4.15: 5.4-14
pve-kernel-4.15.18-26-pve: 4.15.18-54
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: 0.8.41
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.10
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.5
libpve-rs-perl: 0.8.6
libpve-storage-perl: 8.0.3
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.5
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.1.2-1
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.8
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1
 
There were still no disruptions, so the work-around worked for us.
 
Last edited:
i am late to this party, but having the same issue. qm suspend / qm resume did fix it. Out of 100+ VMs, only three have been doing this - two centos7 vm's, and one ubuntu 22.



Code:
root@pvea2:~# qm config 142
agent: 1
boot: 
cores: 12
cpu: x86-64-v2-AES
memory: 16384
meta: creation-qemu=8.0.2,ctime=1699758968
name: PNET-xx.xx.xx.net
net0: virtio=C6:52:5D:A4:D3:BC,bridge=vmbr1,tag=101
net1: virtio=CE:12:C0:DB:25:55,bridge=vmbr1,tag=305
numa: 0
ostype: l26
parent: upgrade
scsi0: ceph_pool_1:vm-142-disk-0,cache=writeback,discard=on,iothread=1,size=300G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=2fc60165-ba16-47cc-a389-ad188f6177c9
sockets: 1
vmgenid: 7dbb45f7-cda9-4742-813f-b2ede2b15457




root@pvea2:~# pveversion -v
proxmox-ve: 8.1.0 (running kernel: 6.5.11-7-pve)
pve-manager: 8.1.3 (running version: 8.1.3/b46aac3b42da5d15)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.5: 6.5.11-7
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
proxmox-kernel-6.2.16-18-pve: 6.2.16-18
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.7-pve1
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx7
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.7
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.2-1
proxmox-backup-file-restore: 3.1.2-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.2
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-2
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.1.4
pve-qemu-kvm: 8.1.2-4
pve-xtermjs: 5.3.0-2
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!