Very high iowait since proxmox-backup-client 2.1.6-1

cipwurzel

Active Member
Sep 8, 2017
17
4
43
Hi community,

we have a problem and like to help with all the data you need to debug the situation we are facing:

Since the update of proxmox-backup-client:amd64 from 2.1.5-1 to 2.1.6-1 on 2022-04-25 the backup of a Debian 10 VM with with a huge ext4 filesystem and an IMAP Server (Dovecot, Maildir) is very slow. The VM is on Ceph

Before the update the backup took around ~15min every 4 hours. Now we are at 1 hours backup time every 4 hours. Something is broken. We are not sure if we see this behaviour on all VMs where we have on Ceph but only on this VM we really can measure because of the size and can see it clearly in munins 5min resolution.

What we have tried
  • install qemu-guest-agent from buster-backup-ports
  • downgrade proxmox-backup-client on the pve node to 2.1.5-1
Any ideas? Which data should we supply to help debugging? Which next steps can we try? Thought about a buster-backports kernel (5.10 and not 4.19)

Check out some graphs:
1652375167511.png
1652375182485.png
1652375242839.png
 
  • downgrade proxmox-backup-client on the pve node to 2.1.5-1
did that help? if not, i guess the fault lies elsewhere, you say the vm is on ceph, maybe the load on the storage was increasing independently? or did the amount of data grow significantly?
 
also, the task log of the backup before/after might give a bit more insight
 
  • Downgrade to 2.1.5-1 did not help
  • There was no increase on load on the Grafana Dashboards for Ceph - ofc the throughput increases because of the increased backup traffic
Logiles will follow later today

Some graphs about the Dovecot load:
1652431734359.png1652431758120.png1652431775128.png
Log
 
Now about the log files:
I took one day before the slow down and one day after and choosed 4 different backup slots per day:

Code:
backup_einhorn_Apr20_0000:INFO: transferred 122.87 GiB in 680 seconds (185.0 MiB/s)
backup_einhorn_Apr20_1200:INFO: transferred 173.87 GiB in 922 seconds (193.1 MiB/s)
backup_einhorn_Apr20_1600:INFO: transferred 287.37 GiB in 1374 seconds (214.2 MiB/s)
backup_einhorn_Apr20_2000:INFO: transferred 205.50 GiB in 1022 seconds (205.9 MiB/s)

backup_einhorn_Apr27_0000:INFO: transferred 112.77 GiB in 1746 seconds (66.1 MiB/s)
backup_einhorn_Apr27_1200:INFO: transferred 180.52 GiB in 2864 seconds (64.5 MiB/s)
backup_einhorn_Apr27_1600:INFO: transferred 210.71 GiB in 3631 seconds (59.4 MiB/s)
backup_einhorn_Apr27_2000:INFO: transferred 152.00 GiB in 2804 seconds (55.5 MiB/s)

The amount of data is similar between the two days. The huge diff is the read speed on Ceph. Is this the backup client or our Ceph. There was no change in VMs and usage.
 

Attachments

could you check what else got upgraded on/around that day (/var/log/apt/history.log) either on the PVE or the PBS side? the proxmox-backup-client package isn't actually involved in a VM backup (except for uploading the log at the end), all the heavy lifting is done by pve-qemu-kvm and libproxmox-backup-qemu0 ..
 
Updates on the IMAP server itself - nothing around this date.

Updates on the pbs:
Code:
Start-Date: 2022-04-23  06:56:22
Commandline: apt-get install qemu-guest-agent
Install: qemu-guest-agent:amd64 (1:5.2+dfsg-11+deb11u1), liburing1:amd64 (0.7-3, automatic), libglib2.0-0:amd64 (2.66.8-1, automatic)
End-Date: 2022-04-23  06:56:23

Start-Date: 2022-04-26  12:54:19
Commandline: apt-get dist-upgrade -V
Upgrade: pve-firmware:amd64 (3.3-6, 3.4-1), pve-kernel-helper:amd64 (7.1-14, 7.2-1)
End-Date: 2022-04-26  12:54:43

Start-Date: 2022-04-29  07:16:04
Commandline: apt-get dist-upgrade -V
Upgrade: smartmontools:amd64 (7.2-1, 7.2-pve3)
End-Date: 2022-04-29  07:16:05

Updates on the pve node:
Code:
Start-Date: 2022-04-25  07:22:34
Commandline: apt-get dist-upgrade
Upgrade: proxmox-widget-toolkit:amd64 (3.4-7, 3.4-9), libpve-rs-perl:amd64 (0.6.0, 0.6.1), pve-qemu-kvm:amd64 (6.2.0-2, 6.2.0-3), proxmox-backup-file-restore:amd64 (2.1.5-1, 2.1.6-1), proxmox-backup-client:amd64 (2.1.5-1, 2.1.6-1)
End-Date: 2022-04-25  07:23:03

The first backup which was slower was the update on Monday April 25 8:00am after the upgrade:
Code:
00:00 INFO: transferred 120.37 GiB in 641 seconds (192.3 MiB/s)
04:00 INFO: transferred 63.20 GiB in 316 seconds (204.8 MiB/s)
08:00 INFO: transferred 93.15 GiB in 1295 seconds (73.7 MiB/s)
12:00 INFO: transferred 168.60 GiB in 2638 seconds (65.4 MiB/s)
16:00 INFO: transferred 177.81 GiB in 2715 seconds (67.1 MiB/s)
 
three more questions:
- could you post the full VM config in question?
- when was the host upgraded to 6.2.0-2 and from which version (again, apt history log)?
- is it possible that you restarted/migrated the VM on the 25th, and missed that on the previous update of Qemu to 6.2.0-2? was there a migration/restart of the VM after you upgraded to 6.2.0-2 but before the 25th? (migration/.. status should be extractable from PVE task logs for that VMID)
 
VM config:
Code:
#einhorn%3A IMAP/Mail Server
agent: enabled=1
boot: order=scsi0;ide2;net0
cores: 2
cpu: host
ide2: none,media=cdrom
keyboard: de
memory: 16384
name: einhorn
net0: virtio=00:16:3e:00:18:00,bridge=vmbr0,tag=905
numa: 1
onboot: 0
ostype: l26
rng0: source=/dev/urandom
scsi0: cephrbd:vm-1050-disk-0,cache=writeback,discard=on,size=30G,ssd=1
scsi1: cephrbd:vm-1050-disk-1,cache=writeback,discard=on,size=2500G,ssd=1
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=134210b4-7859-4a3e-81b0-869416a43f9c
sockets: 2
tablet: 0
vmgenid: 36f57825-30d9-4abf-8739-359ab9790daf

- when was the host upgraded to 6.2.0-2 and from which version (again, apt history log)?

Code:
Start-Date: 2022-04-16  13:08:51
Commandline: apt-get dist-upgrade -V
Install: libgbm1:amd64 (20.3.5-1, automatic), libwayland-server0:amd64 (1.18.0-2~exp1.1, automatic), libdrm-common:amd64 (2.4.104-1, automatic), libdrm2:amd64 (2.4.104-1, automatic), libvirglrenderer1:amd64 (0.8.2-5, automatic), libepoxy0:amd64 (1.5.5-1, automatic)
Upgrade: pve-qemu-kvm:amd64 (6.1.1-2, 6.2.0-2), lxcfs:amd64 (4.0.11-pve1, 4.0.12-pve1), libpve-storage-perl:amd64 (7.1-1, 7.1-2), lxc-pve:amd64 (4.0.11-1, 4.0.12-1)
End-Date: 2022-04-16  13:09:26

- is it possible that you restarted/migrated the VM on the 25th, and missed that on the previous update of Qemu to 6.2.0-2? was there a migration/restart of the VM after you upgraded to 6.2.0-2 but before the 25th? (migration/.. status should be extractable from PVE task logs for that VMID)

My process is as follow:
On the morning of April 25th there was a migration. I migrate all VMs away from the host by using HA groups and lower the priority of the host I plan to update. I update the hosts and reboot. Then we move back the VMS.

It's possible that I do not restart /migrate the VMa on every update. Maybe it was missed with the update to 6.2.0-2

1652442970803.png


Task history is not long enough:
Code:
root@pve01.in-berlin.de /var/log/pve/tasks # find .|grep migrate |grep 1050 | xargs ls -ltr
-rw-r----- 1 www-data root 4305 Apr 25 06:09 ./D/UPID:pve01:00265193:0C04AEFF:62661EDD:qmigrate:1050:root@pam:
-rw-r----- 1 www-data root 3289 Apr 25 07:21 ./6/UPID:pve01:00292395:0C0B3D65:62662FA6:qmigrate:1050:root@pam:
-rw-r----- 1 www-data root 3749 Apr 26 16:03 ./D/UPID:pve01:00086F98:00B2D4E2:6267FB6D:qmigrate:1050:root@pam:
-rw-r----- 1 www-data root 3193 May  7 06:41 ./C/UPID:pve01:0011D039:0575F8AA:6275F86C:qmigrate:1050:root@pam:
 
Just short period of iostat on this VM while no backup is running:
Code:
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
========================================================================================================================================
sdb              6.67   60.00     33.33    178.00    10.00    28.67  60.00  32.33   14.10    0.41   0.12     5.00     2.97   0.80   5.33
sdb              1.67    0.33     44.67      0.67    20.67     0.00  92.54   0.00    8.40    0.00   0.01    26.80     2.00   6.00   1.20
sdb             24.33    1.00     50.00     70.00     0.33    34.33   1.35  97.17   15.33   12.00   0.40     2.05    70.00   4.42  11.20
sdb              2.67    1.33     45.33     48.00    19.67    23.00  88.06  94.52    3.62    4.25   0.02    17.00    36.00   4.67   1.87
sdb              2.33    3.67      4.67     32.00     0.00    12.67   0.00  77.55    9.71    8.00   0.08     2.00     8.73   9.33   5.60
sdb             24.00    0.33    145.33      4.67    41.33     0.00  63.27   0.00    9.69    0.00   0.23     6.06    14.00   6.03  14.67
sdb              8.33    1.00     58.00     34.67    20.00    16.67  70.59  94.34   14.68    6.33   0.13     6.96    34.67   6.86   6.40
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdb              0.00    4.00      0.00    381.33     0.00    14.33   0.00  78.18    0.00    3.42   0.03     0.00    95.33   4.33   1.73
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

and with a running backup
Code:
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
========================================================================================================================================
sdb              8.67    8.67     60.00    985.33    20.33    42.00  70.11  82.89  201.88  556.62   6.99     6.92   113.69  56.46  97.87
sdb             11.81    3.85     39.56    119.23     7.97    12.09  40.28  75.86  152.88  364.36   3.43     3.35    31.00  53.89  84.40
sdb             10.00    0.33     84.67      0.67    31.33     0.00  75.81   0.00  160.63 1013.00   1.94     8.47     2.00 102.84 106.27
sdb              9.00    1.33     18.00     37.33     0.00    17.67   0.00  92.98  218.81  515.25   2.80     2.00    28.00  95.10  98.27
sdb              9.33    0.33     58.67      0.67    20.00     0.00  68.18   0.00  181.32 1024.00   2.03     6.29     2.00  86.21  83.33
sdb              7.67    2.33     42.00    108.00    10.33    23.00  57.41  90.79  163.61  172.14   1.78     5.48    46.29  68.93  68.93
sdb              4.67    1.67      9.33     17.33     0.00     7.33   0.00  81.48  245.07  674.60   2.39     2.00    10.40 174.95 110.80
sdb              5.00    9.00     29.33     64.67     9.67    23.00  65.91  71.88  278.53  148.74   2.79     5.87     7.19  66.38  92.93
sdb              9.33    5.33     18.67     50.67     0.00    12.67   0.00  70.37  114.75  289.69   2.75     2.00     9.50  60.09  88.13
sdb             16.33    2.67     72.67     63.33    20.00    14.00  55.05  84.00   76.76  928.12   3.91     4.45    23.75  53.61 101.87
 
OK, I checked the load on the Ceph while the backup is runnig (check attached file) and it is not critical. We have 5 nodes. The IMAP Server is on pve04 and we use
Code:
rbd_read_from_replica_policy = localize
 

Attachments

thanks for the info (and a pity that the task logs have already been rotated out..) could you try enabling iothread on the disks (and switching the scsi-HW to virtio SCSI single)? unfortunately these changes require a stop-start cycle of the VM, which means the dirty bitmap gets invalidated, but they should improve performance in general and might avoid an issue with newer Qemu versions.
 
Hi Fabian,
I waited around 1.5h to let the IMAP cache and processes settle a bit before starting a manual backup after your suggested changes to the VM. It got worse and the IMAP server did not look healty :-(
Code:
top - 10:07:52 up  1:38,  1 user,  load average: 75.47, 50.88, 24.63
Tasks: 1651 total,   1 running, 1650 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.9 us,  0.6 sy,  0.0 ni,  0.0 id, 98.4 wa,  0.0 hi,  0.1 si,  0.0 st
%Cpu1  :  0.3 us,  0.4 sy,  0.0 ni,  0.0 id, 99.2 wa,  0.0 hi,  0.1 si,  0.0 st
%Cpu2  :  1.6 us,  1.5 sy,  0.0 ni,  0.0 id, 96.5 wa,  0.0 hi,  0.4 si,  0.0 st
%Cpu3  :  1.0 us,  0.6 sy,  0.0 ni,  0.0 id, 98.3 wa,  0.0 hi,  0.0 si,  0.1 st
MiB Mem :  16008.8 total,   6338.0 free,   2254.9 used,   7415.9 buff/cache
MiB Swap:   8192.0 total,   8192.0 free,      0.0 used.  13422.8 avail Mem


1652690392311.png
I canceled the backup after some minutes:
Code:
()
INFO: starting new backup job: vzdump 1050 --mode snapshot --mailnotification always --prune-backups 'keep-all=1' --node pve04 --mailto automat@in-berlin.de --storage pbs-n15 --notes-template '{{guestname}}' --all 0
INFO: Starting Backup of VM 1050 (qemu)
INFO: Backup started at 2022-05-16 09:58:49
INFO: status = running
INFO: VM Name: einhorn
INFO: include disk 'scsi0' 'cephrbd:vm-1050-disk-0' 30G
INFO: include disk 'scsi1' 'cephrbd:vm-1050-disk-1' 2500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: skip unused drive 'cephrbd:vm-1050-disk-2' (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/1050/2022-05-16T07:58:49Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: enabling encryption
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '7e804287-56c8-4984-85a0-9276a6a447d4'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
INFO:   0% (424.0 MiB of 2.5 TiB) in 3s, read: 141.3 MiB/s, write: 34.7 MiB/s
INFO:   1% (25.4 GiB of 2.5 TiB) in 5m 11s, read: 83.0 MiB/s, write: 21.4 MiB/s
ERROR: interrupted by signal
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 1050 failed - interrupted by signal
INFO: Failed at 2022-05-16 10:08:12
ERROR: Backup job failed - interrupted by signal
TASK ERROR: interrupted by signal
 
thanks for the feedback!

@Fabian_E managed to narrow down a possible culprit in Qemu that seems to make backing up empty / all-zero blocks a lot slower in Qemu >= 6.2:

https://gitlab.com/qemu-project/qemu/-/commit/0347a8fd4c
https://gitlab.com/qemu-project/qemu/-/issues/1026

since enabling iothreads doesn't seem to have helped, switching to KRBD (which means Qemu will only access the rbd image as regular block device, all the heavy Ceph <-> blockdev lifting is done by the 'rbd' module inside the Linux kernel) might be a viable workaround. we are investigating options on how to proceed (fix, revert as stop-gap measure, ..).
 
Thanks for your details and fast support on this topic. If i can support with any debugging information or tests please let me know.
 
  • Like
Reactions: fabian
I did some tests with krbd and the difference is amazing:
  • krbd: INFO: transferred 32.00 GiB in 62 seconds (528.5 MiB/s)
  • rbd: INFO: transferred 32.00 GiB in 249 seconds (131.6 MiB/s)
The backup was delete before each backup run to have a full backup. While the update with rbd was running I also have seen timeout in the log of the pve node:
Code:
May 17 08:18:52 pve05 pve-ha-lrm[2440992]: VM 3025 qmp command failed - VM 3025 qmp command 'query-status' failed - got timeout
May 17 08:18:52 pve05 pve-ha-lrm[2440992]: VM 3025 qmp command 'query-status' failed - got timeout
(more in attached files)
 

Attachments

  • Like
Reactions: fabian
Hi,

I raised this topic as I was having issues backup up a large 3.6TB guest - https://forum.proxmox.com/threads/c...t-iowait-through-the-roof.109232/#post-473772

Since switching the drives over to run via KRBD I've been able to run unthrottled backups without hitting IOWait issues. This is interesting because I only started experiencing this issue since an update to PBS.

Yesterday I ran a backup limited to 80MiB/s via /etc/vzdump.conf bwlimit flag and it took 17 hours and the machine was very unstable for the duration.

This morning, with KRBD and no throttling I'm 24% through the same backup (fresh backup because of dirty bitmaps) in 19 minutes... The guest is perfectly stable.
 
  • Like
Reactions: fabian
note that you can likely switch back to librbd after upgrading to the fixed pve-qemu-kvm version (only newly started kvm process have the fix, so either restart or migrate VMs for it to take effect!):

pve-qemu-kvm (6.2.0-8) bullseye; urgency=medium

* revert "block/rbd: implement bdrv_co_block_status" to work around
performance regression when backing up large RBD disk
 
  • Like
Reactions: cipwurzel
Hi Fabian,

I switched back on Saturday because I have seen the change and it working fast again.

Thanks a lot for your work on Proxmox - it's a great environment for virtualization!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!