[cant reproduce] After PBS upgrade, backups freezing my VMs

mailinglists

Renowned Member
Mar 14, 2012
641
69
93
Hi guys,

I have had a cluster of ProxMox on version 6 as well as PBS installed quite some time ago. We use ZFS for everything.
I recently added new nodes and upgraded everything,.. PM to 7 latest as well as PBS.

When ProxMox nodes were upgraded, there were no issues.
As soon as I upgraded first PBS server, virtual instances started freezing during backups.

I guess I am not the only one with this issue. Not to reinvent hot water, I kindly ask you, if there is a workaround that would make our cluster work again without freezing when using backups on latest PBS. I guess I could try and downgrade PBS, but that does not seem like a proper solution.
 
Hi,
do you experience high IO wait during the backups? Then you might want to limit the number of workers used for the backup, see here for how.
 
Thank you for you answer Fiona.

High IO wait on backup server or on client (PM) server?
On the PM / client / hypervisor there is zero IO wait.
On storage there is.
I will follow the link you gave me.

I also found out that now that only dirty maps are used and data to transfer is below 30 - 60 GB all works just fine.
I will do a live migration of 1 TB big VM, to loose dirty maps, start the backup and check if it stalls again.
 
Thank you for you answer Fiona.

High IO wait on backup server or on client (PM) server?
The other reports mentioned that the client storage was overloaded which could explain the VM hangs.
On the PM / client / hypervisor there is zero IO wait.
During the VM hangs? What load do you observe then? If reducing workers doesn't help you might want to try configuring a bwlimit instead.
On storage there is.
I will follow the link you gave me.

I also found out that now that only dirty maps are used and data to transfer is below 30 - 60 GB all works just fine.
I will do a live migration of 1 TB big VM, to loose dirty maps, start the backup and check if it stalls again.
Live-migration preserves dirty bitmaps ;)
 
  • Like
Reactions: mailinglists
Thank you for your reply Fiona.

We have 0.1 IO wait on PM hypervizor where VMs run.
I did shutdown a VM, to destroy dirty map and ran backup job.
It completed perfectly. We can not reproduce the issue anymore and all jobs are enabled yet again.
Seems the issue was only on the first backup job run after upgrade of the PBS server.

I have one more PBS server to update. I will update it within 14 days or so. If the issue pops up again, I will let you know.
For now I do not want to put more time into this, as it seems to be working as expected.

Code:
INFO: starting new backup job: vzdump 142 --mode snapshot --storage p36backup --notes-template '{{guestname}}' --remove 0 --node p42
INFO: Starting Backup of VM 142 (qemu)
INFO: Backup started at 2023-02-03 10:53:47
INFO: status = running
INFO: VM Name: whm.somedomain.proxmox
INFO: include disk 'scsi0' 'local-zfs:vm-142-disk-0' 62G
INFO: include disk 'scsi1' 'local-zfs:vm-142-disk-1' 800G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/142/2023-02-03T09:53:47Z'
INFO: enabling encryption
INFO: started backup task '82ca2cbe-792b-443d-a359-dd8d87004b9f'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: scsi1: dirty-bitmap status: created new
INFO:   0% (1.3 GiB of 862.0 GiB) in 3s, read: 432.0 MiB/s, write: 28.0 MiB/s
INFO:   1% (8.6 GiB of 862.0 GiB) in 21s, read: 419.8 MiB/s, write: 33.3 MiB/s
INFO:   2% (17.4 GiB of 862.0 GiB) in 43s, read: 405.8 MiB/s, write: 66.2 MiB/s
INFO:   3% (26.3 GiB of 862.0 GiB) in 1m 4s, read: 434.7 MiB/s, write: 22.7 MiB/s
INFO:   4% (34.8 GiB of 862.0 GiB) in 1m 24s, read: 435.0 MiB/s, write: 17.4 MiB/s
INFO:   5% (43.4 GiB of 862.0 GiB) in 1m 46s, read: 400.2 MiB/s, write: 36.0 MiB/s
INFO:   6% (52.0 GiB of 862.0 GiB) in 2m 6s, read: 443.6 MiB/s, write: 9.4 MiB/s
INFO:   7% (60.5 GiB of 862.0 GiB) in 2m 26s, read: 434.4 MiB/s, write: 19.8 MiB/s
INFO:   8% (69.1 GiB of 862.0 GiB) in 2m 46s, read: 437.8 MiB/s, write: 20.0 MiB/s
INFO:   9% (77.7 GiB of 862.0 GiB) in 3m 6s, read: 444.2 MiB/s, write: 8.6 MiB/s
INFO:  10% (86.5 GiB of 862.0 GiB) in 3m 27s, read: 428.0 MiB/s, write: 28.0 MiB/s
INFO:  11% (94.8 GiB of 862.0 GiB) in 3m 47s, read: 426.2 MiB/s, write: 29.2 MiB/s
INFO:  12% (103.7 GiB of 862.0 GiB) in 4m 8s, read: 433.0 MiB/s, write: 23.4 MiB/s
INFO:  13% (112.3 GiB of 862.0 GiB) in 4m 28s, read: 439.4 MiB/s, write: 16.2 MiB/s
INFO:  14% (120.9 GiB of 862.0 GiB) in 4m 50s, read: 398.5 MiB/s, write: 59.1 MiB/s
INFO:  15% (129.5 GiB of 862.0 GiB) in 5m 10s, read: 443.2 MiB/s, write: 12.2 MiB/s
INFO:  16% (138.2 GiB of 862.0 GiB) in 5m 30s, read: 442.4 MiB/s, write: 14.0 MiB/s
INFO:  17% (146.9 GiB of 862.0 GiB) in 5m 50s, read: 446.2 MiB/s, write: 6.4 MiB/s
INFO:  18% (155.2 GiB of 862.0 GiB) in 6m 9s, read: 448.0 MiB/s, write: 6.9 MiB/s
INFO:  19% (163.8 GiB of 862.0 GiB) in 6m 29s, read: 440.8 MiB/s, write: 11.2 MiB/s
INFO:  20% (172.8 GiB of 862.0 GiB) in 6m 50s, read: 436.8 MiB/s, write: 13.1 MiB/s
INFO:  21% (181.2 GiB of 862.0 GiB) in 7m 10s, read: 432.6 MiB/s, write: 13.2 MiB/s
INFO:  22% (189.8 GiB of 862.0 GiB) in 7m 30s, read: 439.0 MiB/s, write: 8.2 MiB/s
INFO:  23% (198.4 GiB of 862.0 GiB) in 7m 51s, read: 419.0 MiB/s, write: 9.3 MiB/s
INFO:  24% (207.3 GiB of 862.0 GiB) in 8m 12s, read: 433.5 MiB/s, write: 14.3 MiB/s
INFO:  25% (215.7 GiB of 862.0 GiB) in 8m 32s, read: 430.6 MiB/s, write: 20.2 MiB/s
INFO:  26% (224.3 GiB of 862.0 GiB) in 8m 52s, read: 439.0 MiB/s, write: 14.4 MiB/s
INFO:  27% (233.0 GiB of 862.0 GiB) in 9m 12s, read: 447.6 MiB/s, write: 9.2 MiB/s
INFO:  28% (241.7 GiB of 862.0 GiB) in 9m 32s, read: 444.8 MiB/s, write: 13.6 MiB/s
INFO:  29% (250.2 GiB of 862.0 GiB) in 9m 52s, read: 437.0 MiB/s, write: 18.0 MiB/s
INFO:  30% (258.7 GiB of 862.0 GiB) in 10m 12s, read: 431.8 MiB/s, write: 22.4 MiB/s
INFO:  31% (267.3 GiB of 862.0 GiB) in 10m 32s, read: 442.0 MiB/s, write: 11.8 MiB/s
INFO:  32% (275.9 GiB of 862.0 GiB) in 10m 52s, read: 439.8 MiB/s, write: 10.2 MiB/s
INFO:  33% (284.6 GiB of 862.0 GiB) in 11m 12s, read: 448.8 MiB/s, write: 6.0 MiB/s
INFO:  34% (293.3 GiB of 862.0 GiB) in 11m 32s, read: 443.0 MiB/s, write: 10.6 MiB/s
INFO:  35% (302.0 GiB of 862.0 GiB) in 11m 52s, read: 443.4 MiB/s, write: 10.6 MiB/s
INFO:  36% (310.4 GiB of 862.0 GiB) in 12m 12s, read: 430.4 MiB/s, write: 14.8 MiB/s
INFO:  37% (319.2 GiB of 862.0 GiB) in 12m 34s, read: 411.3 MiB/s, write: 43.3 MiB/s
INFO:  38% (327.7 GiB of 862.0 GiB) in 12m 54s, read: 435.4 MiB/s, write: 17.4 MiB/s
INFO:  39% (336.4 GiB of 862.0 GiB) in 13m 14s, read: 446.4 MiB/s, write: 5.2 MiB/s
INFO:  40% (345.1 GiB of 862.0 GiB) in 13m 34s, read: 444.6 MiB/s, write: 5.0 MiB/s
INFO:  41% (353.4 GiB of 862.0 GiB) in 13m 54s, read: 426.0 MiB/s, write: 5.4 MiB/s
INFO:  42% (362.1 GiB of 862.0 GiB) in 14m 14s, read: 444.6 MiB/s, write: 12.4 MiB/s
INFO:  43% (370.8 GiB of 862.0 GiB) in 14m 34s, read: 447.2 MiB/s, write: 4.8 MiB/s
INFO:  44% (379.5 GiB of 862.0 GiB) in 14m 54s, read: 445.8 MiB/s, write: 7.2 MiB/s
INFO:  45% (388.3 GiB of 862.0 GiB) in 15m 14s, read: 447.4 MiB/s, write: 4.0 MiB/s
INFO:  46% (396.8 GiB of 862.0 GiB) in 15m 34s, read: 437.6 MiB/s, write: 15.6 MiB/s
INFO:  47% (405.2 GiB of 862.0 GiB) in 15m 55s, read: 410.3 MiB/s, write: 39.8 MiB/s
INFO:  48% (414.1 GiB of 862.0 GiB) in 16m 16s, read: 432.2 MiB/s, write: 12.6 MiB/s
INFO:  49% (422.8 GiB of 862.0 GiB) in 16m 36s, read: 444.2 MiB/s, write: 7.6 MiB/s
INFO:  50% (431.4 GiB of 862.0 GiB) in 16m 56s, read: 441.4 MiB/s, write: 6.6 MiB/s
INFO:  51% (440.0 GiB of 862.0 GiB) in 17m 16s, read: 440.8 MiB/s, write: 4.0 MiB/s
INFO:  52% (448.3 GiB of 862.0 GiB) in 17m 35s, read: 444.8 MiB/s, write: 7.6 MiB/s
INFO:  53% (456.9 GiB of 862.0 GiB) in 17m 55s, read: 443.8 MiB/s, write: 8.2 MiB/s
INFO:  54% (465.6 GiB of 862.0 GiB) in 18m 15s, read: 443.6 MiB/s, write: 10.4 MiB/s
INFO:  55% (474.3 GiB of 862.0 GiB) in 18m 35s, read: 444.6 MiB/s, write: 5.6 MiB/s
INFO:  56% (482.9 GiB of 862.0 GiB) in 18m 55s, read: 443.0 MiB/s, write: 8.8 MiB/s
INFO:  57% (491.6 GiB of 862.0 GiB) in 19m 15s, read: 443.2 MiB/s, write: 7.6 MiB/s
INFO:  58% (500.1 GiB of 862.0 GiB) in 19m 35s, read: 435.8 MiB/s, write: 17.0 MiB/s
INFO:  59% (508.6 GiB of 862.0 GiB) in 19m 56s, read: 414.5 MiB/s, write: 9.0 MiB/s
INFO:  60% (517.2 GiB of 862.0 GiB) in 20m 16s, read: 441.0 MiB/s, write: 9.2 MiB/s
INFO:  61% (526.2 GiB of 862.0 GiB) in 20m 37s, read: 438.3 MiB/s, write: 5.7 MiB/s
INFO:  62% (534.8 GiB of 862.0 GiB) in 20m 57s, read: 439.4 MiB/s, write: 7.6 MiB/s
INFO:  63% (543.3 GiB of 862.0 GiB) in 21m 17s, read: 435.6 MiB/s, write: 10.2 MiB/s
INFO:  64% (552.1 GiB of 862.0 GiB) in 21m 38s, read: 427.8 MiB/s, write: 17.1 MiB/s
INFO:  65% (560.4 GiB of 862.0 GiB) in 21m 57s, read: 446.5 MiB/s, write: 5.9 MiB/s
INFO:  66% (569.3 GiB of 862.0 GiB) in 22m 18s, read: 436.8 MiB/s, write: 12.4 MiB/s
INFO:  67% (577.9 GiB of 862.0 GiB) in 22m 38s, read: 439.0 MiB/s, write: 7.0 MiB/s
INFO:  68% (586.6 GiB of 862.0 GiB) in 22m 58s, read: 445.8 MiB/s, write: 4.0 MiB/s
INFO:  69% (595.1 GiB of 862.0 GiB) in 23m 18s, read: 434.6 MiB/s, write: 10.8 MiB/s
INFO:  70% (603.7 GiB of 862.0 GiB) in 23m 39s, read: 421.7 MiB/s, write: 19.8 MiB/s
INFO:  71% (612.3 GiB of 862.0 GiB) in 23m 59s, read: 439.6 MiB/s, write: 9.2 MiB/s
INFO:  72% (621.0 GiB of 862.0 GiB) in 24m 19s, read: 442.8 MiB/s, write: 7.0 MiB/s
INFO:  73% (629.6 GiB of 862.0 GiB) in 24m 39s, read: 443.8 MiB/s, write: 5.0 MiB/s
INFO:  74% (637.9 GiB of 862.0 GiB) in 24m 58s, read: 445.1 MiB/s, write: 5.7 MiB/s
INFO:  75% (646.9 GiB of 862.0 GiB) in 25m 19s, read: 439.6 MiB/s, write: 5.1 MiB/s
INFO:  76% (655.5 GiB of 862.0 GiB) in 25m 40s, read: 416.8 MiB/s, write: 3.6 MiB/s
INFO:  77% (664.1 GiB of 862.0 GiB) in 26m, read: 441.8 MiB/s, write: 3.6 MiB/s
INFO:  78% (672.7 GiB of 862.0 GiB) in 26m 20s, read: 439.0 MiB/s, write: 2.4 MiB/s
INFO:  79% (681.2 GiB of 862.0 GiB) in 26m 40s, read: 439.0 MiB/s, write: 12.2 MiB/s
INFO:  80% (689.9 GiB of 862.0 GiB) in 27m, read: 444.6 MiB/s, write: 5.6 MiB/s
INFO:  81% (698.5 GiB of 862.0 GiB) in 27m 20s, read: 437.2 MiB/s, write: 14.6 MiB/s
INFO:  82% (707.2 GiB of 862.0 GiB) in 27m 40s, read: 448.8 MiB/s, write: 3.4 MiB/s
INFO:  83% (715.9 GiB of 862.0 GiB) in 28m, read: 442.4 MiB/s, write: 4.8 MiB/s
INFO:  84% (724.5 GiB of 862.0 GiB) in 28m 20s, read: 442.4 MiB/s, write: 8.8 MiB/s
INFO:  85% (733.1 GiB of 862.0 GiB) in 28m 40s, read: 438.6 MiB/s, write: 8.8 MiB/s
INFO:  86% (741.7 GiB of 862.0 GiB) in 29m, read: 442.8 MiB/s, write: 9.2 MiB/s
INFO:  87% (750.1 GiB of 862.0 GiB) in 29m 19s, read: 449.9 MiB/s, write: 2.3 MiB/s
INFO:  88% (758.8 GiB of 862.0 GiB) in 29m 39s, read: 449.6 MiB/s, write: 2.6 MiB/s
INFO:  89% (767.5 GiB of 862.0 GiB) in 29m 59s, read: 442.8 MiB/s, write: 6.8 MiB/s
INFO:  90% (775.8 GiB of 862.0 GiB) in 30m 18s, read: 449.1 MiB/s, write: 4.4 MiB/s
INFO:  91% (784.6 GiB of 862.0 GiB) in 30m 38s, read: 448.0 MiB/s, write: 6.2 MiB/s
INFO:  92% (793.1 GiB of 862.0 GiB) in 30m 58s, read: 438.8 MiB/s, write: 11.0 MiB/s
INFO:  93% (801.9 GiB of 862.0 GiB) in 31m 19s, read: 425.1 MiB/s, write: 3.2 MiB/s
INFO:  94% (810.3 GiB of 862.0 GiB) in 31m 39s, read: 433.8 MiB/s, write: 14.4 MiB/s
INFO:  95% (819.2 GiB of 862.0 GiB) in 32m 1s, read: 411.6 MiB/s, write: 49.3 MiB/s
INFO:  96% (827.8 GiB of 862.0 GiB) in 32m 22s, read: 419.6 MiB/s, write: 41.9 MiB/s
INFO:  97% (836.5 GiB of 862.0 GiB) in 32m 43s, read: 424.2 MiB/s, write: 37.3 MiB/s
INFO:  98% (845.1 GiB of 862.0 GiB) in 33m 4s, read: 418.1 MiB/s, write: 46.3 MiB/s
INFO:  99% (853.8 GiB of 862.0 GiB) in 33m 26s, read: 405.5 MiB/s, write: 64.7 MiB/s
INFO: 100% (862.0 GiB of 862.0 GiB) in 33m 48s, read: 383.1 MiB/s, write: 70.0 MiB/s
INFO: backup is sparse: 10.09 GiB (1%) total zero data
INFO: backup was done incrementally, reused 831.52 GiB (96%)
INFO: transferred 862.00 GiB in 2028 seconds (435.3 MiB/s)
INFO: adding notes to backup
storing login ticket failed: $XDG_RUNTIME_DIR must be set
INFO: Finished Backup of VM 142 (00:33:48)
INFO: Backup finished at 2023-02-03 11:27:35
storing login ticket failed: $XDG_RUNTIME_DIR must be set
INFO: Backup job finished successfully
TASK OK
 
  • Like
Reactions: fiona

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!