Backup (Vzdump) fails and hangs forever

RudyBzh

Member
Jul 9, 2020
14
1
8
43
Hi,
I have an issue with a VM backup task.
It often hangs and the only way to recover is to entirely reload proxmox.
I found several similar issues on the forum but none of my tries permits me to solve my issue.

root@pve:~# pveversion pve-manager/6.2-11/22fb4983 (running kernel: 5.4.60-1-pve)

Task informations :
Code:
INFO: starting new backup job: vzdump 102 --remove 0 --mode snapshot --storage i7-2700k --compress zstd --node pve
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2020-09-04 11:23:29
INFO: status = running
INFO: VM Name: OLD-Ubuntu
INFO: include disk 'scsi0' 'local-lvm:vm-102-disk-1' 50G
INFO: include disk 'efidisk0' 'local-lvm:vm-102-disk-0' 4M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/i7-2700k/dump/vzdump-qemu-102-2020_09_04-11_23_29.vma.zst'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'b1061af6-74c5-41ac-bbfc-953a64de07a5'
INFO: resuming VM again
INFO:   1% (1007.8 MiB of 50.0 GiB) in  3s, read: 335.9 MiB/s, write: 327.6 MiB/s
INFO:   3% (1.9 GiB of 50.0 GiB) in  6s, read: 306.9 MiB/s, write: 303.7 MiB/s
INFO:   5% (2.8 GiB of 50.0 GiB) in  9s, read: 298.5 MiB/s, write: 296.2 MiB/s
INFO:   7% (3.6 GiB of 50.0 GiB) in 12s, read: 281.5 MiB/s, write: 272.5 MiB/s
INFO:   9% (4.5 GiB of 50.0 GiB) in 15s, read: 321.4 MiB/s, write: 317.9 MiB/s
INFO:  10% (5.4 GiB of 50.0 GiB) in 18s, read: 311.8 MiB/s, write: 311.5 MiB/s
INFO:  12% (6.3 GiB of 50.0 GiB) in 21s, read: 286.4 MiB/s, write: 278.2 MiB/s
INFO:  14% (7.2 GiB of 50.0 GiB) in 24s, read: 300.9 MiB/s, write: 299.9 MiB/s
INFO:  15% (7.9 GiB of 50.0 GiB) in 27s, read: 264.6 MiB/s, write: 263.8 MiB/s
INFO:  17% (8.7 GiB of 50.0 GiB) in 30s, read: 276.7 MiB/s, write: 274.4 MiB/s
INFO:  19% (9.5 GiB of 50.0 GiB) in 33s, read: 274.4 MiB/s, write: 273.7 MiB/s
INFO:  20% (10.4 GiB of 50.0 GiB) in 36s, read: 289.9 MiB/s, write: 283.1 MiB/s
INFO:  22% (11.2 GiB of 50.0 GiB) in 39s, read: 261.0 MiB/s, write: 261.0 MiB/s
INFO:  23% (12.0 GiB of 50.0 GiB) in 42s, read: 271.9 MiB/s, write: 271.7 MiB/s
INFO:  25% (12.8 GiB of 50.0 GiB) in 45s, read: 280.3 MiB/s, write: 271.1 MiB/s
INFO:  26% (13.4 GiB of 50.0 GiB) in 48s, read: 210.2 MiB/s, write: 209.3 MiB/s
INFO:  28% (14.1 GiB of 50.0 GiB) in 51s, read: 233.2 MiB/s, write: 232.5 MiB/s
INFO:  29% (14.6 GiB of 50.0 GiB) in 54s, read: 177.6 MiB/s, write: 168.7 MiB/s
INFO:  30% (15.2 GiB of 50.0 GiB) in 57s, read: 197.0 MiB/s, write: 194.6 MiB/s
INFO:  31% (15.8 GiB of 50.0 GiB) in  1m  0s, read: 200.6 MiB/s, write: 199.1 MiB/s
INFO:  32% (16.3 GiB of 50.0 GiB) in  1m  3s, read: 183.7 MiB/s, write: 175.7 MiB/s
INFO:  33% (16.7 GiB of 50.0 GiB) in  1m  6s, read: 152.2 MiB/s, write: 151.4 MiB/s
INFO:  34% (17.1 GiB of 50.0 GiB) in  1m  9s, read: 114.8 MiB/s, write: 114.6 MiB/s
INFO:  35% (17.6 GiB of 50.0 GiB) in  1m 14s, read: 105.1 MiB/s, write: 105.1 MiB/s
INFO:  36% (18.1 GiB of 50.0 GiB) in  1m 18s, read: 119.6 MiB/s, write: 118.9 MiB/s
INFO:  37% (18.6 GiB of 50.0 GiB) in  1m 21s, read: 198.2 MiB/s, write: 175.9 MiB/s
INFO:  38% (19.1 GiB of 50.0 GiB) in  1m 24s, read: 149.8 MiB/s, write: 130.5 MiB/s
INFO:  39% (19.6 GiB of 50.0 GiB) in  1m 28s, read: 132.3 MiB/s, write: 128.1 MiB/s
INFO:  40% (20.0 GiB of 50.0 GiB) in  1m 33s, read: 84.8 MiB/s, write: 82.0 MiB/s
INFO:  41% (20.5 GiB of 50.0 GiB) in  1m 42s, read: 55.2 MiB/s, write: 50.4 MiB/s
INFO:  42% (21.0 GiB of 50.0 GiB) in  1m 53s, read: 47.1 MiB/s, write: 30.4 MiB/s
INFO:  43% (21.5 GiB of 50.0 GiB) in  2m  1s, read: 64.3 MiB/s, write: 28.0 MiB/s
INFO:  44% (22.0 GiB of 50.0 GiB) in  2m  9s, read: 64.8 MiB/s, write: 31.1 MiB/s
INFO:  45% (22.5 GiB of 50.0 GiB) in  2m 16s, read: 74.6 MiB/s, write: 70.5 MiB/s
INFO:  46% (23.1 GiB of 50.0 GiB) in  2m 20s, read: 154.1 MiB/s, write: 153.3 MiB/s
INFO:  47% (23.6 GiB of 50.0 GiB) in  2m 23s, read: 157.1 MiB/s, write: 156.3 MiB/s
INFO:  48% (24.3 GiB of 50.0 GiB) in  2m 26s, read: 244.1 MiB/s, write: 231.4 MiB/s
INFO:  49% (24.9 GiB of 50.0 GiB) in  2m 29s, read: 197.1 MiB/s, write: 185.1 MiB/s
INFO:  51% (25.6 GiB of 50.0 GiB) in  2m 32s, read: 239.1 MiB/s, write: 233.7 MiB/s
INFO:  52% (26.2 GiB of 50.0 GiB) in  2m 35s, read: 224.8 MiB/s, write: 215.4 MiB/s
INFO:  53% (26.7 GiB of 50.0 GiB) in  2m 38s, read: 143.9 MiB/s, write: 128.0 MiB/s
INFO:  54% (27.2 GiB of 50.0 GiB) in  2m 41s, read: 171.5 MiB/s, write: 163.1 MiB/s
INFO:  55% (27.7 GiB of 50.0 GiB) in  2m 44s, read: 169.2 MiB/s, write: 159.1 MiB/s
INFO:  56% (28.0 GiB of 50.0 GiB) in  2m 48s, read: 89.7 MiB/s, write: 87.3 MiB/s
INFO:  57% (28.5 GiB of 50.0 GiB) in  3m  8s, read: 25.6 MiB/s, write: 24.7 MiB/s
INFO:  58% (29.0 GiB of 50.0 GiB) in  3m 28s, read: 26.1 MiB/s, write: 21.9 MiB/s
INFO:  59% (29.6 GiB of 50.0 GiB) in  3m 31s, read: 186.1 MiB/s, write: 151.0 MiB/s
INFO:  60% (30.5 GiB of 50.0 GiB) in  3m 34s, read: 317.0 MiB/s, write: 275.3 MiB/s
INFO:  62% (31.3 GiB of 50.0 GiB) in  3m 37s, read: 264.2 MiB/s, write: 195.5 MiB/s
INFO:  64% (32.4 GiB of 50.0 GiB) in  3m 40s, read: 377.0 MiB/s, write: 270.1 MiB/s
INFO:  65% (32.7 GiB of 50.0 GiB) in  3m 43s, read: 110.0 MiB/s, write: 102.6 MiB/s
INFO:  66% (33.5 GiB of 50.0 GiB) in  3m 46s, read: 274.3 MiB/s, write: 160.7 MiB/s
INFO:  68% (34.5 GiB of 50.0 GiB) in  3m 49s, read: 326.2 MiB/s, write: 219.6 MiB/s
INFO:  70% (35.2 GiB of 50.0 GiB) in  3m 52s, read: 240.5 MiB/s, write: 157.1 MiB/s
INFO:  72% (36.1 GiB of 50.0 GiB) in  3m 55s, read: 320.2 MiB/s, write: 124.2 MiB/s
INFO:  73% (36.5 GiB of 50.0 GiB) in  4m  2s, read: 65.2 MiB/s, write: 51.6 MiB/s
INFO:  74% (37.0 GiB of 50.0 GiB) in  4m 19s, read: 28.1 MiB/s, write: 19.1 MiB/s
INFO:  75% (37.5 GiB of 50.0 GiB) in  4m 36s, read: 32.0 MiB/s, write: 19.0 MiB/s
INFO:  76% (38.2 GiB of 50.0 GiB) in  4m 39s, read: 217.5 MiB/s, write: 100.0 MiB/s
INFO:  77% (38.6 GiB of 50.0 GiB) in  4m 42s, read: 152.2 MiB/s, write: 121.2 MiB/s
INFO:  79% (39.9 GiB of 50.0 GiB) in  4m 45s, read: 435.1 MiB/s, write: 214.5 MiB/s
INFO:  82% (41.2 GiB of 50.0 GiB) in  4m 48s, read: 457.9 MiB/s, write: 231.1 MiB/s
INFO:  85% (42.6 GiB of 50.0 GiB) in  4m 51s, read: 474.9 MiB/s, write: 236.9 MiB/s
INFO:  88% (44.1 GiB of 50.0 GiB) in  4m 54s, read: 488.2 MiB/s, write: 236.7 MiB/s
INFO:  91% (46.0 GiB of 50.0 GiB) in  4m 57s, read: 652.0 MiB/s, write: 249.0 MiB/s
INFO:  99% (49.5 GiB of 50.0 GiB) in  5m  0s, read: 1.2 GiB/s, write: 72.0 MiB/s
INFO: 100% (50.0 GiB of 50.0 GiB) in  5m  1s, read: 483.1 MiB/s, write: 8.0 KiB/s
INFO: backup is sparse: 12.38 GiB (24%) total zero data
INFO: transferred 50.00 GiB in 301 seconds (170.1 MiB/s)

It's then staying here for hours keeping a lock on the VM. Impossible to kill related processes.

Storage is on a CIFS share :
Code:
root@pve:~# more /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content iso,vztmpl
maxfiles 2
shared 0

lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir

cifs: i7-2700k
path /mnt/pve/i7-2700k
server 192.168.1.2
share SauvegardesPVE
content backup
maxfiles 5
username XXXX

root@pve:~# mount -t cifs
//192.168.1.2/SauvegardesPVE on /mnt/pve/i7-2700k type cifs (rw,relatime,vers=3.0,cache=strict,username=XXXX,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.2,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1)

Already tried to tune /etc/vzdump.conf to uncomment tmpdir to use local FS :
Code:
root@pve:~# more /etc/vzdump.conf
# vzdump default settings

tmpdir: /var/lib/vz/vztmp
#dumpdir: DIR
#storage: STORAGE_ID
#mode: snapshot|suspend|stop
#bwlimit: KBPS
#ionice: PRI
#lockwait: MINUTES
#stopwait: MINUTES
#size: MB
#stdexcludes: BOOLEAN
#mailto: ADDRESSLIST
#maxfiles: N
#script: FILENAME
#exclude-path: PATHLIST
#pigz: N

Related syslog attached.

I'd like to notice than I can succesfully backup others smaller VM/LXC on the same storage, so it's not accessibility issues to the CIFS for me.

Any help please ?

Thanks.
Rudy.
 

Attachments

  • syslog.txt
    24.3 KB · Views: 5
Same here.
VzDump Backup to CIFS share on Windoof 10 always hang the complete PVE server. Have to reboot PVE.
Last line is always
INFO: transferred xx.xx GiB in xx seconds (xx.x MiB/s)
so the backup is finished, but the Task OK never happens.
 
Some more informations from the console:

[60460.969050] CIFS VFS: \\192.168.179.136 has not responded in 180 seconds. Reconnecting...
[62229.695203] INFO: task zstd:36670 blocked for more than 120 seconds.
[62229.695798] Tainted: P O 5.4.78-2-pve #1
[62229.696084] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[62229.696572] INFO: task kworker/13:32:38180 blocked for more than 120 seconds.
[62229.696844] Tainted: P O 5.4.78-2-pve #1
[62229.697108] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
... this goes endless if I don't reboot PVE.

I can connect from other computers to the CIFS server, it is responding.
So it's not a problem of the CIFS server.
 
With LZO compression, same problem, different messages:

[ 2331.460588] CIFS VFS: \\192.168.179.136 Cancelling wait for mid 2943494 cmd: 5
[ 2331.460643] CIFS VFS: \\192.168.179.136 Cancelling wait for mid 2943495 cmd: 16
[ 2331.460666] CIFS VFS: \\192.168.179.136 Cancelling wait for mid 2943496 cmd: 6
[ 2331.562936] CIFS VFS: Close unmatched open
[ 2418.240281] INFO: task lzop:8669 blocked for more than 120 seconds.
[ 2418.240338] Tainted: P O 5.4.78-2-pve #1
[ 2418.240379] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2418.240563] INFO: task kworker/27:5:10560 blocked for more than 120 seconds.
[ 2418.240586] Tainted: P O 5.4.78-2-pve #1
[ 2418.240603] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 
Hi All.
Same here
Virtual Environment 6.3-3

Backup with ZSTD on local folder mounted via ftpfs
 
Hi,

I am seeing the exact same issue with PVE 6.4.8

Always happens after finishing one of the VMs in the job (not the same)
INFO: transferred xxx GiB in xxx seconds (xxx MiB/s)

Did anyone find a solution for this issue ?

cheers
Michael
 
Still happens unfortunately. Renders automated backups useless. Disabling compression alleviates the issue but is not preferred.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!