After upgrade to 4.4 we got timeout for backup on NFS

May 7, 2016
12
1
43
40
We just recently ugraded from 4.2 to 4.4. After upgrade one backup task for quite big VM timeout.

Before upgrade everyting works fine. Backup took about 2-3 hours but it was ok. But now backup starts, looks fine for a moment (usually for 5-15% of backup) and then just freeze. No more progress is shown in log and after some time we ot timeout.

Code:
INFO: starting new backup job: vzdump 5001 --storage backup-office --mode snapshot --compress lzo --node carol --remove 0
INFO: Starting Backup of VM 5001 (qemu)
INFO: status = running
INFO: update VM 5001: -lock backup
INFO: VM Name: moneys4
INFO: include disk 'ide0' 'vm-hdd:vm-5001-disk-2' 60G
INFO: include disk 'ide1' 'vm-ssd:vm-5001-disk-1' 30G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/backup-office/dump/vzdump-qemu-5001-2017_08_01-09_18_24.vma.lzo'
INFO: started backup task '090ffddf-3313-49ec-a038-63c8a2eb07ad'
INFO: status: 0% (384434176/96636764160), sparse 0% (62197760), duration 3, 128/107 MB/s
INFO: status: 1% (1071054848/96636764160), sparse 0% (93851648), duration 9, 114/109 MB/s
INFO: status: 2% (2035941376/96636764160), sparse 0% (119918592), duration 18, 107/104 MB/s
INFO: status: 3% (2903965696/96636764160), sparse 0% (174608384), duration 27, 96/90 MB/s
INFO: status: 4% (3879665664/96636764160), sparse 0% (187219968), duration 37, 97/96 MB/s
INFO: status: 5% (4838785024/96636764160), sparse 0% (255647744), duration 48, 87/80 MB/s
long freeze here
Code:
ERROR: VM 5001 qmp command 'query-backup' failed - got timeout
INFO: aborting backup job
ERROR: Backup of VM 5001 failed - VM 5001 qmp command 'query-backup' failed - got timeout
INFO: Backup job finished with errors
TASK ERROR: job errors

Syslog and dmesg shows nothing suspicious.

Any idea what can be cause or what we should check for more info?
 
Update. After some experiments I found out some additional info:

- there is one log message which may realte
Code:
pvedaemon[19116]: VM 5001 qmp command failed - interrupted by signal
- backup starts to create content on target NFS backup storage (.tmp and .vma.dat files) But these files are no longer updated from moment when backup freeze
- there is process zopwhich consume lots of I/O even during freeze and cannot be killed not even with kill -9