nfs kernel: 4.15.18-5-pve vzdump problem

Mario Hosse

Well-Known Member
Oct 25, 2017
51
6
48
Hallo,

after upgrade to kernel "4.15.18-5-pve" i can not make a backup from vm per vzdump. NFS-Share hang.
Cluster 4 Nodes
Ceph
NFS on XFS filesystem

#pveversion -v
proxmox-ve: 5.2-2 (running kernel: 4.15.18-5-pve)
pve-manager: 5.2-9 (running version: 5.2-9/4b30e8f9)
pve-kernel-4.15: 5.2-8
pve-kernel-4.15.18-5-pve: 4.15.18-24
pve-kernel-4.15.18-4-pve: 4.15.18-23
ceph: 12.2.8-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-38
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-29
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-2
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-20
pve-cluster: 5.0-30
pve-container: 2.0-27
pve-docs: 5.2-8
pve-firewall: 3.0-14
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-35
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.11-pve1~bpo1

#Problem:
Oct 3 10:09:36 prox1a kernel: [89295.705053] INFO: task lzop:842067 blocked for more than 120 seconds.
Oct 3 10:09:36 prox1a kernel: [89295.705080] Tainted: P O 4.15.18-5-pve #1
Oct 3 10:09:36 prox1a kernel: [89295.705100] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 3 10:09:36 prox1a kernel: [89295.705125] lzop D 0 842067 842038 0x00000000
Oct 3 10:09:36 prox1a kernel: [89295.705129] Call Trace:
Oct 3 10:09:36 prox1a kernel: [89295.705135] __schedule+0x3e0/0x870
Oct 3 10:09:36 prox1a kernel: [89295.705137] ? bit_wait+0x60/0x60
Oct 3 10:09:36 prox1a kernel: [89295.705138] schedule+0x36/0x80
Oct 3 10:09:36 prox1a kernel: [89295.705141] io_schedule+0x16/0x40
Oct 3 10:09:36 prox1a kernel: [89295.705142] bit_wait_io+0x11/0x60
Oct 3 10:09:36 prox1a kernel: [89295.705144] __wait_on_bit+0x5a/0x90
Oct 3 10:09:36 prox1a kernel: [89295.705145] out_of_line_wait_on_bit+0x8e/0xb0
Oct 3 10:09:36 prox1a kernel: [89295.705149] ? bit_waitqueue+0x40/0x40
Oct 3 10:09:36 prox1a kernel: [89295.705172] nfs_wait_on_request+0x46/0x50 [nfs]
Oct 3 10:09:36 prox1a kernel: [89295.705179] nfs_lock_and_join_requests+0x121/0x510 [nfs]
Oct 3 10:09:36 prox1a kernel: [89295.705182] ? radix_tree_lookup_slot+0x22/0x50
Oct 3 10:09:36 prox1a kernel: [89295.705190] nfs_updatepage+0x151/0x910 [nfs]
Oct 3 10:09:36 prox1a kernel: [89295.705196] nfs_write_end+0x129/0x4e0 [nfs]
Oct 3 10:09:36 prox1a kernel: [89295.705200] generic_perform_write+0xff/0x1b0
Oct 3 10:09:36 prox1a kernel: [89295.705207] nfs_file_write+0xd7/0x250 [nfs]
Oct 3 10:09:36 prox1a kernel: [89295.705212] new_sync_write+0xe7/0x140
Oct 3 10:09:36 prox1a kernel: [89295.705214] __vfs_write+0x29/0x40
Oct 3 10:09:36 prox1a kernel: [89295.705216] vfs_write+0xb5/0x1a0
Oct 3 10:09:36 prox1a kernel: [89295.705218] SyS_write+0x55/0xc0
Oct 3 10:09:36 prox1a kernel: [89295.705221] do_syscall_64+0x73/0x130
Oct 3 10:09:36 prox1a kernel: [89295.705225] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Oct 3 10:09:36 prox1a kernel: [89295.705227] RIP: 0033:0x7f9099538730
Oct 3 10:09:36 prox1a kernel: [89295.705228] RSP: 002b:00007ffed41cf198 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Oct 3 10:09:36 prox1a kernel: [89295.705230] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9099538730
Oct 3 10:09:36 prox1a kernel: [89295.705231] RDX: 0000000000000004 RSI: 00007ffed41cf200 RDI: 0000000000000001
Oct 3 10:09:36 prox1a kernel: [89295.705231] RBP: 0000000000000004 R08: 0000000000040000 R09: 00007f9099b87000
Oct 3 10:09:36 prox1a kernel: [89295.705232] R10: 00000000000001f9 R11: 0000000000000246 R12: 00007f9099c30698
Oct 3 10:09:36 prox1a kernel: [89295.705233] R13: 0000000000000001 R14: 0000000000000019 R15: 00007ffed41cf200
Oct 3 10:09:37 prox1a pvedaemon[3455]: got timeout
Oct 3 10:09:37 prox1a pvedaemon[3455]: unable to activate storage 'nfs_prox2b' - directory '/mnt/pve/nfs_prox2b' does not exist or is unreachable

#pvesm status
Name Type Status Total Used Available %
nfs_prox1a nfs active 999714816 298649600 701065216 29.87%
nfs_prox2b nfs active 999714816 351735808 647979008 35.18%
nfs_prox3c nfs active 999714816 318177280 681537536 31.83%
nfs_prox4d nfs active 999714816 650489856 349224960 65.07%

#time pvesm nfsscan 192..9.198.2
/mnt/backup2 192.9.198.4,192.9.198.3,192.9.198.2,192.9.198.1,192.9.198.6

real 0m0.482s
user 0m0.378s
sys 0m0.101s

# storage.cfg
nfs: nfs_prox3c
export /mnt/backup3
path /mnt/pve/nfs_prox3c
server 192.9.198.3
content backup
maxfiles 1
options vers=3

nfs: nfs_prox4d
export /mnt/backup4
path /mnt/pve/nfs_prox4d
server 192.9.198.4
content backup
maxfiles 1
options vers=3

nfs: nfs_prox2b
export /mnt/backup2/
path /mnt/pve/nfs_prox2b
server 192.9.198.2
content backup
maxfiles 1
options vers=3

nfs: nfs_prox1a
export /mnt/backup
path /mnt/pve/nfs_prox1a
server 192.9.198.1
content backup
maxfiles 1
options vers=3

This problem is on all hosts.
What can i do?
 
Solved:
My problem was a wrong option in the NFS server:
(sync,acl,no_subtree_check,no_root_squash,rw)
This I have replaced by:
(async,acl,no_subtree_check,no_root_squash,rw)
Now the backup works.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!