Upgrade to 1.7 -> problem with nfs server

obelix79

New Member
Dec 22, 2010
5
0
1
Hi

We upgraded our cluster of 2 proxmox servers from version 1.4 to 1.7. Everything is running fine but we've got problems with the backup of our vm's to a nfs share (in 1.4 there were no problems). We use the built in nfs server, I think version 1:1.1.2-6lenny2 and we use the kernel 2.6.32.
Our vm's of both servers are backuped to the nfs share on one server. In the syslog there following messsages:

Code:
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2503 blocked for more than 120 seconds.
Dec 22 02:08:55 vampus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 22 02:08:55 vampus kernel: nfsd          D ffff88021c1cf800     0  2503      2 0x00000000
Dec 22 02:08:55 vampus kernel: ffff88021e48c000 0000000000000046 0000000000000000 ffffffffa04494e0
Dec 22 02:08:55 vampus kernel: 0000000000000001 ffff880219099d20 000000000000fa40 ffff880219099fd8
Dec 22 02:08:55 vampus kernel: 0000000000016940 0000000000016940 ffff88021c1cf800 ffff88021c1cfaf8
Dec 22 02:08:55 vampus kernel: Call Trace:
Dec 22 02:08:55 vampus kernel: [<ffffffffa02b41ce>] ? encode_fattr3+0x125/0x135 [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff81313fc3>] ? __mutex_lock_common+0x122/0x192
Dec 22 02:08:55 vampus kernel: [<ffffffff813140eb>] ? mutex_lock+0x1a/0x31
Dec 22 02:08:55 vampus kernel: [<ffffffffa01e9deb>] ? svc_send+0x4e/0x9b [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa01dee28>] ? svc_process+0x618/0x627 [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8772>] ? nfsd+0x0/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8857>] ? nfsd+0xe5/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff8106656a>] ? kthread+0xc0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec 22 02:08:55 vampus kernel: [<ffffffff810664aa>] ? kthread+0x0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2504 blocked for more than 120 seconds.
Dec 22 02:08:55 vampus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 22 02:08:55 vampus kernel: nfsd          D ffff88021c1c8000     0  2504      2 0x00000000
Dec 22 02:08:55 vampus kernel: ffff88021e4d0000 0000000000000046 0000000000000001 ffffffffa04494e0
Dec 22 02:08:55 vampus kernel: 0000000000000003 ffff880219193d20 000000000000fa40 ffff880219193fd8
Dec 22 02:08:55 vampus kernel: 0000000000016940 0000000000016940 ffff88021c1c8000 ffff88021c1c82f8
Dec 22 02:08:55 vampus kernel: Call Trace:
Dec 22 02:08:55 vampus kernel: [<ffffffff810f499e>] ? vfs_getattr+0x35/0x77
Dec 22 02:08:55 vampus kernel: [<ffffffffa02b41ce>] ? encode_fattr3+0x125/0x135 [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff81313fc3>] ? __mutex_lock_common+0x122/0x192
Dec 22 02:08:55 vampus kernel: [<ffffffff813140eb>] ? mutex_lock+0x1a/0x31
Dec 22 02:08:55 vampus kernel: [<ffffffffa01e9deb>] ? svc_send+0x4e/0x9b [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa01dee28>] ? svc_process+0x618/0x627 [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8772>] ? nfsd+0x0/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8857>] ? nfsd+0xe5/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff8106656a>] ? kthread+0xc0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec 22 02:08:55 vampus kernel: [<ffffffff810664aa>] ? kthread+0x0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2506 blocked for more than 120 seconds.
Dec 22 02:08:55 vampus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 22 02:08:55 vampus kernel: nfsd          D ffff88021c1cb800     0  2506      2 0x00000000
Dec 22 02:08:55 vampus kernel: ffffffff81491c30 0000000000000046 0000000000000000 ffffffffa04494e0
Dec 22 02:08:55 vampus kernel: 0000000000000000 ffff880219387d20 000000000000fa40 ffff880219387fd8
Dec 22 02:08:55 vampus kernel: 0000000000016940 0000000000016940 ffff88021c1cb800 ffff88021c1cbaf8
Dec 22 02:08:55 vampus kernel: Call Trace:
Dec 22 02:08:55 vampus kernel: [<ffffffffa02b41ce>] ? encode_fattr3+0x125/0x135 [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff81313fc3>] ? __mutex_lock_common+0x122/0x192
Dec 22 02:08:55 vampus kernel: [<ffffffff813140eb>] ? mutex_lock+0x1a/0x31
Dec 22 02:08:55 vampus kernel: [<ffffffffa01e9deb>] ? svc_send+0x4e/0x9b [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa01dee28>] ? svc_process+0x618/0x627 [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8772>] ? nfsd+0x0/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8857>] ? nfsd+0xe5/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff8106656a>] ? kthread+0xc0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec 22 02:08:55 vampus kernel: [<ffffffff810664aa>] ? kthread+0x0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2507 blocked for more than 120 seconds.
Dec 22 02:08:55 vampus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 22 02:08:55 vampus kernel: nfsd          D ffff88021c1cb000     0  2507      2 0x00000000
Dec 22 02:08:55 vampus kernel: ffff88021e48c000 0000000000000046 0000000000000000 ffffffffa04494e0
Dec 22 02:08:55 vampus kernel: 0000000000000001 ffff880218cb1d20 000000000000fa40 ffff880218cb1fd8
Dec 22 02:08:55 vampus kernel: 0000000000016940 0000000000016940 ffff88021c1cb000 ffff88021c1cb2f8
Dec 22 02:08:55 vampus kernel: Call Trace:
Dec 22 02:08:55 vampus kernel: [<ffffffffa02b41ce>] ? encode_fattr3+0x125/0x135 [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff81313fc3>] ? __mutex_lock_common+0x122/0x192
Dec 22 02:08:55 vampus kernel: [<ffffffff813140eb>] ? mutex_lock+0x1a/0x31
Dec 22 02:08:55 vampus kernel: [<ffffffffa01e9deb>] ? svc_send+0x4e/0x9b [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa01dee28>] ? svc_process+0x618/0x627 [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8772>] ? nfsd+0x0/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8857>] ? nfsd+0xe5/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff8106656a>] ? kthread+0xc0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec 22 02:08:55 vampus kernel: [<ffffffff810664aa>] ? kthread+0x0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2508 blocked for more than 120 seconds.
Dec 22 02:08:55 vampus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 22 02:08:55 vampus kernel: nfsd          D ffff88021c1cf000     0  2508      2 0x00000000
Dec 22 02:08:55 vampus kernel: ffff88021e4d0000 0000000000000046 0000000000000000 ffffffffa04494e0
Dec 22 02:08:55 vampus kernel: 0000000000000003 ffff880218dabd20 000000000000fa40 ffff880218dabfd8
Dec 22 02:08:55 vampus kernel: 0000000000016940 0000000000016940 ffff88021c1cf000 ffff88021c1cf2f8
Dec 22 02:08:55 vampus kernel: Call Trace:
Dec 22 02:08:55 vampus kernel: [<ffffffffa02b41ce>] ? encode_fattr3+0x125/0x135 [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff81313fc3>] ? __mutex_lock_common+0x122/0x192
Dec 22 02:08:55 vampus kernel: [<ffffffff813140eb>] ? mutex_lock+0x1a/0x31
Dec 22 02:08:55 vampus kernel: [<ffffffffa01e9deb>] ? svc_send+0x4e/0x9b [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa01dee28>] ? svc_process+0x618/0x627 [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8772>] ? nfsd+0x0/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8857>] ? nfsd+0xe5/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff8106656a>] ? kthread+0xc0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec 22 02:08:55 vampus kernel: [<ffffffff810664aa>] ? kthread+0x0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2509 blocked for more than 120 seconds.
Dec 22 02:08:55 vampus kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 22 02:08:55 vampus kernel: nfsd          D ffff88021c1cc000     0  2509      2 0x00000000
Dec 22 02:08:55 vampus kernel: ffff88021e48c000 0000000000000046 0000000000000000 ffffffffa04494e0
Dec 22 02:08:55 vampus kernel: 0000000000000001 ffff880218ea5d20 000000000000fa40 ffff880218ea5fd8
Dec 22 02:08:55 vampus kernel: 0000000000016940 0000000000016940 ffff88021c1cc000 ffff88021c1cc2f8
Dec 22 02:08:55 vampus kernel: Call Trace:
Dec 22 02:08:55 vampus kernel: [<ffffffffa02b127e>] ? nfsd_cache_update+0x9a/0x13a [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff81313fc3>] ? __mutex_lock_common+0x122/0x192
Dec 22 02:08:55 vampus kernel: [<ffffffff813140eb>] ? mutex_lock+0x1a/0x31
Dec 22 02:08:55 vampus kernel: [<ffffffffa01e9deb>] ? svc_send+0x4e/0x9b [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa01dee28>] ? svc_process+0x618/0x627 [sunrpc]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8772>] ? nfsd+0x0/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffffa02a8857>] ? nfsd+0xe5/0x12e [nfsd]
Dec 22 02:08:55 vampus kernel: [<ffffffff8106656a>] ? kthread+0xc0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c6a>] ? child_rip+0xa/0x20
Dec 22 02:08:55 vampus kernel: [<ffffffff810664aa>] ? kthread+0x0/0xca
Dec 22 02:08:55 vampus kernel: [<ffffffff81011c60>] ? child_rip+0x0/0x20
Dec 22 02:08:55 vampus kernel: INFO: task nfsd:2510 blocked for more than 120 seconds.
...

The backup is not finished and we could not log in to the proxmox web interface.
Can anyone please help us?

Thanks a lot
 
you backup to a nfs share where both is located on the same physical host?
 
The nfs server and store is located on server 1. Server 1 and 2 backup via nfs share to server 1.
 
Is it possible to use proxmox 1.7 with kernel 2.6.18? Would kernel 2.6.24 also be old enough?
Thanks
 
Of course it's possible, and preferable, if you ask me. Only problems with latest kernel, at least for me. Question is, is your hardware supported (and it is, if working with 1.4), if so, just go for it, and don't think about it any longer.
 
We use drbd. Is it a problem if we switch back to an older kernel? I think we used the default 1.4 kernel 2.6.24 before.
 
Well, I guess it will work with 2.6.24 kernel, but you won't get any more security updates. It's realy awkward situation, and currently best solution is to get AMD based servers, but that's not any real comfort and possibility, in most cases. :)
 
configure the backup for server 1 NOT to use the NFS share. instead, just add a second backup destination pointing to the dir. (and you need to create a 'dummy' dir on the second node).
we have seen this issue on several installations and its generally problematic to use NFS like this, at least there are some posts around.
 
Thanks a lot,first we will try the backup directly to the directory. If this will not help, we will try an older kernel.