Hi there,
I've been running a new node with 4.3 for a couple of weeks now and everything has been pretty great.
Just now, the server became unresponsive (even to pings) for several minutes. Looking in syslog I found a few kernel errors:
3 of the above.
At the end, we have:
However, not long before, pvestatd updated in a sufficient amount of time:
Looking at /etc/pve/.rrd, the results from the status query seem fine. I do have 1 remote machine mounted via sshfs, however the first query (only 15 minutes before) reported fine. And that remote share has been mounted for a week or so now.
Any suggestions as to why this happened, so I can make sure it doesn't happen again!
Thank you in advance,
Jarrod.
I've been running a new node with 4.3 for a couple of weeks now and everything has been pretty great.
Just now, the server became unresponsive (even to pings) for several minutes. Looking in syslog I found a few kernel errors:
Code:
Oct 08 12:12:16 s1 kernel: INFO: task pvestatd:5982 blocked for more than 120 seconds.
Oct 08 12:12:16 s1 kernel: Tainted: P O 4.4.16-1-pve #1
Oct 08 12:12:16 s1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 08 12:12:16 s1 kernel: pvestatd D ffff8809422dfc18 0 5982 4453 0x00000004
Oct 08 12:12:16 s1 kernel: ffff8809422dfc18 00000000c1965920 ffff881ff27d6e00 ffff881fd9888000
Oct 08 12:12:16 s1 kernel: ffff8809422e0000 ffff8809422dfc58 ffff881faf31ebc0 ffff881ccdf4e050
Oct 08 12:12:16 s1 kernel: fffffffffffffe00 ffff8809422dfc30 ffffffff8184d835 ffff881faf31eaf0
Oct 08 12:12:16 s1 kernel: Call Trace:
Oct 08 12:12:16 s1 kernel: [<ffffffff8184d835>] schedule+0x35/0x80
Oct 08 12:12:16 s1 kernel: [<ffffffff8131871f>] request_wait_answer+0x12f/0x280
Oct 08 12:12:16 s1 kernel: [<ffffffff810c3fe0>] ? wait_woken+0x90/0x90
Oct 08 12:12:16 s1 kernel: [<ffffffff813188d9>] __fuse_request_send+0x69/0x90
Oct 08 12:12:16 s1 kernel: [<ffffffff81318927>] fuse_request_send+0x27/0x30
Oct 08 12:12:16 s1 kernel: [<ffffffff8131b40b>] fuse_simple_request+0xcb/0x1a0
Oct 08 12:12:16 s1 kernel: [<ffffffff813252b2>] fuse_statfs+0xe2/0x160
Oct 08 12:12:16 s1 kernel: [<ffffffff81242a7f>] statfs_by_dentry+0x6f/0x90
Oct 08 12:12:16 s1 kernel: [<ffffffff81242abb>] vfs_statfs+0x1b/0xb0
Oct 08 12:12:16 s1 kernel: [<ffffffff81242ba8>] user_statfs+0x58/0xa0
Oct 08 12:12:16 s1 kernel: [<ffffffff81242c17>] SYSC_statfs+0x27/0x60
Oct 08 12:12:16 s1 kernel: [<ffffffff81242dfe>] SyS_statfs+0xe/0x10
Oct 08 12:12:16 s1 kernel: [<ffffffff81851936>] entry_SYSCALL_64_fastpath+0x16/0x75
3 of the above.
At the end, we have:
Code:
Oct 08 12:17:07 s1 pvestatd[4453]: status update time (506.590 seconds)
However, not long before, pvestatd updated in a sufficient amount of time:
Code:
Oct 08 12:02:31 s1 pvestatd[4453]: status update time (18.277 seconds)
Looking at /etc/pve/.rrd, the results from the status query seem fine. I do have 1 remote machine mounted via sshfs, however the first query (only 15 minutes before) reported fine. And that remote share has been mounted for a week or so now.
Any suggestions as to why this happened, so I can make sure it doesn't happen again!
Thank you in advance,
Jarrod.