Hello,
I want to ask you for help with some bug I discovered.
I have Proxmox cluster built from two nodes. Sometimes, pvestatd service hangs and that node is marked red (as unavailable) in web GUI. When I look into its log, I see that process launched by it, zpool status -o name -H rpool hangs, eats 100% (one core) of CPU and cannot be killed (even by SIGKILL).
Then, each minute this appears in my dmesg:
When I restart pvestatd and wait for some timeouts, the node reappears green again in GUI, but this process still hangs and can be killed only by system restart.
Please, can you help me with some tips how to avoid this problem? In case of need, I can supply additional logs of course.
Thanks, have a nice day!
David
I want to ask you for help with some bug I discovered.
I have Proxmox cluster built from two nodes. Sometimes, pvestatd service hangs and that node is marked red (as unavailable) in web GUI. When I look into its log, I see that process launched by it, zpool status -o name -H rpool hangs, eats 100% (one core) of CPU and cannot be killed (even by SIGKILL).
Then, each minute this appears in my dmesg:
Code:
[519015.269863] spl_kmem_alloc_impl: 114631497 callbacks suppressed
[519015.269864] Possible memory allocation deadlock: size=32776 lflags=0x1404200
[519015.269866] CPU: 1 PID: 16295 Comm: zpool Tainted: P O 4.13.4-1-pve #1
[519015.269867] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0a 12/21/2015
[519015.269868] Call Trace:
[519015.269873] dump_stack+0x63/0x8b
[519015.269880] spl_kmem_alloc_impl+0x173/0x180 [spl]
[519015.269882] spl_vmem_alloc+0x19/0x20 [spl]
[519015.269887] nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
[519015.269889] nv_mem_zalloc.isra.0+0x15/0x40 [znvpair]
[519015.269891] nvlist_xpack+0xb4/0x110 [znvpair]
[519015.269894] ? nvlist_common.part.89+0x118/0x200 [znvpair]
[519015.269896] nvlist_pack+0x34/0x40 [znvpair]
[519015.269899] fnvlist_pack+0x3e/0xa0 [znvpair]
[519015.269931] put_nvlist+0x95/0x100 [zfs]
[519015.269953] zfs_ioc_pool_stats+0x50/0x90 [zfs]
[519015.269974] zfsdev_ioctl+0x5d4/0x660 [zfs]
[519015.269976] do_vfs_ioctl+0xa3/0x610
[519015.269979] ? handle_mm_fault+0xce/0x1c0
[519015.269980] ? __do_page_fault+0x266/0x4e0
[519015.269981] SyS_ioctl+0x79/0x90
[519015.269982] entry_SYSCALL_64_fastpath+0x1e/0xa9
[519015.269983] RIP: 0033:0x7f6cb065ae07
[519015.269984] RSP: 002b:00007fff87ef6a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[519015.269985] RAX: ffffffffffffffda RBX: 00007f6cb0913b00 RCX: 00007f6cb065ae07
[519015.269986] RDX: 00007fff87ef6a90 RSI: 0000000000005a05 RDI: 0000000000000003
[519015.269986] RBP: 0000558c42291fb0 R08: 0000000000000003 R09: 0000000000010010
[519015.269987] R10: 00007f6cb069bb20 R11: 0000000000000246 R12: 0000000000010000
[519015.269987] R13: 0000000000020060 R14: 0000558c42291fa0 R15: 00007fff87efa0e0
[519015.269989] Possible memory allocation deadlock: size=32776 lflags=0x1404200
[519015.269990] CPU: 1 PID: 16295 Comm: zpool Tainted: P O 4.13.4-1-pve #1
[519015.269991] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0a 12/21/2015
[519015.269991] Call Trace:
[519015.269992] dump_stack+0x63/0x8b
[519015.269995] spl_kmem_alloc_impl+0x173/0x180 [spl]
[519015.269997] spl_vmem_alloc+0x19/0x20 [spl]
[519015.270000] nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
[519015.270002] nv_mem_zalloc.isra.0+0x15/0x40 [znvpair]
[519015.270004] nvlist_xpack+0xb4/0x110 [znvpair]
[519015.270007] ? nvlist_common.part.89+0x118/0x200 [znvpair]
[519015.270009] nvlist_pack+0x34/0x40 [znvpair]
[519015.270012] fnvlist_pack+0x3e/0xa0 [znvpair]
[519015.270032] put_nvlist+0x95/0x100 [zfs]
[519015.270052] zfs_ioc_pool_stats+0x50/0x90 [zfs]
[519015.270092] zfsdev_ioctl+0x5d4/0x660 [zfs]
[519015.270094] do_vfs_ioctl+0xa3/0x610
[519015.270095] ? handle_mm_fault+0xce/0x1c0
[519015.270096] ? __do_page_fault+0x266/0x4e0
[519015.270097] SyS_ioctl+0x79/0x90
[519015.270098] entry_SYSCALL_64_fastpath+0x1e/0xa9
[519015.270098] RIP: 0033:0x7f6cb065ae07
[519015.270099] RSP: 002b:00007fff87ef6a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[519015.270099] RAX: ffffffffffffffda RBX: 00007f6cb0913b00 RCX: 00007f6cb065ae07
[519015.270100] RDX: 00007fff87ef6a90 RSI: 0000000000005a05 RDI: 0000000000000003
[519015.270100] RBP: 0000558c42291fb0 R08: 0000000000000003 R09: 0000000000010010
[519015.270101] R10: 00007f6cb069bb20 R11: 0000000000000246 R12: 0000000000010000
[519015.270101] R13: 0000000000020060 R14: 0000558c42291fa0 R15: 00007fff87efa0e0
[519015.270103] Possible memory allocation deadlock: size=32776 lflags=0x1404200
[519015.270104] CPU: 1 PID: 16295 Comm: zpool Tainted: P O 4.13.4-1-pve #1
[519015.270104] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0a 12/21/2015
[519015.270104] Call Trace:
[519015.270105] dump_stack+0x63/0x8b
[519015.270108] spl_kmem_alloc_impl+0x173/0x180 [spl]
[519015.270110] spl_vmem_alloc+0x19/0x20 [spl]
[519015.270122] nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
[519015.270124] nv_mem_zalloc.isra.0+0x15/0x40 [znvpair]
[519015.270127] nvlist_xpack+0xb4/0x110 [znvpair]
[519015.270130] ? nvlist_common.part.89+0x118/0x200 [znvpair]
[519015.270132] nvlist_pack+0x34/0x40 [znvpair]
[519015.270135] fnvlist_pack+0x3e/0xa0 [znvpair]
[519015.270164] put_nvlist+0x95/0x100 [zfs]
[519015.270184] zfs_ioc_pool_stats+0x50/0x90 [zfs]
[519015.270204] zfsdev_ioctl+0x5d4/0x660 [zfs]
[519015.270205] do_vfs_ioctl+0xa3/0x610
[519015.270206] ? handle_mm_fault+0xce/0x1c0
[519015.270207] ? __do_page_fault+0x266/0x4e0
[519015.270208] SyS_ioctl+0x79/0x90
[519015.270209] entry_SYSCALL_64_fastpath+0x1e/0xa9
[519015.270210] RIP: 0033:0x7f6cb065ae07
[519015.270210] RSP: 002b:00007fff87ef6a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[519015.270211] RAX: ffffffffffffffda RBX: 00007f6cb0913b00 RCX: 00007f6cb065ae07
[519015.270211] RDX: 00007fff87ef6a90 RSI: 0000000000005a05 RDI: 0000000000000003
[519015.270212] RBP: 0000558c42291fb0 R08: 0000000000000003 R09: 0000000000010010
[519015.270212] R10: 00007f6cb069bb20 R11: 0000000000000246 R12: 0000000000010000
[519015.270213] R13: 0000000000020060 R14: 0000558c42291fa0 R15: 00007fff87efa0e0
[519015.270214] Possible memory allocation deadlock: size=32776 lflags=0x1404200
[519015.270215] CPU: 1 PID: 16295 Comm: zpool Tainted: P O 4.13.4-1-pve #1
[519015.270216] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0a 12/21/2015
[519015.270216] Call Trace:
[519015.270217] dump_stack+0x63/0x8b
[519015.270219] spl_kmem_alloc_impl+0x173/0x180 [spl]
[519015.270222] spl_vmem_alloc+0x19/0x20 [spl]
[519015.270224] nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
[519015.270226] nv_mem_zalloc.isra.0+0x15/0x40 [znvpair]
[519015.270229] nvlist_xpack+0xb4/0x110 [znvpair]
[519015.270231] ? nvlist_common.part.89+0x118/0x200 [znvpair]
[519015.270234] nvlist_pack+0x34/0x40 [znvpair]
[519015.270236] fnvlist_pack+0x3e/0xa0 [znvpair]
[519015.270256] put_nvlist+0x95/0x100 [zfs]
[519015.270275] zfs_ioc_pool_stats+0x50/0x90 [zfs]
[519015.270295] zfsdev_ioctl+0x5d4/0x660 [zfs]
[519015.270296] do_vfs_ioctl+0xa3/0x610
[519015.270297] ? handle_mm_fault+0xce/0x1c0
[519015.270298] ? __do_page_fault+0x266/0x4e0
[519015.270299] SyS_ioctl+0x79/0x90
[519015.270300] entry_SYSCALL_64_fastpath+0x1e/0xa9
[519015.270300] RIP: 0033:0x7f6cb065ae07
[519015.270301] RSP: 002b:00007fff87ef6a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[519015.270302] RAX: ffffffffffffffda RBX: 00007f6cb0913b00 RCX: 00007f6cb065ae07
[519015.270302] RDX: 00007fff87ef6a90 RSI: 0000000000005a05 RDI: 0000000000000003
[519015.270302] RBP: 0000558c42291fb0 R08: 0000000000000003 R09: 0000000000010010
[519015.270303] R10: 00007f6cb069bb20 R11: 0000000000000246 R12: 0000000000010000
[519015.270303] R13: 0000000000020060 R14: 0000558c42291fa0 R15: 00007fff87efa0e0
[519015.270305] Possible memory allocation deadlock: size=32776 lflags=0x1404200
[519015.270306] CPU: 1 PID: 16295 Comm: zpool Tainted: P O 4.13.4-1-pve #1
[519015.270306] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0a 12/21/2015
[519015.270306] Call Trace:
[519015.270307] dump_stack+0x63/0x8b
[519015.270310] spl_kmem_alloc_impl+0x173/0x180 [spl]
[519015.270312] spl_vmem_alloc+0x19/0x20 [spl]
[519015.270315] nv_alloc_sleep_spl+0x1f/0x30 [znvpair]
[519015.270317] nv_mem_zalloc.isra.0+0x15/0x40 [znvpair]
[519015.270319] nvlist_xpack+0xb4/0x110 [znvpair]
[519015.270322] ? nvlist_common.part.89+0x118/0x200 [znvpair]
[519015.270324] nvlist_pack+0x34/0x40 [znvpair]
[519015.270326] fnvlist_pack+0x3e/0xa0 [znvpair]
[519015.270346] put_nvlist+0x95/0x100 [zfs]
[519015.270366] zfs_ioc_pool_stats+0x50/0x90 [zfs]
[519015.270385] zfsdev_ioctl+0x5d4/0x660 [zfs]
[519015.270387] do_vfs_ioctl+0xa3/0x610
[519015.270388] ? handle_mm_fault+0xce/0x1c0
[519015.270388] ? __do_page_fault+0x266/0x4e0
[519015.270389] SyS_ioctl+0x79/0x90
[519015.270390] entry_SYSCALL_64_fastpath+0x1e/0xa9
[519015.270391] RIP: 0033:0x7f6cb065ae07
[519015.270391] RSP: 002b:00007fff87ef6a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[519015.270392] RAX: ffffffffffffffda RBX: 00007f6cb0913b00 RCX: 00007f6cb065ae07
[519015.270392] RDX: 00007fff87ef6a90 RSI: 0000000000005a05 RDI: 0000000000000003
[519015.270393] RBP: 0000558c42291fb0 R08: 0000000000000003 R09: 0000000000010010
[519015.270393] R10: 00007f6cb069bb20 R11: 0000000000000246 R12: 0000000000010000
[519015.270394] R13: 0000000000020060 R14: 0000558c42291fa0 R15: 00007fff87efa0e0
When I restart pvestatd and wait for some timeouts, the node reappears green again in GUI, but this process still hangs and can be killed only by system restart.
Please, can you help me with some tips how to avoid this problem? In case of need, I can supply additional logs of course.
Thanks, have a nice day!
David