Questionmarks on all VMs / pvestatd blocked

Jun 26, 2023
4
1
3
Hello,

occasionally we are experiencing an issue on various hosts with all VMs and storages being displayed with questionmarks. From our analysis it seems that the pvestatd service is blocked with the following kernel messages being logged. Unfortunately we were unable to kill the process or to recover other than a complete host restart.

Code:
2023-08-28T23:30:55.391180+02:00 vmh08 kernel: [1072167.577334] INFO: task pvestatd:2603603 blocked for more than 1208 seconds.
2023-08-28T23:30:55.391191+02:00 vmh08 kernel: [1072167.577652]       Tainted: P           O       6.2.16-6-pve #1
2023-08-28T23:30:55.391192+02:00 vmh08 kernel: [1072167.577861] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2023-08-28T23:30:55.391192+02:00 vmh08 kernel: [1072167.578065] task:pvestatd        state:D stack:0     pid:2603603 ppid:3244   flags:0x00000006
2023-08-28T23:30:55.391193+02:00 vmh08 kernel: [1072167.578240] Call Trace:
2023-08-28T23:30:55.391193+02:00 vmh08 kernel: [1072167.578411]  <TASK>
2023-08-28T23:30:55.391193+02:00 vmh08 kernel: [1072167.578582]  __schedule+0x402/0x1510
2023-08-28T23:30:55.391194+02:00 vmh08 kernel: [1072167.578752]  schedule+0x63/0x110
2023-08-28T23:30:55.391194+02:00 vmh08 kernel: [1072167.578915]  schedule_preempt_disabled+0x15/0x30
2023-08-28T23:30:55.391195+02:00 vmh08 kernel: [1072167.579077]  __mutex_lock.constprop.0+0x3f8/0x7a0
2023-08-28T23:30:55.391195+02:00 vmh08 kernel: [1072167.579235]  ? __kmem_cache_alloc_node+0x19d/0x340
2023-08-28T23:30:55.391195+02:00 vmh08 kernel: [1072167.579396]  __mutex_lock_slowpath+0x13/0x20
2023-08-28T23:30:55.391196+02:00 vmh08 kernel: [1072167.579550]  mutex_lock+0x3c/0x50
2023-08-28T23:30:55.391196+02:00 vmh08 kernel: [1072167.579705]  smb2_reconnect+0x18f/0x4a0 [cifs]
2023-08-28T23:30:55.391196+02:00 vmh08 kernel: [1072167.579894]  SMB2_open_init+0x72/0xb60 [cifs]
2023-08-28T23:30:55.391197+02:00 vmh08 kernel: [1072167.580074]  smb2_compound_op+0x5ce/0x1fe0 [cifs]
2023-08-28T23:30:55.391197+02:00 vmh08 kernel: [1072167.580259]  smb2_query_path_info+0x1ce/0x400 [cifs]
2023-08-28T23:30:55.391198+02:00 vmh08 kernel: [1072167.580439]  cifs_get_inode_info+0x434/0xc20 [cifs]
2023-08-28T23:30:55.391198+02:00 vmh08 kernel: [1072167.580619]  ? __pfx_smb2_query_path_info+0x10/0x10 [cifs]
2023-08-28T23:30:55.391198+02:00 vmh08 kernel: [1072167.580800]  ? cifs_get_inode_info+0x434/0xc20 [cifs]
2023-08-28T23:30:55.391199+02:00 vmh08 kernel: [1072167.580981]  cifs_revalidate_dentry_attr+0x1a9/0x3e0 [cifs]
2023-08-28T23:30:55.391199+02:00 vmh08 kernel: [1072167.581163]  cifs_getattr+0xbd/0x260 [cifs]
2023-08-28T23:30:55.391200+02:00 vmh08 kernel: [1072167.581353]  vfs_getattr_nosec+0xc8/0x120
2023-08-28T23:30:55.395012+02:00 vmh08 kernel: [1072167.581515]  vfs_statx+0xc4/0x180
2023-08-28T23:30:55.395013+02:00 vmh08 kernel: [1072167.581672]  vfs_fstatat+0x58/0x80
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.581828]  __do_sys_newfstatat+0x44/0x90
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.581991]  __x64_sys_newfstatat+0x1c/0x30
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.582146]  do_syscall_64+0x5b/0x90
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.582301]  ? irqentry_exit_to_user_mode+0x9/0x20
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582455]  ? irqentry_exit+0x43/0x50
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582607]  ? exc_page_fault+0x91/0x1b0
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582760]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582917] RIP: 0033:0x7f3ef32e263a
2023-08-28T23:30:55.395016+02:00 vmh08 kernel: [1072167.583073] RSP: 002b:00007ffce238a688 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
2023-08-28T23:30:55.395016+02:00 vmh08 kernel: [1072167.583232] RAX: ffffffffffffffda RBX: 00005625cdcc52a0 RCX: 00007f3ef32e263a
2023-08-28T23:30:55.395017+02:00 vmh08 kernel: [1072167.583393] RDX: 00005625cdcc54a8 RSI: 00005625ce160480 RDI: 00000000ffffff9c
2023-08-28T23:30:55.395017+02:00 vmh08 kernel: [1072167.583555] RBP: 00005625ce38e730 R08: 00005625d2c33640 R09: 00005625d2a56ef0
2023-08-28T23:30:55.395018+02:00 vmh08 kernel: [1072167.583718] R10: 0000000000000000 R11: 0000000000000246 R12: 00005625ce160480
2023-08-28T23:30:55.395018+02:00 vmh08 kernel: [1072167.583882] R13: 00005625ccf841c2 R14: 0000000000000000 R15: 00007f3ef34e7020
2023-08-28T23:30:55.395018+02:00 vmh08 kernel: [1072167.584048]  </TASK>
 

Attachments

  • Clipboard - 29. August 2023 10_38.png
    Clipboard - 29. August 2023 10_38.png
    86.8 KB · Views: 7
Hello,

occasionally we are experiencing an issue on various hosts with all VMs and storages being displayed with questionmarks. From our analysis it seems that the pvestatd service is blocked with the following kernel messages being logged. Unfortunately we were unable to kill the process or to recover other than a complete host restart.

Code:
2023-08-28T23:30:55.391180+02:00 vmh08 kernel: [1072167.577334] INFO: task pvestatd:2603603 blocked for more than 1208 seconds.
2023-08-28T23:30:55.391191+02:00 vmh08 kernel: [1072167.577652]       Tainted: P           O       6.2.16-6-pve #1
2023-08-28T23:30:55.391192+02:00 vmh08 kernel: [1072167.577861] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2023-08-28T23:30:55.391192+02:00 vmh08 kernel: [1072167.578065] task:pvestatd        state:D stack:0     pid:2603603 ppid:3244   flags:0x00000006
2023-08-28T23:30:55.391193+02:00 vmh08 kernel: [1072167.578240] Call Trace:
2023-08-28T23:30:55.391193+02:00 vmh08 kernel: [1072167.578411]  <TASK>
2023-08-28T23:30:55.391193+02:00 vmh08 kernel: [1072167.578582]  __schedule+0x402/0x1510
2023-08-28T23:30:55.391194+02:00 vmh08 kernel: [1072167.578752]  schedule+0x63/0x110
2023-08-28T23:30:55.391194+02:00 vmh08 kernel: [1072167.578915]  schedule_preempt_disabled+0x15/0x30
2023-08-28T23:30:55.391195+02:00 vmh08 kernel: [1072167.579077]  __mutex_lock.constprop.0+0x3f8/0x7a0
2023-08-28T23:30:55.391195+02:00 vmh08 kernel: [1072167.579235]  ? __kmem_cache_alloc_node+0x19d/0x340
2023-08-28T23:30:55.391195+02:00 vmh08 kernel: [1072167.579396]  __mutex_lock_slowpath+0x13/0x20
2023-08-28T23:30:55.391196+02:00 vmh08 kernel: [1072167.579550]  mutex_lock+0x3c/0x50
2023-08-28T23:30:55.391196+02:00 vmh08 kernel: [1072167.579705]  smb2_reconnect+0x18f/0x4a0 [cifs]
2023-08-28T23:30:55.391196+02:00 vmh08 kernel: [1072167.579894]  SMB2_open_init+0x72/0xb60 [cifs]
2023-08-28T23:30:55.391197+02:00 vmh08 kernel: [1072167.580074]  smb2_compound_op+0x5ce/0x1fe0 [cifs]
2023-08-28T23:30:55.391197+02:00 vmh08 kernel: [1072167.580259]  smb2_query_path_info+0x1ce/0x400 [cifs]
2023-08-28T23:30:55.391198+02:00 vmh08 kernel: [1072167.580439]  cifs_get_inode_info+0x434/0xc20 [cifs]
2023-08-28T23:30:55.391198+02:00 vmh08 kernel: [1072167.580619]  ? __pfx_smb2_query_path_info+0x10/0x10 [cifs]
2023-08-28T23:30:55.391198+02:00 vmh08 kernel: [1072167.580800]  ? cifs_get_inode_info+0x434/0xc20 [cifs]
2023-08-28T23:30:55.391199+02:00 vmh08 kernel: [1072167.580981]  cifs_revalidate_dentry_attr+0x1a9/0x3e0 [cifs]
2023-08-28T23:30:55.391199+02:00 vmh08 kernel: [1072167.581163]  cifs_getattr+0xbd/0x260 [cifs]
2023-08-28T23:30:55.391200+02:00 vmh08 kernel: [1072167.581353]  vfs_getattr_nosec+0xc8/0x120
2023-08-28T23:30:55.395012+02:00 vmh08 kernel: [1072167.581515]  vfs_statx+0xc4/0x180
2023-08-28T23:30:55.395013+02:00 vmh08 kernel: [1072167.581672]  vfs_fstatat+0x58/0x80
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.581828]  __do_sys_newfstatat+0x44/0x90
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.581991]  __x64_sys_newfstatat+0x1c/0x30
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.582146]  do_syscall_64+0x5b/0x90
2023-08-28T23:30:55.395014+02:00 vmh08 kernel: [1072167.582301]  ? irqentry_exit_to_user_mode+0x9/0x20
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582455]  ? irqentry_exit+0x43/0x50
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582607]  ? exc_page_fault+0x91/0x1b0
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582760]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
2023-08-28T23:30:55.395015+02:00 vmh08 kernel: [1072167.582917] RIP: 0033:0x7f3ef32e263a
2023-08-28T23:30:55.395016+02:00 vmh08 kernel: [1072167.583073] RSP: 002b:00007ffce238a688 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
2023-08-28T23:30:55.395016+02:00 vmh08 kernel: [1072167.583232] RAX: ffffffffffffffda RBX: 00005625cdcc52a0 RCX: 00007f3ef32e263a
2023-08-28T23:30:55.395017+02:00 vmh08 kernel: [1072167.583393] RDX: 00005625cdcc54a8 RSI: 00005625ce160480 RDI: 00000000ffffff9c
2023-08-28T23:30:55.395017+02:00 vmh08 kernel: [1072167.583555] RBP: 00005625ce38e730 R08: 00005625d2c33640 R09: 00005625d2a56ef0
2023-08-28T23:30:55.395018+02:00 vmh08 kernel: [1072167.583718] R10: 0000000000000000 R11: 0000000000000246 R12: 00005625ce160480
2023-08-28T23:30:55.395018+02:00 vmh08 kernel: [1072167.583882] R13: 00005625ccf841c2 R14: 0000000000000000 R15: 00007f3ef34e7020
2023-08-28T23:30:55.395018+02:00 vmh08 kernel: [1072167.584048]  </TASK>
Hi,
it seems like pvestatd is blocked by an uninterruptable sleep (performing the IO request) while connection to the SMB share is lost, therefore never getting a response and hanging until reboot.

From the logs it seems like you are using SMB version 2. Please try if the problem persists when setting a higher smbversion in the config or using the default version 3, see also https://pve.proxmox.com/pve-docs/pve-admin-guide.html#storage_cifs
 
Dear Chris,

thanks for your help. We will adjust the SMB version and will monitor everything. Since the problem only occured occasionally it will take some time to see whether the chance has any positive impact or not.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!