VMs hung after backup

fiona · Monday at 13:48

Unfortunately my celebration was premature. It died in the same way again after working fine for four days. This is pretty strange since triggering the failure was very repeatable before the update (even on a freshly booted system), and was repeatably not reproducible immediately following it. This time, I managed to capture the following backtraces of the KVM processes using sysrq. Again, there are other processes that subsequently go into D state after the KVM ones lock up, and I can provide those if useful, but I assume the KVM ones are the most important.

Code:

[386896.658774] sysrq: Show Blocked State
[386896.658831] task:kvm             state:D stack:0     pid:1579  ppid:1      flags:0x00004002
[386896.658835] Call Trace:
[386896.658837]  <TASK>
[386896.658840]  __schedule+0x3fc/0x1440
[386896.658845]  ? blk_mq_flush_plug_list+0x19/0x30
[386896.658849]  schedule+0x63/0x110
[386896.658852]  wait_on_commit+0x95/0xd0 [nfs]
[386896.658887]  ? __pfx_var_wake_function+0x10/0x10
[386896.658891]  __nfs_commit_inode+0x13b/0x190 [nfs]
[386896.658917]  nfs_wb_folio+0xee/0x240 [nfs]
[386896.658942]  nfs_release_folio+0x95/0x150 [nfs]
[386896.658963]  filemap_release_folio+0x6e/0xb0
[386896.658966]  shrink_folio_list+0x59c/0xc30
[386896.658970]  evict_folios+0x2bd/0x660
[386896.658973]  try_to_shrink_lruvec+0x1c8/0x2d0
[386896.658976]  shrink_one+0x123/0x1e0
[386896.658979]  shrink_node+0x9bd/0xc10
[386896.658981]  do_try_to_free_pages+0xe1/0x5f0
[386896.658983]  ? sched_clock_noinstr+0x9/0x10
[386896.658986]  try_to_free_pages+0xe6/0x210
[386896.658988]  __alloc_pages+0x6d9/0x12f0
[386896.658992]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[386896.658994]  alloc_pages+0x90/0x160
[386896.658997]  __get_free_pages+0x11/0x50
[386896.658999]  __pollwait+0x9e/0xe0
[386896.659002]  eventfd_poll+0x2f/0x70
[386896.659005]  do_sys_poll+0x2f7/0x610
[386896.659009]  ? __pfx___pollwait+0x10/0x10
[386896.659012]  ? __pfx_pollwake+0x10/0x10
[386896.659014]  ? __pfx_pollwake+0x10/0x10
[386896.659016]  ? __pfx_pollwake+0x10/0x10
[386896.659018]  ? __pfx_pollwake+0x10/0x10
[386896.659020]  ? __pfx_pollwake+0x10/0x10
[386896.659023]  ? __pfx_pollwake+0x10/0x10
[386896.659025]  ? __pfx_pollwake+0x10/0x10
[386896.659027]  ? __pfx_pollwake+0x10/0x10
[386896.659029]  ? __pfx_pollwake+0x10/0x10
[386896.659031]  ? _copy_from_user+0x2f/0x80
[386896.659033]  ? __pfx_read_tsc+0x10/0x10
[386896.659036]  __x64_sys_ppoll+0xde/0x170
[386896.659038]  do_syscall_64+0x5b/0x90
[386896.659040]  ? syscall_exit_to_user_mode+0x37/0x60
[386896.659042]  ? do_syscall_64+0x67/0x90
[386896.659044]  ? do_syscall_64+0x67/0x90
[386896.659045]  ? do_syscall_64+0x67/0x90
[386896.659046]  ? do_syscall_64+0x67/0x90
[386896.659047]  ? sysvec_call_function_single+0x4b/0xd0
[386896.659049]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[386896.659051] RIP: 0033:0x7f612318b256
[386896.659070] RSP: 002b:00007ffce8298640 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
[386896.659072] RAX: ffffffffffffffda RBX: 000056c7d90e36a0 RCX: 00007f612318b256
[386896.659073] RDX: 00007ffce8298660 RSI: 000000000000000c RDI: 000056c7da56ef70
[386896.659074] RBP: 00007ffce82986cc R08: 0000000000000008 R09: 0000000000000000
[386896.659075] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffce8298660
[386896.659076] R13: 000056c7d90e36a0 R14: 000056c7d6fd7c48 R15: 00007ffce82986d0
[386896.659078]  </TASK>
[386896.659082] task:kvm             state:D stack:0     pid:1820  ppid:1      flags:0x00024002
[386896.659084] Call Trace:
[386896.659085]  <TASK>
[386896.659086]  __schedule+0x3fc/0x1440
[386896.659088]  ? blk_mq_flush_plug_list+0x19/0x30
[386896.659090]  schedule+0x63/0x110
[386896.659091]  wait_on_commit+0x95/0xd0 [nfs]
[386896.659115]  ? __pfx_var_wake_function+0x10/0x10
[386896.659118]  __nfs_commit_inode+0x13b/0x190 [nfs]
[386896.659142]  nfs_wb_folio+0xee/0x240 [nfs]
[386896.659167]  nfs_release_folio+0x95/0x150 [nfs]
[386896.659187]  filemap_release_folio+0x6e/0xb0
[386896.659189]  shrink_folio_list+0x59c/0xc30
[386896.659193]  evict_folios+0x2bd/0x660
[386896.659196]  try_to_shrink_lruvec+0x1c8/0x2d0
[386896.659199]  shrink_one+0x123/0x1e0
[386896.659202]  shrink_node+0x9bd/0xc10
[386896.659204]  do_try_to_free_pages+0xe1/0x5f0
[386896.659206]  try_to_free_pages+0xe6/0x210
[386896.659209]  __alloc_pages+0x6d9/0x12f0
[386896.659211]  ? ktime_get+0x48/0xc0
[386896.659214]  alloc_pages+0x90/0x160
[386896.659216]  __get_free_pages+0x11/0x50
[386896.659219]  __pollwait+0x9e/0xe0
[386896.659221]  eventfd_poll+0x2f/0x70
[386896.659223]  do_sys_poll+0x2f7/0x610
[386896.659228]  ? __pfx___pollwait+0x10/0x10
[386896.659230]  ? __pfx_pollwake+0x10/0x10
[386896.659232]  ? __pfx_pollwake+0x10/0x10
[386896.659235]  ? __pfx_pollwake+0x10/0x10
[386896.659237]  ? __pfx_pollwake+0x10/0x10
[386896.659239]  ? __pfx_pollwake+0x10/0x10
[386896.659241]  ? __pfx_pollwake+0x10/0x10
[386896.659243]  ? __pfx_pollwake+0x10/0x10
[386896.659245]  ? __pfx_pollwake+0x10/0x10
[386896.659247]  ? __pfx_pollwake+0x10/0x10
[386896.659250]  __x64_sys_ppoll+0xde/0x170
[386896.659253]  do_syscall_64+0x5b/0x90
[386896.659254]  ? exit_to_user_mode_prepare+0x39/0x190
[386896.659257]  ? syscall_exit_to_user_mode+0x37/0x60
[386896.659258]  ? do_syscall_64+0x67/0x90
[386896.659260]  ? do_syscall_64+0x67/0x90
[386896.659261]  ? sysvec_reschedule_ipi+0x7a/0x120
[386896.659263]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[386896.659264] RIP: 0033:0x7f95cb98b256
[386896.659268] RSP: 002b:00007ffe6583cd40 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
[386896.659270] RAX: ffffffffffffffda RBX: 000057aa15ebc420 RCX: 00007f95cb98b256
[386896.659271] RDX: 00007ffe6583cd60 RSI: 0000000000000044 RDI: 000057aa16ee9c00
[386896.659271] RBP: 00007ffe6583cdcc R08: 0000000000000008 R09: 0000000000000000
[386896.659272] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe6583cd60
[386896.659273] R13: 000057aa15ebc420 R14: 000057aa14514c48 R15: 00007ffe6583cdd0
[386896.659275]  </TASK>

@fiona I know you have been asking for some backtraces, so perhaps these might mean something to you.

I may have more time to mess around with it in a while. I have a few random ideas to check, mainly NFS volume free space (maybe 4 days of backups brought it down to some critical level?) and running a memory test on the server (it passed fine about a year or two ago).

The trace does indicate it's waiting in functions related to NFS, so yes, please check that the connection and the NFS server itself are working fine. If it's not some space or connection issue, there also is an opt-in 6.8 kernel you could try: https://forum.proxmox.com/threads/o...e-8-available-on-test-no-subscription.144557/

cmeWNQAm · Monday at 22:54

fiona said:
The trace does indicate it's waiting in functions related to NFS, so yes, please check that the connection and the NFS server itself are working fine. If it's not some space or connection issue, there also is an opt-in 6.8 kernel you could try: https://forum.proxmox.com/threads/o...e-8-available-on-test-no-subscription.144557/

One thing I find strange is that these hanging KVM processes are that of the TrueNAS VM, which has nothing stored on NFS (since the NFS is hosted by this VM and doesn't exist yet at its boot time). If something is causing the TrueNAS VM to block waiting on its own NFS server, that would explain the deadlock. Then the question then becomes why are the TrueNAS KVM processes accessing NFS if they have nothing stored on it?

I suppose the next step would be trying to figure out what files those processes are accessing over NFS.

EDIT: Just triggered a lockup again (took copying 150GB of random junk this time), but wasn't able to see any new file opens or open files on /mnt/pve/TrueNAS (the NFS location)

Was watching
strace -t -e trace=open,openat,close,stat,fstat,lstat,connect,accept -p $(cat /var/run/qemu-server/102.pid) -ff -y

And running
lsof -a -p $(cat /var/run/qemu-server/102.pid)
every 10s

Search

Search

VMs hung after backup

fiona

Proxmox Staff Member

cmeWNQAm

New Member