Strange I/O lockup

tizbac

New Member
Dec 4, 2023
4
12
3
Hello, we just bought a DL580 Gen9 with 4x 18 core xeon ( now server is turned off, i can turn it on later for exact model)
Raid controller is P830i, i know you strongly advise against using HW raid controller etc, but i bet that at least one third of people here is using actually :) , and we always used no problem
The problem is very "simple"
For example i start migrating a virtual machine from another host that causes a lot of write I/O to allocate thin lvm volume for it ( this is not very smart by the way because you are writing two times, maybe you assuming that storage may be non zeroed, but in case of a thinly provisioned LVM it is )
While that intensive write I/O is taking place ( in our case around 500 mbytes/sec ), you do an hdparm -tT on a logical drive of the same controller, even same where writing is taking place = kernel deadlock.

hdparm will never start, ctrl+C won't work, in some time it will start spamming kernel hung task timeouts etc , in some minutes machine becomes unuseable completely requiring reboot with sysrq.

This happens with 6.5.13 shipped with proxmox 8.1, happens also with 6.7 mainline.
99% it is kernel bug but i want to gather if there are here similiar cases even without using hp smart array, i suspect the 4 socket thing and numa being the trigger.

Edit: we also tested a similiar scenario with FreeBSD 14 and it did not freeze at all, also clonezilla between logical drives didn't freeze, so may be related to lvm too
 
Last edited:
Update:
Problem is reproduced with LVM thin and whole proxmox on USB2.0 to SATA, seems like serious I/O starvation when doing migration with local storage on LVM thin.
The server showed like 90GB of buffer memory usage too, launched hdparm is unkillable

System is 72x
Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz

Ram 512GB


Dmesg:

Code:
[ 1940.801745] INFO: task hdparm:37833 blocked for more than 1087 seconds.
[ 1940.802900]       Tainted: P           O       6.5.11-8-pve #1
[ 1940.803910] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1940.804917] task:hdparm          state:D stack:0     pid:37833 ppid:37161  flags:0x00004006
[ 1940.804929] Call Trace:
[ 1940.804935]  <TASK>
[ 1940.804944]  __schedule+0x3fd/0x1450
[ 1940.804967]  ? blk_mq_run_hw_queue+0x154/0x210
[ 1940.804987]  schedule+0x63/0x110
[ 1940.804993]  io_schedule+0x46/0x80
[ 1940.804999]  folio_wait_bit_common+0x136/0x330
[ 1940.805016]  ? __pfx_wake_page_function+0x10/0x10
[ 1940.805034]  folio_wait_bit+0x18/0x30
[ 1940.805039]  folio_wait_writeback+0x2c/0xa0
[ 1940.805051]  __filemap_fdatawait_range+0x90/0x100
[ 1940.805059]  filemap_fdatawait_keep_errors+0x1e/0x50
[ 1940.805065]  sync_bdevs+0xaf/0x160
[ 1940.805086]  ksys_sync+0x73/0xb0
[ 1940.805107]  __do_sys_sync+0xe/0x20
[ 1940.805114]  do_syscall_64+0x5b/0x90
[ 1940.805129]  ? find_vma_intersection+0x31/0x60
[ 1940.805144]  ? __mm_populate+0xe4/0x190
[ 1940.805165]  ? exit_to_user_mode_prepare+0x39/0x190
[ 1940.805180]  ? syscall_exit_to_user_mode+0x37/0x60
[ 1940.805195]  ? do_syscall_64+0x67/0x90
[ 1940.805201]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Could be related to mq-deadline scheduler too, we are now proceeding to assess stability using ZFS instead of LVM thin
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!