Strange I/O lockup

tizbac · Mar 2, 2024

Hello, we just bought a DL580 Gen9 with 4x 18 core xeon ( now server is turned off, i can turn it on later for exact model)
Raid controller is P830i, i know you strongly advise against using HW raid controller etc, but i bet that at least one third of people here is using actually

, and we always used no problem
The problem is very "simple"
For example i start migrating a virtual machine from another host that causes a lot of write I/O to allocate thin lvm volume for it ( this is not very smart by the way because you are writing two times, maybe you assuming that storage may be non zeroed, but in case of a thinly provisioned LVM it is )
While that intensive write I/O is taking place ( in our case around 500 mbytes/sec ), you do an hdparm -tT on a logical drive of the same controller, even same where writing is taking place = kernel deadlock.

hdparm will never start, ctrl+C won't work, in some time it will start spamming kernel hung task timeouts etc , in some minutes machine becomes unuseable completely requiring reboot with sysrq.

This happens with 6.5.13 shipped with proxmox 8.1, happens also with 6.7 mainline.
99% it is kernel bug but i want to gather if there are here similiar cases even without using hp smart array, i suspect the 4 socket thing and numa being the trigger.

Edit: we also tested a similiar scenario with FreeBSD 14 and it did not freeze at all, also clonezilla between logical drives didn't freeze, so may be related to lvm too

tizbac · Mar 5, 2024

Update:
Problem is reproduced with LVM thin and whole proxmox on USB2.0 to SATA, seems like serious I/O starvation when doing migration with local storage on LVM thin.
The server showed like 90GB of buffer memory usage too, launched hdparm is unkillable

System is 72x
Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz

Ram 512GB

Dmesg:

Code:

[ 1940.801745] INFO: task hdparm:37833 blocked for more than 1087 seconds.
[ 1940.802900]       Tainted: P           O       6.5.11-8-pve #1
[ 1940.803910] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1940.804917] task:hdparm          state:D stack:0     pid:37833 ppid:37161  flags:0x00004006
[ 1940.804929] Call Trace:
[ 1940.804935]  <TASK>
[ 1940.804944]  __schedule+0x3fd/0x1450
[ 1940.804967]  ? blk_mq_run_hw_queue+0x154/0x210
[ 1940.804987]  schedule+0x63/0x110
[ 1940.804993]  io_schedule+0x46/0x80
[ 1940.804999]  folio_wait_bit_common+0x136/0x330
[ 1940.805016]  ? __pfx_wake_page_function+0x10/0x10
[ 1940.805034]  folio_wait_bit+0x18/0x30
[ 1940.805039]  folio_wait_writeback+0x2c/0xa0
[ 1940.805051]  __filemap_fdatawait_range+0x90/0x100
[ 1940.805059]  filemap_fdatawait_keep_errors+0x1e/0x50
[ 1940.805065]  sync_bdevs+0xaf/0x160
[ 1940.805086]  ksys_sync+0x73/0xb0
[ 1940.805107]  __do_sys_sync+0xe/0x20
[ 1940.805114]  do_syscall_64+0x5b/0x90
[ 1940.805129]  ? find_vma_intersection+0x31/0x60
[ 1940.805144]  ? __mm_populate+0xe4/0x190
[ 1940.805165]  ? exit_to_user_mode_prepare+0x39/0x190
[ 1940.805180]  ? syscall_exit_to_user_mode+0x37/0x60
[ 1940.805195]  ? do_syscall_64+0x67/0x90
[ 1940.805201]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Could be related to mq-deadline scheduler too, we are now proceeding to assess stability using ZFS instead of LVM thin

rj45 · Mar 5, 2024

tizbac said:
i know you strongly advise against using HW raid controller

What ?! Please use your hwraid and you get huge io increase..

Search

Search

Strange I/O lockup

tizbac

New Member

tizbac

New Member

rj45

Member