Hello, we just bought a DL580 Gen9 with 4x 18 core xeon ( now server is turned off, i can turn it on later for exact model)
Raid controller is P830i, i know you strongly advise against using HW raid controller etc, but i bet that at least one third of people here is using actually , and we always used no problem
The problem is very "simple"
For example i start migrating a virtual machine from another host that causes a lot of write I/O to allocate thin lvm volume for it ( this is not very smart by the way because you are writing two times, maybe you assuming that storage may be non zeroed, but in case of a thinly provisioned LVM it is )
While that intensive write I/O is taking place ( in our case around 500 mbytes/sec ), you do an hdparm -tT on a logical drive of the same controller, even same where writing is taking place = kernel deadlock.
hdparm will never start, ctrl+C won't work, in some time it will start spamming kernel hung task timeouts etc , in some minutes machine becomes unuseable completely requiring reboot with sysrq.
This happens with 6.5.13 shipped with proxmox 8.1, happens also with 6.7 mainline.
99% it is kernel bug but i want to gather if there are here similiar cases even without using hp smart array, i suspect the 4 socket thing and numa being the trigger.
Edit: we also tested a similiar scenario with FreeBSD 14 and it did not freeze at all, also clonezilla between logical drives didn't freeze, so may be related to lvm too
Raid controller is P830i, i know you strongly advise against using HW raid controller etc, but i bet that at least one third of people here is using actually , and we always used no problem
The problem is very "simple"
For example i start migrating a virtual machine from another host that causes a lot of write I/O to allocate thin lvm volume for it ( this is not very smart by the way because you are writing two times, maybe you assuming that storage may be non zeroed, but in case of a thinly provisioned LVM it is )
While that intensive write I/O is taking place ( in our case around 500 mbytes/sec ), you do an hdparm -tT on a logical drive of the same controller, even same where writing is taking place = kernel deadlock.
hdparm will never start, ctrl+C won't work, in some time it will start spamming kernel hung task timeouts etc , in some minutes machine becomes unuseable completely requiring reboot with sysrq.
This happens with 6.5.13 shipped with proxmox 8.1, happens also with 6.7 mainline.
99% it is kernel bug but i want to gather if there are here similiar cases even without using hp smart array, i suspect the 4 socket thing and numa being the trigger.
Edit: we also tested a similiar scenario with FreeBSD 14 and it did not freeze at all, also clonezilla between logical drives didn't freeze, so may be related to lvm too
Last edited: