Hi,
While running PVE 6.4 (kernel 5.4.140-1-pve) on a threadripper system based on AsrockRack TRX40D8-2N2T motherboard we recently lost two SATA SSD drives on kernel write errors, after testing various things (swapping SSD, cables) we found out the issue happens (after a while) only on the two Asmedia ASM1062 SATA ports, not on the other 4 SATA ports of the motherboard which have different controlers.
Looking for similar issue we found out a description that matches our symptoms and hopefully someone developped a kernel patch for it:
https://patchwork.ozlabs.org/project/linux-pci/patch/20210317115924.31885-1-kabel@kernel.org/
This patch backport landed in 5.4.148:
https://elixir.bootlin.com/linux/v5.4.148/source/drivers/pci/quirks.c#L3255
References :
https://bugzilla.kernel.org/show_bug.cgi?id=212695
https://lwn.net/Articles/870006/
I'm announcing the release of the 5.4.148 kernel.
Marek Behún (1): PCI: Restrict ASMedia ASM1062 SATA Max Payload Size Supported
Asmedia ASM1062 is often cited on this forum for various issues, an example similar to ours:
https://forum.proxmox.com/threads/sata-devices-missing-after-update.76204/
I don't know if an updated kernel on pve 6.4 no subscription repo is planned but I'm ready to test it
On our production system on pve-enterprise repo kernel is pve-kernel-5.4.128-1-pve.
On another of our test system we have PVE 7.0 on no subscription repo with kernel 5.11.22 which doesn't have the ASmedia quirk backported in the official kernel source tree (as 5.11 is not a "long term" kernel), I didn't check if proxmox kernel source has the quirk.
While running PVE 6.4 (kernel 5.4.140-1-pve) on a threadripper system based on AsrockRack TRX40D8-2N2T motherboard we recently lost two SATA SSD drives on kernel write errors, after testing various things (swapping SSD, cables) we found out the issue happens (after a while) only on the two Asmedia ASM1062 SATA ports, not on the other 4 SATA ports of the motherboard which have different controlers.
Looking for similar issue we found out a description that matches our symptoms and hopefully someone developped a kernel patch for it:
https://patchwork.ozlabs.org/project/linux-pci/patch/20210317115924.31885-1-kabel@kernel.org/
This patch backport landed in 5.4.148:
https://elixir.bootlin.com/linux/v5.4.148/source/drivers/pci/quirks.c#L3255
References :
https://bugzilla.kernel.org/show_bug.cgi?id=212695
https://lwn.net/Articles/870006/
I'm announcing the release of the 5.4.148 kernel.
Marek Behún (1): PCI: Restrict ASMedia ASM1062 SATA Max Payload Size Supported
Asmedia ASM1062 is often cited on this forum for various issues, an example similar to ours:
https://forum.proxmox.com/threads/sata-devices-missing-after-update.76204/
I don't know if an updated kernel on pve 6.4 no subscription repo is planned but I'm ready to test it
On our production system on pve-enterprise repo kernel is pve-kernel-5.4.128-1-pve.
On another of our test system we have PVE 7.0 on no subscription repo with kernel 5.11.22 which doesn't have the ASmedia quirk backported in the official kernel source tree (as 5.11 is not a "long term" kernel), I didn't check if proxmox kernel source has the quirk.