Recent content by YAGA

  1. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    Hi Team, After conducting intensive tests with several 4TB 990 Pro SSDs with heatsinks, the issue now seems to be resolved. Here is my functional configuration: Kernel Linux 6.14.5-1-bpo12-pve GRUB_CMDLINE_LINUX_DEFAULT="quiet nvme_core.default_ps_max_latency_us=0 pcie_aspm=off...
  2. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    Hi @SoVeryRural Please could you give us more details on your setup: CPU, MoBo's chipset, AGESA version, PCIe speed, M.2 slot or PCIe adapter, disabled c-states in bios? Regards,
  3. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    My servers also have AMD CPU, Ryzen with x470 chipset in my case. Same here, everything was working fine during months and months but from February issues reappeared Same here, it might be triggered by activity usually during backups
  4. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    @SoVeryRural Interesting I'm not convinced that this failure is only SSD related because I've noticed this failure occurs at the same time, on different nodes and usually 2 ou 3 SSD (OSD in CEPH) are concerned It might be CEPH, communication layer between nodes, hardware issues, kernel...
  5. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    Unfortunately I still see other crashes on the OSD even with Linux 6.14.0-2-pve kernel. :confused:
  6. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    Linux 6.14.0-2-pve fixed this issue. So far so good. But a new one appeared with ceph version 19.2.1 https://forum.proxmox.com/threads/ceph-19-2-1-2-osd-s-experiencing-slow-operations-in-bluestore.164856/
  7. Y

    Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

    Ceph 19.2.2 might fix this issue, please find the change log : https://docs.ceph.com/en/latest/releases/squid/#v19-2-2-squid Regards
  8. Y

    Ceph 19.2.1 2 OSD(s) experiencing slow operations in BlueStore

    Hi Francis, Could you restart with gui or cli the osd concerned by the slow warning message ? Please check again and keep us informed. Regards,
  9. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    We have many points in common: The hardware worked well in a previous software configuration Ceph is used for SSDs, is it a CephRBD volume? Several SSDs and several nodes are affected The failure is random but the more the system is loaded, the faster the failure occurs I noticed that the...
  10. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    Hello, Please could give us more details regarding your setup: #nodes, #ssd per node, HA?, CEPH? and when the failure occurs ? Regards,
  11. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    Hi Team, Before connecting the SSDs with an M.2 adapter in a PCIe slot, I noticed that the disconnection of the SSDs occurs only under very specific conditions. My configuration is based on 4 nodes and 1 qdevice with the latest PVE community updates including Ceph Squid 19.2 : 1 SSD on each...
  12. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    They are inserted into M.2 slots on the MOBO. I'll try to test with a PCIe <-> M.2 adapter inside a PCIe slot.
  13. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    @marcio79 I still have the issue of NVME SSDs disappearing after a few days of operation. This happens during intensive use such as backup. I also updated the BIOS to the latest version including AGESA 1.2.0.C. I get the same errors. I'm stuck, I don't know what else to try. Any advice is...
  14. Y

    [SOLVED] Are SSD NVMe issues resurfacing with the latest PVE kernel upgrade?

    As discussed, I have updated the kernel to version 6.11.11-1-pve on each node. So far, everything is working. I will keep you informed after a few days if this kernel version no longer causes bugs with the NVMe SSDs.