Hi,
has anything changed with the latest build of Proxmox / Kernel that would of caused high disk latency with AMD Sata Controllers?
We have a 4 node Ceph cluster with 2 x Samsung EVO 4TB 870s in each node , its been running great with fast read/writes for over 2 months, however we were made aware of a problem 2 weeks ago where the VMs slowed to a crawl.
When we looked at the OSDs, they were all showing Apply/Commit latency around 400/400 ms, but not all at the same time. If it was a dodgy disk i would expect one to show high latency.
Im fully aware of the issues with SLC on these drives and my initial thoughts were we were seeing this problem.
However, i have managed to remove all VMs from the SSD pool onto NVMe and local disks so i could troubleshoot it more.
I have so far wiped each OSD, deleted the pool and recreated everything.
As soon as i do anything such as creating a VM disk, the latency spikes up then drops once finished.
Another test i did was remove an OSD, formatted it to LVM and created a VM on it, this works fine so i dont know what is going on.
All i know is it was running fine,we did some updates recently to Proxmox 7.4 and now we are seeing these problems.
Can someone advise if this maybe a kernel issue?
thanks
Ian
has anything changed with the latest build of Proxmox / Kernel that would of caused high disk latency with AMD Sata Controllers?
We have a 4 node Ceph cluster with 2 x Samsung EVO 4TB 870s in each node , its been running great with fast read/writes for over 2 months, however we were made aware of a problem 2 weeks ago where the VMs slowed to a crawl.
When we looked at the OSDs, they were all showing Apply/Commit latency around 400/400 ms, but not all at the same time. If it was a dodgy disk i would expect one to show high latency.
Im fully aware of the issues with SLC on these drives and my initial thoughts were we were seeing this problem.
However, i have managed to remove all VMs from the SSD pool onto NVMe and local disks so i could troubleshoot it more.
I have so far wiped each OSD, deleted the pool and recreated everything.
As soon as i do anything such as creating a VM disk, the latency spikes up then drops once finished.
Another test i did was remove an OSD, formatted it to LVM and created a VM on it, this works fine so i dont know what is going on.
All i know is it was running fine,we did some updates recently to Proxmox 7.4 and now we are seeing these problems.
Can someone advise if this maybe a kernel issue?
thanks
Ian