I've had a slower NVME SSD in the machine, on which the most IO demanding containers would choke during backups. I had to IOnice 8 and BWlimit 51200 to get these containers to perform w/o errors during backups.
FStrim (via the script below) ran weekly on this SSD, and it was never a problem for these 3 IO demanding containers in this slower SSD.
In order to run the backups faster, I added another disk. This is a much faster disk (one of the fastest). I moved the 3 IO demanding containers to it. I've been able to increase BWlimit to 179200 (and I am still increasing it on a daily basis. I might not need a limit after all), and there have been no problems during backup.
But last night for the first time the weekly fstrim cronjob ran on the new disk the 3 containers failed to perform (didn't do their job for a good part of the 20 or so minutes it takes for fstrim to run on every container).
There were configuration changes on the machine nor the containers. I am running kernel 5.3.13-2, and current everything, except from a pending reboot to switch to kernel 5.3.13-3.
I am wondering if this has something to do with how I setup LVM-thin on the new SSD? As far as I remember, part of the setup for the old SSD was done via CLI. On the new SSD, I used the GUI for everything. So the two SSDs have some differences. nvme0n1 is the older/slower disk:
Here's CPU & IO Wait during FStrim and Backups (backup as always is done to spinning disks in RAID 10)
lsblk
Why would a much faster disk fail to keep up during fstrim when the slower disk had no problems? Do I need to change something related to the setup of the 2nd disk?
FStrim (via the script below) ran weekly on this SSD, and it was never a problem for these 3 IO demanding containers in this slower SSD.
/usr/sbin/pct list | awk '/^[0-9]/ {print $1}' | while read ct; do /usr/sbin/pct fstrim ${ct} && NOW=$(date +"%Y-%m-%d-%R") && echo -e "$NOW\tTrimming ${ct}"; done
In order to run the backups faster, I added another disk. This is a much faster disk (one of the fastest). I moved the 3 IO demanding containers to it. I've been able to increase BWlimit to 179200 (and I am still increasing it on a daily basis. I might not need a limit after all), and there have been no problems during backup.
But last night for the first time the weekly fstrim cronjob ran on the new disk the 3 containers failed to perform (didn't do their job for a good part of the 20 or so minutes it takes for fstrim to run on every container).
There were configuration changes on the machine nor the containers. I am running kernel 5.3.13-2, and current everything, except from a pending reboot to switch to kernel 5.3.13-3.
I am wondering if this has something to do with how I setup LVM-thin on the new SSD? As far as I remember, part of the setup for the old SSD was done via CLI. On the new SSD, I used the GUI for everything. So the two SSDs have some differences. nvme0n1 is the older/slower disk:
Here's CPU & IO Wait during FStrim and Backups (backup as always is done to spinning disks in RAID 10)
lsblk
Why would a much faster disk fail to keep up during fstrim when the slower disk had no problems? Do I need to change something related to the setup of the 2nd disk?