IODelay very high since few days

demon_mono

Member
Nov 24, 2020
24
1
8
37
Hy all ,

I have a Hetzner server with Promox since 3-4 years.
Every thing is very nice , with quite ~ 40 LXCrunnings.

pveversion :
root@px ~ # pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.12-1-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-1
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
amd64-microcode: 3.20240820.1~deb12u1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx9
intel-microcode: 3.20240813.1~deb12u1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.2
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.3
pve-edk2-firmware: 4.2023.08-4
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.2-2
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.4
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0



But since few days , I am facing iodelays issue.

I don't know what happened .
1725435792910.png
I ask for a rebooting with a Hard drive check. Every thing seems ok .

But SMART values do not seem ok
1725436006125.png

What do you think ?
 
Hy all ,

I have a Hetzner server with Promox since 3-4 years.
Every thing is very nice , with quite ~ 40 LXCrunnings.

pveversion :
root@px ~ # pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.12-1-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-1
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
amd64-microcode: 3.20240820.1~deb12u1
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx9
intel-microcode: 3.20240813.1~deb12u1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.2
libpve-guest-common-perl: 5.1.4
libpve-http-server-perl: 5.1.0
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.3
pve-edk2-firmware: 4.2023.08-4
pve-firewall: 5.0.7
pve-firmware: 3.13-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.2-2
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.4
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0



But since few days , I am facing iodelays issue.

I don't know what happened .
View attachment 74195
I ask for a rebooting with a Hard drive check. Every thing seems ok .

But SMART values do not seem ok
View attachment 74197

What do you think ?
Hi,
maybe you have a process producing a lot of disk IO? Did you check the output of e.g. iotop?
 
The current check with the smart values seems to be “still” ok. It hasn't been switched off often. Has been running for about 5 years. Raw Read Error yes... that's nothing unusual.

Tell me... how many hard disks are actually installed? Raid level?
 
Last edited:
The current check with the smart values seems to be “still” ok. It hasn't been switched off often. Has been running for about 5 years. Raw Read Error yes... that's nothing unusual.

Tell me... how many hard disks are actually installed? Raid level?
Oh OK thanks for your answer .
My conf is :

Proxmox Server installed on 2 SSD en RAID 1 :
2x SSD SATA 480 GB Datacenter

And 1 SATA HDD for the saves and some LXC
6 TB SATA Enterprise Hard Drive


 
Last edited:
I'm back :)

In fact it seems that for some specific LXC , the dump process ( in fact the tar command inside dump process ) is the issue .
Here is the log of one the bugging LXC dump :

Code:
2024-09-05 02:56:41 INFO: Starting Backup of VM 402 (lxc)
2024-09-05 02:56:41 INFO: status = running
2024-09-05 02:56:41 INFO: backup mode: stop
2024-09-05 02:56:41 INFO: ionice priority: 7
...
2024-09-05 02:56:45 INFO: creating vzdump archive '/VG_6TO_LV_BACKUP/dump/vzdump-lxc-402-2024_09_05-02_56_41.tar.zst'
2024-09-05 04:12:29 INFO: Total bytes written: 27483607040 (26GiB, 5.8MiB/s)
2024-09-05 04:12:29 INFO: archive file size: 20.60GB
...
2024-09-05 04:12:36 INFO: Finished Backup of VM 402 (01:15:55)

Not sure to find why this archive creating is so long

The 401 , made just before is faster :
Code:
2024-09-05 02:55:27 INFO: creating vzdump archive '/VG_6TO_LV_BACKUP/dump/vzdump-lxc-401-2024_09_05-02_55_25.tar.zst'
2024-09-05 02:56:35 INFO: Total bytes written: 13511464960 (13GiB, 190MiB/s)
2024-09-05 02:56:36 INFO: archive file size: 9.93GB
 
It seems that we have a lot things due to kworker
View attachment 74226
Well, the kworker are kernel worker tasks (in your case related to loop devices), so these are most likely not the root cause of your issue.

I'm back :)

In fact it seems that for some specific LXC , the dump process ( in fact the tar command inside dump process ) is the issue .
Here is the log of one the bugging LXC dump :

Code:
2024-09-05 02:56:41 INFO: Starting Backup of VM 402 (lxc)
2024-09-05 02:56:41 INFO: status = running
2024-09-05 02:56:41 INFO: backup mode: stop
2024-09-05 02:56:41 INFO: ionice priority: 7
...
2024-09-05 02:56:45 INFO: creating vzdump archive '/VG_6TO_LV_BACKUP/dump/vzdump-lxc-402-2024_09_05-02_56_41.tar.zst'
2024-09-05 04:12:29 INFO: Total bytes written: 27483607040 (26GiB, 5.8MiB/s)
2024-09-05 04:12:29 INFO: archive file size: 20.60GB
...
2024-09-05 04:12:36 INFO: Finished Backup of VM 402 (01:15:55)

Not sure to find why this archive creating is so long

The 401 , made just before is faster :
Code:
2024-09-05 02:55:27 INFO: creating vzdump archive '/VG_6TO_LV_BACKUP/dump/vzdump-lxc-401-2024_09_05-02_55_25.tar.zst'
2024-09-05 02:56:35 INFO: Total bytes written: 13511464960 (13GiB, 190MiB/s)
2024-09-05 02:56:36 INFO: archive file size: 9.93GB
Is this reproducible when backing up this particular Container? Please share the storage config cat /etc/pve/storage.cfg as well as the container config pct config 402. Is this the container running the postgres database?


Are you sure that the single disk IOPS is not your bottleneck?
And 1 SATA HDD for the saves and some LXC
6 TB SATA Enterprise Hard Drive
 
Last edited:
Are you sure that the single disk IOPS is not your bottleneck?
Yes I think it is . It is the first I'm facing this kind of issue ... I need to see how to add a new disk to my server .
and yes , 402 has a postgres database... Why ?
 
Hello .
I'm back.
I added a new disk 1To , copied the LXC Images to it.
Now the server and service are OK .... but all save from the old 6TO VG are still verly long ...
End for now , I can't explain what happened to my PV/VG/LV.
as any idea ...
If someone had any idea ...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!