IO delay 18-27%

bravo0916 · 2025-08-26T07:18:03+0200

Hello Team,

The system IO delay is around 18%-27% and CPU usage is around 4-10%. 5 VM machines are running, but all of them are offline now. But I sitll can see IO delay and CPU usage.
Also, I don't see any specific processes with TOP command as below. All SSDs have not been configured as RAID. All of them are single disk.
But there are many "cp" processes have been running. I don't copy anything... Anyway, I tried to kill all "cp" processes, but I still can see a couple of cp exist.

Tasks: 510 total, 1 running, 509 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.9 sy, 0.0 ni, 81.5 id, 15.8 wa, 0.0 hi, 1.7 si, 0.0 st
MiB Mem : 128472.0 total, 3245.4 free, 2196.7 used, 124278.9 buff/cache
MiB Swap: 8192.0 total, 7304.3 free, 887.7 used. 126275.3 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
237067 root 20 0 0 0 0 S 2.0 0.0 7:49.84 kworker/u64:1+flush-cifs-1
141 root 20 0 0 0 0 S 0.7 0.0 1:22.95 kswapd0
1311 root 20 0 183880 50660 7224 S 0.7 0.0 9:33.75 pvestatd
327438 root 20 0 84948 20692 17492 S 0.7 0.0 0:00.02 smbclient
1356 root 20 0 201896 9236 4076 S 0.3 0.0 0:07.52 pve-ha-lrm
1365 root 20 0 198576 5524 3248 S 0.3 0.0 0:05.98 pvescheduler
242254 root 20 0 7700 2960 1936 D 0.3 0.0 1:15.17 cp
242475 root 20 0 7700 3124 2100 D 0.3 0.0 1:03.02 cp
242681 root 20 0 7700 3112 2088 D 0.3 0.0 1:00.00 cp
242885 root 20 0 7700 3120 2096 D 0.3 0.0 0:56.00 cp
243291 root 20 0 7700 3064 2040 D 0.3 0.0 0:53.86 cp
243494 root 20 0 7700 3064 2040 D 0.3 0.0 1:00.34 cp
244108 root 20 0 7700 3000 1976 D 0.3 0.0 0:49.65 cp
244315 root 20 0 7700 3248 2096 D 0.3 0.0 0:51.08 cp
244720 root 20 0 7700 3116 2092 D 0.3 0.0 0:47.63 cp
244923 root 20 0 7700 2960 1936 D 0.3 0.0 0:50.77 cp
245123 root 20 0 7700 3100 2076 D 0.3 0.0 0:45.60 cp
245528 root 20 0 7700 3024 2000 D 0.3 0.0 0:43.91 cp
245923 root 20 0 7700 2996 1972 D 0.3 0.0 0:46.32 cp
246550 root 20 0 7700 3016 1992 D 0.3 0.0 0:44.39 cp
246958 root 20 0 7700 2988 1964 D 0.3 0.0 0:43.67 cp
247162 root 20 0 7700 3040 2016 D 0.3 0.0 0:41.82 cp
247353 root 20 0 7700 3016 1992 D 0.3 0.0 0:41.65 cp
247551 root 20 0 7700 3132 2108 D 0.3 0.0 0:41.26 cp
247772 root 20 0 7700 3124 2100 D 0.3 0.0 0:43.73 cp
248157 root 20 0 7700 3140 2116 D 0.3 0.0 0:40.30 cp
249215 root 20 0 7700 3124 2100 D 0.3 0.0 1:13.87 cp
250700 root 20 0 7700 3108 2084 D 0.3 0.0 0:38.17 cp

ness1602 · 2025-08-26T07:28:55+0200

Usually delay is corelated to slow disks, you didn't write all info about your platform. SSD type,etc,etc

bravo0916 · 2025-08-26T07:47:50+0200

Hi ness1602. Thanks for your reply. I tried to kill all "cp" processed finally and now IO delay and CPU usage looks good. But I am wondering why so many "cp" processes run on Proxmox even if I did not do any copy tasks. PVE version is 9.03.

cwt · 2025-08-26T08:53:33+0200

I don’t think that the „cp“-processes relate to your local drives. Do you have CIFS shares mounted with sync or even VMs running on them?

UdoB · 2025-08-26T08:54:41+0200

bravo0916 said:
But I am wondering why so many "cp" processes run

ps x would have shown you the complete command line...

bravo0916 · 2025-08-26T09:20:45+0200

Hi cwt and UdoB,

Thanks for your comments. I realized that I added backup script in the crontab. The backup files are supposed to be copied to vmdata which is CIFS mounted. Now, the backup was failed due to backup folder name was changed and I forgot to modify the name in the script. The script and cron worked before. Anyway, I got error mail from Proxmox system. But the cp process should be closed after backup job is failed right? I didn't get the same issue for previous PVE version (8.4.3) . Even if the backup job was failed (copy was failed), I didn't see so many cp processed still run.

UdoB · 2025-08-26T09:25:44+0200

bravo0916 said:
But the cp process should be closed after backup job is failed right?

Unfortunately that depends on the exact error conditions. Some processes hang forever. And then there are unkillable Zombies for example - https://en.wikipedia.org/wiki/Zombie_process

cwt · 2025-08-26T10:37:29+0200

If you didn’t define a timeout in your script, the cp process will stuck in d-state (unkillable). For backups I would prefer rsync or rclone which have timeout mechanisms.

bravo0916 · 2025-08-26T11:28:05+0200

WOW, I understood! I will take a look my script again and put "timout". Also, I will think about rsync or rcolne that you recomemded.

Thank you very much!!

Search

Search

IO delay 18-27%

bravo0916

Member

ness1602

Famous Member

bravo0916

Member

cwt

Renowned Member

UdoB

Distinguished Member

bravo0916

Member

UdoB

Distinguished Member

cwt

Renowned Member

bravo0916

Member

We value your privacy