IO delay 18-27%

bravo0916

Member
Jan 29, 2024
52
5
8
Hello Team,

The system IO delay is around 18%-27% and CPU usage is around 4-10%. 5 VM machines are running, but all of them are offline now. But I sitll can see IO delay and CPU usage.
Also, I don't see any specific processes with TOP command as below. All SSDs have not been configured as RAID. All of them are single disk.
But there are many "cp" processes have been running. I don't copy anything... Anyway, I tried to kill all "cp" processes, but I still can see a couple of cp exist.

Tasks: 510 total, 1 running, 509 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.9 sy, 0.0 ni, 81.5 id, 15.8 wa, 0.0 hi, 1.7 si, 0.0 st
MiB Mem : 128472.0 total, 3245.4 free, 2196.7 used, 124278.9 buff/cache
MiB Swap: 8192.0 total, 7304.3 free, 887.7 used. 126275.3 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
237067 root 20 0 0 0 0 S 2.0 0.0 7:49.84 kworker/u64:1+flush-cifs-1
141 root 20 0 0 0 0 S 0.7 0.0 1:22.95 kswapd0
1311 root 20 0 183880 50660 7224 S 0.7 0.0 9:33.75 pvestatd
327438 root 20 0 84948 20692 17492 S 0.7 0.0 0:00.02 smbclient
1356 root 20 0 201896 9236 4076 S 0.3 0.0 0:07.52 pve-ha-lrm
1365 root 20 0 198576 5524 3248 S 0.3 0.0 0:05.98 pvescheduler
242254 root 20 0 7700 2960 1936 D 0.3 0.0 1:15.17 cp
242475 root 20 0 7700 3124 2100 D 0.3 0.0 1:03.02 cp
242681 root 20 0 7700 3112 2088 D 0.3 0.0 1:00.00 cp
242885 root 20 0 7700 3120 2096 D 0.3 0.0 0:56.00 cp
243291 root 20 0 7700 3064 2040 D 0.3 0.0 0:53.86 cp
243494 root 20 0 7700 3064 2040 D 0.3 0.0 1:00.34 cp
244108 root 20 0 7700 3000 1976 D 0.3 0.0 0:49.65 cp
244315 root 20 0 7700 3248 2096 D 0.3 0.0 0:51.08 cp
244720 root 20 0 7700 3116 2092 D 0.3 0.0 0:47.63 cp
244923 root 20 0 7700 2960 1936 D 0.3 0.0 0:50.77 cp
245123 root 20 0 7700 3100 2076 D 0.3 0.0 0:45.60 cp
245528 root 20 0 7700 3024 2000 D 0.3 0.0 0:43.91 cp
245923 root 20 0 7700 2996 1972 D 0.3 0.0 0:46.32 cp
246550 root 20 0 7700 3016 1992 D 0.3 0.0 0:44.39 cp
246958 root 20 0 7700 2988 1964 D 0.3 0.0 0:43.67 cp
247162 root 20 0 7700 3040 2016 D 0.3 0.0 0:41.82 cp
247353 root 20 0 7700 3016 1992 D 0.3 0.0 0:41.65 cp
247551 root 20 0 7700 3132 2108 D 0.3 0.0 0:41.26 cp
247772 root 20 0 7700 3124 2100 D 0.3 0.0 0:43.73 cp
248157 root 20 0 7700 3140 2116 D 0.3 0.0 0:40.30 cp
249215 root 20 0 7700 3124 2100 D 0.3 0.0 1:13.87 cp
250700 root 20 0 7700 3108 2084 D 0.3 0.0 0:38.17 cp
 
Last edited:
Hi ness1602. Thanks for your reply. I tried to kill all "cp" processed finally and now IO delay and CPU usage looks good. But I am wondering why so many "cp" processes run on Proxmox even if I did not do any copy tasks. PVE version is 9.03.

1756187205602.png
1756187249997.png
 
I don’t think that the „cp“-processes relate to your local drives. Do you have CIFS shares mounted with sync or even VMs running on them?
 
Hi cwt and UdoB,

Thanks for your comments. I realized that I added backup script in the crontab. The backup files are supposed to be copied to vmdata which is CIFS mounted. Now, the backup was failed due to backup folder name was changed and I forgot to modify the name in the script. The script and cron worked before. Anyway, I got error mail from Proxmox system. But the cp process should be closed after backup job is failed right? I didn't get the same issue for previous PVE version (8.4.3) . Even if the backup job was failed (copy was failed), I didn't see so many cp processed still run.
 
Last edited:
  • Like
Reactions: UdoB
If you didn’t define a timeout in your script, the cp process will stuck in d-state (unkillable). For backups I would prefer rsync or rclone which have timeout mechanisms.
 
  • Like
Reactions: UdoB
WOW, I understood! I will take a look my script again and put "timout". Also, I will think about rsync or rcolne that you recomemded.

Thank you very much!!