Fresh install of PVE 5.1, IO delay of death

Cecil

Well-Known Member
Sep 22, 2017
54
1
48
44
I completely wiped out my system and started from nothing after I assumed I made a mistake.
I thought because I mixed SSD and SAS drives in LVM it caused my IO issues.

This time I completely left the SSD drives untouched and installed proxmox on the large SAS raid 5 array on the hw controller.

I made a backup of my VM on NAS and set up a dedicated network between the NAS and server on a separate network.

Tried to restore the VM and it starts off really well, 100mb/s transfer and then after 1min or less the IO delay spikes up to 8/9% and the transfer rates drop to 10 or less mb/s.

I tried doing a dd test of the drive just to see what that does (while it was still restoring the vm backup super slowly)
root@pve:~# dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.5917 s, 141 MB/s
root@pve:~# dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.65142 s, 140 MB/s

Doesn't seem to be the raid controller or drive that's the problem, could it be the network drivers/setup?

Any idea's on how I can figure out why the IO delays jump up and the system gets really slow?
*edit* I ran iotop and got some 99.99% hits on ks... something processes, any ideas?
 

Attachments

  • io top.JPG
    io top.JPG
    88.3 KB · Views: 25
I did a ZFS install of 5.1 a couple weeks ago and the I/O has been abysmal. I assume it was something related to ZFS because I have other systems with similar hardware running 4.4 with ext4 and it performs fine. I plan to re-do this 5.1 system as ext4 and see how it goes.
 
Thing is my FS is RAW,because the disks are managed by an LSI raid controller.
The actual throughput on the drives doesn't seem to be the issue, with DD I'm getting good enough speeds that the disks would outperform the LAN by miles.

Whats happening though by the looks of things is when using the LAN port heavily and the disks the entire VE slows to a crawl :(
 
Soo been reading a bunch of stuff on the forum and went to double check the settings on the HW raid controller.. apparently all the cache settings was turned off!

Enabled cache and now it's doing much better.
 

Attachments

  • io delay.JPG
    io delay.JPG
    46.3 KB · Views: 24
  • lan traffic.JPG
    lan traffic.JPG
    38.2 KB · Views: 22
Hi,

please also check the the Controller firmware.
There is maybe a new better working firmware for it.
 
There is an update but IBM is such a bunch of ***** they only have packages for red hat and windows. And for sme reason the updater that should work through the xclarity bios thing isnt accepting the file either:(

I'll have to struggle with that today.
 
There seems to be a whole bunch of treads describing this problem with PVE 5.X in the forum... :(
 
  • Like
Reactions: sneaker15

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!