Hello everyone,
i'am using proxmox last 2 years, but i'am facing one little problem and i can't find how to solved it, so i need some help, or some advices about what can i do for this kind of situation.
I'am using proxmox and i have a VM inside, with 16CPU (2sockets, 8 cores), 90G RAM, 4 x HDD 4TB WD Datacenter Gold Edition in RAID 10 and 2x NVMe 512GB for caching. I use ZFS file system RAID10 for 4 disks and my settings are vm-100-disk-1, cache=writeback, iops_rd=30000, iops_rd_max=32000, iops_wr=30000, iops_wr_max=32000, mbps_rd=30, mbps_rd_max=32, mbps_wr=30, mbps_wr_max=32, size=8T. Those 4 hard drives are encrypted by the way.
Also, i'am using CentoS 7.6 and i have some websites and projects on my control panel. The problem is that when i try to upload or restore a big file like 2G or 4G+ etc I/O requests hits peak for no reason and i have high load on server, without any problem on CPU and RAM, but the services are not responding until the upload or restore finished for some minutes, that's a huge problem for me because i don't want my projects to stop working.
In order to dig a bit further into the issue I ran some tests on the disks.
I ran the following test on encrypted LUKS partition:
=====================
This shows that buffered disk reads are coming in at only 12.31 MB/sec.
I had a feeling that the encrypted LUKS filesystem may not be particularly performant, so I compared this with a test on the xfs filesystem:
As you can see, the performance is doubled, but still quite low at 37.32 MB/sec.
Let's also take a look at writes with the following test:
That's only 19 MB/s for writes to the disk which is quite slow.
This makes it so that the disk utilization quickly spikes and is consumed when doing operations that require a heavy amount of disk Input Output.
Also as you can see that my server is regularly hitting high utilization numbers in this sar report:
The column that you want to look at is the last column which is util%
As i can check everything is fine on CentOS system, but i think i'am doing something wrong on Proxmox and the performance of the disks are very bad and slow as you can see with tests.
Any help or advices for proxmox settings are welcomed, thanks for your time.
i'am using proxmox last 2 years, but i'am facing one little problem and i can't find how to solved it, so i need some help, or some advices about what can i do for this kind of situation.
I'am using proxmox and i have a VM inside, with 16CPU (2sockets, 8 cores), 90G RAM, 4 x HDD 4TB WD Datacenter Gold Edition in RAID 10 and 2x NVMe 512GB for caching. I use ZFS file system RAID10 for 4 disks and my settings are vm-100-disk-1, cache=writeback, iops_rd=30000, iops_rd_max=32000, iops_wr=30000, iops_wr_max=32000, mbps_rd=30, mbps_rd_max=32, mbps_wr=30, mbps_wr_max=32, size=8T. Those 4 hard drives are encrypted by the way.
Also, i'am using CentoS 7.6 and i have some websites and projects on my control panel. The problem is that when i try to upload or restore a big file like 2G or 4G+ etc I/O requests hits peak for no reason and i have high load on server, without any problem on CPU and RAM, but the services are not responding until the upload or restore finished for some minutes, that's a huge problem for me because i don't want my projects to stop working.
In order to dig a bit further into the issue I ran some tests on the disks.
I ran the following test on encrypted LUKS partition:
=====================
Code:
hdparm -Tt /dev/sda3
/dev/sda3:
Timing cached reads: 12832 MB in 1.99 seconds = 6442.37 MB/sec
Timing buffered disk reads: 38 MB in 3.09 seconds = 12.31 MB/sec
This shows that buffered disk reads are coming in at only 12.31 MB/sec.
I had a feeling that the encrypted LUKS filesystem may not be particularly performant, so I compared this with a test on the xfs filesystem:
Code:
hdparm -Tt /dev/sda2
/dev/sda2:
Timing cached reads: 11598 MB in 1.99 seconds = 5824.03 MB/sec
Timing buffered disk reads: 112 MB in 3.00 seconds = 37.32 MB/sec
As you can see, the performance is doubled, but still quite low at 37.32 MB/sec.
Let's also take a look at writes with the following test:
Code:
dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 55.9095 s, 19.2 MB/s
That's only 19 MB/s for writes to the disk which is quite slow.
This makes it so that the disk utilization quickly spikes and is consumed when doing operations that require a heavy amount of disk Input Output.
Also as you can see that my server is regularly hitting high utilization numbers in this sar report:
Code:
sar -dp | column -t
Linux 3.10.0-962.3.2.lve1.5.24.8.el7.x86_64 - 01/23/2019 _x86_64_ (16 CPU)
12:00:02 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
12:10:01 AM sdb 35.84 12.99 1065.57 30.09 0.32 8.98 6.19 22.18
12:10:01 AM sda 687.99 17214.39 9997.64 39.55 8.32 12.06 1.14 78.49
12:10:01 AM sdc 100.17 5.00 21273.32 212.42 74.61 744.62 5.80 58.14
12:10:01 AM uks-d113de61-2823-4b76-becf-b67bb8c4a986 764.43 17214.21 9997.64 35.60 85.24 111.47 1.04 79.54
12:10:01 AM centos-root 299.17 16900.24 1211.67 60.54 4.20 14.00 2.62 78.50
12:10:01 AM centos-swap 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:10:01 AM centos-home 448.83 313.97 8786.72 20.28 81.04 180.52 0.58 25.99
12:10:01 AM backup 326.92 5.00 21273.32 65.09 1132.16 3463.07 1.78 58.14
12:20:01 AM sdb 66.84 27.86 1265.58 19.35 1.69 25.24 14.46 96.62
12:20:01 AM sda 497.77 725.88 30242.73 62.21 62.13 124.59 2.01 99.94
12:20:01 AM sdc 8.83 6.73 576.89 66.11 1.68 187.72 3.46 3.05
12:20:01 AM luks-d113de61-2823-4b76-becf-b67bb8c4a986 1159.81 726.24 30242.73 26.70 3717.22 3204.86 0.86 99.96
12:20:01 AM centos-root 39.85 529.83 738.42 31.82 53.19 1333.06 25.03 99.75
12:20:01 AM centos-swap 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12:20:01 AM centos-home 1084.74 196.41 29504.80 27.38 3663.99 3377.69 0.90 98.08
12:20:01 AM backup 35.05 6.73 576.89 16.65 8.81 251.51 0.87 3.05
12:30:26 AM sdb 332.55 80.13 6446.01 19.62 1.28 3.83 2.16 71.84
12:30:26 AM sda 253.49 5061.01 28166.41 131.08 60.90 239.89 2.40 60.80
12:30:26 AM sdc 45.60 987.64 1361.17 51.51 5.55 121.66 12.48 56.91
The column that you want to look at is the last column which is util%
As i can check everything is fine on CentOS system, but i think i'am doing something wrong on Proxmox and the performance of the disks are very bad and slow as you can see with tests.
Any help or advices for proxmox settings are welcomed, thanks for your time.