Slow Performance and low FSYNCS Rate

FlorianB

New Member
Aug 6, 2016
12
0
1
23
Hello,

I have a fresh setup of Proxmox on my homelab which consists of 10x Poweredge T20 with each a 1TB HDD from Seagate, 32GB Ram and an Intel Xeon E3.

Currently, I've moved from Xenserver to Proxmox and now I'm receiving some low performance. With Xenserver, packages within VM's or on the host itself installed within a few seconds. Now within Proxmox, when I e.g. install apache2, it takes around 1-2 minutes to install ( without download ) which is quite long because in XenServer it took maybe 5-15 seconds.

I'm using Proxmox 4.2-17/e1400248 without any Raid configured. A benchmark looks like this on one of the servers:
Code:
root@proxmox1:~# pveperf
CPU BOGOMIPS:      25542.72
REGEX/SECOND:      2737006
HD SIZE:           879.67 GB (/dev/sda1)
BUFFERED READS:    166.27 MB/sec
AVERAGE SEEK TIME: 19.12 ms
FSYNCS/SECOND:     41.58
DNS EXT:           87.87 ms
DNS INT:           66.89 ms (florianb.lab)

I've already tried switching to VirtIO Network interfaces as well as VirtIo Hard Drives, different Cache types for VM's and similiar. Though the speed within the VM and on the host itself is very slow.

Additionally, I've also seen a lot of IO Wait on my server:
679f0yO.png


BTW: I am running Debian Jessie on these servers and installed Proxmox through your Jessie guide.

Please help me! :-(

Kindest,
Florian
 
Well, an FSYNC rate of 41 is very low, but expected if you use a single, slow disk. To get reasonable speed, you should have an fsync rate > 1000 ...
 
Well, an FSYNC rate of 41 is very low, but expected if you use a single, slow disk. To get reasonable speed, you should have an fsync rate > 1000 ...
The disk itself isn't slow. I've had XenServer running on exact the same server and I've had much better speed on the host itself and within the VM. I've tried to change to AHCI mode in the BIOS now and enabled the Turbo Boost mode of my CPU again. The IO Wait got a bit better, with 60% CPU Load there were 8% IO Wait, don't know whether this is acceptable or not.

Additionally, the disk was made to be used for servers and run 24/7 because it was declared as "Server HDD".

Do you have any idea how I could increase the speed within the VM or on the host itself or on both?

PS: I've got Write; Read; Speeds under XenServer ranging from 150-200MB/s with this HDD, that means this is _not_ a slow HDD.
 
fsync has nothing to do with bandwith.
You test a seq write or read what is not the normal workload of an OS.
And with VM you have more the one OS what makes the same.
 
FlorianB,

First, to get high fsync rates and high performance, get SAS RAID controller with on-board cache memory (and battery backed, of course), such as LSI Logic or Adaptec.
Use SAS HDD, which provides high performance and reliability.
In fact, using cache enabled controllers (or enabling cache on existing controllers, if available) increases performance of disk system by times, in some cases.
Once I've tested SATA SSD drive on Ubuntu with KVM and compared it with my old server, that have Proxmox + LSI RAID (cache enabled) with SAS HDD. On multiple small writes, SATA SSD performance was significant lower, than SAS wich cache.

Second, please do not compare linear read/write "lab test" speed and normal OS/app running conditions when OS services and applications make multiple read/write operations in random places of HDD that cause magnetic head to make many movements from physical disk begin to it middle or even end. All these head movements impact I/O performance and increases I/O Wait value.
And now imagine how many read/writes performs Proxmox host with numbers of VMs. Cache helps VERY MUCH to lower physical read/write disk operations.

Of course, Linux kernel caches disk operations itself. But controller cache much more efficient.

P.S. Here is two fresh tests:

1. Old server in our company (LIS Logic, cache enabled, RAID1 2xSAS disks):
venus:~# pveperf /var/lib/vz
CPU BOGOMIPS: 76797.92
REGEX/SECOND: 776615
HD SIZE: 166.93 GB (/dev/mapper/pve-data)
BUFFERED READS: 102.39 MB/sec
AVERAGE SEEK TIME: 5.98 ms
FSYNCS/SECOND: 1933.95
DNS EXT: 118.40 ms
DNS INT: 0.84 ms (********.ru)

2. New server in one of local education institution (SAS "Server" disk):
root@elefant:~# pveperf /var/lib/vz/
CPU BOGOMIPS: 100818.72
REGEX/SECOND: 1117021
HD SIZE: 2516.60 GB (/dev/mapper/pve-data)
BUFFERED READS: 147.15 MB/sec
AVERAGE SEEK TIME: 13.43 ms
FSYNCS/SECOND: 42.78
DNS EXT: 28.00 ms
DNS INT: 1.00 ms (********.ru)

Old SAS wins.

WBR,
Dmitry, Russia.
 
Last edited:
fsync has nothing to do with bandwith.
You test a seq write or read what is not the normal workload of an OS.
And with VM you have more the one OS what makes the same.
Well, but I don't really get what the issue could be then. Under XenServer everything in the VM ( starting from installation to packages installation ) was much faster in the view of "time to complete like unpack, extract, install, all these apt-get steps ). Currently, I am able to life with that because I'm just using templates now which I clone. And when running a webserver, the performance is even faster than with Xenserver.

FlorianB,

First, to get high fsync rates and high performance, get SAS RAID controller with on-board cache memory (and battery backed, of course), such as LSI Logic or Adaptec.
Use SAS HDD, which provides high performance and reliability.
In fact, using cache enabled controllers (or enabling cache on existing controllers, if available) increases performance of disk system by times, in some cases.
Once I've tested SATA SSD drive on Ubuntu with KVM and compared it with my old server, that have Proxmox + LSI RAID (cache enabled) with SAS HDD. On multiple small writes, SATA SSD performance was significant lower, than SAS wich cache.

Second, please do not compare linear read/write "lab test" speed and normal OS/app running conditions when OS services and applications make multiple read/write operations in random places of HDD that cause magnetic head to make many movements from physical disk begin to it middle or even end. All these head movements impact I/O performance and increases I/O Wait value.
And now imagine how many read/writes performs Proxmox host with numbers of VMs. Cache helps VERY MUCH to lower physical read/write disk operations.

Of course, Linux kernel caches disk operations itself. But controller cache much more efficient.

P.S. Here is two fresh tests:

1. Old server in our company (LIS Logic, cache enabled, RAID1 2xSAS disks):
venus:~# pveperf /var/lib/vz
CPU BOGOMIPS: 76797.92
REGEX/SECOND: 776615
HD SIZE: 166.93 GB (/dev/mapper/pve-data)
BUFFERED READS: 102.39 MB/sec
AVERAGE SEEK TIME: 5.98 ms
FSYNCS/SECOND: 1933.95
DNS EXT: 118.40 ms
DNS INT: 0.84 ms (********.ru)

2. New server in one of local education institution (SAS "Server" disk):
root@elefant:~# pveperf /var/lib/vz/
CPU BOGOMIPS: 100818.72
REGEX/SECOND: 1117021
HD SIZE: 2516.60 GB (/dev/mapper/pve-data)
BUFFERED READS: 147.15 MB/sec
AVERAGE SEEK TIME: 13.43 ms
FSYNCS/SECOND: 42.78
DNS EXT: 28.00 ms
DNS INT: 1.00 ms (********.ru)

Old SAS wins.

WBR,
Dmitry, Russia.

Thanks a lot for your testing and recommendations! Currently this is only my homelab and nothing productional, that means I can't really invest into SAS devices right now or any controller or similiar. Your SAS Disk also has a very low FSYNC rate of 42. How fast is the installation process of virtual machines on that server?

-----

PS: I think the issue could also be caused due to the kernel, is there anything known with the Poweredge T20 in the Xeon edition, were there any issues before?
 
What filesystem do you use on your proxmox servers?
Show: cat /proc/mounts
Code:
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=4103449,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,relatime,size=6570772k,mode=755 0 0
/dev/sda1 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /sys/fs/cgroup tmpfs rw,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset,clone_children 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids,release_agent=/run/cgmanager/agents/cgm-release-agent.pids 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=22,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
tmpfs /run/lxcfs/controllers tmpfs rw,relatime,size=100k,mode=700 0 0
pids /run/lxcfs/controllers/pids cgroup rw,relatime,pids,release_agent=/run/cgmanager/agents/cgm-release-agent.pids 0 0
hugetlb /run/lxcfs/controllers/hugetlb cgroup rw,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb 0 0
perf_event /run/lxcfs/controllers/perf_event cgroup rw,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event 0 0
net_cls,net_prio /run/lxcfs/controllers/net_cls,net_prio cgroup rw,relatime,net_cls,net_prio 0 0
freezer /run/lxcfs/controllers/freezer cgroup rw,relatime,freezer 0 0
devices /run/lxcfs/controllers/devices cgroup rw,relatime,devices 0 0
memory /run/lxcfs/controllers/memory cgroup rw,relatime,memory 0 0
blkio /run/lxcfs/controllers/blkio cgroup rw,relatime,blkio 0 0
cpu,cpuacct /run/lxcfs/controllers/cpu,cpuacct cgroup rw,relatime,cpu,cpuacct 0 0
cpuset /run/lxcfs/controllers/cpuset cgroup rw,relatime,cpuset,clone_children 0 0
name=systemd /run/lxcfs/controllers/name=systemd cgroup rw,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgmfs /run/cgmanager/fs tmpfs rw,relatime,size=100k,mode=755 0 0
lxcfs /var/lib/lxcfs fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
/dev/fuse /etc/pve fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
 
Well, but I don't really get what the issue could be then. Under XenServer everything in the VM ( starting from installation to packages installation ) was much faster in the view of "time to complete like unpack, extract, install, all these apt-get steps ). Currently, I am able to life with that because I'm just using templates now which I clone. And when running a webserver, the performance is even faster than with Xenserver.



Thanks a lot for your testing and recommendations! Currently this is only my homelab and nothing productional, that means I can't really invest into SAS devices right now or any controller or similiar. Your SAS Disk also has a very low FSYNC rate of 42. How fast is the installation process of virtual machines on that server?

-----

PS: I think the issue could also be caused due to the kernel, is there anything known with the Poweredge T20 in the Xeon edition, were there any issues before?
I think there is no issues with the kernel. I have a nubmers of Proxmox installations on different hardware (Intel, AMD) with the same kernel and there is no any problems with performance.

There are SAS "Server" disks, but attached to on-board simple controller without cache, known as FakeRAID.

Installation is really fast, regardless of low FSYNC.
So, another issue may be in VM disk settings. What cache type is set? Should be "Default (no cache)".

WBR,
Dmitry, Russia.
 
root@proxmox3:~# pveperf
CPU BOGOMIPS: 25541.20
REGEX/SECOND: 2693822
HD SIZE: 879.67 GB (/dev/sda1)
BUFFERED READS: 209.80 MB/sec
AVERAGE SEEK TIME: 18.08 ms
FSYNCS/SECOND: 1964.76
DNS EXT: 78.25 ms
DNS INT: 121.25 ms (florianb.lan)

THANKS A LOT! I'm very thankful. The speed of the system increased drastically and the buffered read speed is on all hosts over 200MB/s now!
THANK YOU! This thread can be closed now! :)
 
Please read about what barrier actually is and please get another disk for RAID1, you will not have a lot of fun with this setup if something breaks.
 
setting barrier=0 is perfectly safe on current ext4.
only if the underlying storage can assure that it writes its cache to disk on power failure
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!