Slow VMs , hardDisk Lags, load (cpu/mem inside vm good enough)

Mn1sh

New Member
Mar 6, 2016
20
0
1
41
Please check this and advise.

Ceph cluster is running on private network with 10GBit NIC but it seems there is some lag in FSYNCS/SECOND , from what i read in other forum posts, I am suspecting this is the issue.
Please advise any tips to fine tune this. Each sever runs 4x 4TB disk , which are added to Ceph cluster.

root@srv01:~# pveperf /var/lib/ceph/osd/ceph-0
CPU BOGOMIPS: 115206.36
REGEX/SECOND: 2104252
HD SIZE: 3724.20 GB (/dev/sdc1)
BUFFERED READS: 98.32 MB/sec
AVERAGE SEEK TIME: 32.69 ms
FSYNCS/SECOND: 30.02
DNS EXT: 97.55 ms
DNS INT: 115.20 ms (some_domain.tld)
root@srv01:~# pveperf /var/lib/ceph/osd/ceph-1
CPU BOGOMIPS: 115206.36
REGEX/SECOND: 2104382
HD SIZE: 3724.20 GB (/dev/sdd1)
BUFFERED READS: 4.55 MB/sec
AVERAGE SEEK TIME: 47.58 ms
FSYNCS/SECOND: 10.77
DNS EXT: 91.08 ms
DNS INT: 107.45 ms (some_domain.tld)
root@srv01:~#


root@srv01:~# pveperf /var/lib/ceph/osd/ceph-2
CPU BOGOMIPS: 115206.36
REGEX/SECOND: 2369496
HD SIZE: 3724.20 GB (/dev/sde1)
BUFFERED READS: 98.46 MB/sec
AVERAGE SEEK TIME: 29.45 ms
FSYNCS/SECOND: 16.64
DNS EXT: 105.52 ms
DNS INT: 112.23 ms (some_domain.tld)



root@srv01:~# pveperf /var/lib/ceph/osd/ceph-3
CPU BOGOMIPS: 115206.36
REGEX/SECOND: 2137373
HD SIZE: 3724.20 GB (/dev/sdf1)
BUFFERED READS: 54.55 MB/sec
AVERAGE SEEK TIME: 19.26 ms
FSYNCS/SECOND: 23.28
DNS EXT: 118.42 ms
DNS INT: 109.81 ms (some_domain.tld)











root@srv01:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,relatime)
udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=8229893,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,relatime,size=13173540k,mode=755)
/dev/mapper/pve-root on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset,clone_children)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=23,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
/dev/mapper/pve-data on /var/lib/vz type ext4 (rw,relatime,data=ordered)
rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
tmpfs on /run/lxcfs/controllers type tmpfs (rw,relatime,size=100k,mode=700)
name=systemd on /run/lxcfs/controllers/name=systemd type cgroup (rw,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
cpuset on /run/lxcfs/controllers/cpuset type cgroup (rw,relatime,cpuset,clone_children)
cpu,cpuacct on /run/lxcfs/controllers/cpu,cpuacct type cgroup (rw,relatime,cpu,cpuacct)
blkio on /run/lxcfs/controllers/blkio type cgroup (rw,relatime,blkio)
memory on /run/lxcfs/controllers/memory type cgroup (rw,relatime,memory)
devices on /run/lxcfs/controllers/devices type cgroup (rw,relatime,devices)
freezer on /run/lxcfs/controllers/freezer type cgroup (rw,relatime,freezer)
net_cls,net_prio on /run/lxcfs/controllers/net_cls,net_prio type cgroup (rw,relatime,net_cls,net_prio)
perf_event on /run/lxcfs/controllers/perf_event type cgroup (rw,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event)
hugetlb on /run/lxcfs/controllers/hugetlb type cgroup (rw,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
cgmfs on /run/cgmanager/fs type tmpfs (rw,relatime,size=100k,mode=755)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
/dev/sdc1 on /var/lib/ceph/osd/ceph-0 type xfs (rw,noatime,attr2,inode64,noquota)
/dev/sdd1 on /var/lib/ceph/osd/ceph-1 type xfs (rw,noatime,attr2,inode64,noquota)
/dev/sde1 on /var/lib/ceph/osd/ceph-2 type xfs (rw,noatime,attr2,inode64,noquota)
/dev/sdf1 on /var/lib/ceph/osd/ceph-3 type xfs (rw,noatime,attr2,inode64,noquota)
root@srv01:~#
 
I have no experience with Ceph but I had a similar problem after upgrade from 3.4 to 4.x.

low fsyncs/seconds

The point was the barrier option @ fstab

untoucht fstab options
root@proxmox02:~# pveperf /
CPU BOGOMIPS: 127705.12
REGEX/SECOND: 1088824
HD SIZE: 94.37 GB (/dev/dm-0)
BUFFERED READS: 468.59 MB/sec
AVERAGE SEEK TIME: 7.78 ms
FSYNCS/SECOND: 20.10
DNS EXT: 178.42 ms
DNS INT: 0.98 ms

changed fstab options
root@proxmox02:~# pveperf /var/lib/vz
CPU BOGOMIPS: 127705.12
REGEX/SECOND: 1097056
HD SIZE: 3494.91 GB (/dev/mapper/pve-data)
BUFFERED READS: 516.86 MB/sec
AVERAGE SEEK TIME: 11.96 ms
FSYNCS/SECOND: 6125.42
DNS EXT: 179.27 ms
DNS INT: 1.04 ms

# <file system> <mount point> <type> <options> <dump> <pass>
/dev/pve/root / ext3 errors=remount-ro 0 1
/dev/pve/data /var/lib/vz ext3 defaults,barrier=0 0 1
UUID=564edcfa-8baa-445d-8b9c-ad8d8f0721c8 /boot ext3 defaults 0 1
/dev/pve/swap none swap sw 0 0
proc /proc proc defaults 0 0

take a look @ the post from Dietmar
https://forum.proxmox.com/threads/v...irect-single-hdd-ext3-or-4.25733/#post-128902

i hope it helps
 
Last edited:
All the kvm guest/container images are in qcow2 format ..... is that causing any delay? any thoughts
 
changing from none to cache=writeback in qemu guest/vm ... HW->Disk , improved the lag issue to a very good extent.
 
FSYNCS/SECOND did not change much .... strangely though disk lag and load inside guest VMs were better by atleast 5 times.

It was painfully slow earlier and had to wait for 20 to 60 seconds after command input , but after making changes , dont have to wait more than 5 secs for simple commands like , w or df or other simple commands. Even load which was constantly hitting 12 to 13 came down to below 3 and mostly below 2 ( note that our vms are shared hosts with over 500 sites minimum )