Performance issue with Proxmox 4 - high IO delay

Discussion in 'Proxmox VE: Installation and configuration' started by alex3137, Feb 12, 2016.

  1. alex3137

    alex3137 Member

    Joined:
    Jan 19, 2015
    Messages:
    43
    Likes Received:
    3
    Hello.

    I was running Proxmox 3.x on my server for a year but I experience slow performances after upgrading to Proxmox 4 (I did a clean install from OVH Proxmox 4 template and restored the containers manually).

    I am hosting basic apps for my own usage (owncloud, seafile, openvpn, emby, sonarr, couchpotato, deluge...). It used to work like a charm in the past and now everything is slow to respond.

    For example, connecting via SSH to the host or a container is unusually long (it takes couple of seconds just to get a reply from the server), connecting to my VPN container can take up to 30 secs when usually it is done in 5 secs, installing a simple package like "tree" can take a minute or two in a container, listing files or auto-completion in bash takes couple of seconds...

    I noted in Proxmox UI a unusual high IO delay. With all the containers stopped, just doing aptitude update on the host raises IO delay between 5 - 10 %. All the containers running iddle, there is a permanent IO delay of 1 to 5 % (used to be 0 all the time with Proxmox 3). Downloading one or two torrents from Deluge causes IO delay to raise to 70% !

    So I don't understand what is happening, IO delay used to be low all the time with Proxmox 3 expect when I was doing backups or copying files (which is an expected behaviour).

    My server is a Kimsufi with an Intel i5 - 4 cores, 16 GB of memory (around 5 GB used at the moment) and 2 TB hard drive.

    Here is the result of pveperf when server is iddle with no container running:

    Code:
    CPU BOGOMIPS:      21331.24
    REGEX/SECOND:      1355841
    HD SIZE:           19.10 GB (/dev/sda2)
    BUFFERED READS:    155.27 MB/sec
    AVERAGE SEEK TIME: 7.15 ms
    FSYNCS/SECOND:     34.23
    DNS EXT:           46.06 ms
    DNS INT:           1005.14 ms (xxxx.me)

    Running "iotop" on the host while torrents are downloading shows processes consuming all the IOs:

    Code:
    Total DISK READ :  5.16 M/s | Total DISK WRITE :  4.09 M/s
    Actual DISK READ:  0.00 B/s | Actual DISK WRITE:  2.42 M/s
      TID  PRIO  USER  DISK READ  DISK WRITE  SWAPIN  IO>  COMMAND
    21322 be/4 messageb  5.16 M/s  2.29 M/s  0.00 % 96.37 % python /usr/bin/deluged --port=58846 --config=/var/lib/deluge/.config/deluge
      574 be/3 root  0.00 B/s  0.00 B/s  0.00 % 92.26 % [jbd2/dm-0-8]
    11111 be/3 root  0.00 B/s  0.00 B/s  0.00 % 88.65 % [jbd2/loop10-8]
    12310 be/4 root  0.00 B/s  61.28 K/s  0.00 % 18.55 % [nfsd]
    12306 be/4 root  0.00 B/s  91.92 K/s  0.00 % 18.52 % [nfsd]
    12307 be/4 root  0.00 B/s  153.20 K/s  0.00 % 18.52 % [nfsd]
    12308 be/4 root  0.00 B/s  76.60 K/s  0.00 % 11.59 % [nfsd]
    12309 be/4 root  0.00 B/s  61.28 K/s  0.00 % 11.58 % [nfsd]
    12313 be/4 root  0.00 B/s  107.24 K/s  0.00 % 10.30 % [nfsd]
    12311 be/4 root  0.00 B/s  107.24 K/s  0.00 %  9.56 % [nfsd]
    12312 be/4 root  0.00 B/s  107.24 K/s  0.00 %  7.10 % [nfsd]
     5043 be/0 root  0.00 B/s 1053.25 K/s  0.00 %  0.75 % [kworker/u17:53]
      208 be/3 root  0.00 B/s  0.00 B/s  0.00 %  0.62 % [jbd2/sda2-8]
      364 be/4 root  0.00 B/s  0.00 B/s  0.00 %  0.02 % [kmmpd-loop5]
    26999 be/0 root  0.00 B/s  3.83 K/s  0.00 %  0.00 % [kworker/u17:6]
    15367 be/4 root  0.00 B/s  3.83 K/s  0.00 %  0.00 % pmxcfs
    31593 be/4 root  0.00 B/s  7.66 K/s  0.00 %  0.00 % rsyslogd -c5 [rs:main Q:Reg]
     1432 be/4 root  0.00 B/s  11.49 K/s  0.00 %  0.00 % pmxcfs
      1 be/4 root  0.00 B/s  0.00 B/s  0.00 %  0.00 % init
      2 be/4 root  0.00 B/s  0.00 B/s  0.00 %  0.00 % [kthreadd]
      3 be/4 root  0.00 B/s  0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
      5 be/0 root  0.00 B/s  0.00 B/s  0.00 %  0.00 % [kworker/0:0H]
      7 be/4 root  0.00 B/s  0.00 B/s  0.00 %  0.00 % [rcu_sched]
      8 be/4 root  0.00 B/s  0.00 B/s  0.00 %  0.00 % [rcu_bh]
    
    I don't know how to identify the root cause, any suggestion ?
     
    Clement87 likes this.
  2. evg32

    evg32 New Member
    Proxmox VE Subscriber

    Joined:
    Jan 4, 2016
    Messages:
    19
    Likes Received:
    0
    When I had an unusual high IO delay (even when all VMs were turned off) I just rebooted my server. That helped me.
     
  3. alex3137

    alex3137 Member

    Joined:
    Jan 19, 2015
    Messages:
    43
    Likes Received:
    3
    #3 alex3137, Feb 12, 2016
    Last edited: Feb 12, 2016
  4. alex3137

    alex3137 Member

    Joined:
    Jan 19, 2015
    Messages:
    43
    Likes Received:
    3
    So I converted all my containers from RAW disks to chroot. IO wait are better, performances have improved but it is still not as good as when I was under OpenVZ. Just have a look at my example below.

    I looked at the backup logs when I was running PVE 3.4 and compared them with the duration from a backup log last night with my server under PVE 4.1. The conainer is 185 GB large:
    • when I was under Proxmox 3.4 / OpenVZ: 03:27:15
    • and Proxmox 4.1 / LXC: 09:24:58 !!!!
    There is definitely something wrong with storage performance and I see other users reporting the same issue on this forum. Can somebody from Proxmox staff give his feedback / recommendation please ? Thanks
     
  5. nixmomo

    nixmomo Member
    Proxmox VE Subscriber

    Joined:
    Sep 13, 2014
    Messages:
    65
    Likes Received:
    2
    Same Problem here...
     
  6. alex3137

    alex3137 Member

    Joined:
    Jan 19, 2015
    Messages:
    43
    Likes Received:
    3
    I regret I upgraded to Proxmox 4. If I had known I would have such performance issues, I would have stayed on version 3... Many users are reporting the same issue and I haven't seen any official reply from Proxmox staff... Don't know if they are just ignoring the problem or too busy to investigate...
     
  7. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    Hi mate,
    I have the same problem as you do. But I have also a proxmox 4 at my house. So I can compare. So let's be clear about that. If it is a driver problem, then it would be a debian problem, if it is a process system problem it would be again debian problem. Proxmox is just an software layout at the top of a debian system.
    But as it turned out, my problem appear after a consistency of data error. I guess your system is not in RAID mode? if it is not then we can't check if your problem is the same for sure but I'm thinking more about a problem in your system files. And by my humble thinking, I don't think that the file system check is 100% accurate about the contents of the data. So the only thing I could suggest you is to uninstall the proxmox package and see if you still have io wait without anything running. If you don't have, then it was a problem in your files and you have to reinstall the whole thing.
    Me i'm backing up all the data, they will excahnge all the parts of my server except the disk and I will see if the io wait is still occuring. if it is then I will have to reinstall the whole thing too.
     
    #7 vigilian, Mar 10, 2016
    Last edited: Mar 10, 2016
  8. windinternet

    windinternet Member

    Joined:
    Oct 8, 2015
    Messages:
    159
    Likes Received:
    7
    Is it really the disk IO that is causing a long backup duration? There can be problems with the suspend mode of backup on LXC containers where the suspension itself can hang for a long time.

    Long wait times on connection are often a sign of DNS problems.
     
  9. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    he should give us the number of seconds between each percentage to be sure. But there are a lot of others causes than the disks would can cause the problem.
     
  10. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,086
    Likes Received:
    470
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  11. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    @fabian I'm not sure that this was an actual good answer to the problem of alex since you only post a link about the decision of the devs to cahnge the default filesystem and not really to the performance and I/O delays of this actual problem.
    Anyway @alex3137 , OVH have replaced my server and apparently it was not a hardware problems and since your and mine problems were encountered after a time of use, to me it's clearly a problem of corruption of the file system. So backup your VM and reinstall your server with the last ovh template proxmox 4.0 and I'm sure that jsut after the installation you won't have any I/O delays. I'm reinstalling now.
     
  12. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,086
    Likes Received:
    470
    The switch from ext3 to ext4 changed a default mount option which does affect I/O performance, which is why I posted it (so that users who are experiencing I/O performance issues can check whether they are using ext4 with barriers and whether they want to turn them off or not, the linked thread provides more details about this topic). If you are using ext4, it is something worth investigating.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  13. Ovidiu

    Ovidiu Member
    Proxmox VE Subscriber

    Joined:
    Apr 27, 2014
    Messages:
    247
    Likes Received:
    2
    @fabian so if we're still on ext3 that is definitely not the problem?
     
  14. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    @fabian ok but apparently his problem was encountered after some time, not directly after installation or upgrade. + I am experiencing the same thing and it was appearing after some time on my ovh server.
    And it does not appear on my home server proxmox. So more likely a file corruption problem. The choice of the ext4 and the barrier don't seem to matter a lot from the benchmarks i've seen. But of course it's always a good idea but what experiencing alex is more like my problem :
    https://forum.proxmox.com/threads/problem-io-delay-after-reconstruction-of-raid.26432/
     
  15. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    so after, some extensive tests of the disk which revealed no errors, a replacement of the RAM, of all electronics parts except the disks(MB,RAM,controller, CPU), and a reinstallation of the system, no more problems of IO delay, it is now in normal range. So it was a corrupted system file certainly.
    Please dev team can you find a way to make a better way to check the contents of the file system? something like a hash of all critical system files?
     
  16. alex3137

    alex3137 Member

    Joined:
    Jan 19, 2015
    Messages:
    43
    Likes Received:
    3
    How do you conclude it was corrupted file system ? Because you are running out of ideas ?

    My server is from a fresh Proxmox 4 install. Don't want to do that again (I have a lot of data to backup and restore) if there is no way to confirm this is actually the issue.

    I am using ext3 by the way.
     
  17. alex3137

    alex3137 Member

    Joined:
    Jan 19, 2015
    Messages:
    43
    Likes Received:
    3
    I encountered this problem the first day I have been using Proxmox 4.

    @fabian could you kindly advise how to troubleshoot ?
     
    Ovidiu likes this.
  18. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    @alex3137 Yes of course by elimination but not all of it . The hardware has been replaced and after a fresh installation everything have been restored with good performance. So clearly it was a corruption of file. Do I have to remember you that fsck don't check the contents of file. So If there is a corruption occuring on several disk of a raid, it will be repeatedly copied so you should clearly do that. And since, normally you didn't configured much files in the proxmox system itself(because normally we don't have to, especially with the ovh installation process), the restore process of each vm and its configuration file will be fine. But you will maybe have the same problem as me, you will have to restore 2 times the same vm.

    There is no way to troubleshoot a corruption only if you would have an hash of the whole system files which was the same version as you and with all the modifications that you wanted or to check all the contents of every system file.
     
  19. vigilian

    vigilian Member

    Joined:
    Oct 9, 2015
    Messages:
    82
    Likes Received:
    1
    @alex3137 you could also pm me to answer you more rapidly here or by private message.
     
  20. Philipp Page

    Philipp Page New Member

    Joined:
    Apr 26, 2016
    Messages:
    28
    Likes Received:
    0
    I have the same problem with the io delay, even after a fresh reinstall with proxmox 4.2 template of ovh it does not work. My lxc vms are in raw format. I read changing this to chroot helps, how to do that? If i backup and restore the vm in proxmox webinteface, i cannot choose the format.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice