[SOLVED] VM machine pauses and hangs

YellowShed

Member
Aug 7, 2014
18
2
23
Hello,

I'm having a weird issue that I can't seem to find an answer for and not sure what search terms to google for to try and find any solutions.

I have two HP Proliant Gen 8 machines with dual CPUs and 64GB of RAM each. I have set the onboard controller to configure the disks as raid 1+0. The SATA controller has one SSD for the Proxmox system and a spinning 2TB disk split into two for more local and slower storage. I have gone through and set the Proxmox servers up as dual primary running drbd over the SAS drives and also the two halves of the 2TB HDD (3 drbd resources in total).

After installing a mix of linux and Windows machines, I am seeing pauses/hangs from the virtual machines, especially after more activity from users. Essentially, I have three windows 7 VMs, one win server 2012 a debian-based OMV server and a zentyal machine. The pauses/hangs even affect rlogin sessions to the Proxmox boxes. I do not see high CPU usage on either box just after they un-freeze, nor excessive IO wait etc. No indication of anything wrong.

I have read all I can about best practices for the virtual machines, and am running DirectSync on the vm disks. All machines run virtio for the disks.

Today, I stopped all virtual machines, stopped the drbd service and re-enable barriers md and disk write flushes, then restarted the drbd service and the VMs - but there was no improvement - still seems as bad as ever.

I have two dedicated 10Gb NICs for the DRBD replication and they are hard wired and configured in a bond0 with "balance-rr" round-robin mode, and separate NIcs for the network access to the guest machines.

Any help would be gratefully received and thanks for reading,

YellowShed
 
Last edited:
Hello,

please try to disable the balance-rr mode back to active-backup.
Also disable the Power-Management-Features on every Windows machine.

I have 3 customers with a 10gig drbd setup.

good luck
mac
 
Hi, macday. Thanks for the quick reply.

I will change the config as you suggested, and report back.

YellowShed
 
Hi there. Seems much better. Response seems to be good as much as I ahve been able to assess it this afternoon.

I wonder if I should have configured the sysconf net re-ordering parameters with the balance-rr to allow higher number of re-ordering tcp packets. Seems a shame not to be able to configure the two 10GB cards to be more than just use one with the other as a spare in case there's a problem. Would any of the higher modes be useful? Is there a way to configure the NIcs to be send one one and receive on the other - kind of defeats the duplex nature, I agree. Oh well, active-spare it is for the moment.

Thanks for the pointer.

Do I now go back to no-barriers, no-md-flushes, no-disk-flushes?

Thanks again for the help,

YellowShed
 
Hi again. (Been away on holiday, so no chance to test and report back until now).

Thanks to macday for the pointer on the ethernet issue. I wouldn't have thought to look there. Anyway, guest response seems pretty good and no more hangs/freezes. I have re-implimented the no-barriers etc as well.

Now just looking into Transparent Huge Pages for a performance improvement. I have googled and there is some information about this but not much directly for Proxmox. Any ideas on what settings would be good for a couple of Win 2k12 servers, some Win 7 guests and a few Ubuntu-based VMs? I have 64GB RAM and 2x4core processors on each Proxmox server.

Maybe I'll start a new thread...

YellowShed
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!