Linux guest problems on new Haswell-EP processors

e100 · Nov 19, 2014

We recently upgraded two servers with dual socket Xeon boards.
One server has two E5-2687W v3
The other has two E5-2620 v3

I have three debian wheezy guests, one on the 2687W and two on the 2620 that have had issues.
These guests are currently kicked out of production so they just sit there idle al day.
The only real load is a cron job that runs every few minutes, it makes some http requests and reads/writes some tiny files.

The only clue I have is some kernel message in the guest about jbd2/dm-0-8 being blocked for more than 120 seconds.
I don't have the exact error but was something like "INFO: task jbd2/dm-0-8 blocked for more than 120 seconds."
IO becomes stalled and load keeps rising.
Only way to recover is to stop/start the VM.

Guests worked fine before the upgrade.
The only components changed where CPU/RAM/Motherboard
Still using same RAID card and disks.

Storage is LVM over DRBD.

Oddly no issues with Windows guests, so far.

Any suggestions?

VM config file:

Code:

# cat /etc/pve/qemu-server/107.conf 
bootdisk: virtio0
cores: 1
ide2: none,media=cdrom
memory: 1280
name: XXXXXXXXXXX
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr10
onboot: 1
ostype: l26
sockets: 1
virtio0: vm9-vm10:vm-107-disk-1,cache=directsync,size=3G

Code:

# pveversion -v
proxmox-ve-2.6.32: 3.3-139 (running kernel: 2.6.32-34-pve)
pve-manager: 3.3-5 (running version: 3.3-5/bfebec03)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-12-pve: 2.6.32-68
pve-kernel-2.6.32-19-pve: 2.6.32-96
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-13-pve: 2.6.32-72
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-34-pve: 2.6.32-139
pve-kernel-2.6.32-14-pve: 2.6.32-74
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-18-pve: 2.6.32-88
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-15
qemu-server: 3.3-3
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-25
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

e100 · Nov 19, 2014

Right after posing my message a colleague sent me this screen shot.
As you can see many tasks are stalled, there are no other messages before these, goes from working fine to spitting out these errors with stalled IO.

term · Nov 19, 2014

What do you have the processor type set to? Does changing it help?

e100 · Nov 19, 2014

term said:
What do you have the processor type set to? Does changing it help?

They were all set to Default.
I had the same thought myself and have already changed one of the VMs to Haswell then shutdown and restarted it.

The VMs that have had this problem have run for up to two weeks without any problem so it will take some time to see if that helped.

e100 · Nov 20, 2014

Another user is having similar problems, only on his new hardware that also has Haswell-EP CPUs.

http://tracker.ceph.com/issues/10116

e100 · Nov 23, 2014

More information...

When the IO stalls it does not seem to be a problem with the guest OS, it seems to be an issue with KVM itself.

I tried to reset the VM by pressing reset in the GUI.
VM did not reset.

Then I entered the monitor tab and entered 'help', the response is:

Code:

Type 'help' for help.
# help
ERROR: VM 107 qmp command 'human-monitor-command' failed - unable to connect to VM 107 socket - timeout after 31 retries

Only way to recover is to stop/start the VM.

spirit · Nov 23, 2014

do you have tried with kernel 3.10 ?

e100 · Nov 24, 2014

I have not yet tried the 3.10 kernel, if you think it might help I can surely give it a try.

spirit · Nov 24, 2014

Yes, I think it could help, I known that kvm module in 3.10 have some cpu filtering bugs corrected.
(I have see that mainly on live migration between old and new xeons).

So, try it to compare, maybe it'll work. (Don't have haswell-ep yet to test on my side)

e100 · Dec 3, 2014

So far using the 3.10 kernel seems to have resolved the problem.

e100 · Dec 11, 2014

Spoke too soon.

This problem still occurs on 3.10 but with much less frequency.

KVM itself is hanging, not the guest.
Monitor does not work, cannot perform a reset.
To recover I have to stop, then start the VM.

Is there anything I can do to help track down the source of this problem?

RONIS · Dec 11, 2014

With Intel Xeon E5 2620 v2 we got the same issues with Linux guests.

e100 · Dec 12, 2014

Humm, that is interesting.
We have at least four servers running Xeon E5 v2 CPUs and have not seen this problem with those.

I believe that whatever the problem is its in KVM itself. A race condition of some sort, so I am not shocked to see it happen on other CPUs.

spirit · Dec 12, 2014

Hi,

Can you try to disable apicv,

I have see bug reports about it recently (including rhel7 3.10 kernel), with last xeons processors

# modprobe kvm_intel enable_apicv=N
cat /sys/module/kvm_intel/parameters/enable_apicv to verify

e100 · Dec 13, 2014

Sure, I will try turning off apicv.

I've been playing with various IO options with my new SSDs. When I was testing iothreads if I set cache=directsync I experienced IO stalls. Nearly all of my VMs use directsync. Most likely not related to the issue here but I have set some of my VMs to writethrough to see if it makes a difference.

I've also been having issues with DRBD on 3.10. Seems like the IO scheduler is working very different resulting in timeouts causing DRBD to disconnect.

e100 · Dec 25, 2014

spirit,

Turning off APICv does not resolve the problem.
Any other suggestions? I am completely out of ideas on what might resolve this.

spirit · Dec 25, 2014

e100 said:
spirit,

Turning off APICv does not resolve the problem.
Any other suggestions? I am completely out of ideas on what might resolve this.

I have build a new kernel based on coming rhel 7.1-beta kernel

deb are here:

http://odisoweb1.odiso.net/kernel/

maybe it'll help you ?

Merry Xmas

e100 · Jan 7, 2015

Hi spirit,

I installed the kernel you provided yesterday.
So far no VM lock ups but its not been long enough to conclude the issue is resolved.

But I am concerned that the kernel is spitting out lots of warnings like this:

Code:

[   42.594700] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL
[   42.597066] ib0: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL
[   42.597528] ib1: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL
[   42.599691] ib1: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL
[   42.603214] ib1: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL
[   42.604830] ib1: can't use GFP_NOIO for QPs on device mthca0, using GFP_KERNEL

Those repeat every minute or so.

Seems related to this patch:
http://permalink.gmane.org/gmane.linux.drivers.rdma/20239

I'm no kernel hacker so this is a bit above me but it appears that hardware drivers also need patched to work with the above ipoib patch:

> > mthca are similar to mlx4 and qib does vmalloc() in qib_create_qp()).
> > So this patch needs to be extended to the other 4 IB device drivers in
> > the tree.

http://lkml.org/lkml/2014/4/24/543

My IB cards use the mthca driver

Code:

[    8.190786] ib_mthca: Mellanox InfiniBand HCA driver v1.0 (April 4, 2008)

The changes are to prevent a deadlock, I have been having some issues with DRBD timing out under load on machines running the 3.10 kernel, Wonder if this is related.

e100 · Jan 7, 2015

Shortly after posting this a VM locked up, so this new kernel does not resolve the problem.

wahmed · Jan 7, 2015

e100 said:
Right after posing my message a colleague sent me this screen shot.
As you can see many tasks are stalled, there are no other messages before these, goes from working fine to spitting out these errors with stalled IO.

View attachment 2394

I had hung_task error frequently on one of my E5-2620v2. My problem was tinkering with Infiniband. In my case entire node would lock up and only way to clear was hard reboot. I removed additional IB drivers i installed and have not seen this error for about 2 months now. i am not on the new Kernel 3.

Linux guest problems on new Haswell-EP processors

Renowned Member

Renowned Member

Well-Known Member

Renowned Member

Renowned Member

Renowned Member

Distinguished Member

Renowned Member

Distinguished Member

Renowned Member

Renowned Member

New Member

Renowned Member

Distinguished Member

Renowned Member

Renowned Member

Distinguished Member

Renowned Member

Renowned Member

Famous Member