> 3000 mSec Ping and packet drops with VirtIO under load

As I mentioned in my previous post IDE bus for hard disk works for me, although I had VirtIO SCSI as controller.

I'm currently running with the same workaround: IDE bus for the hard disk and VirtIO SCSI as controller.
 
Hi,
could you try to disable transparent hugepage on host ?
it was disable in proxmox 4 before january 2017, 4.4.x kernel (don't remember exactly which version), and now set to madvise by default

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

(you need to stop/start the vm after this change)
 
Last edited:
Hi aderumier,

both options was on "madvise" on my system as well (whatever this means):

Code:
#cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
#cat /sys/kernel/mm/transparent_hugepage/defrag
always defer [madvise] never

I stopped my machine, set both options to never and changed by disks to scsi again (was IDE before). But the windows guest fails with "no boot device" after already presenting the spinning wheel on blue background and booting a while.
I even restored the VM to an earlier state, but I can't get it to boot anymore. It fails with a blue screen collecting a memory dump and reboots again.
Luckily I did this on my test system, so the production system is still on IDE and it is running fine since the switch.
I can't tell if switching my disks from scsi to ide and back again did any bad. But for me it is not working right now.
 
You can easily go back to the old values by:
Code:
echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
echo madvise > /sys/kernel/mm/transparent_hugepage/defrag
Or simply reboot your host.

My problem is solved with a patch from @wbumiller I asked for an updated pve-qemu-kvm package for you to test. The fix applies to virtio, maybe it also solves your problem.
 
Hi aderumier,

both options was on "madvise" on my system as well (whatever this means):

Code:
#cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
#cat /sys/kernel/mm/transparent_hugepage/defrag
always defer [madvise] never

I stopped my machine, set both options to never and changed by disks to scsi again (was IDE before). But the windows guest fails with "no boot device" after already presenting the spinning wheel on blue background and booting a while.
I even restored the VM to an earlier state, but I can't get it to boot anymore. It fails with a blue screen collecting a memory dump and reboots again.
Luckily I did this on my test system, so the production system is still on IDE and it is running fine since the switch.
I can't tell if switching my disks from scsi to ide and back again did any bad. But for me it is not working right now.

it's 100% unrelated. Note that if you change disk from ide->scsi, scsi->ide, you need to change boot drive each in vm option.
 
it's 100% unrelated. Note that if you change disk from ide->scsi, scsi->ide, you need to change boot drive each in vm option.
Do you mean the transparent_hugepage setting is unrelated to the issue in general, or to the problem with my vm that doesn't boot anymore?
I can't refuse the second assumption, I had not the time yet to investigate any further.
 
Do you mean the transparent_hugepage setting is unrelated to the issue in general, or to the problem with my vm that doesn't boot anymore?
I can't refuse the second assumption, I had not the time yet to investigate any further.
I mean, transparent huge can't impact boot. (maybe windows don't like switch between ide-> scsi, I really don't known).
Transparent hugepage could impact performance only.

BTW, I have build last pve-qemu-kvm with patch for@hansm bug. (which is virtio related, so maybe it could improve performance too)

http://odisoweb1.odiso.net/pve-qemu-kvm_2.9.1-1_amd64.deb
 
I have a question - is anyone seeing i/o issues when using LXC?

If not we'll migrate some systems. We are noticing disk slowdown as time goes on . testing with xfce disk utility as per forum suggestion.
 
I mean, transparent huge can't impact boot. (maybe windows don't like switch between ide-> scsi, I really don't known).
Transparent hugepage could impact performance only.

BTW, I have build last pve-qemu-kvm with patch for@hansm bug. (which is virtio related, so maybe it could improve performance too)

http://odisoweb1.odiso.net/pve-qemu-kvm_2.9.1-1_amd64.deb

I just upgraded pve node and this was installed. then i restarted all kvms
Code:
pve-qemu-kvm (2.9.1-1)

have you tested pve-qemu-kvm (2.9.1-1) ?
 
Last edited:
I just upgraded pve node and this was installed. then i restarted all kvms
Code:
pve-qemu-kvm (2.9.1-1)

have you tested pve-qemu-kvm (2.9.1-1) ?

I can't reproduce the problem from this thread, so I'm blind currently.

Note that my pve-qemu-kvm package 2.9.1-1 is not the same than 2.9.1-1 from proxmox repo. (I have added the patch, but not changed the version number)
 
This thread is very interesting to me.

I have a moderately used 5 node cluster (PVE version 5.0-31 on all nodes) using OVS with 10Gig ethernet to a couple of NFS shared storage hosts.

All my VMs (~25) are configured with virtio disks using the 'virtio scsi single' controller type and all the vnics are virito with multiqueues set to equal the number of vcpus.

I see none of the symptoms mentioned in this thread.

Attached is a snapshot of my busiest host. It shows no IO delay, even though there is a lot of IO on that host.

What makes my setup any different from the people that are having issues?
 

Attachments

  • Screenshot from 2017-09-20 15-13-17.png
    Screenshot from 2017-09-20 15-13-17.png
    101.7 KB · Views: 18
What makes my setup any different from the people that are having issues?
In fact interesting. One difference I see now is that you are using "VirtIO SCSI single" and I haven't even tried that. I was using simply "VirtIO SCSI".
Can you tell the difference? Is it one single port then?
 
In fact interesting. One difference I see now is that you are using "VirtIO SCSI single" and I haven't even tried that. I was using simply "VirtIO SCSI".
Can you tell the difference? Is it one single port then?

virtio-scsi single is 1 virtio-scsi pci controller for 1 scsi disk
virtio-scsi normal controller, is 1 virtio-scsi controller for 7 disks.

The main usage of virtio-scsi single is for enable iothread, as iothread is enabled on controller. (so you have have 1 iothread by disk/controller)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!