High Hard Drive Usage/Low RW Performance in VM Guest Win10

Discussion in 'Proxmox VE: Installation and configuration' started by TheFunk, Jun 7, 2017.

  1. TheFunk

    TheFunk Member

    Joined:
    Oct 25, 2016
    Messages:
    35
    Likes Received:
    3
    Hi all!

    I recently got GPU passthrough working on one of my VMs and I'm delighted to say that everything appears to be working as it should, with one minor exception. I'm noticing some serious slowness in the IO/hard disk department. Sometimes the guest machine appears to be fine. Other times, task manager reports that the disk is at 100% utilization and the IO plummets.

    I'm using FreeNAS as my storage solution with essentially a stripe of mirrors. That's a 12 disk stripe. All matching. Even though it's just a software RAID, I believe I should see some very decent numbers with that kind of working room. I should note that via an SMB share I can transfer into my FreeNAS box from my laptop at roughly 100MBps, rock solid.

    The NAS is available to the Proxmox node via NFS.

    I haven't enabled jumbo frames on my network. Should I?

    I've heard of "hugepages" mentioned before on other forums when this topic has arisen. Can someone give me the 411 on what those are and whether or not I should enable them?

    The host is running Proxmox 5.0 Beta 2 with the latest updates.

    The Win 10 VM specs are as follows with a little commentary:

    8 CPU cores, 1 socket
    8GB RAM (I've heard some people say to try using less RAM, but that kinda defeats the purpose of the machine in my case)
    virtio balloon driver in use (I've heard this sometimes causes problems too.)
    virtio driver for disk
    raw disk image w/ write through cache
    1 passed through GPU
    1 passed through USB port

    Help y'all! I want this baby to purr!
     
  2. t.lamprecht

    t.lamprecht Proxmox Staff Member
    Staff Member

    Joined:
    Jul 28, 2015
    Messages:
    1,138
    Likes Received:
    148
    If your switch supports it, it shouldn't be to hard to setup and could be worth a try.

    Your memory is divided into chunks when assigned to programs, for efficiency. Those chunks are normally 4K big. As modern systems have far more RAM available and programs use also more of this to be faster the 4K has become a bit to small. Modern Hardware and Kernels support something called Huge Pages (or Large Pages) which use bigger chunks, this makes it faster to look up where a exact memory address is and saves some overhead as there are less page table entries. x86_64 can use 2MB chunks (check-able by: `lscpu | grep -ow pse`) or 1GB chunks (`lscpu | grep -ow pdpe1gb`), maybe even some more.
    This should not directly impact IO but rather usage and a bit speed+ regarding memory (maybe @spirit can chip in here, he uses it quite a lot AFAIK). This is more a concern if your VM has quite a bit RAM assigned.

    Can you post also your VM config:
    Code:
    qm config VMID
    Also doing a IO test from the PVE node to the FreeNas would help to see what performance we could expect from the VM.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. fortechitsolutions

    Joined:
    Jun 4, 2008
    Messages:
    326
    Likes Received:
    12
    Hi, small fun question. Are you open to idea of spinning up another VM, possibly Linux of some kind / which is otherwise similar to the windows host (ie, resource allocation, same VM underlying storage etc) ? Reason I ask - is that in my experience, Win10 as an OS is sometimes "inconsistent as hell" in terms of what it is doing. (ie, speaking about client laptop workstations for example - when it gets in their mind, it is time to export patches to a nearby client via secret windows updates protocol; when it is time to do full disk index search and 'background' tasks like this - I can randomly see NIC getting saturated / or DiskIO getting pinned to 100%, or both, more or less without any regard to what the human user may be doing (or not) on the system. So from my perspective, Win10 is a fairly not-ideal platform as a VM guest / when compared to say Win7Pro or even Win8Pro which had less of this kind of random-bad-house-guest-behaviour.

    ie, possibly your issue is to be debugged in the guest acting weird, rather than the Proxmox layer. Hence having a separate VM you can spin up, and then challenge for CPU:NIC:DISK in a controlled manner, might allow you to get some sensible refefrence baseline. Or maybe throw on a temporary Win7Pro VM and see how it behaves. etc etc.

    Just 2 cents worth ..

    Tim
     
  4. TheFunk

    TheFunk Member

    Joined:
    Oct 25, 2016
    Messages:
    35
    Likes Received:
    3
    Thanks all!

    Here's my config file

    Code:
    agent: 1
    bios: ovmf
    bootdisk: scsi0
    cores: 4
    cpu: host
    efidisk0: local-lvm:vm-102-disk-2,size=128K
    hostpci0: 81:00.0,romfile=sapphire.rom,x-vga=on
    keyboard: en-us
    machine: q35
    memory: 12288
    name: Aries
    net0: virtio=32:03:97:11:69:4A,bridge=vmbr0
    numa: 0
    ostype: win10
    scsi0: local-lvm:vm-102-disk-1,backup=0,cache=writethrough,discard=on,iothread=1,size=65G
    scsi1: VMDS:102/vm-102-disk-1.raw,backup=0,cache=writethrough,discard=on,iothread=1,size=500G
    scsihw: virtio-scsi-pci
    smbios1: uuid=a2bee0e0-3f1b-4266-a968-21bd01ac0840
    sockets: 2
    usb0: host=3-1
    
    I spun up a second Win10 host and this time told the host to use local storage on the node (Samsung 840 Evo). I still noticed the atrocious read write speeds. Interestingly, I didn't see this until after I passed through the GPU. Maybe I didn't look for long enough before I passed through the GPU, but what I noticed was that before passing it through, my read/write seemed close to what I would normally expect from this drive, particularly during the OS install.

    Anyway, since I tested this fully locally before adding my second 500GB disk today, we can eliminate the NAS as a source of slowness. The issue is now either with the host or with the guest.

    I will check the phantom update service @fortechitsolutions Thanks for the reminder! I don't know if I disabled that on install or not. I also have an 8.1 guest on this server that I'll test out shortly to see if I notice any improvement there.

    The 12GB of RAM lends me to believe that I should be using hugepages for sure! @spirit any advice?

    The oddest part is that the disk usage is showing at 100% while simultaneously showing extremely low r/w. So it'll say 100% but be sitting at something like 50KBps write speed.

    Thank you @t.lamprecht !
     
  5. TheFunk

    TheFunk Member

    Joined:
    Oct 25, 2016
    Messages:
    35
    Likes Received:
    3
    Just an update,

    I can confirm that I only have this issue on VMs with a passed through GPU. I can also confirm that brand of GPU (AMD or NVIDIA) is a non-factor. I was having issues with VMs not shutting down when passing through a GPU as well. I've since discovered a post over at the Unraid forums that suggested turning on message based interrupts in device manager on my guest for the GPU. Doing so fixed my issue with being unable to reboot/shutdown the VM from within the guest OS. Could something like this also be causing the disk usage issue I'm seeing? It looks like the virtio disks use this form of IRQ by default.

    In other forums, I'd read that on Windows 10 you should use regular IRQs for disk drives to prevent this issue. The virtio disks use message signal IRQs by default. Could I change this setting and see if that helps? Will this break the virtio disk implementation?
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice