Windows guest: high context switch rate when idle

Discussion in 'Proxmox VE 1.x: Installation and configuration' started by escoreal, Feb 17, 2011.

  1. escoreal

    escoreal Member
    Proxmox Subscriber

    Joined:
    Dec 22, 2010
    Messages:
    78
    Likes Received:
    0
    Hello,

    I noticed that running a windows guest causes high context switches on host.

    For example:
    Running one idle windows guest (2003 r2 x86) with one vcpu. Example output of dstat on the host:
    Code:
    ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
    usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
      0   0  99   0   0   0|   0     0 | 434B  704B|   0     0 |2251  6949
    
    Running one idle linux (ubuntu 10.04) guest:
    Code:
    ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
    usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
      0   0 100   0   0   0|   0     0 |3196B 1925B|   0     0 | 119   213
    
    Without any VMs:
    Code:
    ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
    usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
      0   0 100   0   0   0|   0     0 |2185B  344B|   0     0 |  40    20
    
    usbdevice tablet is set off. (tablet: no). This would cause even more context switches.

    PVE Version:
    Code:
    pve-manager: 1.7-11 (pve-manager/1.7/5470)
    running kernel: 2.6.32-4-pve
    proxmox-ve-2.6.32: 1.7-30
    pve-kernel-2.6.32-4-pve: 2.6.32-30
    qemu-server: 1.1-28
    pve-firmware: 1.0-10
    libpve-storage-perl: 1.0-16
    vncterm: 0.9-2
    vzctl: 3.0.24-1pve4
    vzdump: 1.2-10
    vzprocps: 2.0.11-1dso2
    vzquota: 3.0.11-1
    pve-qemu-kvm: 0.13.0-3
    ksm-control-daemon: 1.0-4
    
    With 2.6.35 it get worse (>10k csw).

    The server is an Dell R510 with 2 x Intel Xeon E5620.
    pveperf:
    Code:
    CPU BOGOMIPS:      76804.28
    REGEX/SECOND:      982826
    HD SIZE:           11.00 GB (/dev/sda3)
    BUFFERED READS:    138.16 MB/sec
    AVERAGE SEEK TIME: 6.84 ms
    FSYNCS/SECOND:     2873.95
    DNS EXT:           1047.66 ms
    DNS INT:           0.98 ms (local)
    example VM config:
    Code:
    name: test
    ide2: local:iso/virtio-win-1.1.16.iso,media=cdrom
    vlan0: virtio=4E:42:CF:B7:A3:73
    bootdisk: virtio0
    virtio0: pve_sdc:vm-10106-disk-1
    ostype: w2k3
    memory: 1000
    sockets: 1
    onboot: 0
    cores: 1
    description: test win2003<br>
    boot: c
    freeze: 0
    cpuunits: 1000
    acpi: 1
    kvm: 1
    
    Has anyone an idea how this could be resolved or why on 2.6.35 this get even worse?

    esco

    *edit*

    Here are the different strace results for the guests.
    Windows guest (4 minutes):
    Code:
    #strace -c -p 4013 -p 4020 -p 4022 -p 4024 -p 4025 -p 4026 -p 4027 -p 5307
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     98.43  251.413048         212   1185750    133008 futex
      1.33    3.385304           3   1316741           ioctl
      0.11    0.292721           0    743048           select
      0.09    0.228015       38003         6         5 restart_syscall
      0.03    0.080217         170       473           pwrite
      0.01    0.015842           0   1116955    372610 read
      0.00    0.003526           0    371139           write
      0.00    0.002749           0    371114           rt_sigaction
      0.00    0.002271           0    562989           timer_gettime
      0.00    0.002120           0    372044           timer_settime
      0.00    0.000013           0       578           kill
      0.00    0.000000           0        59           pread
      0.00    0.000000           0       149           writev
      0.00    0.000000           0        10           madvise
      0.00    0.000000           0        46           fdatasync
      0.00    0.000000           0         1           rt_sigpending
      0.00    0.000000           0         1         1 rt_sigtimedwait
    ------ ----------- ----------- --------- --------- ----------------
    100.00  255.425826               6041103    505624 total
    
    Linux guest (4 minutes):
    Code:
    #strace -c -p 3738 -p 3745
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     61.28    0.388065         210      1845           ioctl
     38.54    0.244037           7     35146           select
      0.08    0.000524           0     53749     18439 read
      0.04    0.000250           0     16719           write
      0.03    0.000187           0     16719           rt_sigaction
      0.01    0.000090           0     25118           timer_gettime
      0.01    0.000080           0     16720           timer_settime
      0.00    0.000000           0         4           rt_sigprocmask
      0.00    0.000000           0         2           clone
      0.00    0.000000           0         1           rt_sigpending
      0.00    0.000000           0         1         1 rt_sigtimedwait
      0.00    0.000000           0        16           futex
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.633233                166040     18440 total
    
    Windows has a lot of "futex" calls and more threads (Windows 8, Linux 2). If anyone needs more informations about this problem, don't hesitate to ask!

    esco
     
    #1 escoreal, Feb 17, 2011
    Last edited: Feb 18, 2011
  2. udo

    udo Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 22, 2009
    Messages:
    5,807
    Likes Received:
    158
    Hi,
    even look on one host (2.6.35): with four not total calm win clients (and much more linux) i have a csw-value around 70k (50-90k).
    But the resonding of the VMs are not bad. Must i worry about this?

    Udo
     
  3. escoreal

    escoreal Member
    Proxmox Subscriber

    Joined:
    Dec 22, 2010
    Messages:
    78
    Likes Received:
    0
    Hi Udo,

    if you have only a few windows guest on an oversized host there is no problem about that.

    For example I have an server with lower performance and windows guests that has idle cpu usage of about 30% which have effects on the overall performance and response time of the system.

    And that a kernel update increases this behavior doesn't help..

    esco
     
  4. escoreal

    escoreal Member
    Proxmox Subscriber

    Joined:
    Dec 22, 2010
    Messages:
    78
    Likes Received:
    0
    Hi,

    I made some further tests.

    Monitoring "Context Switches" inside the Windows guest with the "System Monitor" while the system is idle shows an average of about 240 (mininum 191, maximum 429).

    But I found out that stopping the services "SQL Server (ESSENTIALS) and SQL Server Reporing Services (ESSENTIALS)" on the guest the context switches on the host decreases to <1k. But inside the guest this makes no difference in the rate.

    esco

    *edit*
    On another windows guest the same high context switch rate is caused by a java process running WebSphere. Inside the guest the rate is low and on the host it's causing high idle cpu usage. The strace has a lot of "futex", too.

    esco
     
    #4 escoreal, Feb 18, 2011
    Last edited: Feb 25, 2011
  5. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    Hi, i don't know if it's related,

    but with all my windows vm (2003 or 2008), when they are idle, i have 1% cpu use by core on proxmox vm list.

    so if i have a vm with 24cores, when idle, i have 24% cpu use on proxmox stats.

    with 4 cores, 4% use.


    i'm using a 48 core amd host, i have made a top on the host,

    for the 24 cores vm, i had around 800% cpu for the kvm process.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  6. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    i had recheck my windows cpu stats,

    in fact, the only vm on which i had cpu idle problem are all sqlserver server machine... (sql2005 or sql2008).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    i think this is related


    http://www.virtualbox.org/ticket/3613

    "
    - MS SQL server 2008 (express or not), - SQL server 2005 SP3 - TwinCat? - Google Chrome - Or possibly another application which uses 70h interrupts for timing

    virtual box uses ~60-70% of single CPU, while guest OS is almost idle.

    What was expected to happen: Host should be almost idle too.

    Upon further investigation, it appears that these programs use "Real-Time Clock Interrupt", which fires 1024 times per second. "
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. escoreal

    escoreal Member
    Proxmox Subscriber

    Joined:
    Dec 22, 2010
    Messages:
    78
    Likes Received:
    0
    Hi spirit,

    thanks for the further tests and information.

    As a first workaround I reduced the cores of the affected windows guests.

    From guest it looks like that these interrupts (1024) are fired per core and causing the context switches on the host.

    esco
     
    #8 escoreal, Feb 25, 2011
    Last edited: Feb 25, 2011
  9. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    more info in french :

    http://rene.margar.fr/2010/02/reduire-la-consommation-cpu-dune-machine-virtuelle-sous-vmware/

    seem to be cumulative update pack for sql2005 sp3 (CU7) et sql 2008R2 sp1 (CU ?)

    http://support.microsoft.com/kb/972767

    "The SQL Server database engine and SQL Server Reporting Services both use a shared component called SQLOS. SQLOS exposes an internal timer. When the internal timer is set to a 1ms granularity, more power consumption than desired may occur on Windows client computers."



    i'll try to update last CU for sql 2008R2 to see if cpu change.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  10. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  11. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    1 person likes this.
  12. escoreal

    escoreal Member
    Proxmox Subscriber

    Joined:
    Dec 22, 2010
    Messages:
    78
    Likes Received:
    0
    Thanks a lot for that, spirit!

    First tests looks very promising. Especially the response time is better now.

    And for the moment I will ignore the three guest with WebSphere which have the same issue.

    esco
     
  13. cesarpk

    cesarpk Member

    Joined:
    Mar 31, 2012
    Messages:
    770
    Likes Received:
    2
    Hi spirit,

    I have Win Server 2008r2 and SQL-Server 2008 (not R2) into PVE 2.3, in my case will be good change add traceflag -T8038 to sqlserver service?. because the link tell about of Win 2003 and not Win 2008r2 (please see the section "scope" of this link)

    Best regards
    Cesar
     
    #13 cesarpk, Jun 22, 2013
    Last edited: Jun 22, 2013
  14. spirit

    spirit Well-Known Member

    Joined:
    Apr 2, 2010
    Messages:
    3,243
    Likes Received:
    121
    yes,
    http://pve.proxmox.com/wiki/Performance_Tweaks#Trace_Flag_T8038

    since sql2005 sp3, it's use a very fast timer, which send a lot of clock interrupts. (so it's not related to the windows version).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice