Extremely High IOWAIT times... Tracking down to a VM/Process?

oeginc

Member
Mar 21, 2009
133
0
16
So I have a machine (Quad core w/8GB RAM, VM's on a RAID-10 w/4 7200 RPM drives formatted as EXT4) running ProxMox v1.5 and about 30 VM's.

Top shows:

top - 10:19:35 up 2 days, 1:31, 4 users, load average: 18.07, 18.91, 18.25
Tasks: 792 total, 4 running, 788 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.8%us, 2.6%sy, 0.1%ni, 42.3%id, 48.5%wa, 0.2%hi, 0.5%si, 0.0%st
Mem: 7634544k total, 6231280k used, 1403264k free, 341432k buffers
Swap: 7340024k total, 2804k used, 7337220k free, 3037444k cached

iostat -d -x -m shows:

Linux 2.6.18-2-pve (server01) 03/11/10 _x86_64_

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 42.70 549.77 181.54 133.10 6.55 2.68 60.04 12.94 41.13 2.76 86.82
sda1 42.70 549.77 181.54 133.10 6.55 2.68 60.04 12.94 41.13 2.76 86.82
sdb 0.12 3.03 0.33 0.73 0.01 0.01 40.65 0.00 4.10 1.52 0.16
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 15.73 0.00 3.26 3.19 0.00
sdb2 0.11 3.03 0.33 0.73 0.01 0.01 40.67 0.00 4.10 1.52 0.16
sdc 15.70 1334.08 11.80 149.48 0.11 5.88 76.06 10.98 68.07 0.73 11.74
sdc1 15.70 1334.08 11.80 149.48 0.11 5.88 76.06 10.98 68.07 0.73 11.74
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 5.83 1.69 0.00
dm-1 0.00 0.00 0.43 3.76 0.01 0.01 10.28 0.10 22.71 0.38 0.16
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 7.99 0.00 5.35 1.85 0.00
dm-3 0.00 0.00 27.94 1505.52 0.11 5.88 8.00 21.73 14.15 0.08 11.74
dm-4 0.00 0.00 224.70 685.65 6.55 2.68 20.75 0.95 27.49 0.95 86.82

ps ax | awk '$3 ~ /^D/ { print $0 }' shows:

4134 ? D 0:04 imap
6527 ? D< 10:02 [kjournald2]
9000 ? D 0:00 local -t unix
13131 ? D 0:00 smtp -t unix -u -c
14843 ? D 11:08 qmgr -l -t fifo -u
25269 ? D 0:00 cleanup -z -t unix -u -c
26776 ? D 0:00 cleanup -z -t unix -u -c
27364 ? D 0:00 cleanup -z -t unix -u -c
27458 ? D 0:19 /usr/bin/updatedb.mlocate
28977 ? D 0:00 smtp -t unix -u -c
29467 ? D 0:00 /usr/sbin/apache2 -k start

So what this shows is that ~50% of my time is spent in IOWAIT, and my /dev/sda "drive" (which is the RAID-10 VM drive) is 86% - 100% utilized. The 2 processes that standout as using the most time are kjournald2 and qmgr.

The question is, is there some way to track down WHICH VM processes belong to?? I am running several mail server VM's and it'd be nice to know which qmgr is freaking out...

The other question is, is there a more fine-grain way to track down which PROCESSES are using the most disk I/O per second?
 
Hi,
this looks like a slow raid-controller, or driver. Or your Harddisk are to slow.
What is the output of
Code:
pveperf /var/lib/vz
Udo

Well, that output is actually invalid because the system is overloaded... ;)

Before we went live, I did do some testing and this is what I had.

Bonnie++ was reporting 130MB/sec sequential writes and 265MB/sec sequential reads
and
Dbench with 10 threads reported 207MB/sec throughput

Also, I know I need a faster RAID controller and to add more spindles to the RAID, I was looking for a way of tracking down which VM was responsible. This information could go a long way towards balancing out our clusters... CPU is hardly being utilized, but I/O is killing us.
 
Hi,
perhaps you find the process with atop (apt-get install atop). You see the disk read and write of the processes.

Udo

# apt-get update
# apt-get install atop
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Couldn't find package atop

(default v1.5 installation)

# cat /etc/apt/sources.list

deb http://ftp.us.debian.org/debian lenny main contrib
deb http://download.proxmox.com/debian lenny pve
deb http://security.debian.org/ lenny/updates main contrib
 
I found it here:
http://packages.debian.org/lenny/amd64/atop/download

downloaded, and installed it.. I don't quite understand it yet, but this is what I get.

# atop -d

Code:
ATOP - server01           2010/03/11  18:35:39           208034 seconds elapsed
PRC | sys  92m14s | user  81m21s | #proc    776 | #zombie    0 | #exit      4 |
CPU | sys     11% | user     26% | irq       3% | idle    174% | wait    186% |
cpu | sys      3% | user      7% | irq       1% | idle     40% | cpu000 w 50% |
cpu | sys      2% | user      6% | irq       2% | idle     53% | cpu003 w 37% |
cpu | sys      3% | user      7% | irq       0% | idle     40% | cpu002 w 50% |
cpu | sys      3% | user      7% | irq       0% | idle     41% | cpu001 w 50% |
CPL | avg1   1.56 | avg5    1.60 | avg15   1.99 | csw 917573e3 | intr 45173e4 |
MEM | tot    7.3G | free  495.6M | cache   3.7G | buff  472.6M | slab  608.7M |
SWP | tot    7.0G | free    7.0G |              | vmcom   6.0G | vmlim  10.6G |
PAG | scan 5643e5 | stall      0 |              | swin     157 | swout    807 |
DSK |         sda | busy     86% | read 32993e3 | write 3069e4 | avio    2 ms |
DSK |         sdc | busy     10% | read 2105151 | write 2666e4 | avio    0 ms |
DSK |         sdb | busy      0% | read   66385 | write 442134 | avio    0 ms |
NET | transport   | tcpi 1614768 | tcpo 2374925 | udpi   26697 | udpo   26722 |
NET | network     | ipi 207029e3 | ipo 207749e3 | ipfrw 2053e5 | deliv 1643e3 |
NET | eth0     0% | pcki 38505e3 | pcko 49094e3 | si  226 Kbps | so 1366 Kbps |
NET | venet0 ---- | pcki 16877e4 | pcko 15901e4 | si 4069 Kbps | so 2974 Kbps |
NET | vmbr0  ---- | pcki 38419e3 | pcko 49103e3 | si  204 Kbps | so 1366 Kbps |
NET | lo     ---- | pcki   68012 | pcko   68012 | si    2 Kbps | so    2 Kbps |
                *** system and process activity since boot ***
  PID     RDDSK    WRDSK  WRDSK_CANCEL                       DSK CMD     1/36
    1    745.8G   824.1G        198.5M                       64% init
28713    200.9G   157.9G            0K                       15% bash
16982     13.3G   100.1G            0K                        5% dovecot
 8440     34.4G    46.9G        345.6M                        3% init
 6935     76.2G     392K         98.4M                        3% sshd
10373     28.5G    10.4G         4628K                        2% cron
14788      3.2G    32.1G        976.2M                        1% master
31713      7.2G    17.1G        785.0M                        1% mysqld
14843     16.3G     128K          2.1G                        1% qmgr
14528    10896K    15.8G            0K                        1% mailgraph
 6527        0K    15.4G            0K                        1% kjournald2
16812      1.7G    11.0G        18052K                        1% master
17292      1.3G     9.6G        719.6M                        0% master
 8901      1.7G     6.3G          4.3G                        0% cron
 6529        0K     7.7G            0K                        0% kjournald2
16931     6144K     6.3G            0K                        0% rsyslogd
13308    10460K     6.2G            0K                        0% rsyslogd
 8394      2.7G     2.7G          716K                        0% cron
 8748      1.3G     4.1G        25720K                        0% master
 4831      1.2G     3.3G        458.8M                        0% amavisd-new
28721      2.9G   242.0M          144K                        0% apache2
 8291      1.7G   692.8M         3840K                        0% init
So what does it mean? lol.. It shows INIT as the process that has the highest I/O...

But it still doesn't tell me WHICH VM those processes belong to...
 
I found it here:
http://packages.debian.org/lenny/amd64/atop/download

downloaded, and installed it.. I don't quite understand it yet, but this is what I get.
Hi,
perhaps you need a apt-get update?
I have installed the program on two proxmox-nodes via apt-get - no problem.
Use a big terminal-window and use atop -l
Its looks like:
Code:
ATOP - proxmox1           2010/03/12  08:31:08               10 seconds elapsed
PRC | sys   2.86s | user   4.89s | #proc    222 | #zombie    0 | #exit     10 |
CPU | sys     24% | user     46% | irq       4% | idle    325% | wait      2% |
CPL | avg1   0.01 | avg5    0.08 | avg15   0.09 | csw   524897 | intr  279511 |
MEM | tot   15.4G | free   85.7M | cache   6.1G | buff    3.7G | slab  700.7M |
SWP | tot    4.0G | free    2.2G |              | vmcom  11.2G | vmlim  11.7G |
PAG | scan  41593 | stall      0 |              | swin       2 | swout      0 |
DSK |         sdb | busy      2% | read    1408 | write     30 | avio    0 ms |
DSK |         sda | busy      0% | read       4 | write      7 | avio    2 ms |
DSK |         sdf | busy      0% | read       1 | write      6 | avio    1 ms |
DSK |         sde | busy      0% | read       0 | write      3 | avio    3 ms |
DSK |         sdc | busy      0% | read       0 | write      2 | avio    0 ms |
NET | transport   | tcpi      45 | tcpo      38 | udpi       0 | udpo       0 |
NET | network     | ipi       89 | ipo       43 | ipfrw      0 | deliv     53 |
NET | vmtab1 1427 | pcki  117986 | pcko   10422 | si  142 Mbps | so  550 Kbps |
NET | eth0    14% | pcki   11339 | pcko  118799 | si  684 Kbps | so  142 Mbps |
NET | eth0.2  14% | pcki   11159 | pcko  118729 | si  528 Kbps | so  142 Mbps |

  PID  SYSCPU  USRCPU  VGROW  RGROW  RDDSK  WRDSK  ST EXC S  CPU CMD     1/1   
24894   1.03s   3.05s     0K    12K 168.2M    12K  --   - S  38% kvm
 4031   0.33s   0.79s     0K     0K     4K    48K  --   - S  10% kvm
 4380   0.19s   0.18s     0K     0K     0K    16K  --   - S   3% kvm
 4090   0.13s   0.23s     0K     0K     0K   256K  --   - S   3% kvm
 4459   0.23s   0.05s     0K     0K     0K     0K  --   - S   3% kvm
24175   0.20s   0.07s     0K     0K     0K    12K  --   - S   3% kvm
 4407   0.13s   0.13s     0K     0K     0K     0K  --   - S   2% kvm
24584   0.08s   0.16s     0K     0K     0K     0K  --   - S   2% kvm
30084   0.14s   0.09s     0K     0K     0K     0K  --   - S   2% kvm
   54   0.23s   0.00s     0K     0K     0K     0K  --   - S   2% ksmd
28931   0.12s   0.10s     0K     0K     0K     0K  --   - S   2% kvm
28530   0.02s   0.02s     0K     0K      -      -  NE   0 E   0% <apache2>
28553   0.01s   0.01s   396K   396K     0K     0K  --   - R   0% atop
 3945   0.00s   0.01s     0K     0K     0K     0K  --   - S   0% ksmtuned
28563   0.01s   0.00s     0K     0K      -      -  NE   1 E   0% <pgrep>
28560   0.01s   0.00s     0K     0K      -      -  NE   0 E   0% <ps>
 4180   0.00s   0.00s     0K     0K     0K    16K  --   - S   0% apache2
28555   0.00s   0.00s 248.4M 25256K     0K     0K  N-   - S   0% apache2
25448   0.00s   0.00s     0K     0K     0K     0K  --   - S   0% iscsid
 4152   0.00s   0.00s     0K     0K     0K     0K  --   - S   0% cron
28565   0.00s   0.00s  8040K   584K    12K     0K  N-   - S   0% sleep
   53   0.00s   0.00s     0K     0K     0K     0K  --   - S   0% kswapd0
 1265   0.00s   0.00s     0K     0K     0K     4K  --   - S   0% kjournald
 2658   0.00s   0.00s     0K     0K     0K    16K  --   - S   0% kjournald
 2687   0.00s   0.00s     0K     0K     0K     0K  --   - S   0% xfsbufd
28529   0.00s   0.00s     0K     0K      -      -  NE   0 E   0% <sleep>
28558   0.00s   0.00s     0K     0K      -      -  NE   0 E   0% <awk>
28557   0.00s   0.00s     0K     0K      -      -  NE   0 E   0% <ksmtuned>
28562   0.00s   0.00s     0K     0K      -      -  NE   1 E   0% <ksmtuned>
28561   0.00s   0.00s     0K     0K      -      -  NE   0 E   0% <awk>
28559   0.00s   0.00s     0K     0K      -      -  NE   0 E   0% <ksmtuned>
28564   0.00s   0.00s     0K     0K      -      -  NE   0 E   0% <ksmtuned>
At this time there is no load on the proxmox-server (0.19).

Udo
 
Hi,
perhaps you need a apt-get update?
I have installed the program on two proxmox-nodes via apt-get - no problem.
Udo

Umm.... I'm not sure what this message is in response to... But as you can see in post #5 above, I did do an apt-get update before I tried the apt-get install and it's a no-go... But then in post #6 I just downloaded the package manually from debian and did a dpkg -i, which got it to work, but it still doesn't tell me which VM the process belongs too which was my original question, since my original post was able to identify which process was using too much disk I/O - I just need to track that process down to a VM...

I'm running this on the proxmox host node, not on the individual VM's, since trying to run anything on 50+ VM's simultaneously and correlating the dating in realtime would be a nightmare.
 
... but it still doesn't tell me which VM the process belongs too which was my original question, since my original post was able to identify which process was using too much disk I/O - I just need to track that process down to a VM...

I'm running this on the proxmox host node, not on the individual VM's, since trying to run anything on 50+ VM's simultaneously and correlating the dating in realtime would be a nightmare.
Hi,
perhaps i understand you not right, but if you run atop -l (here the shorten output):
Code:
 proxmox1           2010/03/12  12:51:51          3774214 seconds elapsed
PRC | sys 239h40m | user 204h58m | #proc    241 | #zombie    0 | #exit      0 |
CPU | sys     25% | user     24% | irq       1% | idle    349% | wait      2% |
NET | vmtab1   3% | pcki 10318e4 | pcko 48724e3 | si  302 Kbps | so    6 Kbps |
                *** system and process activity since boot ***
  PID  SYSCPU  USRCPU  VGROW  RGROW  RDDSK  WRDSK  ST EXC S  CPU CMD     1/13  
 4031  47h06m 108h35m   1.2G 907.8M 118.6G 585.2G  N-   - S  14% kvm
   54  97h52m   0.00s     0K     0K     0K     0K  N-   - S   9% ksmd
 4407  25h54m  36h16m   1.2G 927.6M  15.0G 822.6G  N-   - S   6% kvm
 4380  18h09m 843m32s   1.2G 820.1M   3.9G  24.4G  N-   - S   3% kvm
You see that the process 4031 had the most disk-traffic.
With
Code:
ps aux | grep 4031
root      4031 14.8  5.7 1242788 929604 ?      Sl   Jan27 9340:14 /usr/bin/kvm -monitor unix:/var/run/qemu-server/107.mon,server,nowait -vnc unix:/var/run/qemu-server/107.vnc,password
i see, that this is VM 107... where is the problem?
I think the list is not clear with many VMs ...

Udo

PS: of course you can add atop manual, but i'm wondering that your repository not held the package. My sources.list:
Code:
deb http://ftp.de.debian.org/debian lenny main contrib
deb http://download.proxmox.com/debian lenny pve
deb http://security.debian.org/ lenny/updates main contrib
 
You see that the process 4031 had the most disk-traffic.
With
Code:
ps aux | grep 4031
root      4031 14.8  5.7 1242788 929604 ?      Sl   Jan27 9340:14 /usr/bin/kvm -monitor unix:/var/run/qemu-server/107.mon,server,nowait -vnc unix:/var/run/qemu-server/107.vnc,password
i see, that this is VM 107... where is the problem?
I think the list is not clear with many VMs ...

Udo

I suppose if I was running 50 KVM's that might work, but I'm not.. I'm just running regular VM's, and grepping a process out just shows the process, not which VM it belongs to...

Code:
# ps aux | grep 8440
root      8440  0.0  0.0   1980   696 ?        Ss   Mar09   0:01 init [2]

*** DING **** (Light bulb going off.. ;)

I found what I was looking for... There is a command called 'vzpid' that will tell you the VM of the process ID (imagine that!).

Code:
# vzpid 8440

Pid     VEID    Name
8440    208243  init

Exactly what I was looking for... Thanks for the help!
 
Last edited: