High Load Average on host while guest is copying files

melanch0lia

New Member
Jul 31, 2014
25
0
1
Using ZFS 0.6.4.1-1 and Proxmox as VE for my KVM ~32 machines powered with cache=writeback (cache=none not starting)


Deduplication disabled, primarycache/secondarycache=all, checksum/compression ON. compressratio 1.71X, sync standard.


When I start copying heavy files (~1G) inside the guest, Load Average dramatically increases on Host.


VM - CentOS6 x64, RAW image, cache=writeback, VIRTIO
For example,
Before copy - load average: 3.40, 4.14, 4.25
ON copy - load average: 15.10, 7.73, 6.09


I also see that other VMs start lagging.

Principally same behaviour experiencing even if VM not using VIRTIO.


Some info + screenshots:

1.png2.png


zpool status -v:
Code:
  pool: POOL1
 state: ONLINE
  scan: scrub repaired 0 in 42h55m with 0 errors on Sun Aug 31 15:28:56 2014
config:


    NAME                          STATE     READ WRITE CKSUM
    POOL1                         ONLINE       0     0     0
      mirror-0                    ONLINE       0     0     0
        scsi-35000c500563c3ecb    ONLINE       0     0     0
        scsi-35000c500565053a7    ONLINE       0     0     0
      mirror-1                    ONLINE       0     0     0
        scsi-35000c500565057d3    ONLINE       0     0     0
        scsi-35000c50056505827    ONLINE       0     0     0
    cache
      scsi-SAdaptec_SSD_9067252C  ONLINE       0     0     0


errors: No known data errors
# cat /proc/scsi/scsi
Code:
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: Adaptec  Model: SSD              Rev: V1.0
  Type:   Direct-Access                    ANSI  SCSI revision: 02
Host: scsi0 Channel: 01 Id: 00 Lun: 00
  Vendor: Samsung  Model: SSD 840 PRO Seri Rev: DXM0
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: ST3250318AS      Rev: CC46
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: SEAGATE  Model: ST2000NM0023     Rev: A001
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 02 Lun: 00
  Vendor: SEAGATE  Model: ST2000NM0023     Rev: A001
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 03 Lun: 00
  Vendor: SEAGATE  Model: ST2000NM0023     Rev: A001
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 04 Lun: 00
  Vendor: SEAGATE  Model: ST2000NM0023     Rev: A001
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 05 Lun: 00
  Vendor: SEAGATE  Model: ST32000645SS     Rev: 0004
  Type:   Direct-Access                    ANSI  SCSI revision: 06
Host: scsi1 Channel: 00 Id: 06 Lun: 00
  Vendor: SEAGATE  Model: ST3300657SS      Rev: 0006
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 07 Lun: 00
  Vendor: SEAGATE  Model: ST3300657SS      Rev: 0006
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 08 Lun: 00
  Vendor: SEAGATE  Model: ST3300657SS      Rev: 0006
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 09 Lun: 00
  Vendor: SEAGATE  Model: ST3300657SS      Rev: 0006
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 10 Lun: 00
  Vendor: ATA      Model: WDC WD20EARX-00P Rev: AB51
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 11 Lun: 00
  Vendor: INTEL    Model: SR2612UR         Rev: I106
  Type:   Enclosure                        ANSI  SCSI revision: 05

----


Code:
02:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS (rev 08)

----
Code:
# free -m
             total       used       free     shared    buffers     cached
Mem:         88556      83518       5038          0        119        240
-/+ buffers/cache:      83157       5399
Swap:        29695       3004      26691


modprobe:
Code:
options zfs zfs_arc_min=4294967296
options zfs zfs_arc_max=34359738368


PS. Yes, I see that my memory is not enough free, but I can't do anything with that due to ARC cache usage... My VMs together doesn't eat that memory, really (1-2 GB VMs).
 
Last edited:
Your VMs eat 52GB of resident (i.e. used) RAM, not really 1-2 GB VMs. Try to always use virtio for drives and also try to lower the ARC size and see what happens. It looks like the hypervisor has paged some memory, but we don't know the paging rate, to get an idea (vmstat 1).
 
Your VMs eat 52GB of resident (i.e. used) RAM, not really 1-2 GB VMs. Try to always use virtio for drives and also try to lower the ARC size and see what happens. It looks like the hypervisor has paged some memory, but we don't know the paging rate, to get an idea (vmstat 1).

If I send drop_caches to 2 (to free reclaimable slab objects (includes dentries and inodes)), my RAM frees up to 52 GB. After a while it is being used to ~82-84 GB again.

Here is my vmstat 1 for load average: 1.82, 1.90, 1.98
Code:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 2  2 2987640 6136348 136356 250728    0    0    37   288    0    0  9  6 81  4
 6  1 2987640 6148208 136356 250728    0    0     0  2392 34816 88146  6  6 84  5
 2  1 2987640 6160608 136356 250728    0    0     0  1687 36947 110607  5  6 85  4
 0  1 2987640 6179524 136356 250728    0    0     0  1784 33840 84283  6  6 84  3
 0  1 2987640 6201104 136356 250728    0    0    45  3008 33380 85465  5  6 85  4
 1  2 2987640 6211604 136356 250728    0    0     0 15518 33240 88235  5  6 83  6
 2  1 2987640 6216368 136356 250728    0    0     0  1712 34062 91970  6  6 84  4
 4  1 2987640 6222768 136356 250736    0    0     0  1692 33228 83298  7  6 84  4

Code:
 5  0 2987640 6238540 136356 250736    0    0    13  4861 36624 97484  9  6 
 5  1 2987640 6237080 136356 250740    0    0    14  1712 34891 90073  8  6 82  3
 2  1 2987640 6236912 136356 250740    0    0    26  1648 33452 95423  6  6 85  3

Code:
# vmstat -s     90682192 K total memory
     84457600 K used memory
     37079456 K active memory
      6355976 K inactive memory
      6224592 K free memory
       136728 K buffer memory
       252568 K swap cache
     30408700 K total swap
      2987640 K used swap
     27421060 K free swap
   1755849859 non-nice user cpu ticks
            0 nice user cpu ticks
   1105707115 system cpu ticks
  16191538840 idle cpu ticks
    847155389 IO-wait cpu ticks
       247067 IRQ cpu ticks
      8591511 softirq cpu ticks
            0 stolen cpu ticks
   7281325573 pages paged in
  57385683604 pages paged out
       879184 pages swapped in
      1916730 pages swapped out
   1259872757 interrupts
    420114288 CPU context switches
   1434611298 boot time
     17451447 forks
How to interpret the results?
 
Last edited:
Your RAM starts to be used again because of ZFS ARC.

The vmstat would've been important while doing a file copy and your load skyrockets.

Problem reproduced during a file copy.

Here is my "vmstat 1" when I started copying.
Before starting - load average ~2.
After start - load average ~8-9


As I can see there's some peaks:
Code:
# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 3  2      0 12241000 137136 222364    0    0    37   288    0    0  9  6 81  4
 1  1      0 12240068 137136 222364    0    0     0  2641 27920 59131  5  5 85  5
 2  1      0 12240324 137136 222364    0    0     7  6353 28531 63462  5  6 86  4
 3  2      0 12239228 137136 222368    0    0  4973  3411 29975 59680  5  5 85  5
 4  3      0 12068596 137136 222368    0    0 16256   752 40026 60617  7  6 81  5
 2  4      0 12060584 137136 222368    0    0 23402 22033 28295 55092  5  6 76 13
 3  2      0 11982252 137136 222368    0    0 21925  5729 29752 58623  5  6 80  9
 3  2      0 11869588 137136 222368    0    0  9260  3474 30503 61908  6  6 81  7
 5  1      0 11785004 137136 222368    0    0 26921  3429 37348 93308  7  7 79  7
 4  3      0 11706656 137136 222372    0    0 16167  3217 31332 59246  6  6 81  7
 1  4      0 11606108 137136 222372    0    0 19568 14053 31942 61976  6  6 78 10
 4  6      0 11518964 137136 222372    0    0 17838  3586 33360 64190  7  6 77 10
 2  3      0 11454988 137136 222372    0    0 25775 33797 31798 61600  5  7 75 13
 3  3      0 11433016 137136 222372    0    0 12219 52123 31768 61125  5  6 77 11
 2  4      0 11377820 137136 222376    0    0 16620 11254 30211 61391  5  6 77 11
 1  3      0 11322672 137136 222376    0    0 44419  3550 30007 57617  5  5 80 10
 2 29      0 11221596 137136 222376    0    0 57057 25386 33714 67020  7  7 74 12
 2  2      0 10998664 137136 222376    0    0 18507 76895 35992 73596  6  9 54 31
 0  4      0 11021396 137136 222376    0    0 23794 38526 44375 86625  6  7 74 12
 2  3      0 11043872 137136 222384    0    0 25994 12275 33349 66470  7  6 77 10
 2  4      0 11014072 137136 222384    0    0 29578 11688 32100 72589  6  7 77 11
 1  2      0 10936436 137136 222384    0    0 36871  3775 34411 60318  6  6 81  7
 3  3      0 10889180 137136 222384    0    0 18564  4571 40511 62564  6  6 80  8
 7  3      0 10851116 137136 222384    0    0 20069  6690 28923 57335  5  5 81  8
 3  3      0 10779248 137136 222384    0    0 19195  3544 28870 56192  6  5 81  8
 4  3      0 10699652 137136 222384    0    0 22873 26243 31413 62121  6  6 78 10
 1  3      0 10586728 137136 222384    0    0 20192  3496 44500 58071  7  5 79  9
10 39      0 10581832 137136 222384    0    0 26740  3963 33556 62207  6  6 77 11
 4  3      0 10544236 137136 222384    0    0  5522 47591 41269 113561  7  8 72 14
 9  3      0 10536276 137136 222388    0    0 14063  9090 42137 63923  6  6 77 11
 7  1      0 10513804 137136 222388    0    0 18524  3060 38107 58082  7  6 80  7
 2  2      0 10505972 137136 222388    0    0 17500 16306 37970 65526  7  7 72 15
 0  3      0 10480300 137136 222392    0    0 10780 16864 34176 58241  6  6 76 12
 3  2      0 10484032 137136 222392    0    0 18212 17562 30792 63564  6  6 78 11
 0  3      0 10477064 137136 222392    0    0 20695  2966 31128 58309  7  5 80  8
 3  4      0 10274444 137136 222392    0    0 27801 35118 35619 67635  7  9 56 28
 3  3      0 10371256 137136 222392    0    0 10420 49312 33357 66046  5  6 79 11
 1  4      0 10406056 137136 222392    0    0 23417 28304 33719 85712  6  7 77 10
 6  3      0 10411740 137136 222392    0    0 30782  7921 31758 75021  6  6 78 10
 0  3      0 10412116 137136 222396    0    0 16676  2288 31721 60363  7  6 80  8
 2  3      0 10414468 137136 222396    0    0 16731  3110 30414 64022  6  6 79  9
 4  1      0 10418564 137136 222396    0    0 25774  2605 31107 64558  7  6 79  7
 1  3      0 10427024 137136 222396    0    0 16995 14634 30125 62250  6  6 77 11
 0  3      0 10433668 137136 222400    0    0 18099  3540 28799 59063  5  6 80  9
 4  3      0 10438572 137136 222400    0    0 20998  3389 29435 60996  6  6 80  9
 2 43      0 10415312 137136 222400    0    0 23059  6243 30069 63317  6  7 75 13
 2  5      0 10434324 137136 222400    0    0  6831 26464 30063 61653  6  5 69 19
 5  4      0 10419212 137136 222400    0    0 14970  7686 29610 59667  6  7 77 10
 4  4      0 10428212 137136 222400    0    0  9568 39874 32065 68050  5  6 77 11
 1  4      0 10439796 137136 222408    0    0 15376 18236 30814 63448  8  6 68 18
 4  4      0 10398392 137136 222408    0    0 12036 25759 36635 96647  7  7 75 12
 5  2      0 10422628 137140 222408    0    0 21092 15171 30863 70986  6  6 77 11
 1 55      0 10413984 137140 222408    0    0 36679  3750 29889 60484  6  5 76 13
 2  4      0 10391444 137140 222408    0    0 16081 58377 34882 67586  6  8 39 47

But I don't understand them.

Code:
# free -m
             total       used       free     shared    buffers     cached
Mem:         88556      78478      10078          0        133        214
-/+ buffers/cache:      78130      10426
Swap:        29695          0      29695

Freed UP swap before using swapoff
 
Last edited:
Except the last line in vmstat, everything looks OK. Your system is ~80% idle. There is no active swapping.

Is there a chance that the copying itself started at the last line? The I/O wait is jumping to 47% which is pretty high. That means that the drives are slow to write.

Another tool to see this is iostat (apt-get install sysstat): iostat -kxz 1
 
Except the last line in vmstat, everything looks OK. Your system is ~80% idle. There is no active swapping.

Is there a chance that the copying itself started at the last line? The I/O wait is jumping to 47% which is pretty high. That means that the drives are slow to write.

Another tool to see this is iostat (apt-get install sysstat): iostat -kxz 1

No, copying started at 1st-2nd line really.

You can see jump to 31% I/O wait earlier
Code:
[COLOR=#333333]2  2      0 10998664 137136 222376    0    0 18507 76895 35992 73596  6  9 54 31[/COLOR]


I have lowered ARC cache on the fly to 24 GB and will check iostat.

What may be the problem to drives slow to write?

Code:
scsi 1:0:1:0: Direct-Access     SEAGATE  ST2000NM0023     A001 PQ: 0 ANSI: 6sd 1:0:1:0: Attached scsi generic sg3 type 0
scsi 1:0:2:0: Direct-Access     SEAGATE  ST2000NM0023     A001 PQ: 0 ANSI: 6
sd 1:0:2:0: Attached scsi generic sg4 type 0
scsi 1:0:3:0: Direct-Access     SEAGATE  ST2000NM0023     A001 PQ: 0 ANSI: 6
sd 1:0:3:0: Attached scsi generic sg5 type 0
scsi 1:0:4:0: Direct-Access     SEAGATE  ST2000NM0023     A001 PQ: 0 ANSI: 6
sd 1:0:4:0: Attached scsi generic sg6 type 0

Maybe that they are behind this controller?

Code:
[COLOR=#333333]02:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS (rev 08)[/COLOR]
 
Last edited:
Here's iostat -kxz 1 during file copy.

It seems problem with disk load, but it shouldn't be that...

Also, should write cache on controller (LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS) be turned ON or OFF? NCQ?
Code:
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   34.00     0.00  3409.50   200.56     0.11    3.15    0.00    3.15   3.15  10.70
sdc               0.00     0.00   56.00  224.00  1192.50 18203.00   138.54     3.18   11.15   32.98    5.69   3.52  98.50
sdd               0.00     0.00   41.00  212.00  1129.00 19630.00   164.10     3.46   12.70   47.17    6.03   3.92  99.30
sde               2.00     0.00   33.00  233.00   855.00 21433.00   167.58     2.91   11.41   53.94    5.39   3.75  99.70
sdf               0.00     0.00   32.00  221.00   790.00 19883.50   163.43     2.74   10.30   43.12    5.55   3.85  97.30
sdb               0.00    14.00    0.00    9.00     0.00    92.00    20.44     0.01    1.44    0.00    1.44   0.44   0.40
dm-0              0.00     0.00    0.00   23.00     0.00    92.00     8.00     0.01    0.57    0.00    0.57   0.17   0.40
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           9.95    0.00    5.90    9.62    0.00   74.53
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   34.00     0.00  3525.00   207.35     0.10    3.03    0.00    3.03   3.03  10.30
sdc               0.00     0.00  112.00  154.00  3472.50 14141.00   132.43     4.57   16.49   26.15    9.47   3.76 100.10
sdd               0.00     2.00  108.00  219.00  2367.50 18759.50   129.22     4.40   14.01   28.80    6.72   3.06 100.10
sde               0.00     0.00   20.00  238.00   660.00 17022.00   137.07     2.10    8.49   45.25    5.40   3.86  99.50
sdf               0.00     2.00   20.00  248.00   956.50 21175.50   165.16     1.88    7.54   35.70    5.27   3.72  99.60
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          11.56    0.00    7.30   10.25    0.00   70.90
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    1.00   30.00    31.00  3111.00   202.71     0.10    3.06    3.00    3.07   3.06   9.50
sdc               0.00     0.00   73.00  195.00  2485.00 12450.50   111.46     4.13   16.17   37.53    8.17   3.73 100.00
sdd               0.00     0.00  102.00  198.00  3313.00 16508.00   132.14     4.48   14.68   28.75    7.44   3.33 100.00
sde               0.00     0.00   36.00  214.00  1229.00 12140.50   106.96     3.26   12.47   47.28    6.61   4.00 100.00
sdf               0.00     0.00   46.00  226.00  1610.00 19602.00   155.97     3.27   11.70   40.72    5.80   3.67  99.70
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          17.44    0.00    8.40    9.63    0.00   64.53
 
Code:
# zpool list
NAME     SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
POOL  3.62T  3.03T   607G         -    57%    83%  1.00x  ONLINE  -
 
Last edited:
ZFS is not happy with a 83% full pool, that's for sure. When copying data, hunting for free blocks is very expensive with an almost full pool. You can search on Google about this kind of issues.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!