ZFS Performance - Was haltet Ihr von diesen Werten?

tony blue

Well-Known Member
Dec 26, 2017
83
2
48
53
Hallo,

ich habe hier einen Proxmox auf ZFS mit 2 Platten als Mirror.

Code:
zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdb2    ONLINE       0     0     0
            sda2    ONLINE       0     0     0

errors: No known data errors
root@virtualhost:/mnt/test2# zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
rpool  7.25T   628G  6.64T         -     1%     8%  1.00x  ONLINE  -

zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
rpool                          635G  6.40T   104K  /rpool
rpool/ROOT                    11.8G  6.40T    96K  /rpool/ROOT
rpool/ROOT/pve-1              11.8G  6.40T  11.8G  /
rpool/data                     614G  6.40T    96K  /rpool/data
rpool/data/subvol-800-disk-1  6.08G  43.9G  6.08G  /rpool/data/subvol-800-disk-1
rpool/data/vm-100-disk-1      72.9G  6.40T  72.9G  -
rpool/data/vm-101-disk-1       121G  6.40T   121G  -
rpool/data/vm-102-disk-1      58.8G  6.40T  58.8G  -
rpool/data/vm-300-disk-1      47.8G  6.40T  47.8G  -
rpool/data/vm-301-disk-1      41.8G  6.40T  41.8G  -
rpool/data/vm-302-disk-1      36.9G  6.40T  36.9G  -
rpool/data/vm-500-disk-1      50.8G  6.40T  50.8G  -
rpool/data/vm-501-disk-1      55.0G  6.40T  55.0G  -
rpool/data/vm-502-disk-1      31.2G  6.40T  31.2G  -
rpool/data/vm-600-disk-1      7.87G  6.40T  7.87G  -
rpool/data/vm-601-disk-1      19.4G  6.40T  19.4G  -
rpool/data/vm-702-disk-1      64.6G  6.40T  64.6G  -
rpool/swap                    8.50G  6.41T  1.04G  -
rpool/test                     260M  6.40T   147M  -
rpool/test2                    268M  6.40T   237M  -

Wenn ich innerhalb der VMs ein fio --name rw --rw rw --size 10G starte bekomme ich Werte von ca. 60000KB/s zurück. Das müsste doch mehr sein oder?

Code:
fio --name rw --rw rw --size 10G
rw: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.10
Starting 1 process
Jobs: 1 (f=1): [M(1)] [100.0% done] [55440KB/55424KB/0KB /s] [13.9K/13.9K/0 iops] [eta 00m:00s]
rw: (groupid=0, jobs=1): err= 0: pid=22475: Sat Jun 16 14:18:14 2018
  read : io=5121.4MB, bw=61822KB/s, iops=15455, runt= 84828msec
    clat (usec): min=0, max=13554K, avg=17.81, stdev=12197.24
     lat (usec): min=0, max=13554K, avg=17.91, stdev=12197.24
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    0], 10.00th=[    0], 20.00th=[    0],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    1], 60.00th=[    1],
     | 70.00th=[    1], 80.00th=[    2], 90.00th=[    2], 95.00th=[    3],
     | 99.00th=[   23], 99.50th=[   28], 99.90th=[   42], 99.95th=[   48],
     | 99.99th=[  135]
    bw (KB  /s): min=    4, max=982112, per=100.00%, avg=92316.87, stdev=143344.20
  write: io=5118.7MB, bw=61790KB/s, iops=15447, runt= 84828msec
    clat (usec): min=0, max=1832.4K, avg=45.62, stdev=3586.12
     lat (usec): min=0, max=1832.4K, avg=45.75, stdev=3586.12
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    1], 10.00th=[    1], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    1], 60.00th=[    1],
     | 70.00th=[    1], 80.00th=[    2], 90.00th=[    2], 95.00th=[    3],
     | 99.00th=[    7], 99.50th=[   15], 99.90th=[14912], 99.95th=[15168],
     | 99.99th=[17792]
    bw (KB  /s): min=    2, max=976504, per=100.00%, avg=91432.05, stdev=142758.12
    lat (usec) : 2=75.97%, 4=21.38%, 10=1.37%, 20=0.41%, 50=0.71%
    lat (usec) : 100=0.02%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.12%, 50=0.01%
    lat (msec) : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%, 2000=0.01%
    lat (msec) : >=2000=0.01%
  cpu          : usr=1.45%, sys=5.02%, ctx=4008, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=1311060/w=1310380/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=5121.4MB, aggrb=61822KB/s, minb=61822KB/s, maxb=61822KB/s, mint=84828msec, maxt=84828msec
  WRITE: io=5118.7MB, aggrb=61789KB/s, minb=61789KB/s, maxb=61789KB/s, mint=84828msec, maxt=84828msec

Disk stats (read/write):
  vda: ios=17717/4759, merge=1/451, ticks=1844/7200644, in_queue=7315992, util=99.32%

Beim Googlen nach diesem Problem bin ich darauf gestoßen, dass es vielleicht an der Sektorgröße liegen könnte. Proxmox hat die zvols der virtuellen Maschinen jeweils mit 8 k Sektorgröße angelegt. Die eingebauten Platten haben 4 K Sektorgröße

Code:
lsblk -o Name,Mountpoint,PHY-SEC
NAME     MOUNTPOINT PHY-SEC
sda                    4096
ââsda1                 4096
ââsda2                 4096
ââsda9                 4096
sdb                    4096
ââsdb1                 4096
ââsdb2                 4096
ââsdb9                 4096
sdc                    4096
ââsdc1                 4096
sdd                    4096
ââsdd1                 4096
zd0      [SWAP]        4096
zd16                   8192
zd32                   8192
zd48                   8192
ââzd48p1               8192
zd64                   8192
zd80                   8192
zd96                   8192
zd112                  8192
zd128                  8192
zd144                  8192
zd160                  8192
zd176                  8192
zd192                  8192
zd208    /mnt/test     8192
zd224    /mnt/test2    4096
zd240    /mnt/test3    8192
zd256                  4096

Wenn ich manuell per

zfs create -V 25g -o compression=on -b 4k rpool/test4
mkfs.ext4 /dev/zvol/rpool/test4
mount /dev/zvol/rpool/test4 /mnt/test4

ein zvol anlege erhalte ich beim gleichen Test höhere Werte von rd 500000KB/s.

Ist dieser Wert aus Euerer Sicht i. O.?

Code:
fio --name rw --rw rw --size 10G
rw: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.16
Starting 1 process
Jobs: 1 (f=1): [M(1)] [30.0% done] [484.4MB/484.1MB/0KB /s] [124K/124K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [45.5% done] [494.2MB/493.5MB/0KB /s] [126K/126K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [54.5% done] [484.3MB/485.4MB/0KB /s] [124K/124K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [63.6% done] [491.7MB/489.1MB/0KB /s] [126K/125K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [72.7% done] [486.4MB/486.4MB/0KB /s] [125K/124K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [81.8% done] [488.3MB/487.4MB/0KB /s] [125K/125K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [90.9% done] [501.5MB/502.3MB/0KB /s] [128K/129K/0 iops] [                                                                                                                                                                                                                                             Jobs: 1 (f=1): [M(1)] [100.0% done] [492.2MB/488.1MB/0KB /s] [126K/125K/0 iops]                                                                                                                                                                                                                                              [eta 00m:00s]
rw: (groupid=0, jobs=1): err= 0: pid=8284: Sat Jun 16 14:15:54 2018
  read : io=5119.1MB, bw=498132KB/s, iops=124533, runt= 10525msec
    clat (usec): min=1, max=7728, avg= 1.86, stdev= 9.94
     lat (usec): min=1, max=7728, avg= 1.89, stdev= 9.94
    clat percentiles (usec):
     |  1.00th=[    1],  5.00th=[    1], 10.00th=[    1], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    2], 50.00th=[    2], 60.00th=[    2],
     | 70.00th=[    2], 80.00th=[    2], 90.00th=[    2], 95.00th=[    3],
     | 99.00th=[    5], 99.50th=[   16], 99.90th=[   29], 99.95th=[   38],
     | 99.99th=[   63]
  write: io=5120.4MB, bw=498139KB/s, iops=124534, runt= 10525msec
    clat (usec): min=3, max=10804, avg= 5.54, stdev=21.71
     lat (usec): min=3, max=10804, avg= 5.59, stdev=21.73
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    4], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],
     | 70.00th=[    4], 80.00th=[    5], 90.00th=[    6], 95.00th=[    7],
     | 99.00th=[   37], 99.50th=[   43], 99.90th=[   71], 99.95th=[   94],
     | 99.99th=[  179]
    lat (usec) : 2=19.77%, 4=30.72%, 10=47.10%, 20=0.67%, 50=1.57%
    lat (usec) : 100=0.15%, 250=0.02%, 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=11.51%, sys=85.71%, ctx=4760, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=1310711/w=1310729/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/                                                                                                                                                                                                                                             d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=5119.1MB, aggrb=498132KB/s, minb=498132KB/s, maxb=498132KB/s, mint=1                                                                                                                                                                                                                                             0525msec, maxt=10525msec
  WRITE: io=5120.4MB, aggrb=498139KB/s, minb=498139KB/s, maxb=498139KB/s, mint=1                                                                                                                                                                                                                                             0525msec, maxt=10525msec


Wie kann ich denn Proxmox dazu bringen die VMs (aus Backup wieder hergestellt) mit Sektrogröße 4k anzulegen?

Vielen Dank!

tony