Proxmox ZFS & LUKS High I/O Wait

foobar73

Member
Jan 19, 2016
17
1
23
50
Hi,

I have a proxmox 5 server with one KVM guest (database server primarily) with storage on zfs raidz2 on top of 4 disks luks encrypted, sata 7200RPM 1TB

32 GB RAM with 8GB cap on arc. The system never seems to use more than 1GB.

I am seeing constant iowaits from 1-10 in top for the host node. The guest KVM has no iowait oddly enough,

What kind of troubleshooting/tuning I might do to make this perform better?

I have disabled atime, sync on the zfs level. Set 8k blocks. I think the ashift may be wrong, but I'm not sure that really helps of hurts.

Thanks.
 
Hi,
The first question is what kind of DB do you use. If you use postgresql, then 8k is ok. But if you use any mysql/mariadb/percona, then this 8k is not ok.
Another question is about what is the read/write ratio for your DB.
If you use atime=off on zfs you must use similar for kvm guest.
Some others suggestions :)
- dezctivate zfs compression (compresion over luks have no sense)
- set at zfs vdisk level to cache only metadata (because you use a DB), and alocate more RAM to guest
- use 2 different vdisk for your guest (one for OS and another for DB, with the proper volblocksize, with bigger value for os)
- try to find what is the block size at luks level, and use for zpool at least the same values, but in any case not smaller
 
My ashift looks to be 0, is this a problem?


Hi,
The first question is what kind of DB do you use. If you use postgresql, then 8k is ok. But if you use any mysql/mariadb/percona, then this 8k is not ok.
We use postgres

Another question is about what is the read/write ratio for your DB.

60/40

If you use atime=off on zfs you must use similar for kvm guest.

What do you mean? like the mount options internal to the guest? The guest is XFS

Some others suggestions :)
- dezctivate zfs compression (compresion over luks have no sense)


- set at zfs vdisk level to cache only metadata (because you use a DB), and alocate more RAM to guest

This is already done

- use 2 different vdisk for your guest (one for OS and another for DB, with the proper volblocksize, with bigger value for os)
- try to find what is the block size at luks level, and use for zpool at least the same values, but in any case not smaller

this is the LUKS dump output:

Cipher name: aes
Cipher mode: xts-plain64
Hash spec: sha512
Payload offset: 4096
MK bits: 512
 
oot@pve01:~# zdb
zfspool:
version: 5000
name: 'zfspool'
state: 0
txg: 5529158
pool_guid: 9319991841293427670
errata: 0
hostid: 2831164162
hostname: 'pve01'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 9319991841293427670
children[0]:
type: 'raidz'
id: 0
guid: 15326961453304666842
nparity: 2
metaslab_array: 34
metaslab_shift: 35
ashift: 9
asize: 4000791396352
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 46
children[0]:
type: 'disk'
id: 0
guid: 148287400119563818
path: '/dev/disk/by-id/dm-name-luks-6ac56b6b-056a-495e-878c-a13472102746'
devid: 'dm-uuid-CRYPT-LUKS1-6ac56b6b056a495e878ca13472102746-luks-6ac56b6b-056a-495e-878c-a13472102746'
whole_disk: 0
DTL: 49
create_txg: 4
com.delphix:vdev_zap_leaf: 47
children[1]:
type: 'disk'
id: 1
guid: 2489270880081726796
path: '/dev/disk/by-id/dm-name-luks-94e5985b-a8f4-408c-9070-ffc238728479'
devid: 'dm-uuid-CRYPT-LUKS1-94e5985ba8f4408c9070ffc238728479-luks-94e5985b-a8f4-408c-9070-ffc238728479'
whole_disk: 0
DTL: 48
create_txg: 4
com.delphix:vdev_zap_leaf: 50
children[2]:
type: 'disk'
id: 2
guid: 134640188827098230
path: '/dev/disk/by-id/dm-name-luks-d2990c81-74ab-4d30-b68b-4d4b64ee4bf1'
devid: 'dm-uuid-CRYPT-LUKS1-d2990c8174ab4d30b68b4d4b64ee4bf1-luks-d2990c81-74ab-4d30-b68b-4d4b64ee4bf1'
whole_disk: 0
DTL: 38
create_txg: 4
com.delphix:vdev_zap_leaf: 51
children[3]:
type: 'disk'
id: 3
guid: 16398432213168163165
path: '/dev/disk/by-id/dm-name-luks-49d33527-4940-4ce4-af95-083136bf2b5e'
devid: 'dm-uuid-CRYPT-LUKS1-49d3352749404ce4af95083136bf2b5e-luks-49d33527-4940-4ce4-af95-083136bf2b5e'
whole_disk: 0
DTL: 39
create_txg: 4
com.delphix:vdev_zap_leaf: 52
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
 
Hi,
In your case ashift pool is 9 (recomanded value for 512 B block size. What block size is for your raw disks (gdisk -l)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!