Hi,
I'm a new Proxmox user.
I've built a small lab to evaluate PVE and PBS, with the intention of replace out Hyper-V infrastructure (90 VMs in 3 sites, many in replica).
We got two Dell R640 servers for PVE and another for PBS.
For this testing purpose we are using 4x WD Blue 1TB SSDs (model WDS100T2B0A) that are laying around.
Two per server (+ separate OS disks) in ZFS mirror config, with ashift=12.
HW config (one server)
Dell R640
2x Intel Xeon Gold 5120
64GB RAM (planned to be expanded to 256GB)
Controller Dell Perc in Passthrough mode
The idea is to run two PVE nodes in HA cluster with ZFS replication between nodes, a remote replica for disaster recovery purpose (for critical VMs) and a local third server with PBS for backups (used also as qDevice for HA quorum).
I configured a network ring (with one 2x10G ethernet card per server), with RSTP over Open vSwitch between 2 PVE and 1 PBS.
All works as expected with very good satisfaction.
But...
While testing with some VMs (for the most part they will be windows servers) I ran into a major stability issue during high I/O.
The Write I/O performance is very poor and the VM become very(very) slow to user interaction and other action during a CrystalDiskMark default test.
The benchmark result also go to 0.00 in 1 or more write test (not always repeatable) at the end.
The benchmark with CrystalDiskMark was carried out after noticing an anomalous behavior during the duplication of a simple file (1GB) inside the VM (guest operating system froze a few seconds after starting the copy and remained unresponsive until the end).
This behaviour happens only if I use ZFS as the storage engine, with any combination of storage parameters, except "Writeback (unsafe)" cache.
And only with Windows Write cache active and buffer flushing turned on (flag unchecked), which is the "standard" windows configuration.
Tests I've made to figure out the problem:
- Every combination of cache setting inside the VM Windows write caching (problems with write cache active, as described)
- Every combination of VM cache setting for the virtual disks (impact on results, but same behaviour, EXCEPT for "Writeback (unsafe)" )
- Separating test disk from OS disk inside VM (no difference)
- Create a separate disk for paging file inside VM (no difference)
- Playing with ZFS ashift, volblocksize, VM NTFS allocation size (very light impact on results and same behaviour)
- Enabling/disabling ZFS cache on the Zvol during test (huge impact on read results, but same behaviour)
- Enabling/disabling ZFS compression (impact on results, but same behaviour)
- Changing Zpool from mirror to single disk (near same behaviour)
- Changing storage engine from ZFS to ext4 (PROBLEM SOLVED using ext4 instead of ZFS)
(of course I've reinstalled the VM for every ZFS layer modification like compression and change of ashift/volblocksize)
It seems like a particular problem related to ZFS in my setup.
I've searched around for days, found post like this one (
https://forum.proxmox.com/threads/p...ndows-server-2022-et-write-back-disks.127580/) with near the same issue, but found no practical info except for enterprise SSDs suggestions.
I know that I'm using consumer-grade drives for this test, but since the issue is huge and only present with a certain combination of configuration, I'm searching for an help to figure out the source of the real problem.
Some results from the last test I've ran
Result for ZFS on single disk, with Win cache ON and buffer flush ON
View attachment 58006
Result for ZFS on single disk, with Win cache ON and buffer flush OFF (unsafe)
View attachment 58007
Result for ZFS on single disk, with Win cache OFF
View attachment 58008
Similar behaviour with the ZFS mirror on two disks.
Result using ext4 instead of ZFS on same hardware (and single disk)
No problem at all in this case
View attachment 58009
Hope someone help me to understand where is the problem and how to solve it.
Thanks in advance!
Edoardo
pveversion
Code:
proxmox-ve: 8.0.1 (running kernel: 6.2.16-3-pve)
pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)
pve-kernel-6.2: 8.0.2
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.0
libpve-access-control: 8.0.3
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.5
libpve-guest-common-perl: 5.0.3
libpve-http-server-perl: 5.0.3
libpve-rs-perl: 0.8.3
libpve-storage-perl: 8.0.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
openvswitch-switch: 3.1.0-2
proxmox-backup-client: 2.99.0-1
proxmox-backup-file-restore: 2.99.0-1
proxmox-kernel-helper: 8.0.2
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.5
pve-cluster: 8.0.1
pve-container: 5.0.3
pve-docs: 8.0.3
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.2
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.4
pve-qemu-kvm: 8.0.2-3
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
zpool status
Code:
pool: ZFS-Lab2
state: ONLINE
scan: scrub repaired 0B in 00:03:57 with 0 errors on Sun Nov 12 00:27:59 2023
config:
NAME STATE READ WRITE CKSUM
ZFS-Lab2 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WDS100T2B0A-00SM50_183602A01791 ONLINE 0 0 0
ata-WDC_WDS100T2B0A_1849AC802510 ONLINE 0 0 0
errors: No known data errors
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 00:03:17 with 0 errors on Sun Nov 12 00:27:20 2023
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-TOSHIBA_MQ01ABF050_863LT034T-part3 ONLINE 0 0 0
ata-TOSHIBA_MQ01ABF050_27MDSVHVS-part3 ONLINE 0 0 0
errors: No known data errors
VM config
Code:
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 8
cpu: x86-64-v4
efidisk0: Test:104/vm-104-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ide2: none,media=cdrom
machine: pc-i440fx-8.0
memory: 8192
meta: creation-qemu=8.0.2,ctime=1699873712
name: Testzzz
net0: virtio=8E:46:68:39:1E:CA,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: Test2:vm-104-disk-0,discard=on,iothread=1,size=100G
scsihw: virtio-scsi-single
smbios1: uuid=b65697c3-3c86-4c72-86a4-92b05fc8f241
sockets: 1
tpmstate0: Test:104/vm-104-disk-2.raw,size=4M,version=v2.0
unused0: Test:104/vm-104-disk-1.raw
vmgenid: 0186fd26-8c5d-4ef0-a8bc-d0ff738bef43