Hello,
Since upgrading from 6.4 to 7.1 we have experienced many issues with FreeBSD vm's. Please keep in mind that we had no issues on 6.4.
For example our dovecot FreeBSD 13 (latest patch release) server during busier times would start to deinit dovecot processes, and the load average jumps from below 1, to over 200. Nothing is actually using any CPU, so this definitely seems like disk io issue. Dovecot processes just keep on starting and never get killed, ending up with thousands of deinit processes. Only solution is to reboot the VM.
There are no errors logged anywhere, on proxmox host or freebsd vm.
Proxmox summary shows excessively high CPU usage and Disk IO, network is low:
This happens on both raw and qcow2 VM's. I have tried switching from default io_uring to native and threads, as well as combinations of no cache and writeback using VirtIO SCSI single.
On FreeBSD vm, I have tried different time counters from HPET, TSC-low, to kvmclock.
And I've disabled balloon memory just in case.
I have also tried different CPU options from host, to actual host processor to kvm64.
This happens randomly, usually during busier times. Sometimes it happens within few hours, sometimes it takes days to happen.
I have also tried pve-kernel-5.13.19-1-pve and pve-kernel-5.15.7-1-pve.
I believe this has something to do with the issue that was occurring on Linux VM's with IO errors. pve-qemu-kvm_6.1.0-3 doesn't fix this issue on FreeBSD VMs.
vm config:
pveversion:
Also notice that the disk written is at almost 3.4TB, but net traffic in is only 6.7GB... So impossible to have written that much data:
Since upgrading from 6.4 to 7.1 we have experienced many issues with FreeBSD vm's. Please keep in mind that we had no issues on 6.4.
For example our dovecot FreeBSD 13 (latest patch release) server during busier times would start to deinit dovecot processes, and the load average jumps from below 1, to over 200. Nothing is actually using any CPU, so this definitely seems like disk io issue. Dovecot processes just keep on starting and never get killed, ending up with thousands of deinit processes. Only solution is to reboot the VM.
There are no errors logged anywhere, on proxmox host or freebsd vm.
Proxmox summary shows excessively high CPU usage and Disk IO, network is low:
This happens on both raw and qcow2 VM's. I have tried switching from default io_uring to native and threads, as well as combinations of no cache and writeback using VirtIO SCSI single.
On FreeBSD vm, I have tried different time counters from HPET, TSC-low, to kvmclock.
And I've disabled balloon memory just in case.
I have also tried different CPU options from host, to actual host processor to kvm64.
This happens randomly, usually during busier times. Sometimes it happens within few hours, sometimes it takes days to happen.
I have also tried pve-kernel-5.13.19-1-pve and pve-kernel-5.15.7-1-pve.
I believe this has something to do with the issue that was occurring on Linux VM's with IO errors. pve-qemu-kvm_6.1.0-3 doesn't fix this issue on FreeBSD VMs.
vm config:
Code:
qm config 188
agent: 1
balloon: 0
boot: cdn
bootdisk: scsi0
cores: 24
cpu: host,flags=+aes
machine: q35
memory: 49152
name: garibaldi
net0: virtio=0A:06:9A:F4:7A:01,bridge=vmbr0,firewall=1,queues=8
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-188-disk-0,aio=threads,cache=writeback,discard=on,format=raw,size=256G,ssd=1
scsi1: storage:vm-188-disk-0,aio=threads,backup=0,cache=writeback,discard=on,format=raw,size=2T,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=4479ea4e-6825-42fd-bee4-1a194dacf635
sockets: 1
vmgenid: 9ead784d-701c-4817-8614-9cc019ebe2f6
pveversion:
Code:
pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-8 (running version: 7.1-8/5b267f33)
pve-kernel-5.15: 7.1-7
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.4: 6.4-11
pve-kernel-5.15.7-1-pve: 5.15.7-1
pve-kernel-5.15.5-1-pve: 5.15.5-1
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.4.157-1-pve: 5.4.157-1
ceph-fuse: 14.2.21-1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-1
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-4
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3
Also notice that the disk written is at almost 3.4TB, but net traffic in is only 6.7GB... So impossible to have written that much data:
Code:
qm status 188 --verbose
blockstat:
scsi0:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 1
flush_total_time_ns: 269077
idle_time_ns: 1478509445
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 12456033280
rd_merged: 0
rd_operations: 422602
rd_total_time_ns: 145656193055
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 80045617152
wr_highest_offset: 268511662080
wr_merged: 0
wr_operations: 1967992
wr_total_time_ns: 226722352806
scsi1:
account_failed: 1
account_invalid: 1
failed_flush_operations: 0
failed_rd_operations: 0
failed_unmap_operations: 0
failed_wr_operations: 0
flush_operations: 1
flush_total_time_ns: 382650
idle_time_ns: 909307514
invalid_flush_operations: 0
invalid_rd_operations: 0
invalid_unmap_operations: 0
invalid_wr_operations: 0
rd_bytes: 58119712768
rd_merged: 0
rd_operations: 1252234
rd_total_time_ns: 383142803225
timed_stats:
unmap_bytes: 0
unmap_merged: 0
unmap_operations: 0
unmap_total_time_ns: 0
wr_bytes: 3310958258176
wr_highest_offset: 2188121796608
wr_merged: 0
wr_operations: 100482026
wr_total_time_ns: 11465880717734
cpus: 24
disk: 0
diskread: 70575746048
diskwrite: 3391003875328
maxdisk: 274877906944
maxmem: 51539607552
mem: 45357627372
name: garibaldi
netin: 6770949174
netout: 42921508171
nics:
tap188i0:
netin: 6770949174
netout: 42921508171
pid: 3885990
proxmox-support:
pbs-dirty-bitmap: 1
pbs-dirty-bitmap-migration: 1
pbs-dirty-bitmap-savevm: 1
pbs-library-version: 1.2.0 (6e555bc73a7dcfb4d0b47355b958afd101ad27b5)
pbs-masterkey: 1
query-bitmap-info: 1
qmpstatus: running
running-machine: pc-q35-6.1+pve0
running-qemu: 6.1.0
status: running
uptime: 94978
vmid: 188