Proxmox Backup Server have large IOwait and becomes partly unavailable when performing sync jobs

jmaitra

Member
Oct 4, 2020
26
7
23
I have setup two PBS machines (dedicated hardware with LEXAR NM610 PRO NVMEs) on different locations. Both installed with ZFS. Both machines have a similar setup

proxmox-backup: 3.2.0 (running kernel: 6.8.12-2-pve)
proxmox-backup-server: 3.2.7-1 (running version: 3.2.7)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-2
proxmox-kernel-6.8.12-2-pve-signed: 6.8.12-2
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ifupdown2: 3.2.0-1+pmx9
libjs-extjs: 7.0.0-4
proxmox-backup-docs: 3.2.7-1
proxmox-backup-client: 3.2.7-1
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.2.3
pve-xtermjs: 5.3.0-3
smartmontools: 7.3-pve1
zfsutils-linux: 2.2.6-pve1


Sync source
1728108231451.png


Sync target
1728108283187.png

On the target machine (which pulls) due to the sync jobs the machine regularly becomes unavailable for a timeslot of up to 5 minutes and then becomes available again.

Any hints ?

Best regards
Jens
 
Please show your ZFS Setup. Which ZFS Parameter did you changed?
Jucken listed with zpool get all and zfs get all.
 
I have a Problem based on your selection of the LEXAR NM610 PRO, these are Not for ZFS Usability QLC flash.
You must have enterprise flash, DRAM cache snd PLP. You need many IOPS on random 4k read and write.
Not only aprox. 40 MB/s read and 150 MB/s write.
User ZFS RAID 10 setup with SSD SATA3 Interface.
 
Last edited:
I would take a look at other types of jobs you have scheduled and see if any of them intersect the points where your lag spikes.

So, if you are running a sync, and Pruning or Garbage Collection kick off at the same time ... what would happen?
I don't know. But it would be interesting to find out.
 
Thanks for quick reply. Enclosed you will find the requested parameters. To be honest: i only changed the requested parameter regarding limiting memory usage
https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
You must set both: zfs_arc_min and zfs_arc_max.

Example from one Proxmox VE Server:
Code:
# file /etc/modprobe.d/zfs.conf
# cat /sys/module/zfs/parameters/zfs_arc_min
# cat /sys/module/zfs/parameters/zfs_arc_max
#
# https://www.reddit.com/r/zfs/comments/8102nf/any_experience_with_the_unsupported_openzfs/
options zfs zfs_flags=0x10
#
# Set Min ARC size
# 512 MB
options zfs zfs_arc_min=536870912

# Set Max ARC Size
# 4 GB
options zfs zfs_arc_max=4294967296

# then run: update-initramfs -u -k all

Please show me also zpool status and zpool list -v

You must set a zfs quota on rpool.
The size is aprox. 80% of your pool.
check:
Code:
zpool list rpool
zfs get quota rpool
zfs set quota=xx.yyT rpool

You can tune the zfs pool rpool:
Code:
zfs get atime rpool
zfs set atime=off rpool

# https://docs.oracle.com/cd/E19253-01/820-2313/gayns/index.html

When you see at the Proxmox Backupserver Dataspace, you must set:
Code:
zfs set atime=on rpool/<datastore-backup-server>
 
Last edited:
ZFS is the same regardless of PVE or PBS, although PBS has different GUI tools to manage it .... but I'm not sure where we see evidence that you have a ZFS issue. If you see something that looks like it needs changing, come back and ask about that specific setting. We have lots of ZFS people to help, but there are so many voices that one point of reference helps. I'd evaluate any forum advice you receive against the vendor's page.

Other than the forum feedback you've received, what makes you think disk is the problem?
Are these lag spikes always happening at the same time, on a schedule? Could your Garbage Collection or Pruning jobs be interfering with the sync job?
 
You must set both: zfs_arc_min and zfs_arc_max.

Example from one Proxmox VE Server:
Code:
# file /etc/modprobe.d/zfs.conf
# cat /sys/module/zfs/parameters/zfs_arc_min
# cat /sys/module/zfs/parameters/zfs_arc_max
#
# https://www.reddit.com/r/zfs/comments/8102nf/any_experience_with_the_unsupported_openzfs/
options zfs zfs_flags=0x10
#
# Set Min ARC size
# 512 MB
options zfs zfs_arc_min=536870912

# Set Max ARC Size
# 4 GB
options zfs zfs_arc_max=4294967296

# then run: update-initramfs -u -k all

Please show me also zpool status and zpool list -v

You must set a zfs quota on rpool.
The size is aprox. 80% of your pool.
check:
Code:
zpool list rpool
zfs get quota rpool
zfs set quota=xx.yyT rpool

You can tune the zfs pool rpool:
Code:
zfs get atime rpool
zfs set atime=off rpool

# https://docs.oracle.com/cd/E19253-01/820-2313/gayns/index.html

When you see at the Proxmox Backupserver Dataspace, you must set:
Code:
zfs set atime=on rpool/<datastore-backup-server>
Thank you for your tipps. I decided to only switch the atime - parameter to "off" and observe the system some days. I will come back to you next weekend.
 
  • Like
Reactions: tcabernoch
ZFS is the same regardless of PVE or PBS, although PBS has different GUI tools to manage it .... but I'm not sure where we see evidence that you have a ZFS issue. If you see something that looks like it needs changing, come back and ask about that specific setting. We have lots of ZFS people to help, but there are so many voices that one point of reference helps. I'd evaluate any forum advice you receive against the vendor's page.

Other than the forum feedback you've received, what makes you think disk is the problem?
Are these lag spikes always happening at the same time, on a schedule? Could your Garbage Collection or Pruning jobs be interfering with the sync job?
I was just wondering about high IOwait and that the system sometimes becomes unavailable. I checked the tasks and there were some interferences with a slow sync job. I changed and will further observe the situation. Thank you very much for your advice.
 
  • Like
Reactions: tcabernoch

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!