Hi community and PBS support.
I am experiencing a strange issue with PBS, I am obviously doing something wrong but I can't figure out what.
PVE Details:
VM config:
PBS Details:
Version: proxmox-backup-server 3.2.2-1 running version: 3.2.2
Storage: Single 18TB Toshiba HDD - tried with XFS, ext4 and now its on ZFS with copies=2. ZFS reports no errors, SMART reports no errors.
There are 7 VMs in this backup schedule however only 1 of them is failing the verification. It sits on an ZFS encrypted zvol and is FreeBSD 14.0 running ZFS itself. It has about 2TB of data and a 10TB disk allocated to it.
Errors:
I tried destroying the backups zpool and recreating it, tried copies=2, tried with XFS and ext4, tried giving PBS more threads to work with (more workers), tried limits the BW speed to not overwhelm the disk. Nothing helped.
Please, help me find what I am missing here?
I am experiencing a strange issue with PBS, I am obviously doing something wrong but I can't figure out what.
PVE Details:
Code:
Cluster: No, single host
Version: pve-manager/8.2.2/9355359cd7afbae4 (running kernel: 6.8.4-2-pve)
Storage: ZFS RAID10 - 4 HDDs + 2x mirrored Intel Optane for SLOG + 1x Samsung NVMe SSD for ARCL2
pool: data
state: ONLINE
scan: scrub repaired 0B in 13:54:03 with 0 errors on Sun Apr 14 14:18:06 2024
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-TOSHIBA_MG08ACA16TE_** ONLINE 0 0 0
ata-TOSHIBA_MG08ACA16TE_** ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-TOSHIBA_MG08ACA16TE_** ONLINE 0 0 0
ata-TOSHIBA_MG08ACA16TE_** ONLINE 0 0 0
logs
mirror-3 ONLINE 0 0 0
nvme-INTEL_SSDPEK1A118GA_**-part1 ONLINE 0 0 0
nvme-INTEL_SSDPEK1A118GA_**-part1 ONLINE 0 0 0
cache
nvme-Samsung_SSD_970_PRO_512GB_** ONLINE 0 0 0
errors: No known data errors
Code:
root@atlas:~# zfs get all data/data-encrypted
NAME PROPERTY VALUE SOURCE
data/data-encrypted type filesystem -
data/data-encrypted creation Fri Feb 10 7:27 2023 -
data/data-encrypted used 2.51T -
data/data-encrypted available 23.4T -
data/data-encrypted referenced 200K -
data/data-encrypted compressratio 1.00x -
data/data-encrypted mounted yes -
data/data-encrypted quota none default
data/data-encrypted reservation none default
data/data-encrypted recordsize 128K default
data/data-encrypted mountpoint /data/data-encrypted default
data/data-encrypted sharenfs off default
data/data-encrypted checksum on default
data/data-encrypted compression on inherited from data
data/data-encrypted atime off inherited from data
data/data-encrypted devices on default
data/data-encrypted exec on default
data/data-encrypted setuid on default
data/data-encrypted readonly off default
data/data-encrypted zoned off default
data/data-encrypted snapdir hidden default
data/data-encrypted aclmode discard default
data/data-encrypted aclinherit restricted default
data/data-encrypted createtxg 40383 -
data/data-encrypted canmount on default
data/data-encrypted xattr on default
data/data-encrypted copies 1 default
data/data-encrypted version 5 -
data/data-encrypted utf8only off -
data/data-encrypted normalization none -
data/data-encrypted casesensitivity sensitive -
data/data-encrypted vscan off default
data/data-encrypted nbmand off default
data/data-encrypted sharesmb off default
data/data-encrypted refquota none default
data/data-encrypted refreservation none default
data/data-encrypted guid 3980846415803464505 -
data/data-encrypted primarycache all default
data/data-encrypted secondarycache all default
data/data-encrypted usedbysnapshots 0B -
data/data-encrypted usedbydataset 200K -
data/data-encrypted usedbychildren 2.51T -
data/data-encrypted usedbyrefreservation 0B -
data/data-encrypted logbias latency default
data/data-encrypted objsetid 1174 -
data/data-encrypted dedup off default
data/data-encrypted mlslabel none default
data/data-encrypted sync standard default
data/data-encrypted dnodesize legacy default
data/data-encrypted refcompressratio 1.00x -
data/data-encrypted written 200K -
data/data-encrypted logicalused 2.52T -
data/data-encrypted logicalreferenced 69.5K -
data/data-encrypted volmode default default
data/data-encrypted filesystem_limit none default
data/data-encrypted snapshot_limit none default
data/data-encrypted filesystem_count none default
data/data-encrypted snapshot_count none default
data/data-encrypted snapdev hidden default
data/data-encrypted acltype off default
data/data-encrypted context none default
data/data-encrypted fscontext none default
data/data-encrypted defcontext none default
data/data-encrypted rootcontext none default
data/data-encrypted relatime on default
data/data-encrypted redundant_metadata all default
data/data-encrypted overlay on default
data/data-encrypted encryption aes-256-gcm -
data/data-encrypted keylocation prompt local
data/data-encrypted keyformat passphrase -
data/data-encrypted pbkdf2iters 350000 -
data/data-encrypted encryptionroot data/data-encrypted -
data/data-encrypted keystatus available -
data/data-encrypted special_small_blocks 0 default
VM config:
Code:
agent: 0
balloon: 0
boot: order=ide2;scsi0
cores: 8
cpu: host
description: Services status%3A **OPERATIONAL**
ide2: none,media=cdrom
memory: 12288
name: eos*****
net0: virtio=F2:EE:60:FB:**:**,bridge=vmbr0,firewall=1,tag=8
net2: virtio=5A:32:7D:5F:**:**,bridge=vmbr6,firewall=1
net3: virtio=CA:33:E6:7D:**:**,bridge=vmbr1,firewall=1,mtu=1
numa: 0
ostype: l26
protection: 1
rng0: source=/dev/urandom
scsi0: local-zfs-encrypted:vm-103-disk-1,discard=on,iothread=1,size=30G
scsi1: local-zfs-encrypted:vm-103-disk-2,discard=on,iothread=1,size=8G
scsi2: local-zfs-encrypted:vm-103-disk-0,discard=on,iothread=1,size=10T
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=6e574e5e-27c4-4ede-889d-*******
sockets: 1
tags: autoupdate;encrypted
vmgenid: 45a052b2-eb22-4629-9974-***********
watchdog: model=i6300esb,action=reset
PBS Details:
Version: proxmox-backup-server 3.2.2-1 running version: 3.2.2
Storage: Single 18TB Toshiba HDD - tried with XFS, ext4 and now its on ZFS with copies=2. ZFS reports no errors, SMART reports no errors.
Code:
pool: backups
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
backups ONLINE 0 0 0
ata-TOSHIBA_MG09ACA18TE_** ONLINE 0 0 0
errors: No known data errors
Code:
root@atlas:~# zfs get all backups/backups
NAME PROPERTY VALUE SOURCE
backups/backups type filesystem -
backups/backups creation Thu May 9 10:07 2024 -
backups/backups used 2.17T -
backups/backups available 14.1T -
backups/backups referenced 2.17T -
backups/backups compressratio 1.01x -
backups/backups mounted yes -
backups/backups quota none default
backups/backups reservation none default
backups/backups recordsize 128K default
backups/backups mountpoint /srv/backups/images local
backups/backups sharenfs off default
backups/backups checksum on default
backups/backups compression on default
backups/backups atime on default
backups/backups devices on default
backups/backups exec on default
backups/backups setuid on default
backups/backups readonly off default
backups/backups zoned off default
backups/backups snapdir hidden default
backups/backups aclmode discard default
backups/backups aclinherit restricted default
backups/backups createtxg 9 -
backups/backups canmount on default
backups/backups xattr on default
backups/backups copies 2 local
backups/backups version 5 -
backups/backups utf8only off -
backups/backups normalization none -
backups/backups casesensitivity sensitive -
backups/backups vscan off default
backups/backups nbmand off default
backups/backups sharesmb off default
backups/backups refquota none default
backups/backups refreservation none default
backups/backups guid 5060076912392400749 -
backups/backups primarycache all default
backups/backups secondarycache all default
backups/backups usedbysnapshots 0B -
backups/backups usedbydataset 2.17T -
backups/backups usedbychildren 0B -
backups/backups usedbyrefreservation 0B -
backups/backups logbias latency default
backups/backups objsetid 388 -
backups/backups dedup off default
backups/backups mlslabel none default
backups/backups sync standard default
backups/backups dnodesize legacy default
backups/backups refcompressratio 1.01x -
backups/backups written 2.17T -
backups/backups logicalused 2.21T -
backups/backups logicalreferenced 2.21T -
backups/backups volmode default default
backups/backups filesystem_limit none default
backups/backups snapshot_limit none default
backups/backups filesystem_count none default
backups/backups snapshot_count none default
backups/backups snapdev hidden default
backups/backups acltype off default
backups/backups context none default
backups/backups fscontext none default
backups/backups defcontext none default
backups/backups rootcontext none default
backups/backups relatime on default
backups/backups redundant_metadata all default
backups/backups overlay on default
backups/backups encryption off default
backups/backups keylocation none default
backups/backups keyformat none default
backups/backups pbkdf2iters 0 default
backups/backups special_small_blocks 0 default
There are 7 VMs in this backup schedule however only 1 of them is failing the verification. It sits on an ZFS encrypted zvol and is FreeBSD 14.0 running ZFS itself. It has about 2TB of data and a 10TB disk allocated to it.
Errors:
Code:
2024-05-11T05:30:59+03:00: verify backups:vm/103/2024-05-10T20:01:27Z
2024-05-11T05:30:59+03:00: check qemu-server.conf.blob
2024-05-11T05:30:59+03:00: check fw.conf.blob
2024-05-11T05:30:59+03:00: check drive-scsi2.img.fidx
2024-05-11T05:37:04+03:00: can't verify chunk, load failed - store 'backups', unable to load chunk '517aa60e4480771bd0560626b2834e36459c473b9d0ddfde9df3ed45de7a5eac' - Data blob has wrong CRC checksum.
2024-05-11T05:37:04+03:00: corrupted chunk renamed to "/srv/backups/images/backups/.chunks/517a/517aa60e4480771bd0560626b2834e36459c473b9d0ddfde9df3ed45de7a5eac.0.bad"
2024-05-11T09:53:19+03:00: verified 1967791.48/2134788.00 MiB in 15739.33 seconds, speed 125.02/135.63 MiB/s (1 errors)
2024-05-11T09:53:19+03:00: verify backups:vm/103/2024-05-10T20:01:27Z/drive-scsi2.img.fidx failed: chunks could not be verified
Code:
root@atlas:~# du -sh /srv/backups/images/backups/.chunks/517a/517aa60e4480771bd0560626b2834e36459c473b9d0ddfde9df3ed45de7a5eac.0.bad
4.1M /srv/backups/images/backups/.chunks/517a/517aa60e4480771bd0560626b2834e36459c473b9d0ddfde9df3ed45de7a5eac.0.bad
root@atlas:~# ls -la /srv/backups/images/backups/.chunks/517a/517aa60e4480771bd0560626b2834e36459c473b9d0ddfde9df3ed45de7a5eac.0.bad
-rw-r--r-- 1 backup backup 4179962 May 9 10:50 /srv/backups/images/backups/.chunks/517a/517aa60e4480771bd0560626b2834e36459c473b9d0ddfde9df3ed45de7a5eac.0.bad
root@atlas:~#
I tried destroying the backups zpool and recreating it, tried copies=2, tried with XFS and ext4, tried giving PBS more threads to work with (more workers), tried limits the BW speed to not overwhelm the disk. Nothing helped.
Please, help me find what I am missing here?
Last edited: