Hello,
We regularly experience corruption issues with our PBS backup storage when rebooting the PBS server. We are using PBS version 7 and have updated it to version 8 with the same problem.
Our datastore volume is hosted on an HP disk array that exports an iSCSI volume via multipath.
Our iSCSI volume is partitioned with LVM and formatted as ext4.
Our fstab mount options are as follows:
During a problematic restart, the errors are as follows:
The "index.json.blob" file is corrupted and contains only null bytes.
Our "/etc/multipath.conf" file has the following options:
On the HP array side, we have tested with both "write-back" and "write-through" cache writing modes.
The problem seems to be recurring if a reboot is triggered during a PBS backup.
Sorry for the lengthy message. Do you have any ideas to solve this problem or any leads to investigate the source of the issue?
Thank you in advance!
Vincent
We regularly experience corruption issues with our PBS backup storage when rebooting the PBS server. We are using PBS version 7 and have updated it to version 8 with the same problem.
Our datastore volume is hosted on an HP disk array that exports an iSCSI volume via multipath.
Our iSCSI volume is partitioned with LVM and formatted as ext4.
Our fstab mount options are as follows:
Code:
/dev/mapper/saveVM /mnt/backup ext4 defaults,relatime,_netdev,x-systemd.requires=iscsid.service 0 0
During a problematic restart, the errors are as follows:
Code:
...
error during snapshot file listing: 'unable to load blob '"/mnt/backup/vm/116/2023-07-05T20:00:01Z/index.json.blob"' - unable to parse raw blob - wrong magic'
kernel: EXT4-fs warning (device dm-4): ext4_dirblock_csum_verify:404: inode #701072025: comm UPID:restaurix:: No space for directory leaf checksum. Please run e2fsck -D.
kernel: EXT4-fs error (device dm-4): __ext4_find_entry:1673: inode #701072025: comm UPID:xx:: checksumming directory block 0
....
The "index.json.blob" file is corrupted and contains only null bytes.
Our "/etc/multipath.conf" file has the following options:
Code:
blacklist {
wwid .*
}
blacklist_exceptions {
wwid "3600c0ff000669d2185a3a66401000000"
wwid "3600c0ff000669fbf86a3a66401000000"
}
multipaths {
multipath {
wwid "3600c0ff000669d2185a3a66401000000"
alias mpath-saveVM-A
}
multipath {
wwid "3600c0ff000669fbf86a3a66401000000"
alias mpath-saveVM-B
}
}
defaults {
polling_interval 2
path_selector "round-robin 0"
path_grouping_policy multibus
uid_attribute ID_SERIAL
rr_min_io 100
failback immediate
no_path_retry queue
user_friendly_names yes
}
On the HP array side, we have tested with both "write-back" and "write-through" cache writing modes.
The problem seems to be recurring if a reboot is triggered during a PBS backup.
Sorry for the lengthy message. Do you have any ideas to solve this problem or any leads to investigate the source of the issue?
Thank you in advance!
Vincent