Thanks, the garbage collection should however be unrelated.
Let me give some more insights on the current hypothesis: The suspicion is that there is a cephfs bug related to caching, making this unfortunately also more difficult to reproduce and debug. This is also why the backup is successful, as checksums and sizes written to disk as seen by the proxmox backup client do match the expected ones, while the underlying filesystem seems to not persist the initial writing of the header, only writing the final bytes when updating the index checksum. This is why the full 4k page is zeroed, except for the checksum written when finishing and closing the fixed index file. Also, so far this seems to only happen with VMs having more than 1 disk attached, which indicates that only a certain sequence of operations and timings leads to this.
So comparing the dynamic debug output of the boguos run with a good run should help to see in the source code where things go wrong.
After doing some testing I was able to get the verification errors again with my homeassistant VM backups using kernal ceph. I ran the bash script and grabbed log dumps, however when the debugging is enabled I just get a checksum error. I also spun up a 2gb small alpine linux vm and grabbed logs form it as well. I got one log with a checksum error and one without. I'll send them to you in a PM.
ERROR: Backup of VM 200 failed - backup close image failed: command error: fixed writer 'drive-scsi0.img.fidx' close failed - got unexpected checksum
Bash:
INFO: starting new backup job: vzdump 200 --mode snapshot --storage proxmox-backup-server --notes-template '{{guestname}}' --remove 0 --notification-mode auto --node proxmox01
INFO: Starting Backup of VM 200 (qemu)
INFO: Backup started at 2024-08-06 16:04:35
INFO: status = running
INFO: VM Name: HomeAssistant
INFO: include disk 'scsi0' 'ceph-replicated:vm-200-disk-2' 40G
INFO: include disk 'efidisk0' 'ceph-replicated:vm-200-disk-1' 1M
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/200/2024-08-06T23:04:35Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '991ac964-b21b-4cbe-adff-5d591cbf4c1f'
INFO: resuming VM again
INFO: efidisk0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
INFO: 2% (1.0 GiB of 40.0 GiB) in 3s, read: 350.7 MiB/s, write: 60.0 MiB/s
INFO: 3% (1.2 GiB of 40.0 GiB) in 26s, read: 8.3 MiB/s, write: 3.5 MiB/s
INFO: 4% (1.6 GiB of 40.0 GiB) in 51s, read: 16.6 MiB/s, write: 3.4 MiB/s
INFO: 5% (2.0 GiB of 40.0 GiB) in 1m 20s, read: 13.7 MiB/s, write: 3.9 MiB/s
INFO: 6% (2.4 GiB of 40.0 GiB) in 1m 40s, read: 20.2 MiB/s, write: 7.2 MiB/s
INFO: 7% (2.8 GiB of 40.0 GiB) in 2m 24s, read: 10.4 MiB/s, write: 3.4 MiB/s
INFO: 8% (3.2 GiB of 40.0 GiB) in 2m 50s, read: 14.0 MiB/s, write: 945.2 KiB/s
INFO: 9% (3.6 GiB of 40.0 GiB) in 2m 54s, read: 105.0 MiB/s, write: 41.0 MiB/s
INFO: 10% (4.0 GiB of 40.0 GiB) in 3m 18s, read: 16.7 MiB/s, write: 3.0 MiB/s
INFO: 11% (4.4 GiB of 40.0 GiB) in 3m 34s, read: 25.5 MiB/s, write: 7.8 MiB/s
INFO: 12% (4.9 GiB of 40.0 GiB) in 3m 49s, read: 31.7 MiB/s, write: 4.0 MiB/s
INFO: 13% (5.2 GiB of 40.0 GiB) in 3m 56s, read: 50.9 MiB/s, write: 4.0 MiB/s
INFO: 14% (5.6 GiB of 40.0 GiB) in 4m 2s, read: 73.3 MiB/s, write: 4.0 MiB/s
INFO: 15% (6.0 GiB of 40.0 GiB) in 4m 6s, read: 93.0 MiB/s, write: 6.0 MiB/s
INFO: 16% (6.4 GiB of 40.0 GiB) in 4m 16s, read: 43.6 MiB/s, write: 2.8 MiB/s
INFO: 17% (6.8 GiB of 40.0 GiB) in 4m 20s, read: 98.0 MiB/s, write: 5.0 MiB/s
INFO: 18% (7.2 GiB of 40.0 GiB) in 4m 26s, read: 66.0 MiB/s, write: 6.7 MiB/s
INFO: 19% (7.7 GiB of 40.0 GiB) in 4m 32s, read: 76.7 MiB/s, write: 4.0 MiB/s
INFO: 20% (8.0 GiB of 40.0 GiB) in 4m 38s, read: 60.0 MiB/s, write: 9.3 MiB/s
INFO: 21% (8.4 GiB of 40.0 GiB) in 4m 43s, read: 83.2 MiB/s, write: 6.4 MiB/s
INFO: 22% (8.9 GiB of 40.0 GiB) in 4m 49s, read: 80.7 MiB/s, write: 10.7 MiB/s
INFO: 23% (9.3 GiB of 40.0 GiB) in 5m 6s, read: 23.1 MiB/s, write: 2.1 MiB/s
INFO: 24% (9.6 GiB of 40.0 GiB) in 5m 11s, read: 78.4 MiB/s, write: 2.4 MiB/s
INFO: 25% (10.0 GiB of 40.0 GiB) in 5m 18s, read: 54.9 MiB/s, write: 3.4 MiB/s
INFO: 26% (10.4 GiB of 40.0 GiB) in 5m 24s, read: 64.7 MiB/s, write: 1.3 MiB/s
INFO: 27% (10.8 GiB of 40.0 GiB) in 5m 30s, read: 72.0 MiB/s, write: 5.3 MiB/s
INFO: 28% (11.2 GiB of 40.0 GiB) in 5m 37s, read: 58.3 MiB/s, write: 2.3 MiB/s
INFO: 29% (11.7 GiB of 40.0 GiB) in 5m 42s, read: 89.6 MiB/s, write: 10.4 MiB/s
INFO: 30% (12.0 GiB of 40.0 GiB) in 5m 49s, read: 50.9 MiB/s, write: 3.4 MiB/s
INFO: 31% (12.4 GiB of 40.0 GiB) in 5m 55s, read: 75.3 MiB/s, write: 12.7 MiB/s
INFO: 32% (12.8 GiB of 40.0 GiB) in 6m 43s, read: 7.7 MiB/s, write: 1.2 MiB/s
INFO: 33% (13.2 GiB of 40.0 GiB) in 7m 18s, read: 12.1 MiB/s, write: 702.2 KiB/s
INFO: 34% (13.6 GiB of 40.0 GiB) in 7m 23s, read: 86.4 MiB/s, write: 5.6 MiB/s
INFO: 35% (14.0 GiB of 40.0 GiB) in 7m 31s, read: 47.5 MiB/s, write: 5.0 MiB/s
INFO: 36% (14.5 GiB of 40.0 GiB) in 7m 39s, read: 56.0 MiB/s, write: 9.0 MiB/s
INFO: 37% (14.8 GiB of 40.0 GiB) in 7m 44s, read: 79.2 MiB/s, write: 7.2 MiB/s
INFO: 38% (15.2 GiB of 40.0 GiB) in 7m 49s, read: 75.2 MiB/s, write: 4.8 MiB/s
INFO: 39% (15.6 GiB of 40.0 GiB) in 7m 56s, read: 61.1 MiB/s, write: 1.7 MiB/s
INFO: 40% (16.0 GiB of 40.0 GiB) in 8m 1s, read: 80.0 MiB/s, write: 19.2 MiB/s
INFO: 41% (16.4 GiB of 40.0 GiB) in 8m 10s, read: 48.9 MiB/s, write: 13.3 MiB/s
INFO: 42% (16.8 GiB of 40.0 GiB) in 8m 23s, read: 28.3 MiB/s, write: 8.9 MiB/s
INFO: 43% (17.3 GiB of 40.0 GiB) in 8m 39s, read: 34.8 MiB/s, write: 1.5 MiB/s
INFO: 44% (17.7 GiB of 40.0 GiB) in 8m 45s, read: 58.0 MiB/s, write: 2.0 MiB/s
INFO: 45% (18.0 GiB of 40.0 GiB) in 8m 50s, read: 72.8 MiB/s, write: 3.2 MiB/s
INFO: 46% (18.4 GiB of 40.0 GiB) in 8m 56s, read: 61.3 MiB/s, write: 8.0 MiB/s
INFO: 47% (18.8 GiB of 40.0 GiB) in 9m 1s, read: 88.0 MiB/s, write: 12.0 MiB/s
INFO: 48% (19.3 GiB of 40.0 GiB) in 9m 9s, read: 54.0 MiB/s, write: 4.0 MiB/s
INFO: 49% (19.6 GiB of 40.0 GiB) in 9m 15s, read: 64.7 MiB/s, write: 0 B/s
INFO: 50% (20.0 GiB of 40.0 GiB) in 9m 20s, read: 78.4 MiB/s, write: 3.2 MiB/s
INFO: 51% (20.4 GiB of 40.0 GiB) in 9m 24s, read: 110.0 MiB/s, write: 6.0 MiB/s
INFO: 52% (21.1 GiB of 40.0 GiB) in 9m 27s, read: 234.7 MiB/s, write: 9.3 MiB/s
INFO: 53% (21.5 GiB of 40.0 GiB) in 9m 30s, read: 141.3 MiB/s, write: 2.7 MiB/s
INFO: 56% (22.4 GiB of 40.0 GiB) in 9m 33s, read: 304.0 MiB/s, write: 49.3 MiB/s
INFO: 57% (22.8 GiB of 40.0 GiB) in 9m 52s, read: 19.8 MiB/s, write: 7.8 MiB/s
INFO: 61% (24.8 GiB of 40.0 GiB) in 10m 12s, read: 101.8 MiB/s, write: 614.4 KiB/s
INFO: 66% (26.5 GiB of 40.0 GiB) in 10m 15s, read: 574.7 MiB/s, write: 6.7 MiB/s
INFO: 67% (26.8 GiB of 40.0 GiB) in 10m 20s, read: 70.4 MiB/s, write: 6.4 MiB/s
INFO: 68% (27.2 GiB of 40.0 GiB) in 10m 27s, read: 57.1 MiB/s, write: 0 B/s
INFO: 69% (27.6 GiB of 40.0 GiB) in 10m 33s, read: 72.7 MiB/s, write: 11.3 MiB/s
INFO: 70% (28.1 GiB of 40.0 GiB) in 10m 40s, read: 61.1 MiB/s, write: 7.4 MiB/s
INFO: 71% (28.4 GiB of 40.0 GiB) in 10m 44s, read: 89.0 MiB/s, write: 4.0 MiB/s
INFO: 72% (28.8 GiB of 40.0 GiB) in 10m 51s, read: 61.1 MiB/s, write: 5.7 MiB/s
INFO: 73% (29.3 GiB of 40.0 GiB) in 10m 57s, read: 80.7 MiB/s, write: 14.0 MiB/s
INFO: 74% (29.7 GiB of 40.0 GiB) in 11m 7s, read: 37.2 MiB/s, write: 9.6 MiB/s
INFO: 75% (30.0 GiB of 40.0 GiB) in 11m 18s, read: 32.7 MiB/s, write: 10.9 MiB/s
INFO: 76% (30.4 GiB of 40.0 GiB) in 11m 33s, read: 26.9 MiB/s, write: 4.5 MiB/s
INFO: 77% (30.9 GiB of 40.0 GiB) in 11m 42s, read: 51.1 MiB/s, write: 4.0 MiB/s
INFO: 78% (31.3 GiB of 40.0 GiB) in 11m 48s, read: 74.0 MiB/s, write: 14.7 MiB/s
INFO: 79% (31.6 GiB of 40.0 GiB) in 11m 56s, read: 43.0 MiB/s, write: 4.0 MiB/s
INFO: 80% (32.1 GiB of 40.0 GiB) in 12m 2s, read: 73.3 MiB/s, write: 2.7 MiB/s
INFO: 81% (32.4 GiB of 40.0 GiB) in 12m 8s, read: 63.3 MiB/s, write: 1.3 MiB/s
INFO: 82% (33.0 GiB of 40.0 GiB) in 12m 11s, read: 186.7 MiB/s, write: 9.3 MiB/s
INFO: 94% (37.8 GiB of 40.0 GiB) in 12m 14s, read: 1.6 GiB/s, write: 5.3 MiB/s
INFO: 98% (39.2 GiB of 40.0 GiB) in 12m 17s, read: 490.7 MiB/s, write: 0 B/s
INFO: 99% (39.6 GiB of 40.0 GiB) in 12m 20s, read: 138.7 MiB/s, write: 0 B/s
INFO: 100% (40.0 GiB of 40.0 GiB) in 12m 29s, read: 42.7 MiB/s, write: 58.7 KiB/s
ERROR: backup close image failed: command error: fixed writer 'drive-scsi0.img.fidx' close failed - got unexpected checksum
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 200 failed - backup close image failed: command error: fixed writer 'drive-scsi0.img.fidx' close failed - got unexpected checksum
INFO: Failed at 2024-08-06 16:17:36
INFO: Backup job finished with errors
INFO: notified via target `mail-to-root`
TASK ERROR: job errors