Chunk with wrong digest when verifying a new backup over new disks

CieNTi

Member
Sep 8, 2022
14
0
6
Hello,

I just installed a new PVE+PBS machine which has the following data disks structure:

Code:
sdc                                8:32   0   3.6T  0 disk
├─sdc1                             8:33   0   1.8T  0 part
│ └─md2                            9:2    0   1.8T  0 raid1
│   └─pbs--hdd0-backup--hdd0     252:4    0   1.8T  0 lvm   /mnt/pbs/pbs-dtm1-hdd0
└─sdc2                             8:34   0   1.8T  0 part
  └─md3                            9:3    0   1.8T  0 raid1
    ├─pve--hdd0-data--hdd0_tmeta 252:5    0   1.8G  0 lvm 
    │ └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm 
    └─pve--hdd0-data--hdd0_tdata 252:6    0   1.8T  0 lvm 
      └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm 
sdd                                8:48   0   3.6T  0 disk
├─sdd1                             8:49   0   1.8T  0 part
│ └─md2                            9:2    0   1.8T  0 raid1
│   └─pbs--hdd0-backup--hdd0     252:4    0   1.8T  0 lvm   /mnt/pbs/pbs-dtm1-hdd0
└─sdd2                             8:50   0   1.8T  0 part
  └─md3                            9:3    0   1.8T  0 raid1
    ├─pve--hdd0-data--hdd0_tmeta 252:5    0   1.8G  0 lvm 
    │ └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm 
    └─pve--hdd0-data--hdd0_tdata 252:6    0   1.8T  0 lvm 
      └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm

For the records, I waited to mdadm to finish its sync before a single write was performed, and /dev/md3 is only configured but not used yet, there is no data or operations on it.

Disk are just bought, and I also performed a badblocks -vs /dev/sdX on both disks before partitioning them, with not a single error.

SMART seems also fine:

1723924435398.png

Once all the partitions were set as shown before, I configured PBS to use /mnt/pbs/pbs-dtm1-hdd0 as storage, and one of my PVEs to use it as backup target, then I moved to test it.

I configured a backup schedule, I ran it, and I waited to its end with no problems.

Then I went to PBS and ran a verification over the just finished backup, which thrown me invalid chunks.

1723924891820.png

I repeated the same procedure but I always get some invalid chunks.

Is there something obvious that I'm not doing right? How can I trace what is happening?

It makes no sense to me to get a single bad chunk in this simplistic procedure over new system with new disks, and I'm out of ideas.

Thanks in advance,
CieNTi
 
Last edited:
The things went even worse, when even the backup was not unable to finish due to invalid chunks.

The scenario of my previous post was 'partition A + partition B = mdadm RAID1, then VG on it, then LV on it (no thin)', so I continued testing different scenarios and I finally got a working one, which is 'partition A + partition B = mdadm RAID1, then ext4 on it, no volumes at all':

Code:
sdc                                8:32   0   3.6T  0 disk  
├─sdc1                             8:33   0   1.8T  0 part  
│ └─md2                            9:2    0   1.8T  0 raid1 /mnt/pbs/pbs-dtm1-hdd0
└─sdc2                             8:34   0   1.8T  0 part  
  └─md3                            9:3    0   1.8T  0 raid1 
    ├─pve--hdd0-data--hdd0_tmeta 252:5    0   1.8G  0 lvm   
    │ └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm   
    └─pve--hdd0-data--hdd0_tdata 252:6    0   1.8T  0 lvm   
      └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm   
sdd                                8:48   0   3.6T  0 disk  
├─sdd1                             8:49   0   1.8T  0 part  
│ └─md2                            9:2    0   1.8T  0 raid1 /mnt/pbs/pbs-dtm1-hdd0
└─sdd2                             8:50   0   1.8T  0 part  
  └─md3                            9:3    0   1.8T  0 raid1 
    ├─pve--hdd0-data--hdd0_tmeta 252:5    0   1.8G  0 lvm   
    │ └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm   
    └─pve--hdd0-data--hdd0_tdata 252:6    0   1.8T  0 lvm   
      └─pve--hdd0-data--hdd0     252:7    0   1.8T  0 lvm

This way I can fully finish the backup from PVE to PBS, and PBS successfully verify the content with no single bad chunk.

Is there any problem with my first scenario?

As far as I read, it should not be any problem if I set a LVM volume over a mdadm raid ... but the reality was different for me :S

Thanks in advance,
CieNTi
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!