Problems replacing rpool disk

shtorm93

Member
Feb 4, 2020
9
0
6
31
Failing to attach or replace the zfs partition on my mirrored proxmox installation (rpool). Error "cannot attach OLDDISK-part3 to NEWDISK-part3: device is too small"
Tried following different instructions and tried to copy the partitions from a valid working disk, or even manually create all partitions with the same start and end blocks. Following the instructions on https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#_zfs_administration
in Changing a failed bootable device section I get to the zpool replace part and get the same error. Tried to use different drives (bigger capacity, smaller, and the same), all end up with the same error. Changing the partition size also had no effect.

Trying to proxmox-boot-tool format <new disk's ESP> I also get an error E: '/dev/NEDDISK2' is not a block device!

I'm not that proficient in zsf and I'm probably (hopefully) missing something very simple.
 
If the partitions are the same size, then that error is really surprising. Sometimes a partprobe or reboot is required to make the system refresh the partition information. Can you, after a reboot, show the output of lsblk and zpool status rpool and gdisk -l /dev/YOUR-OLD-DRIVE gdisk -l /dev/YOUR-NEW-DRIVE to give us some overview of the situation?
 
Was very afraid to reboot the machine, I was not sure the second disk had the EFI installed.

Still did as you recommended and I was able to add the new disk without any problems. Never thought I would get caught by the "Have you tried turning it off and on again?", but here we are.

The weird thing is - after resilvering the array, the disk that showed errors is now healthy. (there were like 150+ read errors and around 10 write). What could have caused the errors? Is the drive failing but not fully dead yet? Loose connection?
 
Happens to us all sometimes. Thank you for reporting back on this. Are there I/O errors in journalctl that might shed some light? If there were no checksum errors, then it could be a cable or other connection problem. Which brings us back to did you try unplugging it and plugging it back in ;-) a few times (to scrape possible corrosion of the connectors)?
 
I see no I/O errors. No checksum errors. I will leave the new disk in the pool just in case the old one will error out again.
Thank you for your help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!