[SOLVED] Proxmox unexpected behavior after installation

rugul

New Member
Dec 10, 2019
1
0
1
27
Hello,
I reinstalled proxmox recently(because I wanted to change from raidz-2 to raid10) - previously I was using all 5 disks, but with raid10 obviously I had to use 4, so I did(but 5th drive was still connected all the time)
During installation I checked "--do not use--" on 5th drive, but after installation finished Proxmox couldn't boot(error like: "failed to import pool 'rpool', try manual import").
I realized that I had to disconnect 5th drive and then it worked well.
Posting just in case someone would ecounter this problem - I couldn't find fix googling it
 
This is something we want to avoid in the future. It's always a problem if disks are left over from other installations.
We "fixed" it already for LVM, if a VG called "pve" already exists (by asking if the user wants to rename it).
Something similar, at least detecting and warning, is planned for disks which were used for a ZFS pool named "rpool".

FYI: if you cannot disconnect the disks, for example if this is a remote setup you do not have (quick) access to, you can do the following from the initramfs:
First list the pools: zpool import

You should see two or more, only one should be in good state, the other one is the broken one from the previous installation.
There are big integer number IDs on each pool, we need to use them for unique identification on which pool we operate, as both are named "rpool". Use the ID from the "zombie" pools and do: zpool destroy -f 12345...

Now the you should be able to import the real rpool: zpool import -R / rpool
If that succeeds you can simply type exit or hit CTRL+D and the initramfs should try to continue with the boot.

note: This was recalled from memory, while I did some salvation of such installations, I might recalled something not 100% correctly - so holler, if that's the case and I'll recheck :)
 
Hi

I just ran into a quite similar issue today. I did a dd on the first 1 GB of the disks which made them look like they were clean but ZFS seems to store some information at the end of the disk too. So they were not…

The destroy command with the ID as reference did not work – it looks like it expects the name of the pool not the ID. As they are both called rpool and I did not know witch one the command would destroy I created a new pool on the disks that caused the issues and THEN destroyed it

So if anybody else runs into this error:

Code:
zpool create -f temppool sde sdg  (your devices may be different ones)

zpool destroy -f temppool



Next boot the system was fine again :-)



cheers

Michael

 
I did a dd on the first 1 GB of the disks which made them look like they were clean but ZFS seems to store some information at the end of the disk too. So they were not…

Yeah that won't do it, use: zpool labelclear -f /dev/... the next time to destroy the ZFS traces on a disk.

Destroy by ID seems to really not work anymore, not used it in the most recent ZFS versions.

Actually, what one can also do is import the correct pool by ID and then continue booting and destroying the other one afterwards when in a full Proxmox VE, for example:
Bash:
zpool import 123456789
exit

This works with current Proxmox VE 6.2, just confirmed it.

Your workaround is also valid, though, just have to look out to not destroy a used disk.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!