Boot problem on ZFS 3 way mirror

Recovery
If you have major problems with your Proxmox VE host, e.g. hardware issues, it could be helpful to just copy the pmxcfs database file /var/lib/pve-cluster/config.db and move it to a new Proxmox VE host. On the new host (with nothing running), you need to stop the pve-cluster service and replace the config.db file (needed permissions 0600). Second, adapt /etc/hostname and /etc/hosts according to the lost Proxmox VE host, then reboot and check. (And don’t forget your VM/CT data)

I have just done this, but what is VM/CT data, will this come from the zfs mount? I'm afraid to reboot at this point.
 
Code:
Filesystem                     Size  Used Avail Use% Mounted on
udev                           189G     0  189G   0% /dev
tmpfs                           38G   19M   38G   1% /run
/dev/mapper/pve-root           7.8G  2.5G  5.3G  33% /
tmpfs                          189G   22M  189G   1% /dev/shm
tmpfs                          5.0M     0  5.0M   0% /run/lock
tmpfs                          189G     0  189G   0% /sys/fs/cgroup
/dev/nvme2n1p2                 511M  312K  511M   1% /boot/efi
rpool/move                     1.3T  128K  1.3T   1% /mnt/move/rpool/move
rpool/move/ROOT                1.3T  128K  1.3T   1% /mnt/move/rpool/move/ROOT
rpool                          1.3T  128K  1.3T   1% /mnt/rpool
rpool/ROOT                     1.3T  128K  1.3T   1% /mnt/rpool/ROOT
rpool/ROOT/pve-1               1.7T  333G  1.3T  21% /mnt/rpool/ROOT/pve-1
rpool/data                     1.3T  256K  1.3T   1% /mnt/rpool/data
rpool/data/basevol-100-disk-0  8.0G  579M  7.5G   8% /mnt/rpool/data/basevol-100-disk-0
rpool/data/basevol-120-disk-1  8.0G  572M  7.5G   7% /mnt/rpool/data/basevol-120-disk-1
rpool/data/subvol-102-disk-1   8.0G  907M  7.2G  12% /mnt/rpool/data/subvol-102-disk-1
rpool/data/subvol-107-disk-0   8.0G  903M  7.2G  12% /mnt/rpool/data/subvol-107-disk-0
rpool/data/subvol-110-disk-1   8.0G  572M  7.5G   7% /mnt/rpool/data/subvol-110-disk-1
rpool/data/subvol-111-disk-0   8.0G  957M  7.1G  12% /mnt/rpool/data/subvol-111-disk-0
rpool/data/subvol-112-disk-2   8.0G  2.2G  5.9G  27% /mnt/rpool/data/subvol-112-disk-2
rpool/data/subvol-112-disk-3   200G   56G  145G  28% /mnt/rpool/data/subvol-112-disk-3
rpool/data/subvol-113-disk-2   8.0G  2.1G  6.0G  27% /mnt/rpool/data/subvol-113-disk-2
rpool/data/subvol-113-disk-3   200G   55G  146G  28% /mnt/rpool/data/subvol-113-disk-3
rpool/data/subvol-114-disk-0    38G  1.2G   37G   4% /mnt/rpool/data/subvol-114-disk-0
rpool/data/subvol-115-disk-0   8.0G  1.2G  6.9G  15% /mnt/rpool/data/subvol-115-disk-0
rpool/data/subvol-116-disk-0   8.0G  905M  7.2G  12% /mnt/rpool/data/subvol-116-disk-0
rpool/data/subvol-117-disk-0    11G  2.3G  8.8G  21% /mnt/rpool/data/subvol-117-disk-0
rpool/move/ROOT/pve-1          1.7T  333G  1.3T  21% /mnt/move/rpool/move/ROOT/pve-1
 
# default image store on ZFS based installation
zfspool: rpool
pool rpool/data
sparse
content images,rootdir


I suppose adding this and copying the sqlite db files over and a reboot should do it?
 
But what bothers me, a can not access the old (rpool/ROOT/pve-1/etc/pve directory, is this a special mount for this FS? using FUSE, but how I mount it after the pool is imported and zfs mounted, i have no clue.

If I could see this, then it will make this easier :)
 
In the vgs output you have 2 volume groups which are called pve - this is rather odd (and leads to problems)
* do you have a PVE installed in a Guest? - if yes make sure to add it's disk-images to the 'global_filter' in '/etc/lvm/lvm.conf'!
(see e.g. https://forum.proxmox.com/threads/lvm-vgs-takes-5-minutes-on-one-cluster-node.55453/#post-255304 , https://forum.proxmox.com/threads/vg-pv-from-vm-visible-on-pve-volume-communication-failure.50421/ )
(and quite a few other threads in this forum for more detailed explanation).

The part about '/etc/pve/' is quite well explained in https://pve.proxmox.com/pve-docs/chapter-pmxcfs.html:
* /etc/pve is a fuse-filesystem, which gets mounted by the pmxcfs daemon
* the contents of /etc/pve are stored in the sqlite-db
* if you copy the old sqlite-db from your zfspool/ROOT/pve-1/var/lib/pve-cluster, to the new installation '/var/lib/pve-cluster' and start pmxcfs (maybe in foreground and local-mode to see any potential problems) - the old content should appear in '/etc/pve'

Regarding the addition of the storage.cfg - this is one thing which you need to merge between the old installation and the new current installation - so copy all files from the current installation to a safe backup space and compare them after starting pmxcfs with the old database.

the storage definition looks ok - but you need to make sure that rpool gets mounted on /rpool (no altroot setting) - for this you probably need to remove the mountpoint property of rpool/ROOT/pve-1 (the dataset that used to be your root)

I hope this explains it
 
copied over the /var/lib/vz from the ROOT/pve-1 put zfs-local in storage1.cgf , stop pve cluster proc, copied over db files.

Reboot? :)

edit,

I didn't see your post as I was writing this! :) THANKS!
 
I need to rsync the packages that I installed, especially OVS as i am VLAN:n a lot of the machines...

otherwise using a local repo to install them is a little more tricky. I cant get internet access with out the OVS in place.

I have it on a private net at the moment, so just copying the interfaces file over wont work after a reboot due to OVS :)

What can I rysnc over the top, to get the packages back?
 
UPDATE!

I'm back in BIZ! I would like to thank you tremendously for your help Stoiko Ivanov, a true legend.

I think I will be buying the support from now on :)

Cheers,

John
 
  • Like
Reactions: Stoiko Ivanov
UPDATE!

I'm back in BIZ! I would like to thank you tremendously for your help Stoiko Ivanov, a true legend.

I think I will be buying the support from now on :)

Glad you managed to get your systems back up (and about the decision to support Proxmox :) !
What was the last step that fixed the question marks? (out of curiosity and because other users reading this thread would profit from an answer)

Please mark the thread as solved
Thanks!
 
This was due the ZFS import not working when booting. Once I imported the rpool correctly on /rpool it all started to work. But I still have to fix this somehow in the startup.

once the pool is imported and mounted correctly ie /rpool/data etc... then I zfs mount -a to get the vm disks mounted too.


Code:
rpool                          1.3T  128K  1.3T   1% /rpool
rpool/data                     1.3T  256K  1.3T   1% /rpool/data
rpool/ROOT                     1.3T  128K  1.3T   1% /rpool/ROOT
rpool/ROOT/pve-1               1.6T  333G  1.3T  21% /rpool/ROOT/pve-1
rpool/data/basevol-100-disk-0  8.0G  579M  7.5G   8% /rpool/data/basevol-100-disk-0
rpool/data/basevol-120-disk-1  8.0G  572M  7.5G   7% /rpool/data/basevol-120-disk-1
rpool/data/subvol-102-disk-1   8.0G  906M  7.2G  12% /rpool/data/subvol-102-disk-1
rpool/data/subvol-107-disk-0   8.0G  903M  7.2G  12% /rpool/data/subvol-107-disk-0
rpool/data/subvol-110-disk-1   8.0G  572M  7.5G   7% /rpool/data/subvol-110-disk-1
rpool/data/subvol-111-disk-0   8.0G  958M  7.1G  12% /rpool/data/subvol-111-disk-0
rpool/data/subvol-112-disk-2   8.0G  2.2G  5.9G  27% /rpool/data/subvol-112-disk-2
rpool/data/subvol-112-disk-3   200G   59G  142G  30% /rpool/data/subvol-112-disk-3
rpool/data/subvol-113-disk-2   8.0G  2.1G  6.0G  27% /rpool/data/subvol-113-disk-2
rpool/data/subvol-113-disk-3   200G   59G  142G  30% /rpool/data/subvol-113-disk-3
rpool/data/subvol-114-disk-0    38G  1.2G   37G   4% /rpool/data/subvol-114-disk-0
rpool/data/subvol-115-disk-0   8.0G  1.2G  6.9G  15% /rpool/data/subvol-115-disk-0
rpool/data/subvol-116-disk-0   8.0G  906M  7.2G  12% /rpool/data/subvol-116-disk-0
rpool/data/subvol-117-disk-0    11G  2.3G  8.8G  21% /rpool/data/subvol-117-disk-0
 
  • Like
Reactions: Stoiko Ivanov
I need to rsync the packages that I installed, especially OVS as i am VLAN:n a lot of the machines...

otherwise using a local repo to install them is a little more tricky. I cant get internet access with out the OVS in place.

I have it on a private net at the moment, so just copying the interfaces file over wont work after a reboot due to OVS :)

What can I rysnc over the top, to get the packages back?


installed the ovs packages from debs before rebooting:

libatomic1_6.3.0-18+deb9u1_amd64.deb
net-tools_1.60+git20161116.90da8a0-1_amd64.deb
openvswitch-common_2.6.2~pre+git20161223-3_amd64.deb
openvswitch-switch_2.6.2~pre+git20161223-3_amd64.deb
uuid-runtime_2.29.2-1+deb9u1_amd64.deb

So I can get my vlans in proxmox.

Copied my interfaces config plus other important configs and rebooted. :)

The rest then I just apt install!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!