single mistake ruined proxmox

VinnyG

New Member
May 7, 2024
19
0
1
help, i ran sensors-detect while doing some benchmarks and now the pc can't find a boot device
it created a file called /etc/modules which i think is the culprit
I can't import the zfs pool or see the disk contents on other systems
I'm no expert but have been working on proxmox the whole month i was going to set up PBS next and this happened

error i get when trying to import the pool/disk:
ZFS-8000-EY
 
i have tried many options such as -f -d -o -F -N(with readonly) none of them worked or even recognized the pool

i ran this which gave me an ID
root@pve:~# zpool import -f -o altroot=/mnt -d /dev/nvme0n1
pool: rpool
id: 18028294166640905067
state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
config:

rpool UNAVAIL insufficient replicas
nvme0n1 UNAVAIL invalid label

then tried importing by id
root@pve:~# zpool import -f -o altroot=/oldrpool -d 18028294166640905067
no pools available to import


please help me understand if i can't recover it how can i prevent this from happening in the future
if a misconfiguration can brick the whole node then a PBS VM would help nothing and i would need a second machine/node running PBS right?
would installing root on ext4 and leave a zfs pool just for VMs be a good solution?


greetings from Brazil and Thanks a lot
 
> would installing root on ext4 and leave a zfs pool just for VMs be a good solution?

Yes, if you're not an expert with zfs then a standard lvm+ext4 rootfs is probably your best bet. I've been using zfs since before 2014 and I still don't recommend it for boot/root on Linux. A standard ext4 or XFS rootfs is much easier to restore/recover unless you need mirroring. And if you do, then you need to read up beforehand on how to recover a ZFS boot mirror if a drive dies, or you encounter the need to migrate it to smaller (or larger) disks.

> please help me understand if i can't recover it how can i prevent this from happening in the future

Backups. Nothing substitutes for or beats backups.

https://github.com/kneutron/ansitest/tree/master/proxmox

Setup the BKPDEST.mrg in /root/bin/boojum and point it to a backup destination (separate disk or NAS, should not be the same disk as root) and start using the bkpcrit script frequently. Schedule it nightly in cron. Especially use it before making ANY system changes.

Setup veeam agent for linux and do bare-metal backups of your LVM+ext4 rootfs, at least once a week.

https://www.youtube.com/watch?v=g9J-mmoCLTs

Test a restore of your primary environment into a VM. Document your DR process. Get familiar with it and make it so you can recover back to a running state with all of your customizations in hopefully a couple of hours, just by replacing a failed disk and restoring from backup.
 
  • Like
Reactions: VinnyG

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!