Please Help, I have a real emergency

Elleni

Active Member
Jul 6, 2020
174
10
38
51
Hi all,

I setup proxmox, and liked it very much, and as everything was working like a charm, i setup our servers and everything., and today that we are going live, my system does have a serious problem, and I really hope, there is a way to recover, as in some hours my coworkers come here, and during this weekend we changed all our systems depending to this proxmox installation.

I have a raid1 zfs proxmox that was working fine, until last night, I wanted to make a full backup, thus pluged in an external usb disk. As it was not recognized by the vm when adding it to it to the vms, I tried and added it as pci device. Soon after proxmox did was not reachable anymore, so I went back to my workplace and now I see that proxmox boots but soon after root login prompt, there are the following messages. Failet to set apst feature (-19), then usb devices are deregisterd/disconnected and there are zio poolpool=rpool ... error=5 messages. then there are warnings pool rpool has encountered an uncorrectale i/o failure and has been suspended, soon after there is a kernel panic, if I interpred that correctly.

Now I am really panicing. Please tell me there is a way to recover, as I am in big trouble

Is there a way to boot a livesystem to check and hopefully repair my rpool? I really hope, my work of the last few months is not lost :(((

Please, please tell me what I can do to try to rescue our system

ZFS is setup with two 2TB nvme disks as mirror. Can it work if I try to remove one of the two drives and try to boot that way? I dont want to make things worse thats why I desperately hope for anybodys help :/

Or is there a way to boot proxmox install medium to try to see if the error is fixable, and what would be the steps to find out, if it is possible to not loose all the vms I setup and configured?

I have booted the proxmox cd and have root@proxmox prompt. zpool list says no pools available. What can I do? I am not new to linux, but absolute new to zfs thus I am so lost, and desperately hope, there is a way to recover.
 
Last edited:
Hi,
does zpool import list the pool and does zpool import <POOLNAME> work? If there are errors that make importing impossible, you can still try zpool import -F <POOLNAME>.

From the man page:
Code:
             -F      Recovery mode for a non-importable pool.  Attempt to return the pool to an importable state by discarding the last few transactions.  Not all damaged pools can be recovered by using this option.  If successful, the
                     data from the discarded transactions is irretrievably lost.  This option is ignored if the pool is importable or already imported.
 
It worked but after a fresh boot, I reached login prompt, and about 5 seconds later, everything was detached and zfs suspended pool followed by a kernel panic. With the help of a very kind and guy who has good knowhow on zfs, we were able to reinstall proxmox on a separate disk, mount the pool, and found out that one of the disks of the mirror aparently has a problem, as zpool reported checksum errors, while the other disk is ok. So We copied the needed vms to usb disks. Now I am wainting for replacement of the disk and have a separate boot pool and datapool. No data was lost, but it was a hell of a trip :)

Learned much about zfs and start loving it :)
 
  • Like
Reactions: fiona