[solved] Help please

Elleni

Active Member
Jul 6, 2020
150
6
38
51
Hi all, sorry if this is a double, post, but not seeing my post after refreshing, and as it is a real emergeny I thought, I try again.

I setup proxmox, and liked it very much, and as everything was working like a charm, i setup our servers and everything., and today that we are going live, my system does have a serious problem, and I really hope, there is a way to recover, as in some hours my coworkers come here, and during this weekend we changed all our systems depending to this proxmox installation.

I have a raid1 zfs proxmox that was working fine, until last night, I wanted to make a full backup, thus pluged in an external usb disk. As it was not recognized by the vm when adding it to it to the vms, I tried and added it as pci device. Soon after proxmox did was not reachable anymore, so I went back to my workplace and now I see that proxmox boots but soon after root login prompt, there are the following messages. Failet to set apst feature (-19), then usb devices are deregisterd/disconnected and there are zio poolpool=rpool ... error=5 messages. then there are warnings pool rpool has encountered an uncorrectale i/o failure and has been suspended, soon after there is a kernel panic, if I interpred that correctly.

Now I am really panicing. Please tell me there is a way to recover, as I am in big trouble
 
Last edited:
I have booted proxmox install medium and have a root@proxmox shell. Being new to zfs, can somebody please assist me on how to examine, if my installation is fixable? I would be in serious trouble, if all my work from last few months would be lost, so I desperately hope there is a way to fix. How can I check, if the pool is possible to be repaired? Thanks in advnace for your much apreciated support
 
I tried to zpool clear and then reboot. It stops with (initramfs) prompt and says cannot import rpool: pool was previously in use from another system. The pool can be imported. use zpool import -f to import the pool error 1.

doing so by zpool import -f rpool and then exit it boots system, to login prompt, but then again the errors about uncorrectable i/o failure occure :(

What can I do?
 
Is there a way to save my data? All these vms I setup, they must not be lost, pleease :/
 
proxmox rescue boot aying unable to find boot disk automatically press any key to continue
 
from root@proxmox - reached by installer debug and then abort, I tried zpool import and this shows:

pool: rpool
state: online
status the pool was last accessed by another system

rpool
mirror-0
nvme-eui...... ONLINE
nvme-eui...... ONLINE

So what would be the next step to try to fix this pool, if possible?
 
I hope scrub is able to repair this, I issued it and now I am waiting and praying it is fixable. And if not, I would be very thankfull if at least the vm disks are able to be saved so "only a reinstall would be needed and the virtual clients could be transfered. Thanks in advance for any thoughts, hints and/or help
 
srub finished, but aparently did not do anything?

zpool status reads:

pool: rpool
state: online
status: one or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'
see: http://zfsonlinux.org/msg/zfs-8000-9p
scan: scrub repaired 0B in 0 days 00:30:45 with 0 errors on <date and time>

config:
name state read write cksum
rpool online 0 0 0
mirror-0 online 0 0 0
nvme1.... online 0 0 0
nvme2.... online 0 0 1

errors: no known data errors

I'll try to reboot, but I don't beleive it will work, as it aparently did not correct anything? Do I have to do issue a zpool clear, as mentioned above? I am asking as before the scrub I already tried that but it still did not solve the issue. And if i does not work, what would be the steps to try to save my vm disks and/or backups of the vm-disks? Saving the vms and/or the backups would already be a livesafer as a reinstall of proxmox can be done quickly but setting up all vms would be a nightmare, I am not sure, I would soon recover from. So thanks for any pointer
 
As on the second disk there is a chksum error, what are my options? Would the system boot, if I remove one of the disks? In three hours the collegues arrive here, and if I cannot bring the system back online, I have to put the disasembled it infrastructure back or they cannot work, so I am running out of time, and in the same time I dont want to make things worse, so an hint would be really apreciated. Thanks guys
 
Dear Elleni,

I do not have to much every day experience with ZFS but a mirrored setup should run with only one disk. I think this should be the role of a mirror, to be able to replace one disk if it fails.

The concern I have is if your data is already corrupted or not.

I hope all the best,
Rares
 
With the help of very kind forum members on another forum (nethserver), I was able to recover. No data were lost. We re-installed prosmox on a separate disk and then could mount the pool containing the vms and with zfs send / receive we backuped them to another medium. Was a hell of a trip but now we are up and running, and I learned much about zfs :)

We now have a separate boot pool on separate disk and a mirror (degraded for the moment as one disk is defective), hp support was contacted and is expeced to arrive next tuesday with a replacement disk. Backup of the data is made, so problem is solved.
 
  • Like
Reactions: Dominic
If you run a Proxmox VE cluster in production, I'd really recommend taking a look at the offered subscriptions.
Even though the team tries to answer as many forum threads as possible, we can only guarantee fast response times for owners of subscriptions via the Customer Portal.
 
I understand and we also consider that as soon we will get our second server and configure a cluster. ATM this is a signle server with mirrored disks, and I am wondering a bit why I got no answer at all after waiting a considerable amount of time in this thread, which would be really apreciated as I don't want to risk screwing my system...
 
In order to get the best out of this forum, please follow some basic rules.
  • Before you post, search the forum for similar topics, especially this one is discussed several times already, you will find a LOT of howtos to this topic.
  • Read the documentation and make sure you run latest stable software packages.
If you cannot find a solution, please post your own thread:
  • Choose a suitable thread title (do not just write "help" or anything similar...)
  • Do not post monster threads, ask simple and clear questions, so other do not have to invest a day to understand what you want to know
If you run in production setups, make sure you can help yourself or make sure you have a suitable support contract with someone who can help in emergency.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!