Raid array won't boot

bj-4will

New Member
Mar 27, 2023
16
3
3
Hi,

My brother set up a very cool proxmox server in Debian on a Dell server at his home. I only knew about half of what he was talking about when he told me about it, but he was running all kinds of fun stuff for the wife and kids in it. Well, he passed away (cancer) in January. About a week after that, his server fans quit and it died. The whole house was automated by his instance of Home Assistant, and all the family pics were on there as well. So, I called a local place to get it fixed (didn't trust my own skills). They fixed the fans and got it running again, but I think they must have taken the drives out and installed them back in the wrong order or something because it won't load the raid array now. I know this is a long shot, because I'm pretty helpless here, but it would be so helpful if someone could give me an idea of where to start to get it back up and running--at least long enough to get the data off. His wife really doesn't need this--she's got enough to deal with. Here's the screen that comes up when we boot it:

1679940011198.png

Thanks in advance! My brother always talked about what a great community this was--so I thought I'd give it a shot.
 
Last edited:
  • Like
Reactions: vampyren
On the shell of the node/PVE-host, what is the full output each of?:
  • zpool status
  • zpool import
  • lsblk
  • pvesm status
  • cat /etc/pve/storage.cfg
 
  • Like
Reactions: bj-4will
Thank you for your response. I am still working on getting this information, and I thank you for your patience.
 
Please see the attached screenshot for the full output of each command you listed. I see that it says that one of more devices contain corrupted data. I'm interested to hear what anyone thinks I should do next.screenshot-zfs-proxmox-sm.jpg
 
Ok, thanks so much! Will do. Is it possible that 2 of the drives are just swapped? Since another company had the server for several weeks, it is likely that they took the drives out. When they put them back in, if they swapped 2 of the drives, could that have caused this issue? If so, is there any way for me to identify which 2 drives are swapped? Thanks again!
 
Ok, thanks so much! Will do. Is it possible that 2 of the drives are just swapped?
No, ZFS doesn't care if you swap disks, as long as all disks are there and working.

You are missing 3 of 5 disks of that pool. So they are either dead (in that case all your data is lost, because at least 4 working disks are needed) or they are just not connected (missing cableing/power) or connected to a disk controller that isn't working anymore. If its the latter case and you connect the 3 disks again, then all should be fine and the pool should be importable again.
Open that case and see if all cables are connected. Or take some photos so we can have a look at it.
 
Last edited:
Well, figured out one drive issue--drive0 was plugged in backward. Lol. Should've checked that first thing. Now working on getting the box open - requires special tools. Stay tuned.
 
OMG! I've got 4 drives spinning! It appears that the array loaded! How do I import the pool? I'm so excited!!!
 
It looks like it did import the zpool. Can someone direct me to instructions on how to repair the raid array to use a new drive. I have a new drive hot-swapped in the same slot, and it is 1TB larger than the old drive (which is indeed broken). I tried the instructions at github listed in the message below, but I get the message 'cannot resolve path /dev/sdf2'. My instruction was 'zpool replace my-zfs-pool /dev/sdf1 /dev/sdf2'. It's probably the wrong instruction...?

IMG_9715-zpool-status.jpg
 
find out the wwn of the new disk (post the output if you need help to identify it):
ls -la /dev/disk/by-id

Replace the faulty disk with the new one:
zpool replace my-zfs-pool 12763240508794722132 /dev/disk/by-id/wwn-WWNofYourNewDisk
 
Last edited:
  • Like
Reactions: bj-4will
Ok, here is the output:
IMG_9727 copy.jpg
I get what I'm supposed to do, but I have no idea how to find the new drive here...sorry.
 
Looks like there is no fifth disk. Are you sure it is connected and working?
It only lists 4x SCSI disks (those 4 that are already part of the ZFS pool), 1x NVMe SSD, 1x SATA SSD.

In case the disk is connected to a HW raid controller (would not be recommended when using ZFS) you might need to boot into the raid controllers firmware first and create a new raid0/JBOD for the new disk.
 
Last edited:
  • Like
Reactions: bj-4will
I'm sure it is plugged in, but the two open slots may both be broken. I tried both and get the same output, showing only the 4 available drives. Of course, I haven't formatted the new hard drive at all. Do I need to do that first? It is an old drive that was formatted to BTRFS. If not, then I may not be able to recover the array completely, which is actually okay I think. I'd like to move proxmox over to a different server anyway--one that will be easier for us to manage. It's very possible that my brother set up a HW raid controller, but I'll have to study up on how to boot into that. Do you think proxmox will do backups with the raid array in a degraded state so that I can restore it on a different server?
 
Do you think proxmox will do backups with the raid array in a degraded state so that I can restore it on a different server?
As long as 4 of the 5 disks are available the pool will work. So doing Backups shouldn`t be a problem.
Do I need to do that first?
No, as long as the disk is there, no matter what it is formated with, it should show up in the ls -la /dev/disk/by-id command. The exception may be with HW raid.
 
  • Like
Reactions: bj-4will
If proxmox is showing in the data storage area (templates, vm-disks, zfs-containers, etc) that 5TB of the total 10TB is being used, doesn't that mean that it is recognizing all of the available storage on the device? I did verify that 10TB is the total available on the server. I'm thinking that there's no separate hardware array on this server.

Is there a step-by-step guide to moving that much data to another server on the network? I'm setting up a proxmox virtual machine on a synology server since that's what I'm familiar with. I'll be teaching their 13-year-old son how to use it, and I want it to be something he can be comfortable with (with a nice gui). I have backups for all the containers and VMs, but I don't see how to download or sync them across the network. If it's CLI, then I may need outside help. Any suggestions? I feel bad about continuing to ask questions here.
 
If proxmox is showing in the data storage area (templates, vm-disks, zfs-containers, etc) that 5TB of the total 10TB is being used, doesn't that mean that it is recognizing all of the available storage on the device? I did verify that 10TB is the total available on the server. I'm thinking that there's no separate hardware array on this server.
If you want know if all defined storages are available you can run pvesm status. All storages should be in "active" state.

I'm setting up a proxmox virtual machine on a synology server since that's what I'm familiar with.
I wouldn't do that. Then you get nested virtualization with bad performance and you still have to administrate the whole PVE. Would then be better to migrate all VMs from Proxmox to Synology. So creating new VMs on the Synology and exporting/importing the virtual disks as vmdk, qcow2 or whatever image format that Synology is supporting (have a look at the "qemu-img convert" command).
But yeah, Proxmox might not be a great choice if a 13-year-old should do all the linux administration. I remember having a really hard time keeping everything working, when I rented my first linux root server at the age of 14 or 15. And when looking back, I can just facepalm when thinking about security/backups.:oops:
Some Hypervisor Appliance that allows you to do everything in GUI would be a better choice.

I feel bad about continuing to ask questions here.
No problem. Ask as much as you like.
 
Last edited:
  • Like
Reactions: bj-4will
Really great advice! I was already thinking about the fact that I'm nesting virtualizations. So I won't do that. In fact, I know how to move 1 of the VM's (home assistant) and already backed up/restored it onto the new server. 2 more to go. The next one is not so easy. It's a media center that includes Plex. The hard part isn't so much the vm itself as it is the data. Any advice on that? The qemu-img command looks like it brings over the vm, but does it also bring the data?
 
The qemu-img command looks like it brings over the vm, but does it also bring the data?
It will export the virtual disk. Would be good to see your VMs config files to actually see how the storage is working. You will find the VMs config files in "/etc/pve/qemu-server/<VMID>.conf". The PVE hosts storage config is in "/etc/pve/storage.cfg".
 
  • Like
Reactions: bj-4will

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!