Fail to boot after install ZFS RaidZ2 10 drives

Rassillon

Member
Mar 20, 2019
13
1
8
52
I am installing proxmox 6.3 onto a dual xeon supermicro with 10 ssd drives.

This is a proof-of-concept build.

Basic details of the server:
-1U
-Dual 8-core xeon
-192gb ram
-2 sata controllers
-- sata controller A handles 6 drive bays
-- sata controler B handles 4 drive bays
-1 pcie dual port 10gb card solarflare
-2 onboard 1gb eth ports
-10 x 50gb ssd

This same exact configuration works perfectly fine installing Proxmox 6.3 onto ZFS utilizing 2 drives, 4 drives, 6 drives, and 8 drives.

With the changing number of drives I changed from RaidZ Mirror to Raidz1 and Raidz2 as appropriate.

I did not create multiple volumes or pools.

All installs I allowed the installer to configure ZFS.

When attempting to install utilizing all 10 drives in a RaidZ2 configuration everything during the install seems to be just fine.

When rebooting all disks sda through sdg are detected properly and loading stops with sdh with the following lines

Code:
sdh: sdh1 sdh2 sdh3
sd 8:0:0:0: [sdh] Attached SCSI disk
ata9: SATA link down (SStatus 0 SControl 300)
ata10: SATA link down (SStatus 0 SControl 300)
qla2xxx [0000:20:00.0]-8083:4: cable is unplugged...
qla2xxx [0000:20:00.1]-8083:11: cable is unplugged...

It stops at that point and will not continue even after leaving it for 18+ hrs.

If I hit [Enter] it displays the prompt (initramfs)

**NOTE: I have had to retype output from a picture so there may be typos**

I have tried moving disks around and essentially the same thing is happening except with a different letter disk.

The 10gb card is currently not connected to cables so those final lines make sense.
However it only stops loading when it is configured with all drives.... 2, 4, 6, 8 drives all work just fine.

Can someone explain what is going on?
If there is a fix?

If I can assist the project by providing logs I will if I can get them without too much trouble.

I appreciate any input as long as it relevant to solving the above issue.
 
hi,

please do not bump thread without adding any more information. often when nobody is answering, the problem/topic is either too niche or nobody has an idea how to approach this.

from what you posted it seems there is some hardware initialization issue... did you already check here: https://pve.proxmox.com/wiki/ZFS:_Tips_and_Tricks#Boot_fails_and_goes_into_busybox
e.g. the part with rootdelay

does it work with less disks in the pool ? (e.g. only from one controller ?)

can you provide more logs?

also please use a current version of pve (current is 6.4)
 
hi,

please do not bump thread without adding any more information. often when nobody is answering, the problem/topic is either too niche or nobody has an idea how to approach this.

from what you posted it seems there is some hardware initialization issue... did you already check here: https://pve.proxmox.com/wiki/ZFS:_Tips_and_Tricks#Boot_fails_and_goes_into_busybox
e.g. the part with rootdelay

does it work with less disks in the pool ? (e.g. only from one controller ?)

can you provide more logs?

also please use a current version of pve (current is 6.4)
Thanks for pointing out the rootdelay detail. I will look at that.

Yes, as I mentioned in my original post, ZFS with 2,4,6, or 8 drives all work just fine. I suppose I wasn't as clear as I could have been.

Which specific logs would you suggest would have the detail needed to identify the issue.

I will update to 6.4, it was not convenient for me to update the USB drive as my workstation and the server are not in the same building.

I will update if anything changes.
 
Although I acknowledge that there must be something broken, what is your rational behind installing PVE on a 10-disk raidz?
Wouldn't it make more sense to install on a raid1 or maybe raid10 and setup another pool for the images?
 
Although I acknowledge that there must be something broken, what is your rational behind installing PVE on a 10-disk raidz?
Wouldn't it make more sense to install on a raid1 or maybe raid10 and setup another pool for the images?
I am working on a method to deploy clusters on a variety of hardware configurations while keeping it as simple as possible. I will be providing a step-by-step instruction guide to moderately technical people who very easily could get confused. Sometimes it just is what it is and the "why don't you just" doesn't matter. I have been given requirements and I am working within those guidelines.

This should not be taken as anything other than answering of your question.
 
I have tried moving disks around and essentially the same thing is happening except with a different letter disk.
Maybe I'm under thinking this but that sounds like you have 2 bad drives. "SATA link down" sounds like a bad cable or controller or disk. Moving the disks around and getting an error on a different letter points to the disk or carrier it is in.

Have you tried doing a smaller array with just the two disks that it complains about?
 
Maybe I'm under thinking this but that sounds like you have 2 bad drives. "SATA link down" sounds like a bad cable or controller or disk. Moving the disks around and getting an error on a different letter points to the disk or carrier it is in.

Have you tried doing a smaller array with just the two disks that it complains about?
I thought of that, but I have moved the drives around and installed them in multiple arrangements and the behaviour is all the same.

It is always, Install sees all drives fine and configures them and then fails to load after reboot.

I think @dcsapak 's recommendation about the delay setting is going to be the most likely. I just have to get the time to get back over to where the server is so I can re-install.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!