ZFS - Offline command does nothing

SHSA

New Member
Apr 18, 2023
3
0
1
Hi,

I have a ZFS pool that a drive is failing.
I want to replace it in the same bay.
Typing:
zpool offline (array name) (id)
Does nothing. I get no feedback. The disk status does not change, it still shows as faulted. It acts as if I never ran the command.

I have tried to use the GUID/UUID/Shorthand every id under the sun for this device and it does nothing. Proxmox's GUI does nothing.

Any ideas?
 

Attachments

  • rtV8nzxsta.png
    rtV8nzxsta.png
    26.1 KB · Views: 25
So, just clarifying, you are typing:
zpool offline NXWitnessStorage ata-WDC_WUH721816ALE6L4_3HH568PN

Correct, I am typing that exact line.
zpool offline NXWitnessStorage ata-WDC_WUH721816ALE6L4_3HH568PN
and nothing happens. It looks like the command worked but no feedback and checking the full status again with -v shows it's still there and faulted.

If I try to type replace, it basically says it's still in the array and that I need to force it to remove it, I didn't want to do that out of caution. I'll be heading back there shortly, I can get you a screenshot.

My understanding of the replace command would probably be easy but it's a 4 bay server, so I can't just plug in the new drive and type:
zpool replace (name) (old drive) (new drive) because the old drive would be physically removed by then.
 
@Dunuin This looks particularly dangerous, as he's running with no redundancy on that Z1. My thought would be to backup the data, physically replace the faulted drive, and then run the ZFS replace command to bring everything back into a non-degraded state.

However, that's a pretty "Exciting" way of doing it, what are your thoughts? [Given you have a lot of ZFS experience.] That offline command not working is particularly interesting.
 
Correct, I am typing that exact line.
zpool offline NXWitnessStorage ata-WDC_WUH721816ALE6L4_3HH568PN
and nothing happens. It looks like the command worked but no feedback and checking the full status again with -v shows it's still there and faulted.

If I try to type replace, it basically says it's still in the array and that I need to force it to remove it, I didn't want to do that out of caution. I'll be heading back there shortly, I can get you a screenshot.

My understanding of the replace command would probably be easy but it's a 4 bay server, so I can't just plug in the new drive and type:
zpool replace (name) (old drive) (new drive) because the old drive would be physically removed by then.
Actually, how creative are you feeling? If you secure the replacement drive [in a separate secure bay/caddy], you could have the cabling and power going from the outside of the server case to the internals, allowing you to run the 4-bay array + the extra drive. After the restoration, you could then shut the server down, and sort out the bays accordingly. Bit Ghetto, but could potentially be an option.
 
Backing up the data isn't viable. They're 16TB drives each. That's a big array.
We don't keep an array that big on us. It's an NVR.

There is no other bays, the only way would be an external USB caddy....and I'm sure the client would like to keep his NVR running.

So, likely it's just going to be a hail mary and just go for it.
I'm thinking just force detach it and then put the new drive in, and do whatever the process it is to get the new drive in.
 
zpool replace (name) (old drive) (new drive) because the old drive would be physically removed by then.
You can replace a physically removed drive. It isn't using its data anyway and will resilver from data + parity on the remaining 3 disks. But if zfs doesn't let you take the disk offline, I wouldn't wonder if the replace command would fail too.

@Dunuin This looks particularly dangerous, as he's running with no redundancy on that Z1. However, that's a pretty "Exciting" way of doing it, what are your thoughts?.
Yeah, that's why you usually better get double the amount of drives that are half in size and then do a raidz2. But not that bad with a recent backup. I personally replicate my raidz1 pools to other pools, so it's not that bad when a raidz1 pool is lost.

That offline command not working is particularly interesting.
Not sure about that. Didn't seen zfs not responding yet.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!