Restoring partitions on replaces ZFS-mirrored SSD

prioritize all of it. the most important would be replacing an rpool member, cloning the partition table from a good drive and handling the proxmox-boot-tool portion before executing the zpool attach.

I think that's an operation that almost all noobs will struggle with because it's easy to take for granted that the redundant zfs data is just one partition. I scrtiped it years ago for a narrow use case and I still have to refer back to it for the steps because I don't do it enough to remember it.

zfs is quickly becoming the "premier" local storage for proxmox, if it isn't already. the PVE GUI should have the complete set of tools.

replace/detach/attach is not asking much and the boot tool is such a PVE-specific thing it absolutely must not be taken for granted.
 
Not that I a disgree, but why not file Bugzilla reports for each as feature request and tracked them through here? I noticed previously even on actual bugs people rarely do more than a post in the forum. They do not even +1 themselves on existing reports. There even might be some of what you asked for already filed.

(BTW I was asking the other day about the boot-toll if it can e.g. smartly have a hot spare and auto-create ESP, etc. Similarly if one replaces a failed drive, it would have been nice if all it takes is to hot-swap and forget it. But it's not implemented and then I thought about it - I do not think it's even a priority because the kind of users that need that use a RAID controller for the OS and are well-versed with zfs. Personally the last thing I would want is a ZFS GUI feature that called an API that went wrong ... if you zfs destroy your pool in your terminal, that's easier to brush off here than to handle you when you clicked the same in GUI.:))
 
I'm sure stuff like this is already filed. I have another feature request with tons of +1's going on 5 years old.

it took less than a day to implement in my feasibility testing. it's never getting done, I'm not wasting any more time on obvious feature enrichment. everyone knows the zfs gui is incomplete.
 
I'm sure stuff like this is already filed
Yes: https://bugzilla.proxmox.com/show_bug.cgi?id=3289

I also think this should be added to the webUI. Stopped counting the cases exactly like this one where I needed to explain people how to revert the wrong "zpool replace" and do it the proper way according to wiki with cloning the partitions and bootloader. And they often even fail to follow the wiki article with using wrong disks or partitions for the placeholders.
 
everyone knows the zfs gui is incomplete.


And it's not an "important feature for PVE" - I mean, at least it's answered directly, so no need to wonder.

I also think this should be added to the webUI. Stopped counting the cases exactly like this one where I needed to explain people how to revert the wrong "zpool replace" and do it the proper way according to wiki with cloning the partitions and bootloader.

But are they PVE subscribers? There's outstanding bugs that would have higher priority than this.

And they often even fail to follow the wiki article with using wrong disks or partitions for the placeholders.

That's what the support is for. :D
 
don't ask people to file feature requests then. they aren't getting done.

those who can just make their own.
 
I'm sure stuff like this is already filed. I have another feature request with tons of +1's going on 5 years old.

Care to share?

it took less than a day to implement in my feasibility testing. it's never getting done, I'm not wasting any more time on obvious feature enrichment. everyone knows the zfs gui is incomplete.

It might be a strategy, I think I get answered all the time something is not worth doing out of habit to see how much it riles one up. :) Still, the +1s would have helped to get an idea, I rarely see them in PVE's BZ.
 
But are they PVE subscribers?
Sometimes yes...paying for a subscription doesn't guarantee that the person is experienced in working with partitions, ZFS or Linux CLI in general.
There is a lot that could end very bad (disk of a mirror failed, then puttng a factory-preformated new disk in and cloning the new disks GPT wiping the remaining existing disk and so on) so I would really like to see such a important feature in the webUI to reduce keep human errors.
 
Last edited:
don't ask people to file feature requests then. they aren't getting done.

those who can just make their own.

I don't work for Proxmox, I just did the same to see how many do +1, even they came to forum asked support, would point them to existing links and ask them since they encountered the same if they can indicate +1 for the bug report. No, nobody. :)
 
Sometimes yes...
Then it' a cost/benefit. I suspect the few times this reaches support the load is so small (also easy to dispense link to a wiki) that it's literally less cost than to implement whole new interface on one particular storage type only in what is not a storage appliance at all. I mean, last time I pointed out how HA stack was lacking with ZFS replications, Dietmar told me I am using it wrong (as in, not shared storage), Aaron that it's all good actually. Sometimes I wonder how the worklfow is, i.e. who picks up the forum / bz / decide prio / put onto roadmap ... it feels like nothing like that is in place, it's more driven by support cases I can imagine - best idea would be when one sees kernel modules that were added due to "user request", strange things like JFS support, etc. That's where the time goes then. Not homelabs.
 
  • Like
Reactions: Sasha
You might want to have a look at:
Code:
journalctl -u zfs-zed

# it's not like they know what your /dev/... was though so maybe you want to show it with
lsblk -o+SERIAL

# better yet have a look at SMART output
smartctl -a /dev/...

# or for nvmes even better yet
apt install nvme-cli
nvme error-log -e 255 /dev/nvme...

It sounds mystique, but the only provement about incident remains system email:

Code:
ZFS has detected that a device was removed.

 impact: Fault tolerance of the pool may be compromised.
    eid: 6
  class: statechange
  state: REMOVED
   host: bs
   time: 2024-02-17 07:51:44+0500
  vpath: /dev/disk/by-id/nvme-nvme.126f-4d4e32333039353132473030383333-51323030304d4e2035313247-00000001-part3
  vguid: 0x65FD66A420471F69
   pool: rpool (0x4010E6931FD3CBB7)

smartctl -a /dev/... shows Available Spare: 93% so it was enough to return bad SSD to shop.

Appreciate to all good people for help!
 
  • Like
Reactions: esi_y

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!