ZFS Pool state "Degraded"

Dexter23

Member
Dec 23, 2021
195
15
23
34
Hi everyone
I have a customer who has these disk installed on the proxmox server
disk0: SSD Team 1TB
disk1: SSD WDC_WDS100T2G0A
Initially the disk0 it was the same of disk 1, 4th month ago the zfs pool was in degraded state, so we have replace the disk with the current have now ssd team 1tb.
Now the problem repeated again, proxmox tell me zfs pool is degraded as you can see:
1676040455806.png

But the SMART tell me the disk are good:
1676040488932.png

So i wanted to understand if the disk already has physical problem or is software problem
Thanks
 
I would replace the SATA cables and see if that helps.

By the way...it's highly recommended to use proper Enterprise/Datacenter SSDs with power-loss protecion when using ZFS. Don't expect a reliable system when using those cheap consumer disks. Those might be fine for a homelab where you don't care about downtime or your data, but not for a productive system.
 
Last edited:
Hi
I order two new SSD Kingston DC 500M, i want to say if it's possibile replace the zfs partition from the ssd 1TB to smaller one 960GB.
Thanks
 
Shrinking a formated partition is always risky and might corrupt your data. I would back everything up and then do a reinstall instead.
 
if it's possibile replace the zfs partition from the ssd 1TB to smaller one 960GB.
As far as I know there are no tools that would make this possible.

Do you have free disk connectors in that host? Prepare the new SSDs with a partition table similar to the larger ones. If you boot from that pool you need to prepare them to be boot-capable: search for "proxmox-boot-tool" - here in the forum and in the pve-wiki. Create a new pool using the partitions on the new disks. Then use "zfs send" to copy the data from the old pool to the new one. Export the new pool. The new pool should get the name of the old one. I am not sure on how to rename the new pool to the name "rpool" while the old rpool is in use...

Only if everything went well you can disconnect the old pool and run from the new one. If the machine is mission-critical and downtime should be short I would test this scenario in beforehand - I am sure there are some additional pitfalls.

The positive aspect of this approach: it does not destroy nor tamper with the original pool. But it is cumbersome and not really easy.

It really is much easier to "attach" a new, large enough disk as a third member of the mirror, wait for the data to get copied and then remove the old one. Though the preparation of the boot-mechanism is required also for this approach.

Good luck
 
Hi
I move the suspicious disk to another slot on the server, and i get the same error of the kernel on the monitor (see screenshoot attachments)
if the SMART tell me the disk is ok.
 

Attachments

  • kerneldiskerror.jpg
    kerneldiskerror.jpg
    490.1 KB · Views: 8

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!