Hello,
I have the following issue: I tried to replace an old 2TB nvme disk wich is hosting all my VMs.
I had a 4TB SATA drive which I added to my single ZFS pool. This worked fine, but I still had my pool in DEGRADED status.
I decided then to get another 4TB drive and add it as a replacement for the faulty nvme, so that I end up with a mirror of 2 x 4TB disks.
I ran the following:
Step1: Attache a new Disk to the ZFS pool:
Step2: Replace the vaulty nvme disk:
After the resilvering my Proxmox system wouldn't start.
I ended up unplugging the 2 new disks and my system came back online again.
My pool is now has follow:
What shall I do now?
- try to add the disks again to the pool? How can I cleanup the pool to start all over again?
- create a new pool and copy the data from the degraded pool? How would I do that?
Thanks a lot for your help,
Eric
I have the following issue: I tried to replace an old 2TB nvme disk wich is hosting all my VMs.
I had a 4TB SATA drive which I added to my single ZFS pool. This worked fine, but I still had my pool in DEGRADED status.
I decided then to get another 4TB drive and add it as a replacement for the faulty nvme, so that I end up with a mirror of 2 x 4TB disks.
I ran the following:
Step1: Attache a new Disk to the ZFS pool:
Bash:
zpool attach storage nvme1n1 ata-CT4000MX500SSD1_2245E6836E78
Step2: Replace the vaulty nvme disk:
Bash:
zpool replace storage nvme1n1 ata-CT4000MX500SSD1_2247E689BA0E
After the resilvering my Proxmox system wouldn't start.
I ended up unplugging the 2 new disks and my system came back online again.
My pool is now has follow:
Bash:
zpool status -v storage
pool: storage
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: resilvered 0B in 00:27:02 with 5 errors on Fri Sep 29 19:52:30 2023
config:
NAME STATE READ WRITE CKSUM
storage DEGRADED 0 0 0
mirror-0 DEGRADED 17 0 0
replacing-0 DEGRADED 17 0 0
nvme1n1 DEGRADED 18 0 5 too many errors
13263678074942122905 UNAVAIL 0 0 0 was /dev/disk/by-id/ata-CT4000MX500SSD1_2247E689BA0E-part1
6174270007834866697 UNAVAIL 0 0 0 was /dev/disk/by-id/ata-CT4000MX500SSD1_2245E6836E78-part1
errors: Permanent errors have been detected in the following files:
storage/vms/vm-102-disk-1@__replicate_102-0_1670976008__:<0x1>
storage/vms/vm-705-disk-0:<0x1>
storage/vms/vm-504-disk-0@StartDemo:<0x1>
storage/vms/vm-701-disk-0:<0x1>
root@pve:~#
What shall I do now?
- try to add the disks again to the pool? How can I cleanup the pool to start all over again?
- create a new pool and copy the data from the degraded pool? How would I do that?
Thanks a lot for your help,
Eric