Restoring partitions on replaces ZFS-mirrored SSD

Sasha

Well-Known Member
Oct 18, 2018
86
1
48
Kazahstan
Hi, guys!
Just encountered that one of two SSDs from zfs-pool on mirror RAID1 was degraded.
After
- replacing SSD
- running zpool replace -f rpool 7349143022040719209 /dev/disk/by-id/nvme-Q2000MN_512G_MN2312512G00269
- got a success ZFS has finished a resilver

But as You can see got a difference i sense of partitions concerning booting case.
What is the correct way to make new SSD to by exactly bootable as it was installed by PM's installer?

And what i should to do now?

zpool detach
delete pertiotions from new SSD
copy partition BIOS boot and EFI
proxmox-boot-tool formst + init+refresh
zpool attach
?

1708168788184.png
 
Last edited:
You forgot to manually partition(clone the partition table) the disk and write the bootloader. Correct procedure would have been as described in paragraph "changing a failed bootable device" here: https://pve.proxmox.com/wiki/ZFS_on_Linux
So you could use "zpool remove" "zpool attach" to remove your replaced disk and then start again.
 
Last edited:
  • Like
Reactions: esi_y
As I can understand zpool remove|add will destroy pool. But zpool detach|attache is for working with mirror's disks exactly.
Whould You clarify that moment?

Code:
root@bs:~# zpool status -v
  pool: rpool
 state: ONLINE
  scan: resilvered 104G in 00:02:26 with 0 errors on Sat Feb 17 09:54:12 2024
config:


    NAME                                                                                       STATE     READ WRITE CKSUM
    rpool                                                                                      ONLINE       0     0     0
      mirror-0                                                                                 ONLINE       0     0     0
        nvme-Q2000MN_512G_MN2312512G00269                                                      ONLINE       0     0     0
        nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part3  ONLINE       0     0     0


errors: No known data errors
 
Sorry, yes. You are right. Using "zpool detach" you could turn that mirror into a single disk pool and later use "zpool attach" to add the disk again to form a mirror (instead of the zpool replace of the wiki).
 
Last edited:
  • Like
Reactions: Sasha
Look, I detached new disk from zpool, deleted partitions and cloned it by
Code:
sgdisk /dev/nvme1n1 -R /dev/nvme0n1
sgdisk -G /dev/nvme0n1

But now I'm in stack with attaching...

Code:
zpool attach -f rpool /dev/disk/by-id/nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part3 /dev/disk/by-id/nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part3

cannot resolve path '/dev/disk/by-id/nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part3'

zpool status -v rpool

Code:
pool: rpool
 state: ONLINE
  scan: resilvered 104G in 00:02:26 with 0 errors on Sat Feb 17 09:54:12 2024
config:


    NAME                                                                                     STATE     READ WRITE CKSUM
    rpool                                                                                    ONLINE       0     0     0
      nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part3  ONLINE       0     0     0


errors: No known data errors

ls -la /dev/disk/by-id/

Code:
lrwxrwxrwx 1 root root  13 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001 -> ../../nvme1n1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root  13 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001 -> ../../nvme0n1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root  13 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878 -> ../../nvme1n1
lrwxrwxrwx 1 root root  13 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878_1 -> ../../nvme1n1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878_1-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878_1-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878_1-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-Q2000MN_512G_MN2309512G00878-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root  13 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269 -> ../../nvme0n1
lrwxrwxrwx 1 root root  13 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269_1 -> ../../nvme0n1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269_1-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269_1-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269_1-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-Q2000MN_512G_MN2312512G00269-part3 -> ../../nvme0n1p3
 
Code:
zpool attach -f rpool\
 /dev/disk/by-id/nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part3\
 /dev/disk/by-id/nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part3

I'ts duplicate. So that was a typo.

Code:
cannot resolve path '/dev/disk/by-id/nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part3'
 
Last edited:
  • Like
Reactions: Sasha
Look, I detached new disk from zpool, deleted partitions and cloned it by
Code:
sgdisk /dev/nvme1n1 -R /dev/nvme0n1
sgdisk -G /dev/nvme0n1

Code:
lrwxrwxrwx 1 root root  13 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001 -> ../../nvme1n1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part1 -> ../../nvme1n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part2 -> ../../nvme1n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:31 nvme-nvme.126f-4d4e32333039353132473030383738-51323030304d4e2035313247-00000001-part3 -> ../../nvme1n1p3
lrwxrwxrwx 1 root root  13 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001 -> ../../nvme0n1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root  15 Feb 18 10:34 nvme-nvme.126f-4d4e32333132353132473030323639-51323030304d4e2035313247-00000001-part3 -> ../../nvme0n1p3

You might want to run partprobe after that -G. Never mind, they are actually different! :D
 
Last edited:
Btw I actually prefer to use partlabels for this reason. Of course, make them somewhat unique, but human unique.
 
Guys, and one more stuck 8)
roxmox-boot-tool status
Code:
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
0FF5-B5BA is configured with: uefi (versions: 6.5.11-7-pve, 6.5.11-8-pve)

How i can add new SSD here?
Documentaion is unclear (for me)

proxmox-boot-tool format <new disk's ESP>
 
ls -la /dev/disk/by-uuid/*

Code:
lrwxrwxrwx 1 root root 10 Feb 18 10:12 /dev/disk/by-uuid/06c2dfd4-5749-4e84-b8c6-9a7317428503 -> ../../sda1
lrwxrwxrwx 1 root root 15 Feb 18 10:31 /dev/disk/by-uuid/0FF5-B5BA -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Feb 18 11:15 /dev/disk/by-uuid/4616443137623313335 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root  9 Feb 17 15:44 /dev/disk/by-uuid/850a1f49-5ce8-4874-a3f5-fa8f3132e3f4 -> ../../sdb
 
Guys, and one more stuck 8)
roxmox-boot-tool status
Code:
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
0FF5-B5BA is configured with: uefi (versions: 6.5.11-7-pve, 6.5.11-8-pve)

How i can add new SSD here?
Documentaion is unclear (for me)

proxmox-boot-tool format <new disk's ESP>
proxmox-boot-tool init /dev/... ... the ESP one

Then check with efibootmgr -v
 
Last edited:
Would You clarify, what /dev exactly?

s -la /dev/nvme*
crw------- 1 root root 241, 0 Feb 17 15:44 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Feb 18 10:34 /dev/nvme0n1
brw-rw---- 1 root disk 259, 1 Feb 18 10:34 /dev/nvme0n1p1
brw-rw---- 1 root disk 259, 2 Feb 18 10:34 /dev/nvme0n1p2
brw-rw---- 1 root disk 259, 7 Feb 18 11:15 /dev/nvme0n1p3
crw------- 1 root root 241, 1 Feb 17 15:44 /dev/nvme1
brw-rw---- 1 root disk 259, 3 Feb 18 10:31 /dev/nvme1n1
brw-rw---- 1 root disk 259, 4 Feb 18 10:31 /dev/nvme1n1p1
brw-rw---- 1 root disk 259, 5 Feb 18 10:31 /dev/nvme1n1p2
brw-rw---- 1 root disk 259, 6 Feb 18 10:31 /dev/nvme1n1p3
 
ls -la /dev/disk/by-uuid/*

Code:
lrwxrwxrwx 1 root root 10 Feb 18 10:12 /dev/disk/by-uuid/06c2dfd4-5749-4e84-b8c6-9a7317428503 -> ../../sda1
lrwxrwxrwx 1 root root 15 Feb 18 10:31 /dev/disk/by-uuid/0FF5-B5BA -> ../../nvme1n1p2
lrwxrwxrwx 1 root root 15 Feb 18 11:15 /dev/disk/by-uuid/4616443137623313335 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root  9 Feb 17 15:44 /dev/disk/by-uuid/850a1f49-5ce8-4874-a3f5-fa8f3132e3f4 -> ../../sdb

Another giveaway with FAT partition is that it has short UUID, but generally I prefer to know what I am doing, i.e. know it's the right device and the right partition exactly that was meant to be used as ESP, not format random FAT on random device. ;)
 
  • Like
Reactions: Sasha
Guys, it's ok now and PM works as all good people helped me.
i need one more advice.

Look, degraded SSD is about 1 week from shop. It's not good case...
To return it to the shop I must prove that it was a problem of that SSD...

Is it possible to find out somewhere in syslogs the provement that mirror was degraded?
 
Guys, it's ok now and PM works as all good people helped me.
i need one more advice.

Look, degraded SSD is about 1 week from shop. It's not good case...
To return it to the shop I must prove that it was a problem of that SSD...

Is it possible to find out somewhere in syslogs the provement that mirror was degraded?

You might want to have a look at:
Code:
journalctl -u zfs-zed

# it's not like they know what your /dev/... was though so maybe you want to show it with
lsblk -o+SERIAL

# better yet have a look at SMART output
smartctl -a /dev/...

# or for nvmes even better yet
apt install nvme-cli
nvme error-log -e 255 /dev/nvme...
 
  • Like
Reactions: Sasha

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!