Hi there,
i'm having a problem that i don't really understand.
First of all i need to say a few things. As per right now, i'm unable to reboot any infrastructure.
We're running PVE 5.2-1.
Were using two zpools with partitions
We have a hot plug LSI SAS HBA
Our Storage is a RAID-Z2 based on 4 x 2TB HDD.
Now, that we discovered huge performance impact, we wanted to upgrade to SSDs.
I thought it'll be easy going, because of ZFS ut i'm running into dead ends being unable to replace the drives.
Heres some info before i continue:
4521177c-9202-4dac-9148-9cc506978733 and fe544f72-0f75-4292-bbf9-cfbc4c92cdc0 were partitions on one of the HDDs.
I manually set the drives to offline, copied the partition info with: sgdisk --backup=table /dev/sda , took the HDD out and applied it
to the new SSD with: sgdisk --load-backup=table /dev/sda.
After that i tried to replace the disk, which did not work.
I'm always getting the following error:
I thought it might be because of the same UUIDs, so i changed them, physically detached the drive, put it back in.
But that didn't help. Neither did wiping the partition, detaching the drive, sticking it back in.
So far i've tried everything to my knowledge and i can't get ZFS to accept the new drive and resilver.
I'm suspecting, that a reboot would fix this issue, but like i said, as per now, that's not an option.
/dev/sde wis the new SSD with the Partitiontable from /dev/sda.
I didn't try not copying the partition table, wiping everything and manually create partitions.
But i don't know whats that gonna change.
On top of that those 4 drives should be resilvered until sunday, since we're having a really horrible maintanance then, needing the IO power more than ever.
Does someone know a trick to get the SSD into ZFS without rebooting?
Thanks to all
aka
i'm having a problem that i don't really understand.
First of all i need to say a few things. As per right now, i'm unable to reboot any infrastructure.
We're running PVE 5.2-1.
Were using two zpools with partitions
We have a hot plug LSI SAS HBA
Our Storage is a RAID-Z2 based on 4 x 2TB HDD.
Now, that we discovered huge performance impact, we wanted to upgrade to SSDs.
I thought it'll be easy going, because of ZFS ut i'm running into dead ends being unable to replace the drives.
Heres some info before i continue:
Code:
pool: storagepool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub canceled on Mon Oct 12 19:12:12 2020
config:
NAME STATE READ WRITE CKSUM
storagepool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
4521177c-9202-4dac-9148-9cc506978733 OFFLINE 0 0 0
f5c62fd6-81b8-4434-8c59-ded9afeac6d6 ONLINE 0 0 0
37a6aa01-c827-475c-ada7-c5de36e2ab82 ONLINE 0 0 0
719737a4-5aa7-4195-ab61-7f809a897fcd ONLINE 0 0 0
errors: No known data errors
pool: syspool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: scrub canceled on Wed Oct 14 14:15:04 2020
config:
NAME STATE READ WRITE CKSUM
syspool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
fe544f72-0f75-4292-bbf9-cfbc4c92cdc0 OFFLINE 0 0 0
e4528c9b-a6b0-4f0e-8bc5-ae89ecb89725 ONLINE 0 0 0
6bb412e9-c5be-4e2b-98df-cf3d6047cc3a ONLINE 0 0 0
23f8eb12-5386-4a23-8e3c-2e783e566091 ONLINE 0 0 0
errors: No known data errors
4521177c-9202-4dac-9148-9cc506978733 and fe544f72-0f75-4292-bbf9-cfbc4c92cdc0 were partitions on one of the HDDs.
I manually set the drives to offline, copied the partition info with: sgdisk --backup=table /dev/sda , took the HDD out and applied it
to the new SSD with: sgdisk --load-backup=table /dev/sda.
After that i tried to replace the disk, which did not work.
I'm always getting the following error:
Code:
root@host:/tmp/oldroot# zpool replace storagepool 4521177c-9202-4dac-9148-9cc506978733 4521177c-9202-4dac-9148-9cc506978733
cannot open '/dev/disk/by-partuuid/4521177c-9202-4dac-9148-9cc506978733': Device or resource busy
cannot replace 4521177c-9202-4dac-9148-9cc506978733 with 4521177c-9202-4dac-9148-9cc506978733: 4521177c-9202-4dac-9148-9cc506978733 is busy
I thought it might be because of the same UUIDs, so i changed them, physically detached the drive, put it back in.
But that didn't help. Neither did wiping the partition, detaching the drive, sticking it back in.
So far i've tried everything to my knowledge and i can't get ZFS to accept the new drive and resilver.
I'm suspecting, that a reboot would fix this issue, but like i said, as per now, that's not an option.
Code:
Disk /dev/sdc: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 9B129317-6957-4181-B4D2-16E394897FE2
Device Start End Sectors Size Type
/dev/sdc1 34 2047 2014 1007K BIOS boot
/dev/sdc2 2048 25167871 25165824 12G Linux swap
/dev/sdc3 25167872 130025471 104857600 50G Solaris /usr & Apple ZFS
/dev/sdc4 130025472 3907029134 3777003663 1.8T Solaris /usr & Apple ZFS
Partition 1 does not start on physical sector boundary.
Disk /dev/sdd: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 53D398CF-DAC4-4831-953C-202E8A2A91A3
Device Start End Sectors Size Type
/dev/sdd1 34 2047 2014 1007K BIOS boot
/dev/sdd2 2048 25167871 25165824 12G Linux swap
/dev/sdd3 25167872 130025471 104857600 50G Solaris /usr & Apple ZFS
/dev/sdd4 130025472 3907029134 3777003663 1.8T Solaris /usr & Apple ZFS
Partition 1 does not start on physical sector boundary.
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: F5271248-CB30-45E6-A7AD-3C87C5E66766
Device Start End Sectors Size Type
/dev/sdb1 34 2047 2014 1007K BIOS boot
/dev/sdb2 2048 25167871 25165824 12G Linux swap
/dev/sdb3 25167872 130025471 104857600 50G Solaris /usr & Apple ZFS
/dev/sdb4 130025472 3907029134 3777003663 1.8T Solaris /usr & Apple ZFS
Disk /dev/sde: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 7940BD9D-BA0E-479E-AB61-0CBE31BBA2D9
Device Start End Sectors Size Type
/dev/sde1 34 2047 2014 1007K BIOS boot
/dev/sde2 2048 25167871 25165824 12G Linux swap
/dev/sde3 25167872 130025471 104857600 50G Solaris /usr & Apple ZFS
/dev/sde4 130025472 3907029134 3777003663 1.8T Solaris /usr & Apple ZFS
/dev/sde wis the new SSD with the Partitiontable from /dev/sda.
I didn't try not copying the partition table, wiping everything and manually create partitions.
But i don't know whats that gonna change.
On top of that those 4 drives should be resilvered until sunday, since we're having a really horrible maintanance then, needing the IO power more than ever.
Does someone know a trick to get the SSD into ZFS without rebooting?
Thanks to all
aka
Last edited: