[SOLVED] zpool delete and change disk after crash ...

martos

New Member
Mar 4, 2022
5
0
1
51
France
www.perdu.com
Hello,

I have a cluster in 7.4 ( 7.1 upgrade 7.2 upgrade 7.3 upgrade 7.4 )
PROXMXLPROD1 ( production VM : 10 )
PROXMXLDBA1 ( database VM: 5 )
PROXMXLDEV1 ( Dev VM : 35 )
This 3 host are Proliant ML350p Gen8, 144 Go Ram, 2* Xeon E5 2640 ( last bios and firmware for all component )

this 3 computer have :

1 LVM + Boot ( Promox System Kernel 5.15.102-1-pve)
4 ZFS_POOL
ZFS_DEV ( 2To )
ZFS_PROD ( 2To )
ZFS_DBA ( 2To )
ZFS_BACKUP ( 2To )

1 Proxmox PBS VM
with use storage :
/dev/sda 35 Go Ext4 ( on ZFS_DEV or ZFS_PROD or ZFS_DBA ) Promox system Kernel 5.15.53-1
/dev/sdb 2 To Ext4 ( on ZFS_BACKUP with name : PBS_EXT4_BCK )

Some VM ...
The PROD vm are repliced on DEV ( 7 mn ) and on DBA ( 15 mn ) with dedicated network at 10 Gb.

The Storage of the cluster ( /etc/pve/storage.cfg )

zfspool: DC_ZFS_DBA
pool ZFS_DBA
content images,rootdir
mountpoint /ZFS_DBA
sparse 0

zfspool: DC_ZFS_PROD
pool ZFS_PROD
content rootdir,images
mountpoint /ZFS_PROD
sparse 0

zfspool: DC_ZFS_DEV
pool ZFS_DEV
content rootdir,images
mountpoint /ZFS_DEV
sparse 0

zfspool: DC_ZFS_BCK
pool ZFS_BACKUP
content rootdir,images
mountpoint /ZFS_BACKUP
sparse 0

pbs: PBS-DBA_BCK
datastore PBS_EXT4_BCK
server XX.XXX.XXX.120
content backup
fingerprint c5:......:b2
nodes PROXMXLDBA1
prune-backups keep-all=1
username root@pam

pbs: PBS-DEV_BCK
datastore PBS_EXT4_BCK
server XX.XXX.XXX.130
content backup
fingerprint 42:......:07
nodes PROXMXLDEV1
prune-backups keep-all=1
username root@pam

pbs: PBS-PROD_BCK
datastore PBS_EXT4_BCK
server XX.XXX.XXX.140
content backup
fingerprint 13:......:86
nodes PROXMXLPROD1
prune-backups keep-all=1
username root@pam


All works good, but whit a lot of backup in same the : PROXMXLPBS-DEV1 ( the Proxmox PBS on the pve DEV1 ) have some reset system the nigh when it make backup : 4 reset in 1 years.
After the reboot , i loose the ZFS_BACKUP so i made a :
zpool scrub ZFS_BACKUP
And it's ok after some time.

But last week, we have power trouble and i loose ZFS_BACKUP completely in PROXMXLDEV1.
I make a :
zpool import -FX ZFS_BACKUP
But juste after i have a kernel Panic : dmu.c:1123:dmu_write().
I can just reboot in rescue mode, if i make a systemctl status zfs.target , i have a kernel panic.
Because the data on the ZFS_BACKUP is not important , i remove the disk .
With this i can restart the server with zfs started without kernel Panic
I add new disk , i see it on /dev/sdf with good size.

But,
In CLI :
zpool destroy ZFS_BACKUP
cannot open 'ZFS_BACKUP': no such pool

But in the web GUI:
i see the old disk :

Device Type Usage Size GPT Model Serial
/dev/sdb Unknown partition 2.00 TB yes LOGICAL_VOLUME 60...13
/dev/sdb1 partitions ZFS 2.00 TB yes
/dev/sdb9 partitions ZFS 2.00 TB yes

So my ask :

How to delete the information of the gost disk see in the Gui ?
How to change the ZFS_BACKP on the /dev/sdb to the new disk /dev/sdf
On the PROXMXLDEV1 on the ZFS_BACKP i have the storage for backup PBS-DEV_BCK , datastore PBS_EXT4_BCK, if i can make the other action, how to format it to retart to backup vm ?

Thank's for all
 
Thank's i have a look on this link,
I find the first trouble : The gosht Drive ( i check with the serial )
If i understand the Web Gui for host show the fdisk partition.
So i check in gparted and i see the ghost drive ... so i look for this in google , and i find an hp bug.
If you have P822 extrenal SAS connection to some disk enclosure, the old disk can see it by OS ...
So with i compare the information From ILO , from ssacli and gparted .
And with this information i find the solution to delete the ghost disk ...
So the ghost drive not a Promox trouble or OS trouble , but bug in the firmare of the SAS controller ...

Now i look for delete the old zpool and remake it, after i must recreate the partition for this PBS.
 
OK , i find in the doc howto to destroy and and to Create zpool with same name ...
Now i must find how to :
Start PBS without the vm disk on the old zpool
delete the PBS datastore in the conf and create a new datastore in the new zpool ...