Boot Drive Failing?

sackerman

New Member
Aug 19, 2020
5
0
1
61
I am having a difficult time sorting out the disk configuration in my ProxMox instance. I 'believe' that I have 2 SSD's that are mirrored that are boot disks but am not positive:

2021-09-21_8-38-38.png

So /dev/sdb is the culprit and this is the SMART report:
2021-09-21_8-40-02.png

Here is the applicable result from 'fdisk -l':
Disk /dev/sda: 447.1 GiB, 480103981056 bytes, 937703088 sectors
Disk model: INTEL SSDSC2BF48
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E0FD8F26-3F3F-4E90-B529-7C34D21E4D34

Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 1050623 1048576 512M EFI System
/dev/sda3 1050624 937703054 936652431 446.6G Solaris /usr & Apple ZFS


Disk /dev/sdb: 447.1 GiB, 480103981056 bytes, 937703088 sectors
Disk model: INTEL SSDSC2BF48
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 45F342E2-F09C-438D-8FD5-E6242EC20AED

Device Start End Sectors Size Type
/dev/sdb1 34 2047 2014 1007K BIOS boot
/dev/sdb2 2048 1050623 1048576 512M EFI System
/dev/sdb3 1050624 937703054 936652431 446.6G Solaris /usr & Apple ZFS

Proxmox does not appear to be using MDADM. I'm guessing that the 'primary' disk is /dev/sdb as I am getting warnings from Zabbix that 'Disk read/write responses are too high' and all data is then mirrored to /dev/sda.

In short I have a replacement drive, but since this is not a simple pool I am not clear on the procedure for replacing that drive.
 
Try this first
changing a failed bootable device

You have external backups right?
Probably the easiest way is to just simply reinstall a Fresh Proxmox OS then import your Zpool into the new OS
zpool import poolname

you know Zabbix is a write intensive monitoring and should use a stand alone machine. Use a separate node outside the main cluster otherwise your ssd will get pounded by reads/writes.

Hope this helped






 
You basically wrote that drive to death. 1328 TB written in 5 years so basically 718GB of writes per day. You really should get some enterprise SSDs constructed for heavy writes or atleast mixed workloads. And you are using ZFS and not mdadm, so mikeinnyc's link is the way to replace that failed drive.

And yes, zabbix is writing alot. I moved my Zabbix server VM from ZFS to a LVM-thin because ZFSs write amplification is really terrible. That saved me 200GB of writes per day.

You really should learn a bit how to manage/monitor ZFS. ZFS is nothing that you just use without needing to understand it. There are alot of things you need to fine tune to fit your workload.
 
Last edited:
I understand your replies, thank you. Zabbix is running on a VM on ProxMox, however the Zabbix VM drive is a ZFS over iScsi connection to a TrueNas box, so I wouldn't think that Zabbix read/writes would affect the ProxMox boot drive. In either case, what's the point in having mirrored boot drives if you can't swap one and instead have to reinstall the OS? I'm not disagreeing, just not understanding.
 
I understand your replies, thank you. Zabbix is running on a VM on ProxMox, however the Zabbix VM drive is a ZFS over iScsi connection to a TrueNas box, so I wouldn't think that Zabbix read/writes would affect the ProxMox boot drive. In either case, what's the point in having mirrored boot drives if you can't swap one and instead have to reinstall the OS? I'm not disagreeing, just not understanding.
You don't need to reinstall PVE. But its not just replacing the drive because you boot from it. So you need to manually partition the new drive first so its the same as the failed healthy one, copy the bootloader from the healty disk to the new disk and so on. That all is explained here.
 
Last edited:
You don't need to reinstall PVE. But its not just replacing the drive because you boot from it. So you need to manually partition the new drive first so its the same as the failed healthy one, copy the bootloader from the healty disk to the new disk and so on. That all is explained here.
Thanks, that makes sense.
 
Ran into a bit of a snag.
Powered down the server
Replaced the defective drive
Powered the server back up
Ran 'sudo sgdisk /dev/sda -R /dev/sdb'
Ran 'sudo sgdisk -G /dev/sdb'
Ran 'sudo zpool replace -f rpool /dev/sda3 /dev/sdb3' and received 'cannot replace /dev/sda3 with /dev/sdb3: no such device in pool'

fdisk -l
Disk /dev/sdb: 447.1 GiB, 480103981056 bytes, 937703088 sectors
Disk model: Samsung SSD 883
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 12F49409-4CB8-4985-A869-5A7F102AF916

Device Start End Sectors Size Type
/dev/sdb1 34 2047 2014 1007K BIOS boot
/dev/sdb2 2048 1050623 1048576 512M EFI System
/dev/sdb3 1050624 937703054 936652431 446.6G Solaris /usr & Apple ZFS

Partition 1 does not start on physical sector boundary.


Disk /dev/sda: 447.1 GiB, 480103981056 bytes, 937703088 sectors
Disk model: INTEL SSDSC2BF48
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E0FD8F26-3F3F-4E90-B529-7C34D21E4D34

Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 1050623 1048576 512M EFI System
/dev/sda3 1050624 937703054 936652431 446.6G Solaris /usr & Apple ZFS



zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 444G 267G 177G - - 45% 60% 1.00x DEGRADED -



What am I missing here?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!