Grub error on reboot and pool status question

rufusz

Member
May 16, 2021
4
0
6
42
Hi,

Suddenly an HP server with Proxmox fails to start, boots to grub rescue> with an error message about no such device <id> and unknown filesystem. Booted using systemrescue zfs and zpool status shows this:

Code:
pool: rppol
state: ONLINE

config:
   NAME
       rpool  ONLINE
          mirror-0  ONLINE
             ata-WDC_.....607-part2 ONLINE
             spare-1
                 ata-WDC_......9C5-part2 ONLINE
                 ata-WDC_......53X-part2 ONLINE
         spares
              ata-WDC_.....53X-part2 ONLINE

erros: No known data errors

What are the spare-1's? Apparently mirror 0 is broken? I have an old screenshot and there ata-WDC_....9C5-part2 was part of mirror 0. Can this be a problem?

I've tried installing grub or update-grub2 commands and I get an error "failed to get canonical path of '/dev/spare-1'"

Checked these threads:
- https://forum.proxmox.com/threads/grub-error-on-reboot-device-not-found.38616
- https://forum.proxmox.com/threads/grub-rescue-error-checksum-verification-failed.52730


Thanks
 
Last edited:
Its a spare disk. I don't know why everything is saying online but but I think what you are looking at is:
  • ata-WDC_.....53X-part2 was added to the pool as spare disk
  • ata-WDC_......9C5-part2 died on you
  • ata-WDC_.....53X-part2 took over as replacement
did you get the result from a life stick? Did you chroot?
 
In the end what worked is:
- detach ata-WDC_......9C5-part2 from rpool
- attach it back to rpool
- chroot
- reinstall grub on all sda/sdb/sdc drives

In the end everything is healthy, so I do not know why 9C5 died/was moved to spare. All I know that it happened after a brief power interruption and the server doesn't have an UPS :(
 
I am not sure what you mean with sdc drive. The spare goes back to the spares and will wait there again (usually shared about multiple pools) to replace a faulty device.

Unplugging, eventually format, re plugging ata-WDC_......9C5-part2 and then zpool replace rpool ata-WDC_......9C5-part2 <New Disk> should do the trick.

Also check smartctl on that disk to be sure its not faulty
 
Last edited: