[SOLVED] Cluster Recovery from Locked OS Drive

jebbam

Well-Known Member
Sep 8, 2019
64
22
48
Hi,

I have three similar proxmox clusters, ~10 nodes running Ceph with encrypted root partitions. I enable remote ssh for unlocking the encrypted root drives when it boots. This has worked swell for years. This morning, one of the nodes had rebooted and was waiting for the password to decrypt. I'm not sure what caused it to reboot. But it won't accept the disk password.... Note, I'm doing this via SSH, so there aren't the weird keyboard mappings you would get using Java IPMI. There aren't any disk errors and the partitions look ok. It just won't accept the password. Yikes.

The Ceph cluster recovered fine.

The problem is I can't start any of the virtual machines that were on the non-booting host. I did have *some* KVMs using HA, but not all of them. There's around 30 KVMs on the locked node. How do I get these KVMs booted on another node?

Thanks for any pointers,

-Jeff
 
Last edited:
For the KVMs on the down node, I was able to get them working. You can just go in to "HA" (high availability) and add the KVMs from the down node, and it just migrates them. It doesn't need to be set up when the node is functioning. So I'm able to recover the nodes, thankfully!

Later: I wasn't able to get the password working.
 
Last edited: