What if root filesystem became readonly

ZKallo · Jun 15, 2024

We have four nodes, on one of them the root filesystem became readonly and there are running more virtual machines on it. We have enough resources to move them on an other node but what could we do successfully because lots of things cannot be done. (e.g. we cannot move, make backup, didn't try to stop or shutdown)

bbgeek17 · Jun 15, 2024

You can try to re-mount as rw, perhaps at least temporarily.
If you are using shared storage for the VMs, shutdown the offending host (hard shutdown via IPMI or physical button), and the HA will transfer the VMs (if you had VMs HA'ed).

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

ZKallo · Jun 15, 2024

bbgeek17 said:
You can try to re-mount as rw, perhaps at least temporarily.
If you are using shared storage for the VMs, shutdown the offending host (hard shutdown via IPMI or physical button), and the HA will transfer the VMs (if you had VMs HA'ed).

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Thx for your answer,
- I launched fsck in read-only mode on root filesystem and there were lots of errors.
- We haven't implemented HA.

bbgeek17 · Jun 15, 2024

ZKallo said:
- I launched fsck in read-only mode on root filesystem and there were lots of errors.

You don't need to fix the filesystem, yet. You just need it RW for the duration of the live migration attempt. You have not said what type of storage is backing the data of these VMs, so I don't know if it's even possible to move the VMs at all.

ZKallo said:
- We haven't implemented HA.

I guess, you could turn off the offending node and offline migrate the VMs. Make sure you get the config backups in advance:
https://forum.proxmox.com/threads/move-vm-from-a-dead-node-to-a-second-node.139095/

Please understand that any suggestion here is not a guaranteed way to recover. Only you have access to the full view and understanding of your environment. If these are important services, you should have subscription/support/backups in place.

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Kingneutron · Jun 15, 2024

You'll need to replace the disk, it's dying. Entry in /etc/fstab should have "errors=remount-ro"

Hope you have recent backups to restore from.

Go with Enterprise-class SSD to replace it if this is business/production

alexskysilk · Jun 15, 2024

lots of advice, no one asked the obvious.

What do you see in dmesg to explain the fault? obvs only available before rebooting the node since your logs arent being written to your read-only file system.

ZKallo · Jun 15, 2024

bbgeek17 said:
You don't need to fix the filesystem, yet. You just need it RW for the duration of the live migration attempt. You have not said what type of storage is backing the data of these VMs, so I don't know if it's even possible to move the VMs at all.

I guess, you could turn off the offending node and offline migrate the VMs. Make sure you get the config backups in advance:
https://forum.proxmox.com/threads/move-vm-from-a-dead-node-to-a-second-node.139095/

Please understand that any suggestion here is not a guaranteed way to recover. Only you have access to the full view and understanding of your environment. If these are important services, you should have subscription/support/backups in place.

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Thanks for your answer. It gave me a hope. I'll try these suggestions soon.

ZKallo · Jun 17, 2024

Kingneutron said:
You'll need to replace the disk, it's dying. Entry in /etc/fstab should have "errors=remount-ro"

Hope you have recent backups to restore from.

Go with Enterprise-class SSD to replace it if this is business/production

It is a 32GB SSD as the machine came from the factory. There is no backup from it. We would see what's wrong with it when we will have moved all of the vms from that node.

ZKallo · Jun 17, 2024

bbgeek17 said:
You don't need to fix the filesystem, yet. You just need it RW for the duration of the live migration attempt. You have not said what type of storage is backing the data of these VMs, so I don't know if it's even possible to move the VMs at all.

I guess, you could turn off the offending node and offline migrate the VMs. Make sure you get the config backups in advance:
https://forum.proxmox.com/threads/move-vm-from-a-dead-node-to-a-second-node.139095/

Please understand that any suggestion here is not a guaranteed way to recover. Only you have access to the full view and understanding of your environment. If these are important services, you should have subscription/support/backups in place.

Good luck

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

What worked: stopping vms, renaming conf files and copying to a network share, copying to the new node, renaming back.
What didn't work: remounting pve-root as rw ( it said: "write protected")

leesteken · Jun 17, 2024

ZKallo said:
It is a 32GB SSD as the machine came from the factory.

Sounds like a SD-card-or-eMMC-like storage, and that it was worn out ((by Proxmox, which is known to do that) to the point where it becomes read-only.

bbgeek17 · Jun 17, 2024

ZKallo said:
What worked: stopping vms, renaming conf files and copying to a network share, copying to the new node, renaming back.

Glad to hear you are up and running.

ZKallo said:
What didn't work: remounting pve-root as rw ( it said: "write protected")

That would have been a very short-term workaround to get to the same state as you did via rename. As @leesteken said, you need a new disk of a proper kind.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Kingneutron · Jun 17, 2024

...and it goes without saying that you should be setting up regular backups.

Search

Search

What if root filesystem became readonly

ZKallo

Active Member

bbgeek17

Distinguished Member

ZKallo

Active Member

bbgeek17

Distinguished Member

Kingneutron

Renowned Member

alexskysilk

Distinguished Member

ZKallo

Active Member

ZKallo

Active Member

ZKallo

Active Member

leesteken

Distinguished Member

bbgeek17

Distinguished Member

Kingneutron

Renowned Member

We value your privacy