Overflowing LVM = data loss

Dragsting

New Member
Jun 21, 2023
3
0
1
VM overflowed the disk during operation, making VM unable to start due to io error.
I can't access to the disk because it has 150GB occupied out of 140GB available.
When mounting the disk in any other Proxmox VM, an io error occurs and the whole VM crashes.
I made a disk image using Testdisk and I have an image.dd file. When trying to access under windows using Osfmount, it shows an error.

So I need to understand that this is "normal" and I can say goodbye to my data?
Why didn't Proxmox stop the VM before it was overflowing, why isn't there a simple instruction that if it downloads 10GB and only has 5GB of free space, it should NOT download that?
1687355322570.png
1687355356130.png
1687355371698.png
 
Last edited:
The only way to "overflowed" your LVM if you had it thinly provisioned and over-subscribed. From a VM perspective you had free space and it kept writing, updating blocks and metadata, until it couldnt handle the errors and crashed. There is about a 50/50 chance of full recovery.

This will happen with any system in such situation. Its your job to monitor your space, and if overprovisioned - back-fill as needed.
LVM provides you options to auto-expand on thresholds. It should have given you warnings when you overprovisioned your Logical Volumes.

Why didn't Proxmox stop the VM before it was overflowing
Proxmox relies on your underlying storage to handle "storage things". In this case LVM. There are many options out there for monitoring and alerting for LVM, ie https://forum.proxmox.com/threads/n...monitor-lvm-thin-space-with-warn-mail.109757/



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Why didn't Proxmox stop the VM before it was overflowing, why isn't there a simple instruction that if it downloads 10GB and only has 5GB of free space, it should NOT download that?
That's the risk of overprovisioning in any system. I just witnessed this in a half billion dollar revenue company ... it happens everywhere. The takeaway is ... don't overprovision. Even good monitoring will not help if you can write with > 1GB/s...
 
More than 100gb of data, about 130 docker containers are dead.
From now on, I treat Proxmox as an elevated risk tool.
 
This is not proxmox fault, but yours. Any serious tool allows you to destroy your environment, if you intend to do that.
 
  • Like
Reactions: LnxBil

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!