LVM-Thin broken metadata precautions

BelCloud

Renowned Member
Dec 15, 2015
96
5
73
www.belcloud.net
Hello

Recently, the raid card (PERC h730p) on one of our nodes got fried. After replacing the card and importing the raid array, we've noticed the lvm thin metadata got corrupted.
Code:
Check of pool pve/data failed (status:1). Manual repair required!
We've tried to repair it with thin_check/thin_repair, but they do not seem able to restore it. Our analysis shows that the mappings got scrambled somehow. While we're still working on it, time passes and we've already restored from backups the affected VMs. But I'm looking for some suggestions on how to prevent such issues for happening again in the future.

One idea would be to make backups of the lvm metadata. Not sure if they'd be any useful if new data gets written after the metadata backup was created.

Another idea would be to disable the write cache, if the issue could be traced to it. It was set on write back and the controller had battery.

Anyone has any thoughts on how to prevent the lvmthin metadata from crashing in such a scenario?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!