[SOLVED] VM's cloned or restored from backups have corrupted filesystems

Apr 22, 2024
3
2
3
Hi,

I am having a problem with a Proxmox 8.1 server that I'd really appreciate some help with.

Whenever I clone or restore a VM from a backup, the cloned/restored guests filesystem is corrupted. Boot errors, LVM errors, obvious corruption.

Host storage:
- RAID: Raid 5 (Hardware)
- Drives: 4x INTEL SSDSC2KB960G8 (960GB)

VM's are stored on a LVM thin pool.

I currently only have two VM's setup on the machine that I've tried to both clone and restore:
1) Guest 1 / Debian 12 - LVM / ext4 partitioning and filesystem for guest OS
3) Guest 2 / CentOS 9 - XFS / no LVM

The issue is the same when restoring or cloning either of these VM's - either filesystem or boot errors related to the filesystem.

  • This is a newly deployed server
  • There is no indication at all of any hardware failure on the host
  • The VM's I am cloning/restoring work fine - only the copies have the issue
  • The server has been burnt in with memtest for days but I'm starting to run more testing
  • I have tested restoring backups and cloning about 10 times now with the same results
  • The issue happens for "stop" and "snapshot" backups

qm Config
agent: 1
boot: order=scsi0;ide2;net0
cores: 4
cpu: x86-64-v2-AES
ide2: local-lvm:iso/CentOS-Stream-9-latest-x86_64-dvd1.iso,media=cdrom,size=10021824K
memory: 2048
meta: creation-qemu=8.1.5,ctime=1713732599
name: Centos9Base
net0: virtio=02:00:00:c8:1e:d6,bridge=vmbr0,firewall=1,link_down=1
numa: 0
ostype: l26
scsi0: lvm_thin_local:vm-1006-disk-0,cache=writethrough,discard=on,iothread=1,size=250G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=20aef623-18b6-48c7-abbe-9f2ec476913c
sockets: 1
vmgenid: 544953e7-66a6-47c4-98c1-dff113d92b55

Any ideas are greatly appreciated. Thanks.
 
restore to another host to sort out if it's a backup problem or restore problem.
try with proxmox backup server which has a Verify backup function.
 
Hi _gabriel,

Thanks for the reply.

I've attempted the following:
- Restored a backup made on the problem server to a working server: NOT corrupted.
- Restored a known working backup from a working server to the problems server: IS corrupted.

So it appears the corruption is happening when the backups are restored or the VM is cloned.

Unfortunately I do not have access to a proxmox backup server.


Any ideas? The server has never been unsafely shutdown.

The server setup involves a MegaRAID9560-8i RAID controller managing four 960GB Intel SSDs in a RAID5 configuration, resulting in one partition for BIOS boot, one for the EFI System, and a main 2.6TiB Linux LVM partition.

The LVM setup consists of a volume group named 'pve' with five logical volumes for Proxmox and VM storage, including a 60GB volume for temporary storage or backups, a 2.48TiB thin pool for sparse VM disk storage, a 20GB root partition for the OS, a 4GB swap space.

##### Hardware:
- RAID: MegaRAID9560-8i 4GB
- Drives: 4x INTEL SSDSC2KB960G8 (960GB)

Configuration:
- RAID5 - 3 data, 1 parity

The RAID device is /dev/sda

Partitions on /dev/sda
Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 2099199 2097152 1G EFI System
/dev/sda3 2099200 5622988766 5620889567 2.6T Linux LVM

All storage except except boot and EFI are in a single partition (sda3), using Linux LVM

SDA3 Contains a single LVM logical volume group (vg)
# vgdisplay
--- Volume group ---
VG Name pve
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 51
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 5
Open LV 3
Max PV 0
Cur PV 1
Act PV 1
VG Size <2.62 TiB
PE Size 4.00 MiB
Total PE 686143
Alloc PE / Size 672850 / <2.57 TiB
Free PE / Size 13293 / <51.93 GiB
VG UUID m9W22T-ULSY-GeMQ-msDY-pkCv-Nbxj-Nm27P2

# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 5 0 wz--n- <2.62t <51.93g


And that volume group contains these logical volumes (lv)
# lvs
LV VG Attr LSize Pool Origin Data% Meta%
data pve -wi-ao---- 59.99g
lvm_thin_local pve twi-aot--- 2.48t 0.34 10.55
root pve -wi-ao---- 20.00g
swap pve -wi-ao---- 4.00g
vm-1001-disk-0 pve Vwi-a-t--- 300.00g lvm_thin_local 2.84

Thanks again
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!