[SOLVED] Having trouble with local-lvm after a reboot

DrgnFyre

New Member
May 26, 2023
5
0
1
So, a couple things happened. I had 2 ubuntu vm's going, and they were both encoding some video for me. This apparently was too much for my Proliant DL380 G&, and caused things to go into an "i/o error" message on both the VM containers.

I rebooted them both, and again i/o errors.. i turned one off and it seemed ok, but then later more i/o errors.

Reboot the host. Man, I forgot how long these things take to boot.

restart VM's. 2 at a time, no good. The media server, working ok, but only for a bit. I noticed both the host and the vm's had given errors about the hard drives and they had to run fsck - i had to run fsck at one point on the host between the reboots before it would continue on.

So, the 1st VM is still behaving strange after a reboot / fsck. I have 2 snapshots of it from earlier in the week, so I restore the 'running' one, with the VM up and running. I don't remember exactly what happened, but I got an error saying it couldn't do that.. So, I restored the powered off snapshot.

I remember getting that horribly scary message "Volume Sizes exceeds the size of thin pool and free space in volume group" but it seemed to restore ok.

I'm afraid to turn on the 2nd VM. I'd like to save it if possible, but I can start over.
The problem is that "VM Disks" shows my vm-disk-100 as being 142 TB (!). This is a 3 TB drive. I only had ever provisioned 1TB at a time, so I had 2 TB for the main VM and 1 TB for the 2nd.

How do I fix this? I saw in another thread that I should run fsck against the 'damaged' drive (wasn't quite the same), but I'm afraid that'll break the data even more. Can I just remove both the snapshots I took and make another one?

Code:
lvs
  LV                                             VG  Attr       LSize    Pool Origin                                         Data%  Meta%  Move Log Cpy%Sync Convert
  data                                           pve twi-aotz--   <2.58t                                                     96.52  8.37                           
  root                                           pve -wi-ao----   96.00g                                                                                           
  snap_vm-100-disk-0_Pre_saltbox_051923_shutdown pve Vri---tz-k <129.20t data                                                                                       
  snap_vm-100-disk-0_pre_saltbox_051923          pve Vri---tz-k <129.20t data                                                                                       
  snap_vm-101-disk-0_initial_install             pve Vri---tz-k 1000.00g data vm-101-disk-0                                                                         
  swap                                           pve -wi-ao----    8.00g                                                                                           
  vm-100-disk-0                                  pve Vwi-aotz-- <129.20t data snap_vm-100-disk-0_Pre_saltbox_051923_shutdown 0.93                                   
  vm-100-state-pre_saltbox_051923                pve Vwi-a-tz-- <188.49g data                                                44.97                                 
  vm-101-disk-0                                  pve Vwi-a-tz-- 1000.00g data                                                98.56

Code:
pvs
  PV         VG  Fmt  Attr PSize  PFree 
  /dev/sda3  pve lvm2 a--  <2.73t <16.38g
root@hp-local:~# vgs
  VG  #PV #LV #SN Attr   VSize  VFree 
  pve   1   9   0 wz--n- <2.73t <16.38g
root@hp-local:~# vgdisplay
  --- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  86
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                9
dud
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               <2.73 TiB
  PE Size               4.00 MiB
  Total PE              715233
  Alloc PE / Size       711041 / 2.71 TiB
  Free  PE / Size       4192 / <16.38 GiB
  VG UUID               uXm3Wn-hA5P-Jbmv-VXyG-DGjY-vhVF-JZ7Fr2

Code:
pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda3
  VG Name               pve
  PV Size               <2.73 TiB / not usable <2.01 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              715233
  Free PE               4192
  Allocated PE          711041
  PV UUID               je1utU-rIUW-CJH8-41lO-V3Su-BYnf-0q1CHe


I also tried doing fsck as described here..
https://forum.proxmox.com/threads/struggling-to-repair-bad-superblock-after-power-outage.115481
Code:
fsck /dev/mapper/pve-vm--100--disk--0
fsck from util-linux 2.33.1
e2fsck 1.44.5 (15-Dec-2018)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/mapper/pve-vm--100--disk--0

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>

Found a dos partition table in /dev/mapper/pve-vm--100--disk--0

Code:
fdisk -l
Disk /dev/sda: 2.7 TiB, 3000445722624 bytes, 5860245552 sectors
Disk model: LOGICAL VOLUME 
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 1E11ABC5-974F-4584-AEDD-ADC93710B7D4

Device       Start        End    Sectors  Size Type
/dev/sda1       34       2047       2014 1007K BIOS boot
/dev/sda2     2048    1050623    1048576  512M EFI System
/dev/sda3  1050624 5860245518 5859194895  2.7T Linux LVM


Disk /dev/mapper/pve-swap: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/pve-root: 96 GiB, 103079215104 bytes, 201326592 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/pve-vm--100--state--pre_saltbox_051923: 188.5 GiB, 202387750912 bytes, 395288576 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes


Disk /dev/mapper/pve-vm--101--disk--0: 1000 GiB, 1073741824000 bytes, 2097152000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: A0D76621-BF2F-4082-B304-7C79776FC81E

Device                                   Start        End    Sectors  Size Type
/dev/mapper/pve-vm--101--disk--0-part1    2048       4095       2048    1M BIOS boot
/dev/mapper/pve-vm--101--disk--0-part2    4096    4198399    4194304    2G Linux filesystem
/dev/mapper/pve-vm--101--disk--0-part3 4198400 2097149951 2092951552  998G Linux filesystem


Disk /dev/mapper/pve-vm--100--disk--0: 129.2 TiB, 142051748347904 bytes, 277444820992 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: dos
Disk identifier: 0xf0f2c43b

Device                                 Boot Start        End    Sectors  Size Id Type
/dev/mapper/pve-vm--100--disk--0-part1 *     2048 2566914047 2566912000  1.2T 83 Linux
 
Code:
pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.143-1-pve)
pve-manager: 6.4-13 (running version: 6.4-13/9f411e79)
pve-kernel-helper: 6.4-8
pve-kernel-5.4: 6.4-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.4.143-1-pve: 5.4.143-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve4~bpo10
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.22-pve1~bpo10+1
libproxmox-acme-perl: 1.1.0
libproxmox-backup-qemu0: 1.1.0-1
libpve-access-control: 6.4-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-4
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-3
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.13-2
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.6-1
pve-cluster: 6.4-1
pve-container: 3.3-6
pve-docs: 6.4-2
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-4
pve-firmware: 3.3-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-2
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.6-pve1~bpo10+1
 
Code:
qm list
      VMID NAME                 STATUS     MEM(MB)    BOOTDISK(GB) PID      
       100 drgn-plex            running    96256         132296.00 17039    
       101 drgn-salt            stopped    32768           1000.00 0
 
Last edited:
https://forum.proxmox.com/threads/decrease-a-vm-disk-size.122430/

Code:
lvm lvreduce -L -32g pve/vm-103-disk-0
qm rescan


I only did this after confirming with gparted live that the partition for the VM didn't extend into the whopping 200TB it was showing (Proxmox had this number)

I shrank it a bit at a time, and stopped when i got close to the original size.

working perfect so far, and I have both my server VMs running now..
 
This problem has returned. It was working, and then both machines shut down again with the io error.

I saw something in journactl about proxmox running a backup and it ran out of drive space, but I have no backups scheduled, and can't find any in the backups section of the GUI.

Deleted my other snapshot for vm102 and it seems to be working again..

How do I "FIX" over-provisioning on the LVM-thin drive?

I thought I had not allocated more to the VMs than was available.

vm100 was configured to be a 1.2T partition, vm101 was configured to be 1.0 tb
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!