So, a couple things happened. I had 2 ubuntu vm's going, and they were both encoding some video for me. This apparently was too much for my Proliant DL380 G&, and caused things to go into an "i/o error" message on both the VM containers.
I rebooted them both, and again i/o errors.. i turned one off and it seemed ok, but then later more i/o errors.
Reboot the host. Man, I forgot how long these things take to boot.
restart VM's. 2 at a time, no good. The media server, working ok, but only for a bit. I noticed both the host and the vm's had given errors about the hard drives and they had to run fsck - i had to run fsck at one point on the host between the reboots before it would continue on.
So, the 1st VM is still behaving strange after a reboot / fsck. I have 2 snapshots of it from earlier in the week, so I restore the 'running' one, with the VM up and running. I don't remember exactly what happened, but I got an error saying it couldn't do that.. So, I restored the powered off snapshot.
I remember getting that horribly scary message "Volume Sizes exceeds the size of thin pool and free space in volume group" but it seemed to restore ok.
I'm afraid to turn on the 2nd VM. I'd like to save it if possible, but I can start over.
The problem is that "VM Disks" shows my vm-disk-100 as being 142 TB (!). This is a 3 TB drive. I only had ever provisioned 1TB at a time, so I had 2 TB for the main VM and 1 TB for the 2nd.
How do I fix this? I saw in another thread that I should run fsck against the 'damaged' drive (wasn't quite the same), but I'm afraid that'll break the data even more. Can I just remove both the snapshots I took and make another one?
I also tried doing fsck as described here..
https://forum.proxmox.com/threads/struggling-to-repair-bad-superblock-after-power-outage.115481
I rebooted them both, and again i/o errors.. i turned one off and it seemed ok, but then later more i/o errors.
Reboot the host. Man, I forgot how long these things take to boot.
restart VM's. 2 at a time, no good. The media server, working ok, but only for a bit. I noticed both the host and the vm's had given errors about the hard drives and they had to run fsck - i had to run fsck at one point on the host between the reboots before it would continue on.
So, the 1st VM is still behaving strange after a reboot / fsck. I have 2 snapshots of it from earlier in the week, so I restore the 'running' one, with the VM up and running. I don't remember exactly what happened, but I got an error saying it couldn't do that.. So, I restored the powered off snapshot.
I remember getting that horribly scary message "Volume Sizes exceeds the size of thin pool and free space in volume group" but it seemed to restore ok.
I'm afraid to turn on the 2nd VM. I'd like to save it if possible, but I can start over.
The problem is that "VM Disks" shows my vm-disk-100 as being 142 TB (!). This is a 3 TB drive. I only had ever provisioned 1TB at a time, so I had 2 TB for the main VM and 1 TB for the 2nd.
How do I fix this? I saw in another thread that I should run fsck against the 'damaged' drive (wasn't quite the same), but I'm afraid that'll break the data even more. Can I just remove both the snapshots I took and make another one?
Code:
lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data pve twi-aotz-- <2.58t 96.52 8.37
root pve -wi-ao---- 96.00g
snap_vm-100-disk-0_Pre_saltbox_051923_shutdown pve Vri---tz-k <129.20t data
snap_vm-100-disk-0_pre_saltbox_051923 pve Vri---tz-k <129.20t data
snap_vm-101-disk-0_initial_install pve Vri---tz-k 1000.00g data vm-101-disk-0
swap pve -wi-ao---- 8.00g
vm-100-disk-0 pve Vwi-aotz-- <129.20t data snap_vm-100-disk-0_Pre_saltbox_051923_shutdown 0.93
vm-100-state-pre_saltbox_051923 pve Vwi-a-tz-- <188.49g data 44.97
vm-101-disk-0 pve Vwi-a-tz-- 1000.00g data 98.56
Code:
pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 pve lvm2 a-- <2.73t <16.38g
root@hp-local:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 9 0 wz--n- <2.73t <16.38g
root@hp-local:~# vgdisplay
--- Volume group ---
VG Name pve
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 86
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 9
dud
Open LV 3
Max PV 0
Cur PV 1
Act PV 1
VG Size <2.73 TiB
PE Size 4.00 MiB
Total PE 715233
Alloc PE / Size 711041 / 2.71 TiB
Free PE / Size 4192 / <16.38 GiB
VG UUID uXm3Wn-hA5P-Jbmv-VXyG-DGjY-vhVF-JZ7Fr2
Code:
pvdisplay
--- Physical volume ---
PV Name /dev/sda3
VG Name pve
PV Size <2.73 TiB / not usable <2.01 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 715233
Free PE 4192
Allocated PE 711041
PV UUID je1utU-rIUW-CJH8-41lO-V3Su-BYnf-0q1CHe
I also tried doing fsck as described here..
https://forum.proxmox.com/threads/struggling-to-repair-bad-superblock-after-power-outage.115481
Code:
fsck /dev/mapper/pve-vm--100--disk--0
fsck from util-linux 2.33.1
e2fsck 1.44.5 (15-Dec-2018)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/mapper/pve-vm--100--disk--0
The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem. If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
or
e2fsck -b 32768 <device>
Found a dos partition table in /dev/mapper/pve-vm--100--disk--0
Code:
fdisk -l
Disk /dev/sda: 2.7 TiB, 3000445722624 bytes, 5860245552 sectors
Disk model: LOGICAL VOLUME
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 1E11ABC5-974F-4584-AEDD-ADC93710B7D4
Device Start End Sectors Size Type
/dev/sda1 34 2047 2014 1007K BIOS boot
/dev/sda2 2048 1050623 1048576 512M EFI System
/dev/sda3 1050624 5860245518 5859194895 2.7T Linux LVM
Disk /dev/mapper/pve-swap: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/pve-root: 96 GiB, 103079215104 bytes, 201326592 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/mapper/pve-vm--100--state--pre_saltbox_051923: 188.5 GiB, 202387750912 bytes, 395288576 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disk /dev/mapper/pve-vm--101--disk--0: 1000 GiB, 1073741824000 bytes, 2097152000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: A0D76621-BF2F-4082-B304-7C79776FC81E
Device Start End Sectors Size Type
/dev/mapper/pve-vm--101--disk--0-part1 2048 4095 2048 1M BIOS boot
/dev/mapper/pve-vm--101--disk--0-part2 4096 4198399 4194304 2G Linux filesystem
/dev/mapper/pve-vm--101--disk--0-part3 4198400 2097149951 2092951552 998G Linux filesystem
Disk /dev/mapper/pve-vm--100--disk--0: 129.2 TiB, 142051748347904 bytes, 277444820992 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: dos
Disk identifier: 0xf0f2c43b
Device Boot Start End Sectors Size Id Type
/dev/mapper/pve-vm--100--disk--0-part1 * 2048 2566914047 2566912000 1.2T 83 Linux