Hello,
Last weekend we had a power failure and our server shut down improperly. It's been a long week, and I feel like I'm close to being able to solve the problem. I am not an LVM expert and this has been a somewhat complex learning experience.
I've been through a lot, so I'll give a brief summary of the situation (someone else might find it useful):
1. Oversized storage. It was not allowed to extend so I was forced to 'use what I had on hand' and I had to link a USB drive to the PV and in turn add it because I couldn't extend the drive and it wouldn't let me do any kind of extend.
Code:
Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 1953521663 1953519616 931.5G 7 HPFS/NTFS/exFAT
root@Desarrollo:~#
root@Desarrollo:~# vgextend pve /dev/sdb1
WARNING: ntfs signature detected on /dev/sdb1 at offset 3. Wipe it? [y/n]: y
Wiping ntfs signature on /dev/sdb1.
Physical volume "/dev/sdb1" successfully created.
Volume group "pve" successfully extended
root@Desarrollo:~#
root@Desarrollo:~#
root@Desarrollo:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 2 37 0 wz--n- 13.64t 932.07g
root@Desarrollo:~# lvchange -an /dev/pve/data
root@Desarrollo:~#
root@Desarrollo:~# lvconvert --repair /dev/pve/data
WARNING: Sum of all thin volume sizes (<30.20 TiB) exceeds the size of thin pools and the size of whole volume group (13.64 TiB).
WARNING: You have not turned on protection against thin pools running out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
2. I tried to repair the Thin Pool with this article (
https://www.unixrealm.com/repair-a-thin-pool/)
Code:
root@Desarrollo:/usr/sbin# ./thin_check /dev/pve/data_meta1
examining superblock
TRANSACTION_ID=159
METADATA_FREE_BLOCKS=4145151
examining devices tree
examining mapping tree
thin device 3954164 is missing mappings [0, -]
block out of bounds (2894699358584874 >= 4145152)
thin device 3954165 is missing mappings [0, -]
block out of bounds (2923571252822058 >= 4145152)
thin device 3954166 is missing mappings [0, -]
block out of bounds (2923571269599274 >= 4145152)
a many more missing....
3. I ran many times the lvconvert --repair process and generated several "meta" (in my desperate attempt to do tests)
4. Then "Transaction id mismatch" errors appeared, so it should not be manipulated or applied any procedure... I was able to repair it with (
https://blog.monotok.org/lvm-transaction-id-mismatch-and-metadata-resize-error/)
From everything I did it allowed me to see the local-lvm online and I was able to debug some checkpoints from the proxmox GUI by lowering the thin-pool to half its size but after that I got this new error when trying to mount
Code:
root@0:~# lvextend -L+1G pve/data_tmeta
Thin pool pve-data-tpool (254:10) transaction_id is 159, while expected 161.
Failed to activate pve/data.
root@Desarrollo:~# vgchange -ay
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
device-mapper: reload ioctl on (253:16) failed: No data available
13 logical volume(s) in volume group "pve" now active
root@Desarrollo:~# lvscan
ACTIVE '/dev/pve/swap' [8.00 GiB] inherit
ACTIVE '/dev/pve/root' [96.00 GiB] inherit
ACTIVE '/dev/pve/data' [<12.59 TiB] inherit
inactive '/dev/pve/vm-101-disk-7' [250.00 GiB] inherit
inactive '/dev/pve/vm-102-disk-4' [250.00 GiB] inherit
inactive '/dev/pve/vm-102-disk-5' [250.00 GiB] inherit
inactive '/dev/pve/vm-102-disk-6' [200.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-8' [1.95 TiB] inherit
inactive '/dev/pve/vm-101-disk-0' [500.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-9' [1.95 TiB] inherit
inactive '/dev/pve/vm-101-disk-10' [1000.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-12' [500.00 GiB] inherit
ACTIVE '/dev/pve/data_meta0' [15.81 GiB] inherit
inactive '/dev/pve/vm-119-disk-0' [100.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-1' [1000.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-3' [1.95 TiB] inherit
inactive '/dev/pve/vm-101-disk-2' [100.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-4' [<3.91 TiB] inherit
ACTIVE '/dev/pve/data_meta1' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta2' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta3' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta4' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta5' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta6' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta7' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta8' [15.81 GiB] inherit
ACTIVE '/dev/pve/repaired_01' [24.00 GiB] inherit
root@Desarrollo:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 12.7T 0 disk
├─sda1 8:1 0 512M 0 part
└─sda2 8:2 0 12.7T 0 part
├─pve-swap 253:0 0 8G 0 lvm [SWAP]
├─pve-root 253:1 0 96G 0 lvm /
├─pve-data_meta0 253:2 0 15.8G 0 lvm
├─pve-data_meta1 253:3 0 15.8G 0 lvm
├─pve-data_meta2 253:4 0 15.8G 0 lvm
└─pve-data_tdata 253:6 0 12.6T 0 lvm
└─pve-data-tpool 253:7 0 12.6T 0 lvm
├─pve-data 253:8 0 12.6T 1 lvm
└─pve-repaired_01 253:15 0 24G 0 lvm
sdb 8:16 0 931.5G 0 disk
└─sdb1 8:17 0 931.5G 0 part
├─pve-data_tmeta 253:5 0 15.8G 0 lvm
│ └─pve-data-tpool 253:7 0 12.6T 0 lvm
│ ├─pve-data 253:8 0 12.6T 1 lvm
│ └─pve-repaired_01 253:15 0 24G 0 lvm
├─pve-data_meta3 253:9 0 15.8G 0 lvm
├─pve-data_meta4 253:10 0 15.8G 0 lvm
├─pve-data_meta5 253:11 0 15.8G 0 lvm
├─pve-data_meta6 253:12 0 15.8G 0 lvm
├─pve-data_meta7 253:13 0 15.8G 0 lvm
└─pve-data_meta8 253:14 0 15.8G 0 lvm
sr0 11:0 1 1024M 0 rom
5. Metadata repair (
https://charlmert.github.io/blog/2017/06/15/lvm-metadata-repair/)
Code:
root@0:~# lvchange --yes -ay pve/data_tmeta
Allowing activation of component LV.
Activation of logical volume pve/data_tmeta is prohibited while logical volume pve/repaired_01 is active.
5 root@0:~# lvchange --yes -an pve/repaired_01
127 root@0:~# lvscan
ACTIVE '/dev/pve/swap' [8.00 GiB] inherit
ACTIVE '/dev/pve/root' [96.00 GiB] inherit
ACTIVE '/dev/pve/data' [<12.59 TiB] inherit
inactive '/dev/pve/vm-101-disk-7' [250.00 GiB] inherit
inactive '/dev/pve/vm-102-disk-4' [250.00 GiB] inherit
inactive '/dev/pve/vm-102-disk-5' [250.00 GiB] inherit
inactive '/dev/pve/vm-102-disk-6' [200.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-8' [1.95 TiB] inherit
inactive '/dev/pve/vm-101-disk-0' [500.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-9' [1.95 TiB] inherit
inactive '/dev/pve/vm-101-disk-10' [1000.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-12' [500.00 GiB] inherit
ACTIVE '/dev/pve/data_meta0' [15.81 GiB] inherit
inactive '/dev/pve/vm-119-disk-0' [100.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-1' [1000.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-3' [1.95 TiB] inherit
inactive '/dev/pve/vm-101-disk-2' [100.00 GiB] inherit
inactive '/dev/pve/vm-101-disk-4' [<3.91 TiB] inherit
ACTIVE '/dev/pve/data_meta1' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta2' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta3' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta4' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta5' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta6' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta7' [15.81 GiB] inherit
ACTIVE '/dev/pve/data_meta8' [15.81 GiB] inherit
inactive '/dev/pve/repaired_01' [24.00 GiB] inherit
root@0:~# lvchange --yes -ay pve/data_tmeta
Allowing activation of component LV.
Activation of logical volume pve/data_tmeta is prohibited while logical volume pve/data is active.
root@0:~# lvchange --yes -an pve/data
root@0:~# lvchange --yes -ay pve/data_tmeta
Allowing activation of component LV.
root@0:~# thin_dump /dev/mapper/pve-data_tmeta -o thin_dump_pve-data_tmeta.xml
root@0:~# more thin_dump_pve-data_tmeta.xml
<superblock uuid="" time="42" transaction="178" flags="0" version="2" data_block_size="128" nr_data_blocks="211163456">
<device dev_id="87" mapped_blocks="46265" transaction="175" creation_time="42" snap_time="42">
<range_mapping origin_begin="0" data_begin="0" length="46265" time="42"/>
</device>
</superblock>
root@0:~# thin_restore -i /root/tmeta.xml -o /dev/mapper/pve-repaired_01
Output file does not exist.
The output file should either be a block device,
or an existing file. The file needs to be large
enough to hold the metadata.
1 root@0:~# thin_restore -i thin_dump_pve-data_tmeta.xml -o /dev/mapper/pve-repaired_01
Output file does not exist.
The output file should either be a block device,
or an existing file. The file needs to be large
enough to hold the metadata.
6. I redid the procedure creating a new one called repaired_02 and doing this I was finally able to eliminate the "reload ioctl" errors
Code:
130 root@0:~# lvcreate -an -Zn -L40G --name repaired_02 pve
WARNING: Sum of all thin volume sizes (13.84 TiB) exceeds the size of thin pools and the size of whole volume group (13.64 TiB).
WARNING: You have not turned on protection against thin pools running out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full.
WARNING: Logical volume pve/repaired_02 not zeroed.
Logical volume "repaired_02" created.
root@0:~# lvchange --yes -ay pve/repaired_02
root@0:~# thin_restore -i thin_dump_pve-data_tmeta.xml -o /dev/mapper/pve-repaired_02
truncating metadata device to 4161600 4k blocks
Restoring: [==================================================] 100%
root@0:~# lvconvert --thinpool pve/data --poolmetadata /dev/mapper/pve-repaired_02
Do you want to swap metadata of pve/data pool with metadata volume pve/repaired_02? [y/n]: y
root@0:~#
7. I ran lvconvert repair again (for the last time) and on reboot I was able to (finally) completely remove the ioctl errors
****Now if the reason why I reopen this thread****
After reboot it gave error that cannot mount partition because "t_data" is active
(Dev pvestatd[1639]: activating LV 'pve/data' failed: Activation of logical volume pve/data is prohibited while logical volume pve/data_tdata is active.) .. I tried everything referred to in this thread:
Code:
root@Desarrollo:~# lvchange -an pve/data_tdata
root@Desarrollo:~# lvchange -an pve/data_tmeta
root@Desarrollo:~# lvchange -ay pve/data
Then in the update-initramfs command it threw error
Code:
root@Desarrollo:/etc/lvm# update-initramfs -u
update-initramfs: Generating /boot/initrd.img-5.15.74-1-pve
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
oot@Desarrollo:/etc/lvm# more No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.^C
root@Desarrollo:/etc/lvm# more /etc/kernel/
install.d/ postinst.d/ postrm.d/
root@Desarrollo:/etc/lvm# more /etc/kernel/
install.d/ postinst.d/ postrm.d/
root@Desarrollo:/etc/lvm# more /etc/kernel/^C
root@Desarrollo:/etc/lvm#
root@Desarrollo:/etc/lvm#
root@Desarrollo:/etc/lvm# proxmox-boot-tool
USAGE: /usr/sbin/proxmox-boot-tool <commands> [ARGS]
/usr/sbin/proxmox-boot-tool format <partition> [--force]
/usr/sbin/proxmox-boot-tool init <partition>
/usr/sbin/proxmox-boot-tool reinit
/usr/sbin/proxmox-boot-tool clean [--dry-run]
/usr/sbin/proxmox-boot-tool refresh [--hook <name>]
/usr/sbin/proxmox-boot-tool kernel <add|remove> <kernel-version>
/usr/sbin/proxmox-boot-tool kernel pin <kernel-version> [--next-boot]
/usr/sbin/proxmox-boot-tool kernel unpin [--next-boot]
/usr/sbin/proxmox-boot-tool kernel list
/usr/sbin/proxmox-boot-tool status [--quiet]
/usr/sbin/proxmox-boot-tool help
root@Desarrollo:/etc/lvm# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
E: /etc/kernel/proxmox-boot-uuids does not exist.
root@Desarrollo:/etc/lvm#
root@Desarrollo:/etc/lvm#
root@Desarrollo:/etc/lvm# efibootmgr -v
EFI variables are not supported on this system.
- I investigated a bit and it happens that as such there is no proxmox-boot-tool since this service is only for UEFI in our case I did an update-grub because the server boots with BIOS/Legacymode but I don't know if it is correct or I am missing something more for being a server with Legacymode...
And here I go... Same problem with t_data starting first and I don't know what else to try or do... I'm running out of ideas and this error is rare.
If you have any suggestions or something else I can try, I would greatly appreciate it. I feel like I'm close
My goal is to be able to remove the disks from the virtual machines to do a clean install of Proxmox.
Kind Regards
Jhon