local-LVM not available after Kernel update on PVE 7

Jul 27, 2020
8
0
1
41
Updated our Proxmox PVE 7 server this morning and upon reboot the local-lvm was not available and VM's would not start. Below are the updates applied:
libpve-common-perl: 7.0-6 ==> 7.0-9
pve-container: 4.0-9 ==> 4.0-10
pve-kernel-helper: 7.0-7 ==> 7.1-2
qemu-server: 7.0-13 ==> 7.0-14

PVE said reboot was required

Once rebooted, the local-lvm showed at 0GB in WebGUI and got a start failed error when I tried to start VMs. See attachment(s)

Ran pvdisplay,lvdisplay, and vgdisplay (see attachment) the lvdisplay shows all the logical volumes as "Not available" (see attachment)

Ran lvchange -ay pve/data to activate pve/data so the local-lvm now shows active and with percentage but the VM's still won't start

Ran the lvchange -ay command for each "LV Path" to each VM-disk (ex. lvchange -ay /dev/pve/vm-209002-disk-0) after this the "logical volumes" showed available and VM's started. I have to run the lvchange -ay command for each "LV Path" to each VM-disk anytime the PVE server is rebooted. Why? How can I resolve this permanently?
 

Attachments

  • PVEissue.PNG
    PVEissue.PNG
    20.8 KB · Views: 9
  • PVEissue1.PNG
    PVEissue1.PNG
    22.5 KB · Views: 9
  • pveissue2.PNG
    pveissue2.PNG
    40.9 KB · Views: 9
  • lvdisplayafter runninglvchange-ay.txt
    26.7 KB · Views: 9
  • pvdisplaylvdisplayvgdisplay.txt
    27.4 KB · Views: 7

Fabian_E

Proxmox Staff Member
Staff member
Aug 1, 2019
1,208
186
68
Hi,
is there anything interesting in /var/log/syslog about lvm/pvscan? Please also share your /etc/lvm/lvm.conf and the output of pveversion -v.
 
Jul 27, 2020
8
0
1
41
Hi Fabian

I have attached the syslog and lvm.conf. The only thing that I noted in the syslog was the following line:
Oct 12 06:50:27 pve lvm[825]: pvscan[825] VG pve skip autoactivation.

Here is the output of pvevision -v
pveversion -v
proxmox-ve: 7.0-2 (running kernel: 5.11.22-5-pve)
pve-manager: 7.0-11 (running version: 7.0-11/63d82f4e)
pve-kernel-helper: 7.1-2
pve-kernel-5.11: 7.0-8
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-5.11.22-4-pve: 5.11.22-9
pve-kernel-5.11.22-1-pve: 5.11.22-2
ceph-fuse: 15.2.13-pve1
corosync: 3.1.5-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve1
libproxmox-acme-perl: 1.3.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-9
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-11
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.9-2
proxmox-backup-file-restore: 2.0.9-2
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-6
pve-cluster: 7.0-3
pve-container: 4.0-10
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-4
pve-firmware: 3.3-2
pve-ha-manager: 3.3-1
pve-i18n: 2.5-1
pve-qemu-kvm: 6.0.0-4
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-14
smartmontools: 7.2-1
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1
 

Attachments

  • lvm.txt
    101.1 KB · Views: 3
  • syslog.txt
    914.9 KB · Views: 1

Fabian_E

Proxmox Staff Member
Staff member
Aug 1, 2019
1,208
186
68
I also got VG pve skip autoactivation. in my logs and my /etc/lvm/lvm.conf is identical, so those should be fine.

I noticed two things. First, in the output of lvdisplay is an error:
LV Name data
VG Name pve
LV UUID 56Hxkt-8ifA-xPKr-8a5t-qW25-Nung-HEb9Pa
LV Write Access read/write
LV Creation host, time proxmox, 2021-09-09 14:40:33 -0700
Expected thin-pool segment type but got NULL instead.
LV Pool metadata data_tmeta
LV Pool data data_tdata
LV Status NOT available
LV Size 1.67 TiB
Current LE 437892
Segments 1
Allocation inherit
Read ahead sectors auto

Second, and not sure if this is related, lvm tries to read from /etc/urandom before that was even initialized
Oct 12 06:50:27 pve kernel: [ 3.432021] random: lvm: uninitialized urandom read (4 bytes read)

Please provide the output of lvs -a and lvdisplay pve/data -vvv. Maybe there's more information there.
 
Jul 27, 2020
8
0
1
41
Do you want the output of lvs -a and lvdisplay pve/data -vvv while the error is occurring or after I activate the logical volumes manually? Activating the logical volumes manually has been the only way that I have been able to get the VMs on the PVE to start. This is a production server so I can't keep it offline.
 
Jul 27, 2020
8
0
1
41
Could this be an issue between HP and Proxmox 7? Although, these servers worked fine with Proxmox 6. All of our test PVEs are on various Dell Hardware using the no-subscription Package Repository and we haven't had an issue with them. This was the reason I went ahead and upgraded two of our 5 production PVE servers to Proxmox 7. I have since done a clean re-build of these same two PVE servers and the problem persists.
 

Fabian_E

Proxmox Staff Member
Staff member
Aug 1, 2019
1,208
186
68
Do you want the output of lvs -a and lvdisplay pve/data -vvv while the error is occurring or after I activate the logical volumes manually? Activating the logical volumes manually has been the only way that I have been able to get the VMs on the PVE to start. This is a production server so I can't keep it offline.
It appears that the Expected thin-pool segment type but got NULL instead. error is only present in the log before the activate commands, so probably it's better to do it after the next reboot.
Could this be an issue between HP and Proxmox 7? Although, these servers worked fine with Proxmox 6. All of our test PVEs are on various Dell Hardware using the no-subscription Package Repository and we haven't had an issue with them. This was the reason I went ahead and upgraded two of our 5 production PVE servers to Proxmox 7. I have since done a clean re-build of these same two PVE servers and the problem persists.
So both servers show the same symptoms? And the thin pool was re-created from scratch too?

I also noticed in the syslog
Code:
Oct 10 02:38:53 pve smartd[929]: Device: /dev/sda [SAT], SMART Usage Attribute: 7 Seek_Error_Rate changed from 100 to 200
which looks like the drive is not fully healthy anymore.
 
Jul 27, 2020
8
0
1
41
Yes, both servers are showing the same symptoms. One is a HP Proliant Gen 9 Xeon server and the other is a HP Z620 Workstation. They have both been nuked and paved so yes the thin pools were re-created too. I rebooted the HP Z620 Workstation today so I could run the requested lvs -a and lvdisplay pve/data -vvv commands. The output is attached. As for the SMART attribute, I'm still digging into that but the drive is fairly new and is enterprise grade.

Just seems funny that all the of our test PVEs using old dell PCs are running fine with PVE7 but the HP production servers are having issues with PVE7. We don't have an HP machine that isn't in production to test this. Next week, may try to swap HDD and put it in a Dell and see if the issue still happens. They worked fine with PVE6. We may need to go back to PVE6. If so, is there an easy way to do it?
 

Attachments

  • bhcpvelvdisplaypvedata-vvv.txt
    22.8 KB · Views: 3
  • bhcpvelvs-a.PNG
    bhcpvelvs-a.PNG
    99.4 KB · Views: 2

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!