local-LVM not available after Kernel update on PVE 7

Thanks for this.

I wonder if there is a way to increase the udev timeout, so we can avoid having the pvscan process killed in the first place.
Code:
1. pvscan is started by the udev 69-lvm-metad.rules.
2. pvscan activates XYZ_tmeta and XYZ_tdata.
3. pvscan starts thin_check for the pool and waits for it to complete.
4. The timeout enforced by udev is hit and pvscan is killed.
5. Some time later, thin_check completes, but the activation of the
thin pool never completes.

EDIT: INCREASING THE UDEV TIMEOUT DOES WORK
The boot did take longer, but it did not fail.
I have set the timeout to 600s (10min). Default is 180s (3min).
Some people may need even more time, depending on how many disks and pools they have, but above 10min I would just use --skip-mappings :D

Edited
Code:
# nano /etc/udev/udev.conf
Added
Code:
event_timeout=600
Then I disabled the --skip-mappings option in lvm.conf, by commenting it
(you may skip this step if you haven't changed your lvm.conf file)
Code:
# nano /etc/lvm/lvm.conf
Disabled
Code:
        # thin_check_options = [ "-q", "--skip-mappings" ]
Then updated initramfs again with both changes
Code:
# update-initramfs -u
And rebooted to test and it worked.

I think I prefer it this way. The server should not be rebooted frequently anyways. It is a longer boot with more through tests (not sure why it takes so long though).

Testing here, it took about 2m20s for the first pool to appear on the screen as "found" and 3m17s for all the pools to load. Then the boot quickly finished and the system was online. In my case, I was just above the 3min limit.

EDIT2: I wonder if there is some optimization I can do to the metadata of the pool to make this better. Also, one of my pools has two 2TB disks, but the metadata is only in one of them (it was expanded to the second disk). Not sure if this matters, but this seems to be the slow pool to check/mount.

Anyways, hope this helps someone.
Cheers"

#1 comment, this is what solved it for me. Pemanent fix. It is not "fast" but at least, if I plan a 10 minutes downtime when I update the server, I'm OK with it. Thanks for that info, really helpful!
 
Hi,
updated to proxmox 8 and the issue still exists.

---
Kernel Version Linux 6.2.16-2-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.16-2 (2023-06-13T13:30Z)
PVE Manager Version pve-manager/8.0.3/bbf3993334bfa916
unfortunately, the issue is likely here to stay for a while. Fixing it would require rather fundamental changes to Proxmox VE's/Debian's initialization or setting certain defaults that would not be ideal for most people. You can use one of the workarounds mentioned in this thread if you are affected by the issue.
 
Just came across this issue after rebooting. I did the lvchange commands and they succeed, but reboot still had the issue.

I tried event_timeout=600 but it didn't seem to work. Some of my vms failed to start with the same error. I can start them manually from the web UI. I then tried adding a startup delay of 600 to a failing vm, but that didn't work either.

--skip-mappings is the only thing that worked.

FWIW I also notice this on the console on bootup when using event_timeout=600. It definitely did not wait 10 minutes before showing that. It came in under 20 seconds. Shouldn't it wait 10 minutes or is this some other timeout?

Code:
Timed out waiting for udev queue to be empty.
 
Last edited:
Hi,
hello, I have a similar problem, I cannot access the proxmox panel it starts on a VM that I created after restarting the server it is a VM-104 I do not understand, I have already run the forum commands but it doesn't work
in the screenshot, it looks like you installed cloud-init on the Proxmox host. In almost all cases, that package is intended to be used within a VM and not on the host. If it was not intentional, remove it and fix your network/hostname configuration.
 
Hi,

in the screenshot, it looks like you installed cloud-init on the Proxmox host. In almost all cases, that package is intended to be used within a VM and not on the host. If it was not intentional, remove it and fix your network/hostname configuration.
So no, that's not the problem, it's that it's a VM that asks instead of proxmox for a problem with LVM
 
Hi,

in the screenshot, it looks like you installed cloud-init on the Proxmox host. In almost all cases, that package is intended to be used within a VM and not on the host. If it was not intentional, remove it and fix your network/hostname configuration.
the proxmox works perfectly with several VMs and except that on reboot (proxmox 8) instead of booting on proxmox it reboots on the VM pve-vm--104--cloudinit while basic it is a simple VM but the server boots directly on it and not on the sda3 -> pve-root/pve-data
 

Attachments

  • cd0be5a1cfff8ac03783851824706037.png
    cd0be5a1cfff8ac03783851824706037.png
    549.6 KB · Views: 13
  • 85d28196f43e631a3e9d0151d6ccaa4c.png
    85d28196f43e631a3e9d0151d6ccaa4c.png
    404.7 KB · Views: 11
  • ecc8d9642604191062e5d3db7c505314.png
    ecc8d9642604191062e5d3db7c505314.png
    284.3 KB · Views: 11
EDIT: INCREASING THE UDEV TIMEOUT DOES WORK
The boot did take longer, but it did not fail.
I have set the timeout to 600s (10min). Default is 180s (3min).
Well, actually, not here: yes it works, the thinpool is activated at last... But Proxmox started before by systemd already tried to start the VMs, already failed, so at the end the thinpool is activated but the guest VMs aren't started.
Is it expected to have systemd to start Proxmox before udev finishes to do its stuff ?
 
Hi there,

Got the same issue this night after a reboot.
Is this not yet resolve ?

I had to modify my lvm.conf to add the --skip-mappings in order to fix the issue.

Code:
# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-10
pve-kernel-5.4: 6.4-17
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.35-1-pve: 5.15.35-3
pve-kernel-5.4.189-1-pve: 5.4.189-1
pve-kernel-5.4.162-1-pve: 5.4.162-2
ceph-fuse: 14.2.21-1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksmtuned: 4.20150326
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.6-1
proxmox-backup-file-restore: 2.2.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-2
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
Hey, I'm back a year or so alter and got the the same problem today after upgrading to PVE 8. I've tried the udev "event_timeout" but it looks like it is ignored (the timeout come way before the 600 or 900 seconds I put there).
Do you have any idea about that?

EDIT: I added back the --skip-mapping option to lvm.conf and that worked as expected
 
Last edited:
J'ai mis à niveau la version 7 vers la version 8 mais lorsque j'ai redémarré mes nœuds, mon volume LVM a cessé de fonctionner.

J'ai supprimé le volume pour le remettre mais j'ai ensuite eu ce message d'erreur
échec de la création du stockage : commande '/sbin/pvs --separator : --noheadings --units k --unbuffered --nosuffix --options pv_name, pv_size, vg_name, pv_uuid /dev/disk/by-id/scsi-36006016024984e00cf23665e71e47afa' échec : code de sortie 5 (500)
 
I upgraded from version 7 to version 8 but when I restarted my nodes my LVM volume stopped working.

I removed the volume to put it back but then I got this error message
create storage failed: command '/sbin/pvs --separator: --noheadings --units k --unbuffered --nosuffix --options pv_name,pv_size,vg_name,pv_uuid /dev/disk/by-id/scsi-36006016024984e00cf23665e71e47afa' failed: exit code 5 (500)
 
Hey, I'm back a year or so alter and got the the same problem today after upgrading to PVE 8. I've tried the udev "event_timeout" but it looks like it is ignored (the timeout come way before the 600 or 900 seconds I put there).
Do you have any idea about that?

EDIT: I added back the --skip-mapping option to lvm.conf and that worked as


Hi Can you explain to us how you solved your problem? to which line you added the text?
 
Hi,
I upgraded from version 7 to version 8 but when I restarted my nodes my LVM volume stopped working.

I removed the volume to put it back but then I got this error message
create storage failed: command '/sbin/pvs --separator: --noheadings --units k --unbuffered --nosuffix --options pv_name,pv_size,vg_name,pv_uuid /dev/disk/by-id/scsi-36006016024984e00cf23665e71e47afa' failed: exit code 5 (500)
Is the device listed, when you run just pvs? What do you get when you run pvs /dev/disk/by-id/scsi-36006016024984e00cf23665e71e47afa?

Is there anything interesting in the system log/journal during boot?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!