After Proxmox upgrade from 6 to 7, all vas are down due not found disks

Oct 11, 2020
35
2
13
36
The VMs are not finding the disks. They are under LVM volume that sum 4x2TB disks in one large volume. The Proxmox can see all the disks, but probably is not mounting the volume.

I saw one change in lvm.conf did by the upgrade procedure:
Code:
root@pve:/etc/lvm# diff lvm.conf lvm.conf.bak
175c175
< #    scan_lvs = 1
---
>     scan_lvs = 1
2194,2197d2193
< devices {
<      # added by pve-manager to avoid scanning LVM volumes
<      scan_lvs=0
< }

I am not sure why we have this new config and if is right or not.
 
I discovered that just first volume was mapped in /dev (/dev/pve). The biggest one (/dev/pve3/) is not there. And all the VMs are there. I am trying to identify which task caused that during the upgrade.
 
do you use nested LVM (e.g., a PV and VG on top of an LV?) - that is disabled by default in LVM now. if you need that, you might want to check out the global_filter option and configure it accordingly (disallow scanning LVs except the ones that you require for your nested setup).

you definitely don't want the host LVM to scan all LVs, as that can lead to LVs being active on the host and inside a VM, which can cause corruption.
 
I am also having the exact same issue as this. I have an install on a low end AMD system that was running Proxmox VE 6.4, with a PBS server being the only VM running on this server. The server has a single SSD with the default partition layout. I upgraded to 7.0, following the upgrade path specified without errors. Upon boot, I am no longer able to boot my VM with the VM stating that it cannot find a boot device. I see no errors logged on the Proxmox side. I created a new VM and it also cannot access the disk. lvdisplay shows the LV groups. If I move the disk from the local storage to a NFS share hosted on a Synology, the VM is able to start up. Moving the disk back to the local storage, it fails to access the boot volume again.

I can post a new thread if preferred.
 
Last edited:
I am also having the exact same issue as this. I have an install on a low end AMD system that was running Proxmox VE 6.4, with a PBS server being the only VM running on this server. The server has a single SSD with the default partition layout. I upgraded to 7.0, following the upgrade path specified without errors. Upon boot, I am no longer able to boot my VM with the VM stating that it cannot find a boot device. I see no errors logged on the Proxmox side. I created a new VM and it also cannot access the disk. lvdisplay shows the LV groups. If I move the disk from the local storage to a NFS share hosted on a Synology, the VM is able to start up. Moving the disk back to the local storage, it fails to access the boot volume again.

I can post a new thread if preferred.
could you attach your full /etc/lvm/lvm.conf and the log of the failed VM start?
 
I have attached my /etc/lvm/lvm.conf file. It appears to be a standard default file.

I am not sure which logs you are asking for. Please direct me to it and I will attach it. In the console, when I start a VM, all I see is "TASK OK". The syslog shows the following when I start the VM:
Jul 16 21:04:54 lonestar kernel: [718543.812660] device tap200i0 entered promiscuous mode Jul 16 21:04:54 lonestar kernel: [718544.047105] fwbr200i0: port 1(fwln200i0) entered blocking state Jul 16 21:04:54 lonestar kernel: [718544.047113] fwbr200i0: port 1(fwln200i0) entered disabled state Jul 16 21:04:54 lonestar kernel: [718544.047337] device fwln200i0 entered promiscuous mode Jul 16 21:04:54 lonestar kernel: [718544.047439] fwbr200i0: port 1(fwln200i0) entered blocking state Jul 16 21:04:54 lonestar kernel: [718544.047444] fwbr200i0: port 1(fwln200i0) entered forwarding state Jul 16 21:04:54 lonestar kernel: [718544.068802] vmbr0: port 3(fwpr200p0) entered blocking state Jul 16 21:04:54 lonestar kernel: [718544.068809] vmbr0: port 3(fwpr200p0) entered disabled state Jul 16 21:04:54 lonestar kernel: [718544.069030] device fwpr200p0 entered promiscuous mode Jul 16 21:04:54 lonestar kernel: [718544.069121] vmbr0: port 3(fwpr200p0) entered blocking state Jul 16 21:04:54 lonestar kernel: [718544.069125] vmbr0: port 3(fwpr200p0) entered forwarding state Jul 16 21:04:54 lonestar kernel: [718544.088969] fwbr200i0: port 2(tap200i0) entered blocking state Jul 16 21:04:54 lonestar kernel: [718544.088977] fwbr200i0: port 2(tap200i0) entered disabled state Jul 16 21:04:54 lonestar kernel: [718544.089288] fwbr200i0: port 2(tap200i0) entered blocking state Jul 16 21:04:54 lonestar kernel: [718544.089293] fwbr200i0: port 2(tap200i0) entered forwarding state

Here is my disk layout:
[root@lonestar ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 223.6G 0 disk ├─sda1 8:1 0 1007K 0 part ├─sda2 8:2 0 512M 0 part └─sda3 8:3 0 223.1G 0 part ├─pve-swap 253:0 0 7G 0 lvm [SWAP] ├─pve-root 253:1 0 55.8G 0 lvm / ├─pve-data_tmeta 253:2 0 1.4G 0 lvm │ └─pve-data-tpool 253:4 0 141.4G 0 lvm │ ├─pve-data 253:5 0 141.4G 1 lvm │ ├─pve-vm--105--disk--0 253:6 0 32G 0 lvm │ └─pve-vm--200--disk--1 253:8 0 32G 0 lvm └─pve-data_tdata 253:3 0 141.4G 0 lvm └─pve-data-tpool 253:4 0 141.4G 0 lvm ├─pve-data 253:5 0 141.4G 1 lvm ├─pve-vm--105--disk--0 253:6 0 32G 0 lvm └─pve-vm--200--disk--1 253:8 0 32G 0 lvm sr0 11:0 1 1024M 0 rom [root@lonestar ~]$
 

Attachments

  • lvm.conf.txt
    101.1 KB · Views: 1
Last edited:
so the VM starts, but does not boot?

please post the VM config and output of lvs
from which version did you upgrade?
 
Output of lvs:
[root@lonestar ~]$ lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert data pve twi-aotz-- <141.43g 5.93 1.40 root pve -wi-ao---- 55.75g swap pve -wi-ao---- 7.00g vm-105-disk-0 pve Vwi-a-tz-- 32.00g data 26.20 vm-200-disk-1 pve Vwi-a-tz-- 32.00g data 0.00

VM Config:
agent: 1,fstrim_cloned_disks=1 boot: order=ide2;virtio0 cores: 2 ide2: none,media=cdrom machine: pc-i440fx-5.2 memory: 6144 name: PBS2 net0: virtio=EE:69:FE:C9:A4:5C,bridge=vmbr0,firewall=1 numa: 0 onboot: 1 ostype: l26 scsihw: virtio-scsi-pci smbios1: uuid=60d93504-f029-4cd0-ad9e-f7e1b91a70d0 sockets: 1 startup: up=60 unused0: local-lvm:vm-105-disk-0 virtio0: HC3:105/vm-105-disk-0.qcow2,size=32G vmgenid: 5b9ec4ce-4311-4115-b4df-4935f358a35b

I upgraded from the latest 6.4 release. It was fully updated, and rebooted before the upgrade to 7.0.

You are correct, the VM boots, but it is unable to find the hard disk. I restored the VM from backup to a new VM ID (200) and that has the same problem. Moving the disk image to a NFS share via the "move disk" button in the GUI allows the VM to boot and access the hard drive. When the VM boots with the disk on the local-lvm storage, it says that the hard drive is inaccessable. Booting off of a Parted Magic ISO, GParted says "Could not stat device /dev/mapper/no block devices found - no such file or directory.", followed by "Input/Output error during read on /dev/vda". Clicking Ignore dismisses the errors, and GParted shows /dev/vda with 32GB unallocated. Any operations on the disk, like creating a partition table gives the "Input/Output error during read on /dev/vda" error message again.
 
but the block device on the host side is there - else lvs output would look different and the VM wouldn't even start.. is there anything else in the logs surrounding the VM start? there should be more than just the network device messages from the kernel..
 
but the block device on the host side is there - else lvs output would look different and the VM wouldn't even start.. is there anything else in the logs surrounding the VM start? there should be more than just the network device messages from the kernel..
I know it is odd, but that is what is happening. The VM sees the HDD, but it cannot read/write it, yet on NFS storage it can. I am unable to locate logs, other than syslog. I posted the syslog output from when the VM started.

Please be specific as to which logs you want to see, and I will post them.
 
anything visible if you start the/a VM in the foreground:
  • stop VM
  • qm showcmd --pretty
  • remove the line with --daemonize
  • run the resulting command - the VM process should stay in the foreground and print errors to the terminal

also, are all the affected VMs using virtio? can you try with virtio-scsi?
 
anything visible if you start the/a VM in the foreground:
  • stop VM
  • qm showcmd --pretty
  • remove the line with --daemonize
  • run the resulting command - the VM process should stay in the foreground and print errors to the terminal

also, are all the affected VMs using virtio? can you try with virtio-scsi?
This is the only VM on the Proxmox server. It's sole purpose is to run a PBS instance. I have created additional VMs to test, and none of them can access the block device.

I have taken the VM, detached the HDD and re-added it as SCSI, IDE, SATA and VIRTIO. Each time I went into the Options tab to make sure that the block device is in the boot order. Each time, I get the same results, inaccessable boot device.

The VM console shows this:
SeaBIOS (version rel-1.14.0.0-g155821a1990b-prebuilt.qemu.org Machine UUID 60d93504-f029-4cd0-ad9e-f7e1b91a790d0 Booting from DVD/CD... Boot failed: Could not read from CDROM (code 0003) Booting from Hard Disk... Boot failed: not a bootable disk No bootable device. Retrying in 1 seconds.

Running the VM from the foreground makes no difference, there are no errors logged. I stopped the VM via the stop button in the GUI. This is the full console capture:

[root@lonestar ~]$ /usr/bin/kvm \ -id 200 \ -name PBS2 \ -no-shutdown \ -chardev 'socket,id=qmp,path=/var/run/qemu-server/200.qmp,server=on,wait=off' \ -mon 'chardev=qmp,mode=control' \ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \ -mon 'chardev=qmp-event,mode=control' \ -pidfile /var/run/qemu-server/200.pid \ -smbios 'type=1,uuid=60d93504-f029-4cd0-ad9e-f7e1b91a70d0' \ -smp '2,sockets=1,cores=2,maxcpus=2' \ -nodefaults \ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \ -vnc 'unix:/var/run/qemu-server/200.vnc,password=on' \ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \ -m 6144 \ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \ -device 'vmgenid,guid=61f30508-c3e1-440e-899f-a0254755a8d2' \ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \ -chardev 'socket,path=/var/run/qemu-server/200.qga,server=on,wait=off,id=qga0' \ -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \ -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:5fb123b7e68' \ -drive 'if=none,id=drive-ide2,media=cdrom,aio=io_uring' \ -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100' \ -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \ -drive 'file=/dev/pve/vm-200-disk-1,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \ -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=101' \ -netdev 'type=tap,id=net0,ifname=tap200i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \ -device 'virtio-net-pci,mac=EE:69:FE:C9:A4:5C,netdev=net0,bus=pci.0,addr=0x12,id=net0' \ -machine 'type=pc+pve0' [root@lonestar ~]$
 
can you try modifying the "aio" setting of your disk to 'native' instead of 'io_uring'?
 
The problem seems to have resolved itself. While the problem existed, I was updating packages daily. If there was a kernel update, I rebooted. After several days of reboots, the VM started to boot off of the local-lvm storage. I didn't make any changes, it just started working. The existing VM105, I moved the disk back from NFS to local-lvm and it booted.

I am at a loss to what it could be, unless it was some kernel or recept package bug.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!