PVE 8 whats incorrect with startup VMs on the iscsi?

czechsys

Renowned Member
Nov 18, 2015
507
56
93
Hi,

the was problem with booting (some) VMs after pve server reboot, shared multipathed iscsi SAN.

VM (vg-dorad-prod, bulk start after pve reboot):
Code:
2026-04-14T04:10:06.683761+02:00 pve-14 pvesh[3655]: Starting VM 576
2026-04-14T04:10:06.690330+02:00 pve-14 pve-guests[3721]: start VM 576: UPID:pve-14:00000E89:00000F9A:69DDA1FE:qmstart:576:root@pam:
2026-04-14T04:10:06.690592+02:00 pve-14 pve-guests[3660]: <root@pam> starting task UPID:pve-14:00000E89:00000F9A:69DDA1FE:qmstart:576:root@pam:
2026-04-14T04:10:06.931990+02:00 pve-14 pve-guests[3721]: can't activate LV '/dev/vg-dorado-prod/vm-576-disk-0':   Cannot process volume group vg-dorado-prod
2026-04-14T04:10:07.695595+02:00 pve-14 pvesh[3655]: Starting VM 576 failed: can't activate LV '/dev/vg-dorado-prod/vm-576-disk-0':   Cannot process volume group vg-dorado-prod

lvm inicialization:
Code:
2026-04-14T04:09:34.823727+02:00 pve-14 lvm[1016]:   2 logical volume(s) in volume group "pve" monitored
2026-04-14T04:09:34.823825+02:00 pve-14 systemd[1]: Finished lvm2-monitor.service - Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
2026-04-14T04:09:34.823907+02:00 pve-14 lvm[1336]: PV /dev/sda3 online, VG pve is complete.
2026-04-14T04:09:34.823923+02:00 pve-14 lvm[1336]: VG pve finished
...
2026-04-14T04:10:07.017143+02:00 pve-14 kernel: [   40.271247] sd 2:0:0:3: Attached scsi generic sg7 type 0
2026-04-14T04:10:07.017143+02:00 pve-14 kernel: [   40.271323] sd 2:0:0:1: [sdc] Attached SCSI disk
2026-04-14T04:10:07.018140+02:00 pve-14 kernel: [   40.271709] sd 2:0:0:3: [sde] 17179869184 512-byte logical blocks: (8.80 TB/8.00 TiB)
2026-04-14T04:10:07.018143+02:00 pve-14 kernel: [   40.271836] sd 2:0:0:3: [sde] Write Protect is off
2026-04-14T04:10:07.018143+02:00 pve-14 kernel: [   40.271842] sd 2:0:0:3: [sde] Mode Sense: 97 00 00 08
2026-04-14T04:10:07.018144+02:00 pve-14 kernel: [   40.272036] sd 2:0:0:3: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
2026-04-14T04:10:07.019139+02:00 pve-14 kernel: [   40.273049]  sdd: sdd1
2026-04-14T04:10:07.019142+02:00 pve-14 kernel: [   40.273179] sd 2:0:0:2: [sdd] Attached SCSI disk
2026-04-14T04:10:07.019695+02:00 pve-14 multipathd[1047]: sdc: failed to get udev uid: No data available
2026-04-14T04:10:07.019739+02:00 pve-14 multipathd[1047]: sdc: failed to get path uid
2026-04-14T04:10:07.019768+02:00 pve-14 multipathd[1047]: uevent trigger error
2026-04-14T04:10:07.021354+02:00 pve-14 multipathd[1047]: sdd: failed to get udev uid: No data available
2026-04-14T04:10:07.021384+02:00 pve-14 multipathd[1047]: sdd: failed to get path uid
2026-04-14T04:10:07.021410+02:00 pve-14 multipathd[1047]: uevent trigger error
2026-04-14T04:10:07.022139+02:00 pve-14 kernel: [   40.276301]  sde: sde1
2026-04-14T04:10:07.022149+02:00 pve-14 kernel: [   40.276407] sd 2:0:0:3: [sde] Attached SCSI disk
2026-04-14T04:10:07.058890+02:00 pve-14 multipathd[1047]: BLKID1: addmap [0 4294967296 multipath 1 queue_if_no_path 1 alua 1 1 round-robin 0 1 1 8:48 1]
2026-04-14T04:10:07.063198+02:00 pve-14 kernel: [   40.316632] device-mapper: multipath round-robin: version 1.2.0 loaded
2026-04-14T04:10:07.066145+02:00 pve-14 kernel: [   40.320166] alua: device handler registered
2026-04-14T04:10:07.067649+02:00 pve-14 multipathd[1047]: sdd [8:48]: path added to devmap BLKID1
2026-04-14T04:10:07.070030+02:00 pve-14 multipathd[1047]: BLKID2: addmap [0 4294967296 multipath 1 queue_if_no_path 1 alua 1 1 round-robin 0 1 1 8:32 1]
2026-04-14T04:10:07.071483+02:00 pve-14 multipathd[1047]: sdc [8:32]: path added to devmap BLKID2
2026-04-14T04:10:07.073622+02:00 pve-14 multipathd[1047]: BLKID3: addmap [0 17179869184 multipath 1 queue_if_no_path 1 alua 1 1 round-robin 0 1 1 8:64 1]
2026-04-14T04:10:07.075101+02:00 pve-14 multipathd[1047]: sde [8:64]: path added to devmap BLKDID3
2026-04-14T04:10:07.088138+02:00 pve-14 kernel: [   40.341588] sd 2:0:0:2: alua: transition timeout set to 60 seconds
2026-04-14T04:10:07.091149+02:00 pve-14 kernel: [   40.344581] sd 2:0:0:1: alua: transition timeout set to 60 seconds
2026-04-14T04:10:07.094147+02:00 pve-14 kernel: [   40.347591] sd 2:0:0:3: alua: transition timeout set to 60 seconds
2026-04-14T04:10:07.135781+02:00 pve-14 lvm[3922]: PV /dev/dm-5 online, VG vg-dorado-dev is complete.
2026-04-14T04:10:07.143419+02:00 pve-14 lvm[3928]: PV /dev/dm-6 online, VG vg-dorado-test is complete.
2026-04-14T04:10:07.149034+02:00 pve-14 lvm[3929]: PV /dev/dm-7 online, VG vg-dorado-prod is complete.
2026-04-14T04:10:07.201996+02:00 pve-14 systemd[1]: Started lvm-activate-vg-dorado-dev.service - /sbin/lvm vgchange -aay --autoactivation event vg-dorado-dev.
2026-04-14T04:10:07.211527+02:00 pve-14 systemd[1]: Started lvm-activate-vg-dorado-prod.service - /sbin/lvm vgchange -aay --autoactivation event vg-dorado-prod.
2026-04-14T04:10:07.213146+02:00 pve-14 systemd[1]: Started lvm-activate-vg-dorado-test.service - /sbin/lvm vgchange -aay --autoactivation event vg-dorado-test.
2026-04-14T04:10:07.302773+02:00 pve-14 lvm[4555]: /dev/dm-45 excluded: device is an LV.
2026-04-14T04:10:07.420454+02:00 pve-14 lvm[3935]:   45 logical volume(s) in volume group "vg-dorado-dev" now active
2026-04-14T04:10:07.449776+02:00 pve-14 systemd[1]: lvm-activate-vg-dorado-dev.service: Deactivated successfully.
2026-04-14T04:10:07.678288+02:00 pve-14 lvm[6280]: /dev/dm-237 excluded: device is an LV.
2026-04-14T04:10:07.764328+02:00 pve-14 lvm[3941]:   113 logical volume(s) in volume group "vg-dorado-test" now active
2026-04-14T04:10:07.800578+02:00 pve-14 systemd[1]: lvm-activate-vg-dorado-test.service: Deactivated successfully.
2026-04-14T04:10:08.071596+02:00 pve-14 lvm[3940]:   269 logical volume(s) in volume group "vg-dorado-prod" now active
2026-04-14T04:10:08.102914+02:00 pve-14 systemd[1]: lvm-activate-vg-dorado-prod.service: Deactivated successfully.
2026-04-14T04:10:49.774879+02:00 pve-14 lvm[11197]: PV /dev/dm-6 online, VG vg-dorado-test is complete.
2026-04-14T04:10:49.775044+02:00 pve-14 lvm[11197]: VG vg-dorado-test finished
...
2026-04-14T04:10:59.067246+02:00 pve-14 lvm[11937]: PV /dev/dm-7 online, VG vg-dorado-prod is complete.
2026-04-14T04:10:59.067494+02:00 pve-14 lvm[11937]: VG vg-dorado-prod finished

iscsi initialization:
Code:
2026-04-14T04:09:34.824452+02:00 pve-14 systemd[1]: Listening on iscsid.socket - Open-iSCSI iscsid Socket.
2026-04-14T04:09:39.006440+02:00 pve-14 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl -- --may-exist add-port vmbr0 iscsi1 -- --if-exists clear port iscsi1 bond_active_slave bond_mode cvlans external_ids lacp mac other_config qos tag trunks vlan_mode -- --if-exists clear interface iscsi1 mtu_request external-ids other_config options -- set Port iscsi1 tag=VLANID1 -- set Interface iscsi1 type=internal
2026-04-14T04:09:39.013146+02:00 pve-14 kernel: [   12.267624] iscsi1: entered promiscuous mode
2026-04-14T04:09:39.035847+02:00 pve-14 ovs-vsctl: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl -- --may-exist add-port vmbr0 iscsi2 -- --if-exists clear port iscsi2 bond_active_slave bond_mode cvlans external_ids lacp mac other_config qos tag trunks vlan_mode -- --if-exists clear interface iscsi2 mtu_request external-ids other_config options -- set Port iscsi2 tag=VLANID2 -- set Interface iscsi2 type=internal
2026-04-14T04:09:39.043140+02:00 pve-14 kernel: [   12.297215] iscsi2: entered promiscuous mode
2026-04-14T04:09:39.590367+02:00 pve-14 systemd[1]: Starting open-iscsi.service - Login to default iSCSI targets...
2026-04-14T04:09:39.620659+02:00 pve-14 systemd[1]: Starting iscsid.service - iSCSI initiator daemon (iscsid)...
2026-04-14T04:09:39.638917+02:00 pve-14 iscsid: iSCSI logger with pid=2013 started!
2026-04-14T04:09:39.639537+02:00 pve-14 systemd[1]: Started iscsid.service - iSCSI initiator daemon (iscsid).
2026-04-14T04:09:39.676141+02:00 pve-14 kernel: [   12.930864] iscsi: registered transport (tcp)
2026-04-14T04:09:40.639740+02:00 pve-14 iscsid: iSCSI daemon with pid=2014 started!
2026-04-14T04:09:54.702996+02:00 pve-14 iscsiadm[1975]: iscsiadm: Could not login to [iface: default, target: iqn.2006-08.com.huawei:oceanstor:ID::20003:IP1, portal: IP1,3260].
2026-04-14T04:09:54.703412+02:00 pve-14 iscsiadm[1975]: iscsiadm: initiator reported error (8 - connection timed out)
...
2026-04-14T04:09:54.703713+02:00 pve-14 iscsiadm[1975]: iscsiadm: Could not login to [iface: default, target: iqn.2006-08.com.huawei:oceanstor:ID::1020003:IP8, portal: IP8,3260].
2026-04-14T04:09:54.703730+02:00 pve-14 iscsiadm[1975]: iscsiadm: initiator reported error (8 - connection timed out)
2026-04-14T04:09:54.703745+02:00 pve-14 iscsiadm[1975]: iscsiadm: Could not log into all portals
2026-04-14T04:09:54.703762+02:00 pve-14 iscsiadm[1975]: Logging in to [iface: default, target: iqn.2006-08.com.huawei:oceanstor:ID::20003:IP1, portal: IP1,3260]
...
2026-04-14T04:09:54.703897+02:00 pve-14 iscsiadm[1975]: Logging in to [iface: default, target: iqn.2006-08.com.huawei:oceanstor:ID::1020003:IP8, portal: IP8,3260]
2026-04-14T04:09:54.704062+02:00 pve-14 systemd[1]: open-iscsi.service: Main process exited, code=exited, status=8/n/a
2026-04-14T04:09:54.704198+02:00 pve-14 systemd[1]: open-iscsi.service: Failed with result 'exit-code'.
2026-04-14T04:09:54.704244+02:00 pve-14 systemd[1]: Failed to start open-iscsi.service - Login to default iSCSI targets.
2026-04-14T04:09:55.641371+02:00 pve-14 iscsid: Connection-1:0 to [target: iqn.2006-08.com.huawei:oceanstor:ID::20003:IP1, portal: IP1,3260] through [iface: default] is shutdown.
...
2026-04-14T04:09:55.641630+02:00 pve-14 iscsid: Connection-1:0 to [target: iqn.2006-08.com.huawei:oceanstor:ID::1020003:IP8, portal: IP8,3260] through [iface: default] is shutdown.
2026-04-14T04:10:07.642792+02:00 pve-14 iscsid: Could not set session1 priority. READ/WRITE throughout and latency could be affected.
2026-04-14T04:10:07.642920+02:00 pve-14 iscsid: Connection1:0 to [target: iqn.2006-08.com.huawei:oceanstor:ID::20003:IP6, portal: IP6,3260] through [iface: default] is operational now
...
2026-04-14T04:10:57.647907+02:00 pve-14 iscsid: Could not set session8 priority. READ/WRITE throughout and latency could be affected.
2026-04-14T04:10:57.648367+02:00 pve-14 iscsid: Connection8:0 to [target: iqn.2006-08.com.huawei:oceanstor:ID::20003:IP1, portal: IP1,3260] through [iface: default] is operational now

Code:
  iscsid.service                                                                                        loaded active     running   iSCSI initiator daemon (iscsid)
● open-iscsi.service                                                                                    loaded failed     failed    Login to default iSCSI targets
  iscsid.socket                                                                                         loaded active     running   Open-iSCSI iscsid Socket

After PVE boots, there was multiple messages "can't activate LV '/dev/vg-dorado-prod/vm-VMID-disk-0'" in "GUI Tasks" on the bulk VM start, almost all VMs started anyway, but a small number needed manual start.
The cluster has iscsi paths configured/enabled via pve gui, but after it was configured from cli by iscsiadm by proxmox documentation, so all paths are defined in the /etc/iscsi correctly.
The iscsi setup is used years.


Whats incorrect? Is problem open-iscsi.service? But it looks lvm was fully activated later than starting VMs anyway.