hello world,
I had a Promox 5 cluster with 3 SAN storage bays. When i tried an upgrade to pve 6, one of the bays was no longer visible under multipath :-(
So to take no risk on my existing infrastructure, I decided to create a new proxmox 6 cluster from scratch. (iso 6.2-1)
I installed 3 new server with no real problem, but on the fourth, i can't activate this LUN on the multipath.
(this server was prevouily under proxmox 5 with the same LUN. So the hardaware and LUN config on SAN are corrects).
My problem is on the disks sdc and sdd (the 2 paths of my LUN)
multipath -v4
Dec 05 19:50:05 | sdc: udev property ID_WWN whitelisted
Dec 05 19:50:05 | 3600a0b80002ab136000001865a8d28ba: alias = mpath50 (setting: multipath.conf multipaths section)
Dec 05 19:50:05 | sdd: ownership set to mpath50
Dec 05 19:50:05 | sdd: dev not found in pathvec
Dec 05 19:50:05 | sdd: udev property ID_WWN whitelisted
Dec 05 19:50:05 | sdd: mask = 0xc
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/state'
Dec 05 19:50:05 | sdd: path state = running
Dec 05 19:50:05 | sdd: get_state
Dec 05 19:50:05 | sdd: detect_checker = yes (setting: multipath internal)
Dec 05 19:50:05 | sdd: path_checker = rdac (setting: storage device autodetected)
Dec 05 19:50:05 | sdd: checker timeout = 30 s (setting: kernel sysfs)
Dec 05 19:50:05 | sdd: rdac state = up
Dec 05 19:50:05 | sdd: detect_prio = yes (setting: multipath internal)
Dec 05 19:50:05 | sdd: prio = rdac (setting: storage device configuration)
Dec 05 19:50:05 | sdd: prio args = "" (setting: storage device configuration)
Dec 05 19:50:05 | sdd: rdac prio = 6
Dec 05 19:50:05 | sdc: ownership set to mpath50
Dec 05 19:50:05 | sdc: dev not found in pathvec
Dec 05 19:50:05 | sdc: udev property ID_WWN whitelisted
Dec 05 19:50:05 | sdc: mask = 0xc
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/state'
Dec 05 19:50:05 | sdc: path state = running
Dec 05 19:50:05 | sdc: get_state
Dec 05 19:50:05 | sdc: detect_checker = yes (setting: multipath internal)
Dec 05 19:50:05 | sdc: path_checker = rdac (setting: storage device autodetected)
Dec 05 19:50:05 | sdc: checker timeout = 30 s (setting: kernel sysfs)
Dec 05 19:50:05 | sdc: rdac state = ghost
Dec 05 19:50:05 | sdc: detect_prio = yes (setting: multipath internal)
Dec 05 19:50:05 | sdc: prio = rdac (setting: storage device configuration)
Dec 05 19:50:05 | sdc: prio args = "" (setting: storage device configuration)
Dec 05 19:50:05 | sdc: rdac prio = 1
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/block/sdd/dev'
Dec 05 19:50:05 | mpath50: verified path sdd dev_t 8:48
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/block/sdc/dev'
Dec 05 19:50:05 | mpath50: verified path sdc dev_t 8:32
Dec 05 19:50:05 | mpath50: failback = "immediate" (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: path_grouping_policy = group_by_prio (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: path_selector = "service-time 0" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: no_path_retry = 30 (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: retain_attached_hw_handler = yes (setting: implied in kernel >= 4.3.0)
Dec 05 19:50:05 | mpath50: features = "2 pg_init_retries 50" (setting: storage device configuration)
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/dh_state'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/dh_state'
Dec 05 19:50:05 | mpath50: hardware_handler = "1 rdac" (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: rr_weight = "uniform" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: minio = 1 (setting: multipath internal)
Dec 05 19:50:05 | mpath50: fast_io_fail_tmo = 5 (setting: multipath internal)
Dec 05 19:50:05 | mpath50: deferred_remove = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: delay_watch_checks = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: delay_wait_checks = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_sample_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_rate_threshold = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_recheck_gap_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_double_failed_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: skip_kpartx = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: ghost_delay = "no" (setting: multipath.conf defaults/devices section)
Dec 05 19:50:05 | mpath50: flush_on_last_del = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: update dev_loss_tmo to 150
Dec 05 19:50:05 | target1:0:1 -> rport-1:0-1
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/dev_loss_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/dev_loss_tmo'
Dec 05 19:50:05 | target8:0:0 -> rport-8:0-0
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/dev_loss_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/fast_io_fail_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/dev_loss_tmo'
Dec 05 19:50:05 | mpath50: assembled map [3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 service-time 0 1 1 8:48 1 service-time 0 1 1 8:32 1]
Dec 05 19:50:05 | mpath50: set ACT_CREATE (map does not exist)
Dec 05 19:50:05 | mpath50: addmap [0 5242880000 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 service-time 0 1 1 8:48 1 service-time 0 1 1 8:32 1]
Dec 05 19:50:05 | libdevmapper: ioctl/libdm-iface.c(1923): device-mapper: reload ioctl on mpath50 failed: Device or resource busy
Dec 05 19:50:05 | mpath50: failed to load map, error 16
Dec 05 19:50:05 | Initialized new file [/dev/shm/multipath/failed_wwids/.lock]
Dec 05 19:50:05 | mpath50: domap (0) failure for create/reload map
Dec 05 19:50:05 | mpath50: ignoring map
Dec 05 19:50:05 | mpath50: remove multipath map
Dec 05 19:50:05 | sdd: orphan path, map flushed
Dec 05 19:50:05 | rdac prioritizer refcount 2
Dec 05 19:50:05 | rdac checker refcount 2
Dec 05 19:50:05 | sdc: orphan path, map flushed
Dec 05 19:50:05 | rdac prioritizer refcount 1
Dec 05 19:50:05 | rdac checker refcount 1
Dec 05 19:50:05 | unloading rdac prioritizer
Dec 05 19:50:05 | unloading const prioritizer
Dec 05 19:50:05 | unloading rdac checker
Dec 05 19:50:05 | unloading tur checker
root@pveproj1:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.1G 0 disk
<snip>
sdc 8:32 0 2.5T 0 disk
sdd 8:48 0 2.5T 0 disk
└─sdd1 8:49 0 2.5T 0 part
├─ST10_VD1_VG1-vm--109--disk--1 253:0 0 41G 0 lvm
├─ST10_VD1_VG1-vm--104--disk--1 253:1 0 50.3G 0 lvm
├─ST10_VD1_VG1-vm--109--disk--2 253:2 0 60G 0 lvm
├─ST10_VD1_VG1-vm--103--disk--1 253:3 0 40.5G 0 lvm
├─ST10_VD1_VG1-vm--103--disk--2 253:4 0 51.5G 0 lvm
├─ST10_VD1_VG1-vm--112--disk--1 253:5 0 95G 0 lvm
├─ST10_VD1_VG1-vm--110--disk--1 253:6 0 40G 0 lvm
├─ST10_VD1_VG1-vm--117--disk--0 253:7 0 97.7G 0 lvm
├─ST10_VD1_VG1-vm--120--disk--0 253:8 0 80G 0 lvm
├─ST10_VD1_VG1-vm--122--disk--0 253:9 0 100G 0 lvm
├─ST10_VD1_VG1-vm--121--disk--0 253:10 0 60G 0 lvm
├─ST10_VD1_VG1-vm--131--disk--0 253:11 0 32G 0 lvm
└─ST10_VD1_VG1-vm--114--disk--0 253:12 0 160G 0 lvm
- I tried to exclude temporarily disks from LVM's conf (via the "global_filter" in /etc/lvm/lvm.conf), but with no more result.
- lsof show no use of the disks
- I stop the VM of the LUM on other servers to test, but it's not better.
I really dont understand why on this server, the multipath daemon cant activate this LUN, and why the devices appears busy :-(
Any suggestion is welcome
best regards
GE
I had a Promox 5 cluster with 3 SAN storage bays. When i tried an upgrade to pve 6, one of the bays was no longer visible under multipath :-(
So to take no risk on my existing infrastructure, I decided to create a new proxmox 6 cluster from scratch. (iso 6.2-1)
I installed 3 new server with no real problem, but on the fourth, i can't activate this LUN on the multipath.
(this server was prevouily under proxmox 5 with the same LUN. So the hardaware and LUN config on SAN are corrects).
My problem is on the disks sdc and sdd (the 2 paths of my LUN)
multipath -v4
Dec 05 19:50:05 | sdc: udev property ID_WWN whitelisted
Dec 05 19:50:05 | 3600a0b80002ab136000001865a8d28ba: alias = mpath50 (setting: multipath.conf multipaths section)
Dec 05 19:50:05 | sdd: ownership set to mpath50
Dec 05 19:50:05 | sdd: dev not found in pathvec
Dec 05 19:50:05 | sdd: udev property ID_WWN whitelisted
Dec 05 19:50:05 | sdd: mask = 0xc
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/state'
Dec 05 19:50:05 | sdd: path state = running
Dec 05 19:50:05 | sdd: get_state
Dec 05 19:50:05 | sdd: detect_checker = yes (setting: multipath internal)
Dec 05 19:50:05 | sdd: path_checker = rdac (setting: storage device autodetected)
Dec 05 19:50:05 | sdd: checker timeout = 30 s (setting: kernel sysfs)
Dec 05 19:50:05 | sdd: rdac state = up
Dec 05 19:50:05 | sdd: detect_prio = yes (setting: multipath internal)
Dec 05 19:50:05 | sdd: prio = rdac (setting: storage device configuration)
Dec 05 19:50:05 | sdd: prio args = "" (setting: storage device configuration)
Dec 05 19:50:05 | sdd: rdac prio = 6
Dec 05 19:50:05 | sdc: ownership set to mpath50
Dec 05 19:50:05 | sdc: dev not found in pathvec
Dec 05 19:50:05 | sdc: udev property ID_WWN whitelisted
Dec 05 19:50:05 | sdc: mask = 0xc
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/state'
Dec 05 19:50:05 | sdc: path state = running
Dec 05 19:50:05 | sdc: get_state
Dec 05 19:50:05 | sdc: detect_checker = yes (setting: multipath internal)
Dec 05 19:50:05 | sdc: path_checker = rdac (setting: storage device autodetected)
Dec 05 19:50:05 | sdc: checker timeout = 30 s (setting: kernel sysfs)
Dec 05 19:50:05 | sdc: rdac state = ghost
Dec 05 19:50:05 | sdc: detect_prio = yes (setting: multipath internal)
Dec 05 19:50:05 | sdc: prio = rdac (setting: storage device configuration)
Dec 05 19:50:05 | sdc: prio args = "" (setting: storage device configuration)
Dec 05 19:50:05 | sdc: rdac prio = 1
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/block/sdd/dev'
Dec 05 19:50:05 | mpath50: verified path sdd dev_t 8:48
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/block/sdc/dev'
Dec 05 19:50:05 | mpath50: verified path sdc dev_t 8:32
Dec 05 19:50:05 | mpath50: failback = "immediate" (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: path_grouping_policy = group_by_prio (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: path_selector = "service-time 0" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: no_path_retry = 30 (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: retain_attached_hw_handler = yes (setting: implied in kernel >= 4.3.0)
Dec 05 19:50:05 | mpath50: features = "2 pg_init_retries 50" (setting: storage device configuration)
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/dh_state'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/dh_state'
Dec 05 19:50:05 | mpath50: hardware_handler = "1 rdac" (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: rr_weight = "uniform" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: minio = 1 (setting: multipath internal)
Dec 05 19:50:05 | mpath50: fast_io_fail_tmo = 5 (setting: multipath internal)
Dec 05 19:50:05 | mpath50: deferred_remove = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: delay_watch_checks = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: delay_wait_checks = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_sample_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_rate_threshold = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_recheck_gap_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_double_failed_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: skip_kpartx = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: ghost_delay = "no" (setting: multipath.conf defaults/devices section)
Dec 05 19:50:05 | mpath50: flush_on_last_del = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: update dev_loss_tmo to 150
Dec 05 19:50:05 | target1:0:1 -> rport-1:0-1
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/dev_loss_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/dev_loss_tmo'
Dec 05 19:50:05 | target8:0:0 -> rport-8:0-0
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/dev_loss_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/fast_io_fail_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/dev_loss_tmo'
Dec 05 19:50:05 | mpath50: assembled map [3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 service-time 0 1 1 8:48 1 service-time 0 1 1 8:32 1]
Dec 05 19:50:05 | mpath50: set ACT_CREATE (map does not exist)
Dec 05 19:50:05 | mpath50: addmap [0 5242880000 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 service-time 0 1 1 8:48 1 service-time 0 1 1 8:32 1]
Dec 05 19:50:05 | libdevmapper: ioctl/libdm-iface.c(1923): device-mapper: reload ioctl on mpath50 failed: Device or resource busy
Dec 05 19:50:05 | mpath50: failed to load map, error 16
Dec 05 19:50:05 | Initialized new file [/dev/shm/multipath/failed_wwids/.lock]
Dec 05 19:50:05 | mpath50: domap (0) failure for create/reload map
Dec 05 19:50:05 | mpath50: ignoring map
Dec 05 19:50:05 | mpath50: remove multipath map
Dec 05 19:50:05 | sdd: orphan path, map flushed
Dec 05 19:50:05 | rdac prioritizer refcount 2
Dec 05 19:50:05 | rdac checker refcount 2
Dec 05 19:50:05 | sdc: orphan path, map flushed
Dec 05 19:50:05 | rdac prioritizer refcount 1
Dec 05 19:50:05 | rdac checker refcount 1
Dec 05 19:50:05 | unloading rdac prioritizer
Dec 05 19:50:05 | unloading const prioritizer
Dec 05 19:50:05 | unloading rdac checker
Dec 05 19:50:05 | unloading tur checker
root@pveproj1:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.1G 0 disk
<snip>
sdc 8:32 0 2.5T 0 disk
sdd 8:48 0 2.5T 0 disk
└─sdd1 8:49 0 2.5T 0 part
├─ST10_VD1_VG1-vm--109--disk--1 253:0 0 41G 0 lvm
├─ST10_VD1_VG1-vm--104--disk--1 253:1 0 50.3G 0 lvm
├─ST10_VD1_VG1-vm--109--disk--2 253:2 0 60G 0 lvm
├─ST10_VD1_VG1-vm--103--disk--1 253:3 0 40.5G 0 lvm
├─ST10_VD1_VG1-vm--103--disk--2 253:4 0 51.5G 0 lvm
├─ST10_VD1_VG1-vm--112--disk--1 253:5 0 95G 0 lvm
├─ST10_VD1_VG1-vm--110--disk--1 253:6 0 40G 0 lvm
├─ST10_VD1_VG1-vm--117--disk--0 253:7 0 97.7G 0 lvm
├─ST10_VD1_VG1-vm--120--disk--0 253:8 0 80G 0 lvm
├─ST10_VD1_VG1-vm--122--disk--0 253:9 0 100G 0 lvm
├─ST10_VD1_VG1-vm--121--disk--0 253:10 0 60G 0 lvm
├─ST10_VD1_VG1-vm--131--disk--0 253:11 0 32G 0 lvm
└─ST10_VD1_VG1-vm--114--disk--0 253:12 0 160G 0 lvm
- I tried to exclude temporarily disks from LVM's conf (via the "global_filter" in /etc/lvm/lvm.conf), but with no more result.
- lsof show no use of the disks
- I stop the VM of the LUM on other servers to test, but it's not better.
I really dont understand why on this server, the multipath daemon cant activate this LUN, and why the devices appears busy :-(
Any suggestion is welcome
best regards
GE