Upgrade to Pve6 - multipath error - "Device or resource busy"

carnyx.io

Member
Dec 5, 2020
5
0
6
57
hello world,

I had a Promox 5 cluster with 3 SAN storage bays. When i tried an upgrade to pve 6, one of the bays was no longer visible under multipath :-(


So to take no risk on my existing infrastructure, I decided to create a new proxmox 6 cluster from scratch. (iso 6.2-1)

I installed 3 new server with no real problem, but on the fourth, i can't activate this LUN on the multipath.

(this server was prevouily under proxmox 5 with the same LUN. So the hardaware and LUN config on SAN are corrects).


My problem is on the disks sdc and sdd (the 2 paths of my LUN)

multipath -v4
Dec 05 19:50:05 | sdc: udev property ID_WWN whitelisted
Dec 05 19:50:05 | 3600a0b80002ab136000001865a8d28ba: alias = mpath50 (setting: multipath.conf multipaths section)
Dec 05 19:50:05 | sdd: ownership set to mpath50
Dec 05 19:50:05 | sdd: dev not found in pathvec
Dec 05 19:50:05 | sdd: udev property ID_WWN whitelisted
Dec 05 19:50:05 | sdd: mask = 0xc
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/state'
Dec 05 19:50:05 | sdd: path state = running
Dec 05 19:50:05 | sdd: get_state
Dec 05 19:50:05 | sdd: detect_checker = yes (setting: multipath internal)
Dec 05 19:50:05 | sdd: path_checker = rdac (setting: storage device autodetected)
Dec 05 19:50:05 | sdd: checker timeout = 30 s (setting: kernel sysfs)
Dec 05 19:50:05 | sdd: rdac state = up
Dec 05 19:50:05 | sdd: detect_prio = yes (setting: multipath internal)
Dec 05 19:50:05 | sdd: prio = rdac (setting: storage device configuration)
Dec 05 19:50:05 | sdd: prio args = "" (setting: storage device configuration)
Dec 05 19:50:05 | sdd: rdac prio = 6
Dec 05 19:50:05 | sdc: ownership set to mpath50
Dec 05 19:50:05 | sdc: dev not found in pathvec
Dec 05 19:50:05 | sdc: udev property ID_WWN whitelisted
Dec 05 19:50:05 | sdc: mask = 0xc
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/state'
Dec 05 19:50:05 | sdc: path state = running
Dec 05 19:50:05 | sdc: get_state
Dec 05 19:50:05 | sdc: detect_checker = yes (setting: multipath internal)
Dec 05 19:50:05 | sdc: path_checker = rdac (setting: storage device autodetected)
Dec 05 19:50:05 | sdc: checker timeout = 30 s (setting: kernel sysfs)
Dec 05 19:50:05 | sdc: rdac state = ghost
Dec 05 19:50:05 | sdc: detect_prio = yes (setting: multipath internal)
Dec 05 19:50:05 | sdc: prio = rdac (setting: storage device configuration)
Dec 05 19:50:05 | sdc: prio args = "" (setting: storage device configuration)
Dec 05 19:50:05 | sdc: rdac prio = 1
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/block/sdd/dev'
Dec 05 19:50:05 | mpath50: verified path sdd dev_t 8:48
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/block/sdc/dev'
Dec 05 19:50:05 | mpath50: verified path sdc dev_t 8:32
Dec 05 19:50:05 | mpath50: failback = "immediate" (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: path_grouping_policy = group_by_prio (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: path_selector = "service-time 0" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: no_path_retry = 30 (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: retain_attached_hw_handler = yes (setting: implied in kernel >= 4.3.0)
Dec 05 19:50:05 | mpath50: features = "2 pg_init_retries 50" (setting: storage device configuration)
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/target1:0:1/1:0:1:50/dh_state'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/target8:0:0/8:0:0:50/dh_state'
Dec 05 19:50:05 | mpath50: hardware_handler = "1 rdac" (setting: storage device configuration)
Dec 05 19:50:05 | mpath50: rr_weight = "uniform" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: minio = 1 (setting: multipath internal)
Dec 05 19:50:05 | mpath50: fast_io_fail_tmo = 5 (setting: multipath internal)
Dec 05 19:50:05 | mpath50: deferred_remove = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: delay_watch_checks = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: delay_wait_checks = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_sample_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_rate_threshold = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_err_recheck_gap_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: marginal_path_double_failed_time = "no" (setting: multipath internal)
Dec 05 19:50:05 | mpath50: skip_kpartx = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: ghost_delay = "no" (setting: multipath.conf defaults/devices section)
Dec 05 19:50:05 | mpath50: flush_on_last_del = no (setting: multipath internal)
Dec 05 19:50:05 | mpath50: update dev_loss_tmo to 150
Dec 05 19:50:05 | target1:0:1 -> rport-1:0-1
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/dev_loss_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/fast_io_fail_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.0/host1/rport-1:0-1/fc_remote_ports/rport-1:0-1/dev_loss_tmo'
Dec 05 19:50:05 | target8:0:0 -> rport-8:0-0
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/dev_loss_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/fast_io_fail_tmo'
Dec 05 19:50:05 | open '/sys/devices/pci0000:40/0000:40:03.0/0000:42:00.1/host8/rport-8:0-0/fc_remote_ports/rport-8:0-0/dev_loss_tmo'
Dec 05 19:50:05 | mpath50: assembled map [3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 service-time 0 1 1 8:48 1 service-time 0 1 1 8:32 1]
Dec 05 19:50:05 | mpath50: set ACT_CREATE (map does not exist)
Dec 05 19:50:05 | mpath50: addmap [0 5242880000 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 2 1 service-time 0 1 1 8:48 1 service-time 0 1 1 8:32 1]

Dec 05 19:50:05 | libdevmapper: ioctl/libdm-iface.c(1923): device-mapper: reload ioctl on mpath50 failed: Device or resource busy

Dec 05 19:50:05 | mpath50: failed to load map, error 16
Dec 05 19:50:05 | Initialized new file [/dev/shm/multipath/failed_wwids/.lock]
Dec 05 19:50:05 | mpath50: domap (0) failure for create/reload map
Dec 05 19:50:05 | mpath50: ignoring map
Dec 05 19:50:05 | mpath50: remove multipath map
Dec 05 19:50:05 | sdd: orphan path, map flushed
Dec 05 19:50:05 | rdac prioritizer refcount 2
Dec 05 19:50:05 | rdac checker refcount 2
Dec 05 19:50:05 | sdc: orphan path, map flushed
Dec 05 19:50:05 | rdac prioritizer refcount 1
Dec 05 19:50:05 | rdac checker refcount 1
Dec 05 19:50:05 | unloading rdac prioritizer
Dec 05 19:50:05 | unloading const prioritizer
Dec 05 19:50:05 | unloading rdac checker
Dec 05 19:50:05 | unloading tur checker


root@pveproj1:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.1G 0 disk
<snip>
sdc 8:32 0 2.5T 0 disk
sdd 8:48 0 2.5T 0 disk
└─sdd1 8:49 0 2.5T 0 part

├─ST10_VD1_VG1-vm--109--disk--1 253:0 0 41G 0 lvm
├─ST10_VD1_VG1-vm--104--disk--1 253:1 0 50.3G 0 lvm
├─ST10_VD1_VG1-vm--109--disk--2 253:2 0 60G 0 lvm
├─ST10_VD1_VG1-vm--103--disk--1 253:3 0 40.5G 0 lvm
├─ST10_VD1_VG1-vm--103--disk--2 253:4 0 51.5G 0 lvm
├─ST10_VD1_VG1-vm--112--disk--1 253:5 0 95G 0 lvm
├─ST10_VD1_VG1-vm--110--disk--1 253:6 0 40G 0 lvm
├─ST10_VD1_VG1-vm--117--disk--0 253:7 0 97.7G 0 lvm
├─ST10_VD1_VG1-vm--120--disk--0 253:8 0 80G 0 lvm
├─ST10_VD1_VG1-vm--122--disk--0 253:9 0 100G 0 lvm
├─ST10_VD1_VG1-vm--121--disk--0 253:10 0 60G 0 lvm
├─ST10_VD1_VG1-vm--131--disk--0 253:11 0 32G 0 lvm
└─ST10_VD1_VG1-vm--114--disk--0 253:12 0 160G 0 lvm


- I tried to exclude temporarily disks from LVM's conf (via the "global_filter" in /etc/lvm/lvm.conf), but with no more result.

- lsof show no use of the disks

- I stop the VM of the LUM on other servers to test, but it's not better.


I really dont understand why on this server, the multipath daemon cant activate this LUN, and why the devices appears busy :-(

Any suggestion is welcome

best regards
GE
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!