How to setup multipathing on Proxmox VE?

Tony · Apr 21, 2022

I have a PVE node on a Dell PowerEdge R550 connected to a MD1400 array using redundant connection. The controller is a Dell HBA355e. The disks show twice in the OS; I guess I must setup multipathing but I have very little experience with this. I followed some guides I could google for, but get stuck at this point: `multipath -ll` shows nothing.

An easy way is to use only 1 cable to connect the md1400 array, but it would be nice if I can make use of this redundancy.

Any hint please?

Here is a summary of what I did:

Code:

apt-get install multipath-tools
modprobe dm_multipath

cat /etc/multipath.conf
defaults {
    user_friendly_names yes
}

# the Dell HBA355e uses mpt3sas driver:
lsmod|grep mpt3sas
mpt3sas               299008  0
raid_class             16384  1 mpt3sas
scsi_transport_sas     45056  2 ses,mpt3sas

systemctl status multipath-tools.service
● multipathd.service - Device-Mapper Multipath Device Controller
     Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2022-04-21 14:48:44 IST; 24min ago
TriggeredBy: ● multipathd.socket
    Process: 125396 ExecStartPre=/sbin/modprobe -a scsi_dh_alua scsi_dh_emc scsi_dh_rdac dm-multipath (code=exited, status=0/SUCCESS)
   Main PID: 125397 (multipathd)
     Status: "up"
      Tasks: 7
     Memory: 11.0M
        CPU: 79ms
     CGroup: /system.slice/multipathd.service
             └─125397 /sbin/multipathd -d -s

Apr 21 14:48:44 teima systemd[1]: Starting Device-Mapper Multipath Device Controller...
Apr 21 14:48:44 teima multipathd[125397]: --------start up--------
Apr 21 14:48:44 teima multipathd[125397]: read /etc/multipath.conf
Apr 21 14:48:44 teima multipathd[125397]: failed to increase buffer size
Apr 21 14:48:44 teima multipathd[125397]: path checkers start up
Apr 21 14:48:44 teima systemd[1]: Started Device-Mapper Multipath Device Controller.

mira · Apr 21, 2022

Please provide the multipath config: cat /etc/multipath.conf
And the wwids file: cat /etc/multipath/wwids

Tony · Apr 21, 2022

thank you @mira, I got it working in the meantime.

Here is a summary of what I did, it might be helpful to someone else:

Code:

apt-get install multipath-tools lsscsi
modprobe dm_multipath
lsscsi --scsi_id # get wwid of disks

multipath -a 35000c500d87d8953 # repeat for each wwid found from prev command

- ensue content of /etc/multipath.conf:

Code:

defaults {
    polling_interval        2
    path_selector           "round-robin 0"
    path_grouping_policy    multibus
    uid_attribute           ID_SERIAL
    rr_min_io               100
    failback                immediate
    no_path_retry           queue
    user_friendly_names     yes
}

blacklist {
    wwid .*
}

blacklist_exceptions {
    # use the same wwid from prev steps
    wwid "35000c500d87d8953"
    .
    .
}

Tony · Apr 28, 2022

recently I did a reboot and 2 disks got different naming than the rest.:

Code:

  pool: zpool2
 state: ONLINE
config:

    NAME                                              STATE     READ WRITE CKSUM
    zpool2                                            ONLINE       0     0     0
      raidz2-0                                        ONLINE       0     0     0
        dm-name-mpatha                                ONLINE       0     0     0
        dm-name-mpathb                                ONLINE       0     0     0
        dm-name-mpathc                                ONLINE       0     0     0
        dm-name-mpathd                                ONLINE       0     0     0
        dm-name-mpathe                                ONLINE       0     0     0
        dm-name-mpathf                                ONLINE       0     0     0
      raidz2-1                                        ONLINE       0     0     0
        dm-name-mpathg                                ONLINE       0     0     0
        dm-name-mpathh                                ONLINE       0     0     0
        dm-name-mpathi                                ONLINE       0     0     0
        dm-name-mpathj                                ONLINE       0     0     0
        wwn-0x5000c500d8679173                        ONLINE       0     0     0
        wwn-0x5000c500d8735367                        ONLINE       0     0     0

so it seems 2 disks might be in "degraded" mode? below is the output from multipath -ll

Code:

multipath -ll
mpatha (35000c500d87d8953) dm-5 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:0:0  sde  8:64   active ready running
  `- 1:0:13:0 sdq  65:0   active ready running
mpathb (35000c500d875b657) dm-6 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:1:0  sdf  8:80   active ready running
  `- 1:0:14:0 sdr  65:16  active ready running
mpathc (35000c500d86fbe07) dm-7 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:10:0 sdo  8:224  active ready running
  `- 1:0:23:0 sdaa 65:160 active ready running
mpathd (35000c500d87214cf) dm-8 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:11:0 sdp  8:240  active ready running
  `- 1:0:24:0 sdab 65:176 active ready running
mpathe (35000c500d8663c5f) dm-9 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:2:0  sdg  8:96   active ready running
  `- 1:0:15:0 sds  65:32  active ready running
mpathf (35000c500d874dc27) dm-10 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:3:0  sdh  8:112  active ready running
  `- 1:0:16:0 sdt  65:48  active ready running
mpathg (35000c500d869acc3) dm-11 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:4:0  sdi  8:128  active ready running
  `- 1:0:17:0 sdu  65:64  active ready running
mpathh (35000c500d86567b3) dm-12 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:5:0  sdj  8:144  active ready running
  `- 1:0:18:0 sdv  65:80  active ready running
mpathi (35000c500d875998f) dm-13 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:6:0  sdk  8:160  active ready running
  `- 1:0:19:0 sdw  65:96  active ready running
mpathj (35000c500d8695aa3) dm-14 SEAGATE,ST20000NM004D
size=18T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:7:0  sdl  8:176  active ready running
  `- 1:0:20:0 sdx  65:112 active ready running

mira · Apr 28, 2022

Are wwn-0x5000c500d8735367 and wwn-0x5000c500d8679173 added to the blacklist_exception list?
Are those part of /etc/multipath/wwids?

ZFS would tell you if those were degraded. It seems multipath either doesn't know those or it couldn't create the multipath for some reason.

Tony · Apr 28, 2022

I created /etc/multipath.conf as instructed by Proxmox Wiki:

Code:

# cat /etc/multipath.conf
defaults {
    polling_interval        2
    path_selector           "round-robin 0"
    path_grouping_policy    multibus
    uid_attribute           ID_SERIAL
    rr_min_io               100
    failback                immediate
    no_path_retry           queue
    user_friendly_names     yes
}

blacklist {
    wwid .*
}

blacklist_exceptions {
    wwid "35000c500d87d8953"
    wwid "35000c500d875b657"
    wwid "35000c500d8663c5f"
    wwid "35000c500d874dc27"
    wwid "35000c500d869acc3"
    wwid "35000c500d86567b3"
    wwid "35000c500d875998f"
    wwid "35000c500d8695aa3"
    wwid "35000c500d8679173"
    wwid "35000c500d8735367"
    wwid "35000c500d86fbe07"
    wwid "35000c500d87214cf"
}

and here is /etc/multipath/wwids

Code:

# cat /etc/multipath/wwids
# Multipath wwids, Version : 1.0
# NOTE: This file is automatically maintained by multipath and multipathd.
# You should not need to edit this file in normal circumstances.
#
# Valid WWIDs:
/35000c500d87d8953/
/35000c500d875b657/
/35000c500d8663c5f/
/35000c500d874dc27/
/35000c500d869acc3/
/35000c500d86567b3/
/35000c500d875998f/
/35000c500d8695aa3/
/35000c500d8679173/
/35000c500d8735367/
/35000c500d86fbe07/
/35000c500d87214cf/

Some additional info: This is a disk array MD1400, connected to a PowerDedge R550 via the Dell HBA355e controller in redundant mode (2 ports + 2 cables to the MD array). Recently I rebooted the server, in order to install 4x SSD into the internal bays of the server. I am not sure this issue is caused by these 4 new SSDs.

mira · Apr 28, 2022

Thank you for the output. This looks fine.

Could you provide the output of lsblk and multipath -v3?

Tony · Apr 28, 2022

I attached the files, since they are too large to be inserted.

Thank you for looking at this; very much appreciated.

mira · Apr 28, 2022

It seems the device was busy when multipath tried to create a map:

Code:

Apr 28 17:47:23 | mpathk: addmap [0 39063650304 multipath 1 queue_if_no_path 0 1 1 round-robin 0 2 1 8:192 1 65:128 1]
Apr 28 17:47:23 | libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on mpathk (253:15) failed: Device or resource busy
Apr 28 17:47:23 | dm_addmap: libdm task=0 error: Success
Apr 28 17:47:23 | mpathk: failed to load map, error 16
Apr 28 17:47:23 | mpathk: domap (0) failure for create/reload map
Apr 28 17:47:23 | mpathk: ignoring map

Does it work when you restart multipathd? systemctl restart multipathd.service
multipath -r could also work instead.

If the commands above don't work, please provide the journal of the last boot: journalctl -b > journal.txt

Tony · Apr 28, 2022

I ran the above commands but it didn't seem to fix the problem. I attached the journal of last boot below.

bbgeek17 · Apr 28, 2022

Check the "sudo lsof | grep [device]" output to see if something is using device directly. You may need to grep for various names, ie wwid/serial/sd[*] , etc.

Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Tony · Apr 28, 2022

it seems it was a mistake to create the pool with mutlipath name like dm-name-mpatha. Perhaps a better way is to use something like dm-uuid-mpath-35000c500d87d8953?

Or is it better to skip multipath all together? I thought it's a good idea since it can increase robustness of the setup. But now I fear that it adds more complexity and increases the chance something can go wrong.

bbgeek17 · Apr 28, 2022

Tony said:
it seems it was a mistake to create the pool with mutlipath name like dm-name-mpatha. Perhaps a better way is to use something like dm-uuid-mpath-35000c500d87d8953?

Its not a mistake per se, more of a preference. We never use "friendly names".
https://ubuntu.com/server/docs/device-mapper-multipathing-introduction

Code:

When the user_friendly_names configuration option is set to yes, the name of the multipath device is set to mpathn. When new devices are brought under the control of multipath, the new devices may be seen in two different places under the /dev directory: /dev/mapper/mpathn and /dev/dm-n.

The devices in /dev/mapper are created early in the boot process. Use these devices to access the multipathed devices.

Any devices of the form /dev/dm-n are for internal use only and should never be used directly.

Tony said:
Or is it better to skip multipath all together? I thought it's a good idea since it can increase robustness of the setup.

Multipath is great, no production system should be used without it.

Tony said:
But now I fear that it adds more complexity and increases the chance something can go wrong.

This can be said of any technology, including PVE...
Its a matter of your commitment to set things up properly, testing, support and risk tolerance.

Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

Tony · Apr 28, 2022

can it be the case that zfs tries to import the disks and this causes multipath to fail? From the log I see right before the mpath device failed (mpathk), there is a message: cannot import 'zpool2': no such pool or dataset

Code:

Apr 28 15:57:45 teima multipathd[1563]: mpathj: addmap [0 39063650304 multipath 1 queue_if_no_path 0 1 1 round-robin 0 2 1 8:176 1 65:112 1]
Apr 28 15:57:45 teima multipath[1795]: mpathi: adding new path sdk
Apr 28 15:57:45 teima multipath[1795]: mpathi: adding new path sdw
Apr 28 15:57:45 teima zpool[1538]: cannot import 'zpool2': no such pool or dataset
Apr 28 15:57:45 teima multipathd[1563]: sdm: No SAS end device for 'end_device-1:0'
Apr 28 15:57:45 teima multipathd[1563]: sdy: No SAS end device for 'end_device-1:1'
Apr 28 15:57:45 teima multipathd[1563]: mpathk: addmap [0 39063650304 multipath 1 queue_if_no_path 0 1 1 round-robin 0 2 1 8:192 1 65:128 1]
Apr 28 15:57:45 teima multipathd[1563]: libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on mpathk (253:15) failed: Device or resource busy
Apr 28 15:57:45 teima kernel: device-mapper: table: 253:15: multipath: error getting device
Apr 28 15:57:45 teima kernel: device-mapper: ioctl: error adding target to table
Apr 28 15:57:45 teima multipath[1902]: mpathj: adding new path sdl
Apr 28 15:57:45 teima multipath[1902]: mpathj: adding new path sdx
Apr 28 15:57:45 teima multipathd[1563]: dm_addmap: libdm task=0 error: Success

btw if I reboot, is there a risk that zfs would fail to import the pool?

Tony · Apr 28, 2022

update: I did a reboot and it didn't seem to change anything.

Tony · Apr 29, 2022

update (for those who might be in a similar situation):

Code:

# export pool (make sure no datasets are in use):
zpool export zpool2

# restart multipathd:
systemctl restart multipathd.service

# check mapper devs:
ls -l /dev/mapper # ensure that all mpath devs are present

# import pool:
zpool import -d /dev/mapper zpool2

after these steps things look as expected. I did a reboot to verify, too.

mikeinnyc · Mar 2, 2023

Thanks for the update but my OS had this unkillable error libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on 35000c50030256957 (253:258) failed: Device or resource busy

I just backed up my /etc directory and reinstalled my OS and everything worked well. Thanks

Digitalberg · Apr 13, 2024

Tony said:
update (for those who might be in a similar situation):

Code:

# export pool (make sure no datasets are in use): zpool export zpool2 # restart multipathd: systemctl restart multipathd.service # check mapper devs: ls -l /dev/mapper # ensure that all mpath devs are present # import pool: zpool import -d /dev/mapper zpool2

after these steps things look as expected. I did a reboot to verify, too.

after doing so im getting this error:

root@prx:~# zpool export zpool2

cannot open 'zpool2': no such pool

How to setup multipathing on Proxmox VE?

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Attachments

Proxmox Staff Member

Renowned Member

Attachments

Distinguished Member

Renowned Member

Distinguished Member

Renowned Member

Renowned Member

Renowned Member

Member

New Member

We value your privacy