[SOLVED] Multipathd fails ofter some time (when path_checker is rdac)

Kisuke · Apr 22, 2020

Hi,

I've run into very strange problem that happens to me.

In January I installed 6 dell servers and set-up one array as an experimental cluster.
Servers are directly connected to storage array. 3 to controller A, 3 to controller B. Each server has only one path, since redundancy is not needed because it's experimental cluster.

After some time (2 months) I found that 4 of 6 servers cannot access shared storage. So I rebooted them. Did upgrade of whole cluster from Proxmox 6.0 to 6.1 (via apt-get) and set up some scripts to let me know if error occurs again.

Yesterday (after approx a month after I fixed it first time) first server reported same error. I did a little investigation and some googling. But nothing relevant found. Nothing in storage array event log.
Today, second server reported that error. I did same investigation again. Only thing I was able to find is that if I restart multipathd with systemctl restart multipathd it fixes the problem.

Also I found this message in log, when multipathd was going down:

Code:

Apr 22 16:24:26 node04 multipathd[957]: exit (signal)
Apr 22 16:24:26 node04 multipathd[957]: uxsock: poll failed with 24
Apr 22 16:24:26 node04 systemd[1]: Stopping Device-Mapper Multipath Device Controller...

So I blame multipathd itself for the error. I found this mail where this error is mentioned in patch: https://www.mail-archive.com/dm-devel@redhat.com/msg08954.html

This was written in journalctl when problem begun:

Code:

Apr 22 15:56:01 node04 multipathd[957]: proxmox-storage: load table [0 4679106560 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1]
Apr 22 15:56:21 node04 multipathd[957]: proxmox-storage: load table [0 4679106560 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1]
Apr 22 15:56:21 node04 multipathd[957]: failed getting dm events: Bad file descriptor
Apr 22 16:07:01 node04 multipathd[957]: sdd: No SAS end device for 'end_device-3:0'
Apr 22 16:07:01 node04 multipathd[957]: proxmox-storage: load table [0 4679106560 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:48 1]
Apr 22 16:07:21 node04 multipathd[957]: checker failed path 8:48 in map proxmox-storage
Apr 22 16:07:21 node04 multipathd[957]: proxmox-storage: Entering recovery mode: max_retries=30
Apr 22 16:07:21 node04 multipathd[957]: proxmox-storage: remaining active paths: 0
Apr 22 16:07:21 node04 kernel: device-mapper: multipath: Failing path 8:48.

Complete journalctl log is attached.

Output of multipath -ll when status is OK:

Code:

proxmox-storage (36782bcb00053557b000008d05e39e10b) dm-2 DELL,MD32xx
size=2.2T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw
`-+- policy='round-robin 0' prio=14 status=active
  `- 3:0:0:0 sdd 8:48 active ready running

My /etc/multipath.conf:

Code:

  defaults {
          polling_interval        5
          path_selector           "round-robin 0"
          path_grouping_policy    group_by_prio
          uid_attribute           ID_SERIAL
          rr_min_io               100
          failback                immediate
          no_path_retry           30
      max_fds            8192
          user_friendly_names     yes
      find_multipaths        no
  }



  blacklist {
        wwid .*
         device {
                 vendor DELL.*
                 product Universal.*
         }
         device {
                 vendor DELL.*
                 product Virtual.*
         }
  }

  blacklist_exceptions {
    wwid  36782bcb00053557b000008d05e39e10b
  }

  devices    {
      device    {
          vendor "DELL"
          product "MD32xxi"
          path_grouping_policy group_by_prio
          prio rdac
          path_checker rdac
          path_selector "round-robin 0"
          hardware_handler "1 rdac"
          failback immediate
          features "2 pg_init_retries 50"
          no_path_retry 30
          rr_min_io 100
      }
      device    {
          vendor "DELL"
          product "MD32xx"
          path_grouping_policy group_by_prio
          prio rdac
          path_checker rdac
          path_selector "round-robin 0"
          hardware_handler "1 rdac"
          failback immediate
          features "2 pg_init_retries 50"
          no_path_retry 30
          rr_min_io 100
      }
  }

  multipaths {
          multipath {
                  wwid 36782bcb00053557b000008d05e39e10b
                  alias proxmox-storage
          }

  }

If anyone can point me right direction how to resolve this it would be great.

I also have another one experimental cluster with same setup, but only 2 nodes and HP Storage Array instead dell. Where path_checker is set to tur and hardware_handler "0" as HP recommends in manual. And this one works OK.
So maybe some problem when multipathd is relying on rdac is a problem? But setting that to rdac is what DELL guys recommends.

Maybe I will try some configuration mix-up of those two. But any ideas are welcome.

alexskysilk · Apr 22, 2020

While I have no comment on how to help, I am wondering about this:

Kisuke said:
Each server has only one path

What is the purpose of multipathd in this configuration?

Kisuke · Apr 22, 2020

alexskysilk said:
While I have no comment on how to help, I am wondering about this:

What is the purpose of multipathd in this configuration?

As I know it's the only way to have working shared storage and live migration on this setup. Sure, in reality it's almost for nothing, mainly for testing purposes. But it can handle HA - when one node fails (for example power is disconnected from server) service is started on another node automatically.
In reality it was only for testing and learning how the proxmox HA works.

alexskysilk · Apr 22, 2020

Kisuke said:
As I know it's the only way to have working shared storage and live migration on this setup

I can understand your confusion, but you have it backwards.

multipathd is necessary when you have multiple paths FROM the initiator (server) to the target (storage.) like if you have two different SAS channels with access to the same disk, or when you have iSCSI with multiple NICs.

To be able to live migrate all thats necessary is that each of your servers can see the same storage, regardless of numbers of paths. Do yourself a favor and remove multipathing support unless you intend to add more links from the server to the storage.

Kisuke · Apr 22, 2020

alexskysilk said:
I can understand your confusion, but you have it backwards.

multipathd is necessary when you have multiple paths FROM the initiator (server) to the target (storage.) like if you have two different SAS channels with access to the same disk, or when you have iSCSI with multiple NICs.

To be able to live migrate all thats necessary is that each of your servers can see the same storage, regardless of numbers of paths. Do yourself a favor and remove multipathing support unless you intend to add more links from the server to the storage.

Hi, intresting. I didn't even know it is possible. I was just following DELL's guide:
https://www.dell.com/downloads/glob...ault-md-linux-device-manager-installation.pdf
Which led me to https://pve.proxmox.com/wiki/ISCSI_Multipath

I am not sure how to configure this...
Is it just leave that /etc/iscsi/iscsid.conf to:

Code:

node.startup = automatic
node.session.timeo.replacement_timeout = 15

And then disable services multipath-tools and multipathd? Or how?

This solution could save me from possible multipathd errors. (But it removes the possibility to add multiple paths in future which is not crucial...)

-------------------------------------------------------

Anyway, it should work with multipathd too. So I believe it is some kind of error in this utility.

alexskysilk · Apr 22, 2020

You shouldnt need to touch iscsid.conf; if you modified it from vanilla just revert the changes. the instructions should largely be the same, but in any case you can refer to https://pve.proxmox.com/wiki/Storage:_iSCSI

As for multipath, to uninstall
sudo apt remove multipath*

EDIT I just noticed you're using a dell MD. IF it is dual controller, then you SHOULD use multipathing. The key is to have each controller on its own vlan, and you will need two have two interfaces (physical is preferred but virtual will do) each on a matching subnet to the controller interface.

THEN your multipath can properly load balance/failover across both those interfaces.

Kisuke · Apr 23, 2020

alexskysilk said:
You shouldnt need to touch iscsid.conf; if you modified it from vanilla just revert the changes. the instructions should largely be the same, but in any case you can refer to https://pve.proxmox.com/wiki/Storage:_iSCSI

As for multipath, to uninstall
sudo apt remove multipath*

EDIT I just noticed you're using a dell MD. IF it is dual controller, then you SHOULD use multipathing. The key is to have each controller on its own vlan, and you will need two have two interfaces (physical is preferred but virtual will do) each on a matching subnet to the controller interface.

THEN your multipath can properly load balance/failover across both those interfaces.

Hi,

so you think this could be reason of my problem with multipath? So this problem could be not related to proxmox?
I have 2 controllers, each has 4 scsi ports. But I have 6 servers. Controller one is prefferred owner. But 3 servers are connected to controller two. But it still works, only after some time this strange problem from above appears. Dell owners manual also mentions that virtualdisk can be accessed from non preffered controller.

If it's the root cause, some SAN switch could solve my issue? Sorry, I am quite newbie in this. That's why I have installed this testing cluster for learning and experimenting.

LnxBil · Apr 23, 2020

Kisuke said:
If it's the root cause, some SAN switch could solve my issue?

You're using iSCSI over Ethernet (is there anything else??) You just ne a switch (for HA better two) and connect both controllers/heads to both switches (2/2) and your nodes also with two nics to the switches (one nic to one switch). You should now see both iSCSI SAN IPs and should be able to connect (if your node network is correctly set up). This is standard SAN setup (iSCSI or FC) with multiple heads and multiple switches. In the end, you have multiple paths and will need multipath. If one path fails, which can happen, you will always have another one.

It's crucial to know that if a request is done on one path and it fails while processing your request, this I/O is lost and will result in an error on a higher layer. The next I/O however will work on another path.

alexskysilk · Apr 23, 2020

this illustration might help:

In this configuration, you would map each LUN created on your storage two BOTH controllers port one, resulting in two targets on two separate networks. Each server has two interfaces, each able to see both targets presented by the storage, resulting in TWO Paths possible for each LUN. Your iSCSI configuration will see those paths resulting in 2 distinct devices present despite them both being the same actual volume; thats what the multipathing daemon will intercept, create a virtual device (/dev/mpathxx) that the system will actually address- and the multipath daemon will control how it actually travels to the storage.

Kisuke · Apr 24, 2020

Hi,

thanks.

For now, I will try to leave it non-multipathed, with miltipathd turned off and will see how it works.

It it will be bad, I will look for my options.

EDIT: BTW, I have server connected to array with SAS cables. So if I need redundancy, I will have to buy some SAS switch.

alexskysilk · Apr 24, 2020

Kisuke said:
BTW, I have server connected to array with SAS cables.

If the storage really is an MD32xxi, the SAS ports are DRIVE CHANNEL ports. thats the wrong side to connect a server. If the storage is SAS (MD32xx) then the ethernet port is for OOB (out of band) management only and cant be used for iSCSI. If thats the case you dont need to worry about any iscsi configuration.

LnxBil · Apr 29, 2020

Kisuke said:
So if I need redundancy, I will have to buy some SAS switch.

Wow. Thank you. I never knew those existed.

In the old days, you needed multi-headed disks (e.g. dual-port).

alexskysilk · Apr 29, 2020

LnxBil said:
In the old days, you needed multi-headed disks (e.g. dual-port).

SAS switches solve a problem of multi initiator access. disks still need to be dual ported either by using SAS native disks or SATA with an interposer. Generally speaking SAS switching never took off because ethernet exists; with current ethernet spec reaching 800gbit there is little purpose for SAS switching at 12 or even 24gbit.

LnxBil · May 1, 2020

alexskysilk said:
SAS switches solve a problem of multi initiator access. disks still need to be dual ported either by using SAS native disks or SATA with an interposer. Generally speaking SAS switching never took off because ethernet exists; with current ethernet spec reaching 800gbit there is little purpose for SAS switching at 12 or even 24gbit.

Thank's for the explanation.

Kisuke · May 15, 2020

Hi,

I'm reporting back with the resolution of the problem why I created this thread.

The problem was really not in multipathd or proxmox itself. So it's not proxmox-related.

As it seems, the array is just not designed for usage I was trying to achieve, as someone here suggested.
Disabling multipathd fixed problem with servers disconnecting from array. But after that array was constantly changing virtual disk owner like crazy. And then complaining about disk is not on preferred path. Which worked (as far as I could see), but was driving me crazy for change. So I just turned off two nodes and made array available for all nodes via 2 paths.

For the info, array is MD3200i, connected to servers (HBA cards) with SAS cables (it acts as bus). Not sure if I say it correct. It's these bastards:

alexskysilk · May 16, 2020

as mentioned before, if you have a 3200i the SAS channel is meant to connect to additional disks, not to an initiator (computer.)

Search

Search

[SOLVED] Multipathd fails ofter some time (when path_checker is rdac)

Kisuke

Member

Attachments

alexskysilk

Distinguished Member

Kisuke

Member

alexskysilk

Distinguished Member

Kisuke

Member

alexskysilk

Distinguished Member

Kisuke

Member

LnxBil

Distinguished Member

alexskysilk

Distinguished Member

Kisuke

Member

alexskysilk

Distinguished Member

LnxBil

Distinguished Member

alexskysilk

Distinguished Member

LnxBil

Distinguished Member

Kisuke

Member

alexskysilk

Distinguished Member