iSCSI Settings on Proxmox with multipath

starnetwork · Nov 9, 2023

Hello everyone,
we have IBM 5200 Flashsystem storage with 2 Controllers, each controller have 2x 25Gbps connections
this storage have 8 volumes so I have 8x wwids
each wwid is represented 4 times on all 4 ports
total of 32 devices under sdXX
we created 8 volumes on proxmox connected to the storage 8 volumes using mpath1 / mpath2 / mpath3 etc...

now, the problem is that each time we upgrading the storage firmware, it's freez the multipath connections and all servers are down until we reboot the node
from storage side and with IBM support it's look everything is OK.

my feeling is that something is wrong with multipath settings, anyone maybe familiar with this kind of issue?

Kind Regards,

LnxBil · Nov 9, 2023

Can you post your configuration and the ouput of multipath -ll (everything in CODE tags) please?

starnetwork · Nov 9, 2023

sure, here:

Code:

# multipath -ll
mpath1 (000000000000019) dm-32 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 5:0:0:0  sdaq    66:160 active ready running
| `- 6:0:0:0  sdar    66:176 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:0:0  sdaa    65:160 active ready running
  `- 4:0:0:0  sdai    66:32  active ready running
mpath2 (00000000000001a) dm-33 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 3:0:0:1  sdab    65:176 active ready running
| `- 4:0:0:1  sdaj    66:48  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 5:0:0:1  sdas    66:192 active ready running
  `- 6:0:0:1  sdat    66:208 active ready running
mpath3 (00000000000001b) dm-35 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 5:0:0:2  sdau    66:224 active ready running
| `- 6:0:0:2  sdav    66:240 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:0:2  sdac    65:192 active ready running
  `- 4:0:0:2  sdak    66:64  active ready running
mpath4 (00000000000001c) dm-34 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 3:0:0:3  sdad    65:208 active ready running
| `- 4:0:0:3  sdal    66:80  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 5:0:0:3  sdaw    67:0   active ready running
  `- 6:0:0:3  sdax    67:16  active ready running
mpath5 (00000000000001d) dm-36 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 5:0:0:4  sday    67:32  active ready running
| `- 6:0:0:4  sdaz    67:48  active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:0:4  sdae    65:224 active ready running
  `- 4:0:0:4  sdam    66:96  active ready running
mpath6 (00000000000001e) dm-37 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 3:0:0:5  sdaf    65:240 active ready running
| `- 4:0:0:5  sdan    66:112 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 5:0:0:5  sdba    67:64  active ready running
  `- 6:0:0:5  sdbb    67:80  active ready running
mpath7 (00000000000001f) dm-59 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 5:0:0:6  sdbc    67:96  active ready running
| `- 6:0:0:6  sdbd    67:112 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 3:0:0:6  sdag    66:0   active ready running
  `- 4:0:0:6  sdao    66:128 active ready running
mpath8 (000000000000020) dm-38 IBM,2145
size=20T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 3:0:0:7  sdah    66:16  active ready running
| `- 4:0:0:7  sdap    66:144 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  |- 5:0:0:7  sdbe    67:128 active ready running
  `- 6:0:0:7  sdbf    67:144 active ready running
root@node1

Thanks!

bbgeek17 · Nov 9, 2023

Have you reviewed and applied all the vendor recommended configuration settings as described below? The page claims there is a newer version of the document but I didnt find it immediately. You may want to reach out to IBM to confirm the documents below are still applicable for your specific hardware.
You may want to not volunteer that you are using Proxmox, just say its a Debian host.

https://www.ibm.com/docs/en/flashsystem-9x00/8.2.x?topic=initiator-enabling-multipathing-linux-hosts
https://www.ibm.com/docs/en/flashsy...ystem-settings-linux-hosts#svc_linux_settings

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

bbgeek17 · Nov 9, 2023

the updated information may be buried somewhere here: https://www.ibm.com/docs/en/flashsystem-5x00

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

starnetwork · Nov 9, 2023

bbgeek17 said:
Have you reviewed and applied all the vendor recommended configuration settings as described below? The page claims there is a newer version of the document but I didnt find it immediately. You may want to reach out to IBM to confirm the documents below are still applicable for your specific hardware.
You may want to not volunteer that you are using Proxmox, just say its a Debian host.

https://www.ibm.com/docs/en/flashsystem-9x00/8.2.x?topic=initiator-enabling-multipathing-linux-hosts
https://www.ibm.com/docs/en/flashsy...ystem-settings-linux-hosts#svc_linux_settings

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

thanks for your feedback!
we have Flashsystem 5200 and we set it according to Ubuntu: https://www.ibm.com/docs/en/flashsystem-5x00/8.6.x?topic=system-settings-linux-hosts

Kind Regards,

bbgeek17 · Nov 9, 2023

There were issues with multipath over the years related to queue_if_no_path and no_path_retry , however it seems they have been resolved since then.
Is there nothing at all in journal that sheds any light? If you configured the system based on vendor specs - thats really as much as you can do on your side. PVE does not implement a special multipath or dm subsystems, standard OS tools are used.

If possible, I would experiment with timeout values. But I imagine you dont have an easy path to reproduce this if a firmware upgrade is required. Does the hang happen when you power down/reboot one of the controllers? If you can reproduce it that way, I'd recommend leaning harder on the vendor to provide a solution.
On the other hand, if you cant reproduce with controller reboot/shutdown - perhaps there is a problem with firmware upgrade procedure?

Good luck

https://dm-devel.redhat.narkive.com...how-do-i-turn-features-1-queue-if-no-path-off
https://www.suse.com/support/kb/doc/?id=000019082

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

alexskysilk · Nov 9, 2023

starnetwork said:
the problem is that each time we upgrading the storage firmware, it's freez the multipath connections

How often do you do this?! you shouldnt ever update firmware unless its a critical update OR one applicable to a particular issue you're having, and even then on a downtime basis for maximum safety. Firmware update on a couplet always requires you to STOP IO to the controller being updated; it is normal and expected for the IO to freeze if you dont. But in any case, this should be a very rare operation.

bbgeek17 · Nov 9, 2023

alexskysilk said:
How often do you do this?! you shouldnt ever update firmware unless its a critical update OR one applicable to a particular issue you're having, and even then on a downtime basis for maximum safety. Firmware update on a couplet always requires you to STOP IO to the controller being updated; it is normal and expected for the IO to freeze if you dont. But in any case, this should be a very rare operation.

I agree with your general sentiment - upgrading firmware on an enterprise SAN should be quite rare. I imagine OP got burned once and is now skittish about doing it again.

Unless OP has impeccable pre-upgrade records, it is very possible that Multipath was _not_ in a good state prior to last failed upgrade. May be some configuration value did not take place yet? May be a path was failed from another event?

When we implement our systems we always do full HA/redundancy testing with customer servers prior to going to production. This includes pulling the cables, shutting down/rebooting controllers with IO flowing, etc. When an upgrade is needed we always check with customer to ensure that client hosts are in good state. But, we would never ask customer to stop all IO.

It looks like IBM also does not require IO stoppage (https://www.ibm.com/docs/en/flashsystem-5x00/8.4.x?topic=sv-updating-system-software). That said, the best path for OP at this point is to schedule comprehensive controller/path testing to ensure that their environment can handle interruption.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

alexskysilk · Nov 9, 2023

bbgeek17 said:
But, we would never ask customer to stop all IO.

AFAIK no redundant couple requires you to stop IO in total, just to the affected controller; easiest way to accomplish this (if easy is the right word) is to simple remove the paths to that controller from your multipath settings temporarily.

bbgeek17 said:
It looks like IBM also does not require IO stoppage

That is a neat trick; if anyone can accomplish this kind of host/target orchestration its IBM. fwiw, we're still on a thread complaining about that exact thing not working

bbgeek17 · Nov 9, 2023

alexskysilk said:
That is a neat trick; if anyone can accomplish this kind of host/target orchestration its IBM.

Our upgrades are completely transparent. Can (and are) done with 500k+ IOPS and 8G/s throughput, non-disruptively to applications.

alexskysilk said:
fwiw, we're still on a thread complaining about that exact thing not working

The irony is strong on this one

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

starnetwork · Nov 9, 2023

Hi,
thanks for your feedbacks!
1. yes, it's happen one time and am able to restore it with shutdown of the controllers, it's important to know multipath working well to know that High availability mechanism will cover me in case of failure.

2. attaching some related logs:

Code:

./messages:Nov  7 23:19:50 node01 kernel: [4329396.135393] device-mapper: multipath: 253:37: Failing path 65:192.
./messages:Nov  7 23:19:50 node01 kernel: [4329396.135465] device-mapper: multipath: 253:72: Failing path 66:0.
./messages:Nov  7 23:19:50 node01 kernel: [4329396.135511] device-mapper: multipath: 253:44: Failing path 66:16.
./messages:Nov  7 23:19:53 node01 kernel: [4329399.149763] device-mapper: multipath: 253:35: Failing path 65:208.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.150609] device-mapper: multipath: 253:34: Reinstating path 65:160.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.150832] device-mapper: multipath: 253:36: Reinstating path 65:176.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.151193] device-mapper: multipath: 253:37: Reinstating path 65:192.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.151377] device-mapper: multipath: 253:71: Reinstating path 65:240.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.151769] device-mapper: multipath: 253:72: Reinstating path 66:0.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.230698] device-mapper: multipath: 253:72: Failing path 66:128.
./messages:Nov  7 23:19:55 node01 kernel: [4329401.230971] device-mapper: multipath: 253:44: Reinstating path 66:16.
./messages:Nov  7 23:19:58 node01 kernel: [4329404.232110] device-mapper: multipath: 253:35: Reinstating path 65:208.
./messages:Nov  7 23:19:58 node01 kernel: [4329404.232516] device-mapper: multipath: 253:43: Reinstating path 65:224.

./kern.log:Nov  7 23:51:56 node01 kernel: [4331322.382542] device-mapper: multipath: 253:71: Reinstating path 65:240.
./kern.log:Nov  7 23:51:56 node01 kernel: [4331322.382728] device-mapper: multipath: 253:44: Reinstating path 66:16.
./kern.log:Nov  7 23:51:56 node01 kernel: [4331322.384494] device-mapper: multipath: 253:37: Failing path 66:240.
./kern.log:Nov  7 23:52:00 node01 kernel: [4331326.387177] device-mapper: multipath: 253:71: Failing path 66:112.
./kern.log:Nov  7 23:52:01 node01 kernel: [4331327.395484] device-mapper: multipath: 253:34: Reinstating path 66:176.
./kern.log:Nov  7 23:52:01 node01 kernel: [4331327.395884] device-mapper: multipath: 253:37: Reinstating path 66:240.
./kern.log:Nov  7 23:52:17 node01 kernel: [4331343.410796] device-mapper: multipath: 253:71: Failing path 65:240.
./kern.log:Nov  7 23:52:17 node01 kernel: [4331343.410844] device-mapper: multipath: 253:44: Failing path 66:16.
./kern.log:Nov  7 23:52:17 node01 kernel: [4331343.410892] device-mapper: multipath: 253:36: Failing path 66:48.
./kern.log:Nov  7 23:52:17 node01 kernel: [4331343.410933] device-mapper: multipath: 253:37: Failing path 66:64.
./kern.log:Nov  7 23:52:17 node01 kernel: [4331343.412077] device-mapper: multipath: 253:72: Failing path 66:128.

./syslog:Nov  7 23:55:44 node01 multipathd[2107482]: sdar: mark as failed
./syslog:Nov  7 23:55:44 node01 multipathd[2107482]: sdai: mark as failed
./syslog:Nov  7 23:55:44 node01 multipathd[2107482]: sdav: mark as failed
./syslog:Nov  7 23:55:52 node01 multipathd[2107482]: mpath2: sdab - tur checker timed out
./syslog:Nov  7 23:55:52 node01 multipathd[2107482]: checker failed path 65:176 in map mpath2
./syslog:Nov  7 23:55:52 node01 multipathd[2107482]: mpath7: sdag - tur checker timed out
./syslog:Nov  7 23:55:52 node01 multipathd[2107482]: checker failed path 66:0 in map mpath7
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath4: sdad - tur checker reports path is up
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: 65:208: reinstated
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath4: remaining active paths: 3
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath4: switch to path group #1
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath5: sdae - tur checker reports path is up
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: 65:224: reinstated
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath5: remaining active paths: 3
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath2: sdaj - tur checker timed out
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: checker failed path 66:48 in map mpath2
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: mpath3: sdak - tur checker timed out
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: checker failed path 66:64 in map mpath3
./syslog:Nov  7 23:55:54 node01 kernel: [4331559.727111] device-mapper: multipath: 253:35: Reinstating path 65:208.
./syslog:Nov  7 23:55:54 node01 kernel: [4331559.727344] device-mapper: multipath: 253:43: Reinstating path 65:224.
./syslog:Nov  7 23:55:54 node01 multipath: mpath5: adding new path sday
./syslog:Nov  7 23:55:54 node01 multipath: mpath4: adding new path sdad
./syslog:Nov  7 23:55:54 node01 multipath: mpath5: adding new path sdaz
./syslog:Nov  7 23:55:54 node01 multipath: mpath4: adding new path sdal
./syslog:Nov  7 23:55:54 node01 multipath: mpath5: adding new path sdae
./syslog:Nov  7 23:55:54 node01 multipath: mpath4: adding new path sdaw
./syslog:Nov  7 23:55:54 node01 multipath: mpath5: adding new path sdam
./syslog:Nov  7 23:55:54 node01 multipath: mpath4: adding new path sdax
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: exit (signal)
./syslog:Nov  7 23:55:54 node01 multipathd[2107482]: --------shut down-------
./syslog:Nov  7 23:56:03 node01 kernel: [4331569.120274] device-mapper: multipath: 253:35: Failing path 65:208.
./syslog:Nov  7 23:56:03 node01 multipath: mpath4: adding new path sdad
./syslog:Nov  7 23:56:03 node01 multipath: mpath4: adding new path sdal
./syslog:Nov  7 23:56:03 node01 multipath: mpath4: adding new path sdaw
./syslog:Nov  7 23:56:03 node01 multipath: mpath4: adding new path sdax
./syslog:Nov  7 23:56:09 node01 systemd[1]: multipathd.service: Succeeded.
./syslog:Nov  7 23:56:09 node01 systemd[1]: multipathd.service: Consumed 1.215s CPU time.
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: --------start up--------
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: read /etc/multipath.conf
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: /etc/multipath.conf line 65, invalid keyword: vendor
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: /etc/multipath.conf line 66, invalid keyword: product
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: path checkers start up
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: failed to increase buffer size
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: /etc/multipath.conf line 65, invalid keyword: vendor
./syslog:Nov  7 23:56:09 node01 multipathd[2125982]: /etc/multipath.conf line 66, invalid keyword: product
./syslog:Nov  7 23:56:40 node01 kernel: [4331605.984285] device-mapper: multipath: 253:71: Failing path 66:112.
./syslog:Nov  7 23:56:40 node01 multipath: mpath6: adding new path sdaf
./syslog:Nov  7 23:56:40 node01 multipath: mpath6: adding new path sdan
./syslog:Nov  7 23:56:40 node01 multipath: mpath6: adding new path sdba
./syslog:Nov  7 23:56:40 node01 multipath: mpath6: adding new path sdbb
./syslog:Nov  7 23:56:44 node01 kernel: [4331609.823954] device-mapper: multipath: 253:71: Failing path 67:80.
./syslog:Nov  7 23:56:44 node01 kernel: [4331609.825770] device-mapper: multipath: 253:44: Failing path 67:144.
./syslog:Nov  7 23:56:44 node01 multipath: mpath8: adding new path sdah
./syslog:Nov  7 23:56:44 node01 multipath: mpath6: adding new path sdaf
./syslog:Nov  7 23:56:44 node01 multipath: mpath8: adding new path sdap
./syslog:Nov  7 23:56:44 node01 multipath: mpath6: adding new path sdan
./syslog:Nov  7 23:56:44 node01 multipath: mpath8: adding new path sdbe
./syslog:Nov  7 23:56:44 node01 multipath: mpath6: adding new path sdba
./syslog:Nov  7 23:56:44 node01 multipath: mpath8: adding new path sdbf
./syslog:Nov  7 23:56:44 node01 multipath: mpath6: adding new path sdbb
./syslog:Nov  7 23:57:13 node01 multipathd[2125982]: sdav: prio = const (setting: emergency fallback - alua failed)
./syslog:Nov  7 23:57:39 node01 systemd[1]: multipathd.service: start operation timed out. Terminating.
./syslog:Nov  7 23:57:39 node01 multipathd[2125982]: exit (signal)
./syslog:Nov  7 23:57:43 node01 multipathd[2125982]: sdaz: prio = const (setting: emergency fallback - alua failed)
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: sdbf: prio = const (setting: emergency fallback - alua failed)
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath1: Using dev_loss_tmo=4294967295 instead of 120 because of no_path_retry setting
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath1: reload [0 42949672960 multipath 1 queue_if_no_path 1 alua 2 1 service-time 0 2 1 66:160 1 66:176 1 service-time 0 2 1 65:160 1 66:32 1]
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath2: Using dev_loss_tmo=4294967295 instead of 120 because of no_path_retry setting
./syslog:Nov  7 23:57:55 node01 multipath: mpath1: adding new path sdaq
./syslog:Nov  7 23:57:55 node01 multipath: mpath1: adding new path sdar
./syslog:Nov  7 23:57:55 node01 multipath: mpath1: adding new path sdaa
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath2: setting up map with 1/4 path checkers pending
./syslog:Nov  7 23:57:55 node01 multipath: mpath1: adding new path sdai
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath2: reload [0 42949672960 multipath 1 queue_if_no_path 1 alua 2 1 service-time 0 2 1 65:176 1 66:48 1 service-time 0 2 1 66:192 1 66:208 1]
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath3: Using dev_loss_tmo=4294967295 instead of 120 because of no_path_retry setting
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath3: reload [0 42949672960 multipath 1 queue_if_no_path 0 3 1 service-time 0 1 1 66:224 1 service-time 0 2 1 65:192 1 66:64 1 service-time 0 1 1 66:240 1]
./syslog:Nov  7 23:57:55 node01 multipath: mpath2: adding new path sdab
./syslog:Nov  7 23:57:55 node01 multipath: mpath2: adding new path sdaj
./syslog:Nov  7 23:57:55 node01 multipath: mpath2: adding new path sdas
./syslog:Nov  7 23:57:55 node01 multipath: mpath2: adding new path sdat
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath4: Using dev_loss_tmo=4294967295 instead of 120 because of no_path_retry setting
./syslog:Nov  7 23:57:55 node01 multipathd[2125982]: mpath4: reload [0 42949672960 multipath 1 queue_if_no_path 1 alua 2 1 service-time 0 2 1 65:208 1 66:80 1 service-time 0 2 1 67:0 1 67:16 1]
./syslog:Nov  7 23:57:55 node01 multipath: mpath3: adding new path sdau
./syslog:Nov  7 23:57:55 node01 multipath: mpath3: adding new path sdac
./syslog:Nov  7 23:57:55 node01 multipath: mpath3: adding new path sdak
./syslog:Nov  7 23:57:55 node01 multipath: mpath3: adding new path sdav
./syslog:Nov  7 23:58:12 node01 kernel: [4331698.399595] device-mapper: multipath: 253:35: Failing path 67:16.
./syslog:Nov  7 23:58:12 node01 multipath: mpath4: adding new path sdad
./syslog:Nov  7 23:58:12 node01 multipath: mpath4: adding new path sdal
./syslog:Nov  7 23:58:12 node01 multipath: mpath4: adding new path sdaw
./syslog:Nov  7 23:58:12 node01 multipath: mpath4: adding new path sdax
./syslog:Nov  7 23:58:12 node01 multipathd[2125982]: mpath5: Using dev_loss_tmo=4294967295 instead of 120 because of no_path_retry setting
./syslog:Nov  7 23:58:12 node01 multipathd[2125982]: mpath5: reload [0 42949672960 multipath 1 queue_if_no_path 0 3 1 service-time 0 1 1 67:32 1 service-time 0 2 1 65:224 1 66:96 1 service-time 0 1 1 67:48 1]
./syslog:Nov  7 23:58:19 node01 multipathd[2125982]: mpath8: Using dev_loss_tmo=4294967295 instead of 120 because of no_path_retry setting
./syslog:Nov  7 23:58:19 node01 multipathd[2125982]: mpath8: reload [0 42949672960 multipath 1 queue_if_no_path 0 3 1 service-time 0 2 1 66:16 1 66:144 1 service-time 0 1 1 67:128 1 service-time 0 1 1 67:144 1]
./syslog:Nov  7 23:58:19 node01 multipath: mpath7: adding new path sdbc
./syslog:Nov  7 23:58:19 node01 multipath: mpath7: adding new path sdbd
./syslog:Nov  7 23:58:19 node01 multipath: mpath7: adding new path sdag
./syslog:Nov  7 23:59:06 node01 multipathd[2125982]: mpath2: sdat - tur checker reports path is up
./syslog:Nov  7 23:59:06 node01 multipathd[2125982]: 66:208: reinstated
./syslog:Nov  7 23:59:06 node01 multipathd[2125982]: mpath2: remaining active paths: 4
./syslog:Nov  7 23:59:06 node01 multipath: mpath8: adding new path sdah
./syslog:Nov  7 23:59:06 node01 multipath: mpath8: adding new path sdap
./syslog:Nov  7 23:59:06 node01 multipathd[2125982]: path checkers took longer than 87 seconds, consider increasing max_polling_interval
./syslog:Nov  7 23:59:06 node01 multipath: mpath8: adding new path sdbe
./syslog:Nov  7 23:59:06 node01 multipath: mpath8: adding new path sdbf
./syslog:Nov  7 23:59:06 node01 kernel: [4331752.518811] multipathd[2128812]: segfault at 7fa28d0c63ba ip 00007fa28d0c63ba sp 00007fa28c75aa90 error 14
./syslog:Nov  7 23:59:06 node01 systemd[1]: multipathd.service: Main process exited, code=killed, status=11/SEGV
./syslog:Nov  7 23:59:06 node01 systemd[1]: multipathd.service: Failed with result 'timeout'.

root@node1:~# service multipath-tools status
● multipathd.service - Device-Mapper Multipath Device Controller
     Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2023-11-10 00:39:48 IST; 39s ago
TriggeredBy: ● multipathd.socket
    Process: 3223415 ExecStartPre=/sbin/modprobe -a scsi_dh_alua scsi_dh_emc scsi_dh_rdac dm-multipath (code=exited, status=0/SUCCESS)
   Main PID: 3223417 (multipathd)
     Status: "up"
      Tasks: 19
     Memory: 16.7M
        CPU: 170ms
     CGroup: /system.slice/multipathd.service
             └─3223417 /sbin/multipathd -d -s

Nov 10 00:40:25 node1 multipathd[3223417]: mpath4: sdf - tur checker timed out
Nov 10 00:40:25 node1 multipathd[3223417]: checker failed path 8:80 in map mpath4
Nov 10 00:40:25 node1 multipathd[3223417]: mpath5: sdg - tur checker timed out
Nov 10 00:40:25 node1 multipathd[3223417]: checker failed path 8:96 in map mpath5
Nov 10 00:40:25 node1 multipathd[3223417]: mpath6: sdh - tur checker timed out
Nov 10 00:40:25 node1 multipathd[3223417]: checker failed path 8:112 in map mpath6
Nov 10 00:40:25 node1 multipathd[3223417]: mpath7: sdi - tur checker timed out
Nov 10 00:40:25 node1 multipathd[3223417]: checker failed path 8:128 in map mpath7
Nov 10 00:40:25 node1 multipathd[3223417]: mpath8: sdj - tur checker timed out
Nov 10 00:40:25 node1 multipathd[3223417]: checker failed path 8:144 in map mpath8

root@node1:~# service multipathd status
● multipathd.service - Device-Mapper Multipath Device Controller
     Loaded: loaded (/lib/systemd/system/multipathd.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2023-11-10 00:39:03 IST; 10min ago
TriggeredBy: ● multipathd.socket
    Process: 3717673 ExecStartPre=/sbin/modprobe -a scsi_dh_alua scsi_dh_emc scsi_dh_rdac dm-multipath (code=exited, status=0/SUCCESS)
   Main PID: 3717674 (multipathd)
     Status: "up"
      Tasks: 22
     Memory: 15.3M
        CPU: 790ms
     CGroup: /system.slice/multipathd.service
             └─3717674 /sbin/multipathd -d -s

Nov 10 00:49:00 node1 multipathd[3717674]: sdp: mark as failed
Nov 10 00:49:00 node1 multipathd[3717674]: sdq: mark as failed
Nov 10 00:49:05 node1 multipathd[3717674]: mpath4: sdn - tur checker timed out
Nov 10 00:49:05 node1 multipathd[3717674]: checker failed path 8:208 in map mpath4
Nov 10 00:49:05 node1 multipathd[3717674]: mpath4: remaining active paths: 3
Nov 10 00:49:05 node1 multipathd[3717674]: mpath5: sdo - tur checker timed out
Nov 10 00:49:05 node1 multipathd[3717674]: checker failed path 8:224 in map mpath5
Nov 10 00:49:05 node1 multipathd[3717674]: mpath5: remaining active paths: 3
Nov 10 00:49:05 node1 multipathd[3717674]: sdn: mark as failed
Nov 10 00:49:05 node1 multipathd[3717674]: sdo: mark as failed
root@node1:~#

suggestions?

bbgeek17 · Nov 10, 2023

starnetwork said:
yes, it's happen one time and am able to restore it with shutdown of the controllers, it's important to know multipath working well to know that High availability mechanism will cover me in case of failure.

The only way to ensure that you have properly working HA - is with testing. Get another compute node, replicate the config. Pull cables, controllers, power, etc.

starnetwork said:
2. attaching some related logs:

Many interesting things there. Complaints about misconfigured multipath.conf, multipathd restarting in the middle of recovery, timeouts that are likely vendor specific, multipathd crashing with segfault (which could be legitimate bug).

starnetwork said:
suggestions?

Go over the log, try to address and learn about the messages you see. Use your storage vendor support channel to confirm "normality" of other messages. Make sure your multipathd package is up to date. Confirm with storage vendor that they are happy with all the versions (kernel, multipath, etc).
Test everything. If you can repro the multipath crash - you may need to report it upstream to package maintainers, who may ask you to do more testing.

I am guessing that the final nail in your firmware adventure was the segfault. That cant be debugged in forum. Either you need to tell the developers how to reproduce it, or you need to make it happen again with core dumping enabled and then report to developers.

If something like this happened to one of our customers - we would be all over it, trying to determine why this happened.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

starnetwork · Nov 10, 2023

Hi,
1. it's look like the segfault is not the problem, it's happen only in one node of 4 nodes, all other nodes no segfault issue.
from my tests today, I see that problem is tarting after first controller is boot, up, even before we touch or shutdown the second controller.

the error repeated on logs over and over again is: prio = const (setting: emergency fallback - alua failed)
maybe I should change the prio / path_grouping_policy settings?

Please advice,
Kind Regards,

bbgeek17 · Nov 10, 2023

starnetwork said:
it's look like the segfault is not the problem, it's happen only in one node of 4 nodes, all other nodes no segfault issue.
from my tests today, I see that problem is tarting after first controller is boot, up, even before we touch or shutdown the second controller.

You haven't shared the logs from other systems, nor from your latest test. Only you have access to full set of information from all systems, so anything anyone contributes is really a speculation.

starnetwork said:
./syslog:Nov 7 23:56:09 node01 multipathd[2125982]: /etc/multipath.conf line 65, invalid keyword: vendor ./syslog:Nov 7 23:56:09 node01 multipathd[2125982]: /etc/multipath.conf line 66, invalid keyword: product

Have you reviewed all the configuration files and addressed errors already? They may have nothing to do with your situation or could be part of the problem.

Keep in mind PVE is a Debian system with an Ubuntu kernel. As I mentioned multipath is part of base OS and is not developed by "Proxmox Server Solutions Gmbh".
Is reaching out to IBM not an option? This is basic Storage/Multipath interaction that should be supported by them.

One thing I noticed in log is that you seem to have a LOT of LUNs - based on the naming (sdax/sdbf/etc). What do "lsblk" and "lsscsi" show?

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

starnetwork · Nov 10, 2023

Hi,
IBM recommendation is to work with minimum of 8 volumes (and even with 16)
we have 4 iSCSI Ports so 8x4 it's 32 "Disks".

I also create tickets with IBM but so far we on "Unsupported Linux OS"

Kind Regards,

bbgeek17 · Nov 10, 2023

starnetwork said:
I also create tickets with IBM but so far we on "Unsupported Linux OS"

You could take a server and install supported OS on it, ie Ubuntu (https://www-50.ibm.com/systems/support/storage/ssic/interoperability), reproduce the problem and have them handle it.
Based on everything we know so far, the problem can range from misconfigured host, to software bug, to bad cabling, to misconfiguration on storage side, to a bug in firmware, to a combination of any of the above.

Another option is to get rid of IBM and to buy storage solution that supports Proxmox explicitly

Good Luck!

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

LnxBil · Nov 11, 2023

bbgeek17 said:
Based on everything we know so far, the problem can range from misconfigured host, to software bug, to bad cabling, to misconfiguration on storage side, to a bug in firmware, to a combination of any of the above.

Full ack. Please provide the syslog output and multipath -ll output AFTER rebooting the first san controller and before rebooting the second. Then everything again afterwards.

Search

Search

iSCSI Settings on Proxmox with multipath

starnetwork

Renowned Member

LnxBil

Distinguished Member

starnetwork

Renowned Member

bbgeek17

Distinguished Member

bbgeek17

Distinguished Member

starnetwork

Renowned Member

bbgeek17

Distinguished Member

alexskysilk

Distinguished Member

bbgeek17

Distinguished Member

alexskysilk

Distinguished Member

bbgeek17

Distinguished Member

starnetwork

Renowned Member

bbgeek17

Distinguished Member

starnetwork

Renowned Member

bbgeek17

Distinguished Member

starnetwork

Renowned Member

bbgeek17

Distinguished Member

LnxBil

Distinguished Member

We value your privacy