Proxmox 5.4 multipath LVM issues

schoda · May 29, 2020

Hi,

long story short: our storage vendor verified everything he could (multipath.conf, udev rules, best practices) but he only officially gives support for RHEL and SLES. Everytime they do an update they take down one controller on the storage side, update it and bring it back online. And somewhere there problems happen on the proxmox site because i'm getting errors like this in pvs:

WARNING: PV f0ve56-x2lh-vdbd-b8oo-RxfH-nA9g-biW2ip on /dev/mapper/pm-cluster01-storage01 was already found on /dev/sdcc.
WARNING: PV MyCoVk-DYuY-vPWc-Tbbo-l3Ay-CNfH-tix0f3 on /dev/mapper/pm-cluster01-online was already found on /dev/sdbm.
WARNING: PV yidCyv-zVBb-DEo5-CkUj-IN8G-N4Xt-RQShxe on /dev/sdai was already found on /dev/mapper/pm-cluster01-XXXXXX.
WARNING: PV 3GzQqQ-AWZ2-GFO6-fUhI-cnrr-Yx5e-tKhha9 on /dev/sdd was already found on /dev/mapper/pm-cluster01-storage01-ZZZZZZZZZ.
WARNING: PV yidCyv-zVBb-DEo5-CkUj-IN8G-N4Xt-RQShxe on /dev/sdaz was already found on /dev/mapper/pm-cluster01-XXXXXX.
WARNING: PV BQEBf7-d4pJ-Nxka-bVyN-amdg-lyQE-n0RonP on /dev/sde was already found on /dev/mapper/pm-cluster01-YYYYYY.
WARNING: PV f0ve56-x2lh-vdbd-b8oo-RxfH-nA9g-biW2ip on /dev/sdf was already found on /dev/sdcc.

And now they asked me if i can ask if i can involve the Proxmox support. So here i am.

I've tested the multipath.conf the way i can: i unplugged an FC cable, watched multipath -ll, and plugged it back in and watched multipath -ll again. Everything works fine. But for some reason when Pure Storage does an update via shutting down one controller and enabling it again when they are done my multipath LVM goes nuts. I even had wrong LVM header on my devices.

Currently we are wondering why "pvs" complains about /dev/sdX block devices. The LVM was created via the devices in /dev/mapper/. As far as i understood its not guaranteed that you will get back the same block device (/dev/sdX) when it comes back online. Do i have to do a rescan-scsi-bus-sh and restart multipathd everytime they are done upgrading or what would be the correct way to handle this?

Here is my current multipath.conf

Code:

defaults {
   polling_interval      10
   find_multipaths       yes
}
devices {
   device {
       vendor                "PURE"
       product               "FlashArray"
       path_selector         "queue-length 0"
       path_grouping_policy  group_by_prio
#       path_grouping_policy  multibus
#       rr_min_io             1
       path_checker          tur
       fast_io_fail_tmo      10
       dev_loss_tmo          60
       no_path_retry         0
       hardware_handler      "1 alua"
       prio                  alua
       failback              immediate
   }
}

# Recommended settings for Pure Storage FlashArray.

# Use noop scheduler for high-performance solid-state storage
ACTION=="add|change", KERNEL=="sd*[!0-9]", SUBSYSTEM=="block", ENV{ID_VENDOR}=="PURE", ATTR{queue/scheduler}="noop"

# Reduce CPU overhead due to entropy collection
ACTION=="add|change", KERNEL=="sd*[!0-9]", SUBSYSTEM=="block", ENV{ID_VENDOR}=="PURE", ATTR{queue/add_random}="0"

# Spread CPU load by redirecting completions to originating CPU
ACTION=="add|change", KERNEL=="sd*[!0-9]", SUBSYSTEM=="block", ENV{ID_VENDOR}=="PURE", ATTR{queue/rq_affinity}="2"

# Set the HBA timeout to 60 seconds
ACTION=="add|change", SUBSYSTEMS=="scsi", ATTRS{model}=="FlashArray      ", ATTR{timeout}="60"

# Set max secrots to 4096 kb (was already the default)
ACTION=="add|change", KERNEL=="sd*[!0-9]", SUBSYSTEM=="block", ENV{ID_VENDOR}=="PURE", ATTR{queue/max_sectors_kb}="4096"

Sadly Debian is missing libstoragemgmt which would enable us to use more features (like auto rescan when disk size changes).

Does the Proxmox team has any ideas?

Thanks & Best regards,
Daniel

Last posts about this issue:
https://forum.proxmox.com/threads/s...-5-4-with-fc-multipath-and-lvm-backend.65254/

wolfgang · Jun 2, 2020

Hi,

I guess the problem is that the storages are initial with a new path after reloading the controller.
You could try to make a static mapping with udev rules so the dev is not reordering.

And you can try to install [1] libstorage from the source.

1.) https://libstorage.github.io/libstoragemgmt-doc/doc/install.html

schoda · Jul 9, 2020

Hi,

I've tried to figure out how to make a static mapping with udev but didn't have any luck so far. Do you have a documentation / how to for that?

Thanks in advance,
Daniel

LnxBil · Jul 9, 2020

Why don't you use the naming via multipath and restrict LVM to just those names? I didn't have to fiddle around with udev.

For our san this is as follows:

Code:

root@proxmox3 ~ > grep DX100 /etc/lvm/lvm.conf
    filter = [ "a|/dev/mapper/DX100*|", "a|/dev/sda?|", "r|.*|" ]


root@proxmox3 ~ > cat /etc/multipath.conf

multipaths {
    multipath {
        wwid            360000000000000000000000000000001
        alias           DX100_PROXMOX_01
    }
    multipath {
        wwid            360000000000000000000000000000002
        alias           DX100_PROXMOX_02
    }
}

blacklist {
    device {
        vendor "HPQ"
        product ".*"
    }
}

devices {
    device {
        vendor                  "FUJITSU"
        product                 "ETERNUS_DXL"
        prio                    alua
        path_grouping_policy    group_by_prio
        path_selector           "round-robin 0"
        failback                immediate
        no_path_retry           0
        path_checker            tur
        dev_loss_tmo            2097151
        fast_io_fail_tmo        1
    }
}

schoda · Jul 10, 2020

LnxBil said:

Why don't you use the naming via multipath and restrict LVM to just those names? I didn't have to fiddle around with udev.

For our san this is as follows:

Code:

root@proxmox3 ~ > grep DX100 /etc/lvm/lvm.conf
    filter = [ "a|/dev/mapper/DX100*|", "a|/dev/sda?|", "r|.*|" ]


root@proxmox3 ~ > cat /etc/multipath.conf

multipaths {
    multipath {
        wwid            360000000000000000000000000000001
        alias           DX100_PROXMOX_01
    }
    multipath {
        wwid            360000000000000000000000000000002
        alias           DX100_PROXMOX_02
    }
}

blacklist {
    device {
        vendor "HPQ"
        product ".*"
    }
}

devices {
    device {
        vendor                  "FUJITSU"
        product                 "ETERNUS_DXL"
        prio                    alua
        path_grouping_policy    group_by_prio
        path_selector           "round-robin 0"
        failback                immediate
        no_path_retry           0
        path_checker            tur
        dev_loss_tmo            2097151
        fast_io_fail_tmo        1
    }
}

I'm already using wwids and aliases in multipath.conf. But that does not help at all when all lvm commands try to access the block devices (/dev/sd*) for whatever reason.

Code:

  multipath {
        wwid "3624a9370b9f225dcede6459700011430"
        alias pm-cluster01-online
  }
  multipath {
        wwid "3624a9370b9f225dcede6459700011432"
        alias pm-cluster01-localtop01
  }

You mentioned lvm filter could maybe solve this, currently the filter looks like:

Code:

        global_filter = [ "r|/dev/zd.*|", "r|/dev/mapper/pve-.*|", "r|/dev/mapper/.*-vm--[0-9]+--disk--[0-9]+|", "r|/dev/mapper/vg_.*-brick_.*|", "r|/dev/mapper/vg_.*-tp_.*|" ]

Am i understanding this right that /dev/mapper/ is currently restricted but /dev/sd* is not?

I might try to change this to something like:

Code:

        global_filter = [ "a|/dev/mapper/pm.*|", "a|/dev/sda|", "r|.*|" ]

Maybe i can find an empty host to test this.

LnxBil · Jul 11, 2020

schoda said:
I'm already using wwids and aliases in multipath.conf. But that does not help at all when all lvm commands try to access the block devices (/dev/sd*) for whatever reason.

Ah okay, you haven't shown us, so I assumed you did not.

schoda said:
Am i understanding this right that /dev/mapper/ is currently restricted but /dev/sd* is not?

Yes, no idea where this configuration comes from.

schoda said:
I might try to change this to something like:

Code:

global_filter = [ "a|/dev/mapper/pm.*|", "a|/dev/sda|", "r|.*|" ]

Maybe i can find an empty host to test this.

Yes, that looks similar to my setup.

Search

Search

Proxmox 5.4 multipath LVM issues

schoda

Member

wolfgang

Proxmox Retired Staff

schoda

Member

LnxBil

Distinguished Member

schoda

Member

LnxBil

Distinguished Member