iscsi direct luns with multipath

tvtue

New Member
Nov 22, 2024
11
2
3
Hello dear proxmox users,

we are thinking about migrating from ovirt to proxmox and have quite a lot of direct luns there. I saw that it is possible to use direct luns in proxmox but I haven't found a way to configure multipath with that. Please don't get me wrong, I found the wiki documentation about iscsi multipathing and using lvm on top. It is great and it works. But we need direct luns with multipath and without lvm on top because we would like to keep the data on the direct luns. That way we would not need to reinstall the vms nor we needed to migrate any data.
When I try to add an iscsi storage in the webgui I can only add one portal ip address. But the luns are accessible via two or even four (on a different san) ip addresses. Of course I can configure multipath by going all through the iscsiadm discovery login and adding the the luns to the multipath daemon. But I haven't found a way to add them as a shared storage device.
We could use only one path to every lun but that would deprive us of redundancy and performance.
Is there any other way to accomplish this?

Thanks in advance and regards
Timo
 
we are thinking about migrating from ovirt to proxmox and have quite a lot of direct luns there. I saw that it is possible to use direct luns in proxmox but I haven't found a way to configure multipath with that.
Direct LUN access is organized via using QEMU native ability to connect to iSCSI. QEMU lacks support for multipath.

Your best bet is to:
a) use iscsiadm to connect paths directly, bypassing PVE's scaffolding
b) implement multipath
c) use "qm set" command to point VM to resulting md device

or: connect to iSCSI directly from the VM. If you are, indeed, bypassing the PVE's scaffolding then there is no benefit of having the LUN connected to the Hypervisor and then passing it through.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Thank you for your reply.

Direct LUN access is organized via using QEMU native ability to connect to iSCSI. QEMU lacks support for multipath.

What do you mean with "it lacks support for multipath"? Does it support LVM LVs or iSCSI LUNs in contrast to that? I thought that is just a device which is given to the qemu process, isn't it?

Your best bet is to:
a) use iscsiadm to connect paths directly, bypassing PVE's scaffolding
b) implement multipath

Implementing in QEMU you mean or where?

c) use "qm set" command to point VM to resulting md device

You mean the /dev/mapper/<wwid> device which results from configuring multipath?

or: connect to iSCSI directly from the VM.

That is not possible, because the OS disk is a directlun in ovirt, too. Also it some kind of security measure to no pass the storage vlan into the vms.
 
What do you mean with "it lacks support for multipath"? Does it support LVM LVs or iSCSI LUNs in contrast to that? I thought that is just a device which is given to the qemu process, isn't it?
Direct LUNs are implemented via this mechanism: https://www.qemu.org/docs/master/system/qemu-block-drivers.html#iscsi-luns
There is no support for multipath here.

LVM overlay for iSCSI is implemented via a different mechanism where it is possible to insert Multipath layer.

Implementing in QEMU you mean or where?
map the LUN to the hypervisor, add MP, use "qm set --scsi0 /dev/md/mpath_device" (the syntax is approximate, please find the correct syntax in "man qm").

That is not possible, because the OS disk is a directlun in ovirt, too
Sometimes you need to make adjustments when migrating between complex ecosystems. I don't know what "directlun" means in Ovirt context and whether there is a direct equivalent in PVE.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Direct LUNs are implemented via this mechanism: https://www.qemu.org/docs/master/system/qemu-block-drivers.html#iscsi-luns
There is no support for multipath here.

That looks like the "iscsidirect" variant of connecting iscsi luns to PVE. I saw similar iscsi:// urls in the man page of iscsi-ls which seems to be necessary for listing the luns. If there is no multipathing that is not the way we can go.

map the LUN to the hypervisor, add MP, use "qm set --scsi0 /dev/md/mpath_device" (the syntax is approximate, please find the correct syntax in "man qm").

Yeah, that sounds promising, I think I will test that. I wonder if vm live migration is working then. If I configure the devices on all pve nodes and make sure their names are the same, it should work, should'it?

Sometimes you need to make adjustments when migrating between complex ecosystems. I don't know what "directlun" means in Ovirt context and whether there is a direct equivalent in PVE.

I think "directluns" from ovirt are quite similar to what PVE has with "use lun directly". Nomen est omen, you get the lun as a normal virtual disk in the vm but as big as accordingly. Also snapshots are not possible with ovirt just like in PVE.
Ovirt does it a little nicer, you can do a discovery against multiple portal ips and if its the same lun-id it recognices two or more path to it. Would be cool to have that in PVE too. Just do a pvesm scan multiple times and the systems recognices the multiple paths.
 
Like said above, I did a little test and configured a multipath device on all pve nodes using iscsiadm discovery, login and multipath commands. Just like it is documentend in https://pve.proxmox.com/wiki/Multipath but I stopped at the step where one should continue with lvm on top of the multipath device.
I then went ahead and added the device to a vm directly using qm set <vmid> --scsi1 /dev/mapper/<wwid>,shared=1. With that I was able to live migrate the vm from to another pve node.
The disadvantage that I see is, that this not beeing reflected in the webui under datacenter->storage but only when you look into the vm hardware details.

I wonder why this is not beeing implemented with pvesm for example like pvesm add md ... or similar so that it appears under the datacenter storage page too. What am I missing here?

Of course, you have to configure a lot manually, if you have dozens of such multipath devices, but other than that, are there any other downsides or pitfall that I don't see?
 
  • Like
Reactions: alexskysilk
I would say that it is like that because nobody uses them that way, so the GUI doesn't really reflect that possibility. Maybe that could be a nice patch to provide? ;)
I don't think it's a problem, and you probably don't want all those LUNs to be displayed in the GUI anyway... so it's may need more than a patch, but a discussion about a way to display those nicely...
 
Like said above, I did a little test and configured a multipath device on all pve nodes using iscsiadm discovery, login and multipath commands. Just like it is documentend in https://pve.proxmox.com/wiki/Multipath but I stopped at the step where one should continue with lvm on top of the multipath device.
I then went ahead and added the device to a vm directly using qm set <vmid> --scsi1 /dev/mapper/<wwid>,shared=1. With that I was able to live migrate the vm from to another pve node.
The disadvantage that I see is, that this not beeing reflected in the webui under datacenter->storage but only when you look into the vm hardware details.

I wonder why this is not beeing implemented with pvesm for example like pvesm add md ... or similar so that it appears under the datacenter storage page too. What am I missing here?

Of course, you have to configure a lot manually, if you have dozens of such multipath devices, but other than that, are there any other downsides or pitfall that I don't see?
This bypasses the PVE storage stack by passing through a local block device directly.
PVE has no knowledge of any prerequisites (e.g., activating iSCSI storage so that LUNs are available). This could lead to issues where a VM fails to start because the storage is not yet available.
 
@mira true. maybe you want to incorporate this into pvesm ;)
Use the storage stack. If you take a look at the ISCSIPlugin [0], you'll see that multipath devices are preferred over direct block devices [1].

Adding multipath support to the GUI for iSCSI devices (and maybe FC/DAS/NVMe-over-<transport>) is one of the things we're thinking about for future improvements. As always, we can't say with certainty if we will implement this, and if so, when.

[0] https://git.proxmox.com/?p=pve-stor...04530efad3729ff99035f17a10da00e8c8b96;hb=HEAD
[1] https://git.proxmox.com/?p=pve-stor...efad3729ff99035f17a10da00e8c8b96;hb=HEAD#l247
 
@mira thank you for the warning and reminder to use the storage stack.

You said that, that you are thinking about adding multipath support for iscsi devices to the gui in the future. We are very interested in that improvement because we are planning to migrate nearly all of our virtual machines (about 400) from rhv/ovirt to pve. A lot of them have so called directluns. I have written about that above already.

You have shared those links to the source code. Thanks for that, but my programming skills continued to atrophy after my studies and I am new to pve too which doesn't make it better. I suspect that that perl code only implements things in pve backend and is not available in the cli neither. I haven't found a way to use the storage stack as you suggested to configure multipath devices. I tried the gui and the cli (pvesm) using pve version 8.3.1 for my trials.

Then I found that wiki page about Multipath https://pve.proxmox.com/wiki/Multipath that I already mentioned above too. Your warning was " This could lead to issues where a VM fails to start because the storage is not yet available." But that document describes it the same way as I did it until step 4. So your warning "PVE has no knowledge of any prerequisites (e.g., activating iSCSI storage so that LUNs are available)." also applies to that too, doesn't it?

I would appreciate to follow up this topic together with you all. I can do tests and reports if necessary.

Thanks and regards
Timo
 
You said that, that you are thinking about adding multipath support for iscsi devices to the gui in the future. We are very interested in that improvement because we are planning to migrate nearly all of our virtual machines (about 400) from rhv/ovirt to pve. A lot of them have so called directluns. I have written about that above already.
Yes, and that's easily doable by configuring multipath. PVE will prefer the multipath device over the simple block device. That can be seen in the code I linked.

I haven't found a way to use the storage stack as you suggested to configure multipath devices.
Multipath has to be set up separately, as detailed in the Wiki article you linked.

Your warning was " This could lead to issues where a VM fails to start because the storage is not yet available." But that document describes it the same way as I did it until step 4. So your warning "PVE has no knowledge of any prerequisites (e.g., activating iSCSI storage so that LUNs are available)." also applies to that too, doesn't it?
No, since the storage stack connects to the SAN, queries the LUNs and then looks for a multipath device matching the LUN. If one is available, it will use it.

In your setup you actually pass through a local block device, circumventing the storage stack. There's no knowledge on any requirements. While when you use the storage stack instead (add iSCSI storage, select LUN for VM disk), the storage stack will make sure the connection is up and the LUN is available.
 
@mira I know how to configure a multipath device, but I have not yet understood how to add that multipath device using the storage stack. Can you elaborate on that please?
 
When you've followed the steps of the wiki article and see multipath devices for each of your LUNs, the storage stack will automatically use those over the raw block devices (LUNs directly). No need to manually set the multipath device in the VM configs.
 
@mira thank you for your reply and I am sorry but I still don't get it. Can we make a concrete example? Let's say I have the following mulitpath device available on all of my pve nodes:

Code:
root@pve01:~# multipath -ll 3600d0231000d4a862ce5b77019886ab6
3600d0231000d4a862ce5b77019886ab6 dm-0 IFT,DS 3000 Series
size=7.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 19:0:0:0 sdc 8:32 active ready running
  `- 18:0:0:0 sdd 8:48 active ready running

How would you add this multipath device to the vm with id 100 as a directly accessible disk/lun?

Would you use pvesm add somehow or what?

Thanks and regards
Timo
 
Last edited:
Just select the iSCSI storage and the LUN in the `Add: Hard Disk` panel.

If there's a multipath device available for the LUN, it will use the multipath device. No need for you to manually select the multipath device. Just make sure that one is available if you want redundancy/failover. The iSCSI plugin will make sure that multipath devices are selected for LUNs that have them when starting the VM.
 
Thus I would use the gui and click on data center, then "storage", then "add", "iSCSI" and in that dialog enter some id for example "iscsisan01", then the portal ip. This will present me the iscsi iqn in the chooser field below then. But I only see one target iqn. I cannot enter the second portal ip which would bring me the second iqn of the target and so the second path.

But you mean I should ignore that, confirm with ok and go to the vm, click on it's "Hardware", "Add", "Hard Disk" and choose the id that I have entered above. And I see for example "CH 00 ID0 LUN0" as my "Disk Image". And when I add that, I get the following vm config afterwards:

Code:
root@pvenode01:~# qm config 101
agent: 1
boot: order=scsi0;ide2;net0
cores: 4
cpu: x86-64-v3
ide2: none,media=cdrom
machine: q35
memory: 8192
meta: creation-qemu=9.0.2,ctime=1730378334
name: fedora41
net0: virtio=BC:24:11:B0:45:3A,bridge=vmbr1
numa: 0
ostype: l26
scsi0: local-zfs:vm-101-disk-0,aio=threads,iothread=1,size=16G
scsi1: iscsisan01:0.0.0.scsi-3600d0231000d4a862ce5b77019886ab6,iothread=1,size=7298088M
scsihw: virtio-scsi-single
smbios1: uuid=f1eff414-2ac1-4ad9-9587-d9fb263dec00
sockets: 2
vga: virtio
vmgenid: b143b312-c572-44ea-ad08-ca20361e5e7d

And now the code of the storage stack prefers the multipath device.
Indeed that long id in the line starting with "scsi1:" is the wwid from the multipath device.

Is that the way you meant it?
 
Last edited: