Proxmox and iSCSI

icepicknz

Member
Feb 15, 2023
22
0
6
Tauranga, New Zealand
Hey guys,

Been using NFS for storage for years with Proxmox and just been playing with iSCSI for the first time and running FIO to benchmark and its Way way faster!
Running 5 PVE's all connected to the Truenas NFS & iSCSI but trying to work out how to add the disks for iSCSI correctly.

Initially I added Datacenter -> Storage -> Add -> iSCSI (left the "Use LUNs Directly") while testing fio. But when I went to try migrate a VM to it, was getting error "can't allocate space in iscsi storage".

I then discovered some forum posts that suggested to untick "Use LUNs Directly" and then add a LVM with the LUN and this worked, I got my first VM migrated and working.
My second VM I added another Extent to the Target and an Associated target LUN 1, I go to add the LVM and it adds but every time I go to put a VM on it and it shows 0 available space and just cant get it to work.

Do I need to create a new target for every single VM?

any help would be greatly appreciated.
 
Hi, when you don't tick "Use LUNs Directly" you need to create an lvm storage on top of the iscsi Storage. This has also the advantage, the you can use it as "shared".


When you use the iscsi directly, you can't use it shared
 
Hi, when you don't tick "Use LUNs Directly" you need to create an lvm storage on top of the iscsi Storage. This has also the advantage, the you can use it as "shared".


When you use the iscsi directly, you can't use it shared
Yes I did this thanks, the problem is with the second Lun.

I have since discovered that it partially added, now that I deleted it and readded it it worked. The disk was slightly too small so I had to expand it but there appears to be a stale lun mounted. I’ve deleted and it still appears at the pve storage on each machine and I can’t seem to remove it to readd
 
The purpose of shared iscsi is to live migrate etc right? This I do need.

So what is the workaround if snapshot doesn’t work on shared iscsi? Zfs snapshot from the iscsi?
"Zfs snapshot from the iscsi" ? What do you mean ?
 
"Zfs snapshot from the iscsi" ? What do you mean ?

Code:
root@truenas[~]# zfs list
NAME                                                         USED  AVAIL     REFER  MOUNTPOINT
Vol1                                                        15.2T  8.44T      186K  /mnt/Vol1
Vol1/Plex-OS                                                 162G  8.44T      162G  -

As above, the storage (TrueNAS) is a ZFS volume with an iscsi ZVol, so I could just run a snapshot on the storage
Code:
zfs snapshot Vol1/Plex-OS@snapshot-2025-04-25
 
mmm, I still cant migrate though

2025-04-25 22:28:25 starting migration of CT 301 to node 'proxmox1' (x.x.x.x)
2025-04-25 22:28:25 volume 'os-transmission:vm-301-disk-0' is on shared storage 'os-transmission'
can't deactivate LV '/dev/os-transmission/vm-301-disk-0': Failed to find logical volume "os-transmission/vm-301-disk-0"
2025-04-25 22:28:25 ERROR: volume deactivation failed: os-transmission:vm-301-disk-0 at /usr/share/perl5/PVE/Storage.pm line 1280.
2025-04-25 22:28:25 aborting phase 1 - cleanup resources
2025-04-25 22:28:25 start final cleanup
2025-04-25 22:28:25 ERROR: migration aborted (duration 00:00:01): volume deactivation failed: os-transmission:vm-301-disk-0 at /usr/share/perl5/PVE/Storage.pm line 1280.
TASK ERROR: migration aborted

Code:
root@pve2:~# vgs
  VG              #PV #LV #SN Attr   VSize   VFree
  os-transmission   1   0   0 wz--n-  33.99g 33.99g
  truenas-lun-1     1   1   0 wz--n- 249.99g 49.99g
root@pve2:~# pvs
  PV         VG              Fmt  Attr PSize   PFree
  /dev/sdm   truenas-lun-1   lvm2 a--  249.99g 49.99g
  /dev/sdn   os-transmission lvm2 a--   33.99g 33.99g
root@pve2:~# lvs
  LV            VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  vm-201-disk-0 truenas-lun-1 -wi-ao---- 200.00g                                                   
root@pve2:~#

os-transmission is the disk I'm trying to migrate, but it doesnt appear in lvs

Code:
lvm: os-transmission
    vgname os-transmission
    base truenas01-iscsi-vol01:0.0.1.scsi-36589cfc000000a1aa19fa2c2b7e5bb74
    content rootdir,images
    saferemove 0
    shared 1
 
@icepicknz , there are many online resources for ZFS/iSCSI configurations. Keep in mind that just yesterday it was reported that one particular TrueNAS specific 3rd party plugin was no longer compatible with most recent TrueNAS version.

If you decide to go back to iSCSI+LVM, you may find this resource helpful: https://kb.blockbridge.com/technote/proxmox-lvm-shared-storage/


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Thanks for this, that is what I was worried about, a plugin not being fully supported so I will be going back.

I have read your article and my takeaway is to get my multi path setup first, I was going to introduce this later as I’m waiting on another switch and additional power to the rack to be installed.

I am still somewhat confused with the steps though.

1./ Add the iscsi target
2./ add the LVM from a LUN
3./ use the LVM on a host

The issue I had was trying to move the VM’s to another PVE, it actually wiped the data from the LVM, though it may have been thin provisioned.

I think I ran into the issue of the metadata still existing even after removing the target and LUN as I had to run “vgremove os-transmission-lun-1” as it was still stuck there.

I ran it on one host and it removed on all; is there another method to clear the metadata and do I have to run it on each host?
 
So I've gone and setup a test, created iSCSI connection in Proxmox to TrueNAS & created a single 30Gb LUN
I've installed 2 CT's on the same LUN, each with 10Gb storage.

I can move the containers between hosts (obviously with reboot migration) and have yet to test live migration with actual VM's.

My questions if you don't mind.
1./ Should I be using a single large LUN to install all VM's onto the same LUN, or should I be creating a LUN per VM?
2./ Because snapshots aren't available, what is the suggested backup method? is it manually going to the storage array (running ZFS/iSCSI) and doing the snapshot (i.e. zfs snapshot Vol1/Plex-OS@snapshot-2025-04-25)
3./ The plugin I was looking at was https://github.com/TheGrandWazoo/freenas-proxmox , is this the one you mention became unsupported?
 
The LUN on an iSCSI device is basically a ‘full’ hard drive (to the Proxmox Guest). For thin provisioning and snapshots you have to rely on the storage host (which you are using ZFS, which can thin provision/snapshot a zvol) because not all iSCSI systems support overprovisioning.

You only need to setup the configuration in Proxmox once, any LUN it exports will become visible to the initiator (Proxmox). Likewise in TrueNAS you configure the settings for authentication and allow initiators, set authentication or IP limits, then keep adding associated targets for each zvol you want to export as a LUN.

LVM on top of iSCSI allows you to use 1 LUN and share it across multiple VMs, whether to use it is debatable if you have full access to something like ZFS, basically all your VM would be in 1 zvol. Not very flexible (say you run out of space and deploy another iSCSI system, you can’t just ZFS send a selection of zvol). It does mean you have to export a new zvol+LUN every time you add a new VM, hence the use of a plug-in that basically SSHs into the TrueNAS and does that configuration, but that could equally well be done in Ansible for example.
 
  • Like
Reactions: Macross
The LUN on an iSCSI device is basically a ‘full’ hard drive (to the Proxmox Guest). For thin provisioning and snapshots you have to rely on the storage host (which you are using ZFS, which can thin provision/snapshot a zvol) because not all iSCSI systems support overprovisioning.

You only need to setup the configuration in Proxmox once, any LUN it exports will become visible to the initiator (Proxmox). Likewise in TrueNAS you configure the settings for authentication and allow initiators, set authentication or IP limits, then keep adding associated targets for each zvol you want to export as a LUN.

LVM on top of iSCSI allows you to use 1 LUN and share it across multiple VMs, whether to use it is debatable if you have full access to something like ZFS, basically all your VM would be in 1 zvol. Not very flexible (say you run out of space and deploy another iSCSI system, you can’t just ZFS send a selection of zvol). It does mean you have to export a new zvol+LUN every time you add a new VM, hence the use of a plug-in that basically SSHs into the TrueNAS and does that configuration, but that could equally well be done in Ansible for example.

Thanks for this... So from the looks you are giving 2 scenarios, 1, where I leave "Use LUNS Directly" checked, and the second (LVM on top of iSCSI) where I uncheck the "Use LUNS Directly"; do I have that right?

From the sounds of it, I should go for the first option and just have a single LUN per target with the full disk.

Both setups dont allow THIN provisioning from Proxmox, but the storage handles that; both options dont allow snapshot, so have to be done at the target storage array

sorry for the confusion and questions; been a network engineer for 20+ years but always had people to do the storage side, I've always used NFS or local storage but after testing iSCSI and the performance boost, I want to move to it; but the right way :)
 
this is where the confusion comes in...
If I dont use LVM over iSCSI, there seems to be limitations of the storage use with just plain iSCSI (not ticking "Use LUNS Directly").

I can add storage to an existing VM as per screenshot, it shows the initial storage capacity as 0B (no capacity)
add-hard-disk-snap-0b.png

But if selected, I can choose a disk / LUN and continue. Though in this setup, my first trial had issues with it being shared and it deleted data when I tried to migrate it.
add-hard-disk-disk-image.png

Same goes for adding a VM, but when creating a VM, I choose the 0B storage then the disk / LUN
createvm-snap.png

With these methods, I cannot create CT's or change CT's to this storage.


It appears the only way to allow CT's to store on iSCSI is to do a LVM over iSCSI?
 
Your storage shows 0 because the target is not a disk, think of it more like an enclosure. The LUNs are the “disks” which in early/simple SCSI implementations is actually the case, you would just have a complete physical disk, that disk had a size, the SCSI enclosure doesn’t store anything, didn’t have RAID etc, it just was a chassis of disks and each LUN was just a disk. iSCSI just transfers SCSI protocol over IP, so an iSCSI controller just knows it has to deliver that command to a disk, nothing more, nothing less.

Hence why snapshotting and thin provisioning has to happen at the abstracted underlying file system layer like ZFS, because SCSI doesn’t have a snapshot or thin provisioning command, it knows blocks, a simple disk cannot keep track whether that block is written to or just ones or zeros or has the same value as another block (deduplication).

You can’t share the same LUN with multiple VMs if you use it directly, unless off course you use a clustered file system. You also can’t convert a LUN from direct to LVM without destroying its contents. LVM is basically partitioning the LUN with its own block-level file system then whatever LVM is capable of, (such as sharing a volume) it can do on that LUN. You could just as well put another layer of ZFS on the LUN and then use that to share out a zvol, although that is a nutty proposition that isn’t in the GUI.

Likewise containers can’t use complete disks, they can only use file storage, you will indeed have to make an LVM and format the disk to make it into a file system on the Proxmox side (although you cannot share that file system on multiple hosts).

Think of iSCSI as attaching your disk directly to your machine, just over IP. You get some benefits such as not having to write parts of a file, but the limitations is that you have to treat it as an actual disk. A disk is not a filesystem, a filesystem or volume management system such as ZFS or LVM sits on top of a disk by making partitions, then formatting those partitions.

Passing a “raw” disk indeed gives you a huge performance boost because you’re abstracting away all the things filesystems does for you (such as structure, folders, sparse files, block tracking etc)
 
Last edited:
  • Like
Reactions: Macross and UdoB
Thank you so much for taking the time to explain to me; with your help and some ChatGPT questions, I think I now have it; and figured out the filesystem storage issue with CT's. Such a pity I installed all my systems as CT's for ease and speed of setup; I'm going to have to go through and reinstall everything as VM's now to get the speed benefit.

Interestingly, if I create a LVM over iSCSI, I do get the speed benefit; and while I have the ability to migrate between hosts, I lose the ability to backup using Proxmox backup facility, as you say, it cant copy the files directly.

thanks again for your help
 
The speed benefit may be an illusion, try the nconnect setting on your NFS. That parallelizes your NFS connection (where you classically have the bottleneck of 1 read and 1 write stream).

Especially on containers you shouldn’t see much of a difference (perhaps in benchmarks, but not in practice) because that always uses filesystems, so it should (perhaps immeasurably) perform worse with an iSCSI LUN

ZFS is a CoW filesystem so if you’re using QCOW2 on top of that and then a CoW filesystem in your guests, then you are doing an amplification of writes. Hence why NFS should be slightly faster if you’re just doing files (containers). So for guest file systems on top of ZFS over NFS you should be able to do raw disks without losing the benefits of copy-on-write and then you remove that QCOW2 layer.

Also make sure you’re comparing apples to apples, a ZFS write call is usually synchronous, meaning your data is actually written before it returns, some filesystems cheat and may group writes or do other optimizations which is not a problem until you lose power.
 
Last edited:
  • Like
Reactions: ghusson and Macross
thanks... my tests were via FIO; read/write/random all performed better on the iSCSI. Local disk was around 700mbps, nfs was around 300 and iSCSI was close to local disk with no cache, but 1200 with 'write back' cache.

Interestingly, I've never seen the nconnect and it appears I'm not using it. At the moment, I only have a single 10gb link from each proxmox (storage only) and another 10gb for WAN and VLAN's. I just have to install the second 10gb switch to integrate the second 10g for multi path for storage.

is there any benefit from adding nconnect=4 when I have a single 10gb per host?