Storage Issues: ZFS over ISCSI - istgt

histamineblkr

Member
Sep 28, 2022
4
0
6

Info​

PVE: Proxmox 8 - 6.8.12-12-pve
NAS: FreeBSD 14.3-RELEASE releng/14.3-n271432-8c9ce319fef7 GENERIC amd64
ISCSI PKG: istgt - Version: 20150713_1

Preface​

I am setting up a new PVE. Currently it consists of the pve server and a freebsd server to be used for storage. It is possible I have setup the configs for istgt incorrectly, but the /usr/local/etc/istgt/istgt.conf and /usr/local/etc/istgt/auth.conf are pretty straight forward.

Research/Docs​

I have searched the forum and gone through all the posts (to about 2016) that relate to "zfs over iscsi" and "istgt". There is very little on FreeBSD and proxmox and mostly I see users using Debian and istgt. I could go the route of installing Debian on my storage server, but I had wanted to set up zfs on FreeBSD since the zfs developers write openzfs specifically for bsd and I wanted to manage a bsd box.

In the forum I found the following:
  • This github repo: freenas-proxmox that people seemed to use (it hasn't had a commit in about 2 years and isn't what I want)
  • An old ctld (native iscsi freebsd) patch/implementation repo for managing zfs over iscsi - may work would have to test
I am pretty familiar with implementing and configuring storage and have been running my current setup (proxmox and truenas) for years with little difficulty, but I consulted the following proxmox docs:
I am able to setup and connect successfully to the storage using:
Code:
zfs: freenas
   blocksize 4k
   target iqn.2025-10.com.<subdomain>.free-nas:vms
   pool tank
   iscsiprovider istgt
   portal 192.168.1.101
   content images

Issue​

The issue happens when creating a vm or multiple (not sure about containers, but since they require disks I would assume so). When creating a single vm I am able to create it successfully, but /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm mangles my /usr/local/etc/istgt/istgt.conf on the freebsd server and creates a broken config.

It removes all comments in the "[LogicalUnit1]" section, a few parameters such as "Comment" and "TargetAlias", and adds "QueueDepth" without a number following and that breaks the config. This happens when a single vm is created.

When I then go to create a second vm since things are in a broken state (at that time you don't know it's broken since the proxmox GUI console appears fine), various problems happen like the LUN entry for the disk is written incorrectly, but a zvol/device is created for the disk, and the vm doesn't start giving a generic: "Error: start failed: QEMU exited with code 1".

I cleaned up the disk and configs and tried to start the second vm with the qm command:
Code:
root@pve-01:~# qm start 102
Odd number of elements in anonymous hash at /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm line 324.
Odd number of elements in anonymous hash at /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm line 324.
Odd number of elements in anonymous hash at /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm line 324.
kvm: -drive file=iscsi://192.168.1.101/iqn.2025-10.com.<subdomain>.free-nas:vms/2,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on: iSCSI: Failed to connect to LUN : SENSE KEY:ILLEGAL_REQUEST(5) ASCQ:LOGICAL_UNIT_NOT_SUPPORTED(0x2500)
start failed: QEMU exited with code 1
and it looks like a parsing error here: vms/2 for the file=iscsi://.

I will need to do some more systematic testing to hone in on what is happening, but I can be sure that QueueDepth is handled incorrectly by the parser sub routine:
Perl:
my $lun_dumper = sub {
    my ($lun) = @_;
    my $config = '';

    $config .= "\n[$lun]\n";
    $config .=  'TargetName ' . $SETTINGS->{$lun}->{TargetName} . "\n";
    $config .=  'Mapping ' . $SETTINGS->{$lun}->{Mapping} . "\n";
    $config .=  'AuthGroup ' . $SETTINGS->{$lun}->{AuthGroup} . "\n";
    $config .=  'UnitType ' . $SETTINGS->{$lun}->{UnitType} . "\n";
    $config .=  'QueueDepth ' . $SETTINGS->{$lun}->{QueueDepth} . "\n";

    foreach my $conf (@{$SETTINGS->{$lun}->{luns}}) {
        $config .=  "$conf->{lun} Storage " . $conf->{Storage};
        $config .= ' ' . $size_with_unit->($conf->{Size}) . "\n";
        foreach ($conf->{options}) {
            if ($_) {
                $config .=  "$conf->{lun} Option " . $_ . "\n";
            }
        }
    }
    $config .= "\n";

    return $config;
};

...
my $parser = sub {
...
}

I think there is a regex problem as well. In that same file, we see the base for regex:
Perl:
sub get_base {
    return '/dev/zvol';
}

then in the parser subroutine:
Perl:
my $base = get_base;
...
                    if ($storage =~ /^$base\/$scfg->{pool}\/([\w\-]+)$/) {
                        #print "key: $key, storage: $storage, size: $size, options: @options\n";
                        $conf = {
                            lun => $key,
                            Storage => $storage,
                            Size => $size,
                            options => "",
                        }
                    }

the if conditional and regex should match my zvol: /dev/zvol/tank/vms and create/delete/add luns in the subsequent directory, but it doesn't as we can see on my freebsd server:
Code:
➜  zfs list
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank                 349G  24.0T   189K  /tank
tank/vm-101-disk-0   175G  24.2T   605M  -
tank/vm-102-disk-0   175G  24.2T  94.3K  -
tank/vms            3.26M  24.0T  94.3K  -

What Steps Can I Take?​

I will delete vms, disks, start over, and troubleshoot in a more thorough way.

The source code for this is here and Thomas Lamprecht seems to be the most recent contributor with Fiona Ebner makeing a couple commits. Michael Rasmussen looks to be the original author, but hasn't been active since 2014. Should I reach out and work with Fiona or Thomas?

I am happy to help code and test, but I am unfamiliar with the Debian dependencies since this seems to have cross plugin functionality. Therefore a change that works for FreeBSD could break something for Debian.
 
Does the PVE GUI take the file: /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm into memory so that if you make code changes to the file they are not reflected in the UI?
 

istgt.conf Issue​

This issue is best shown by a diff after creating one single vm using zfs over iscsi. This only affects the section "[LogicalUnit1]", which is my last section and all sections above remain unchanged in the file:
Code:
➜  diff istgt.conf istgt.conf.before.delete
111,118c111,168
< TargetName vms
< Mapping PortalGroup1 InitiatorGroup1
< AuthGroup AuthGroup1
< UnitType Disk
< QueueDepth
< LUN0 Storage /dev/zvol/tank/vms AUTO
< LUN1 Storage /dev/zvol/tank/vm-101-disk-0 AUTO
---
>   Comment "Proxmox VMs"
>   # full specified iqn (same as below)
>   #TargetName iqn.2007-09.jp.ne.peach.istgt:disk1
>   # short specified non iqn (will add NodeBase)
>   TargetName vms
>   TargetAlias "vms"
>   # use initiators in tag1 via portals in tag1
>   Mapping PortalGroup1 InitiatorGroup1
>   # accept both CHAP and None
>   AuthMethod Auto
>   AuthGroup AuthGroup1
>   #UseDigest Header Data
>   UseDigest Auto
>   UnitType Disk
>   # SCSI INQUIRY - Vendor(8) Product(16) Revision(4) Serial(16)
>   #UnitInquiry "FreeBSD" "iSCSI Disk" "0123" "10000001"
>   # Queuing 0=disabled, 1-255=enabled with specified depth.
>   #QueueDepth 32
>
>   # override global setting if need
>   #MaxOutstandingR2T 16
>   #DefaultTime2Wait 2
>   #DefaultTime2Retain 60
>   #FirstBurstLength 262144
>   #MaxBurstLength 1048576
>   #MaxRecvDataSegmentLength 262144
>   #InitialR2T Yes
>   #ImmediateData Yes
>   #DataPDUInOrder Yes
>   #DataSequenceInOrder Yes
>   #ErrorRecoveryLevel 0
>
>   # LogicalVolume for this unit on LUN0
>   # for file extent
>   LUN0 Storage /dev/zvol/tank/vms AUTO
>   # for raw device extent
>   #LUN0 Storage /dev/ad4 Auto
>   # for ZFS volume extent
>   #LUN0 Storage /dev/zvol/tank/istgt-vol1 Auto
>
>   # override the serial of LUN0 specified with UnitInquiry
>   #LUN0 Option Serial "10000001"
>
>   # for 3.5inch, 7200rpm HDD
>   # RPM 0=not reported, 1=non-rotating(SSD), n>1024 rpm
>   #LUN0 Option RPM 7200
>   # FormFactor 0=not reported, 1=5.25, 2=3.5, 3=2.5, 4=1.8, 5=less 1.8 inch
>   #LUN0 Option FormFactor 2
>
>   # for 2.5inch, SSD
>   #LUN0 Option RPM 1
>   #LUN0 Option FormFactor 3
>
>   # for future use (enabled by default)
>   #LUN0 Option ReadCache Disable
>
>   # control WCE(mode page 8) and O_FSYNC/O_SYNC on the backing store (enabled by default)
>   #LUN0 Option WriteCache Disable

4 issues stand out:
  1. All comments are removed
  2. Indentation is lost
  3. LUN0 is "/dev/zvol/tank/vms", but the vm disk for LUN1 is "/dev/zvol/tank/vm-101-disk-0" losing the "vms" directory
  4. QueueDepth is added without a value
I can hazard a quess why "QueueDepth" is added even though in my config it is commented out and left at its default value. On line 172 in the lun_dumper() function we see:
Perl:
$config .= 'QueueDepth ' . $SETTINGS->{$lun}->{QueueDepth} . "\n";
and then later in the parser() function when the CONFIG is rebuilt it assumed that QueueDepth will have a value. I have had a hard time testing that in code and seeing the config being build step by step so I may be wrong. An easy test would be to delete all vms and disks, reset my config, and add the QueueDepth parameter with a value and see what happens when I create a vm.

Workaround to Test Second VM Creation​

Where the code dies and fails for QueueDepth is here:
Perl:
elsif ($lun) {
            next if (($_ =~ /^\s*#/) || ($_ =~ /^\s*$/));
            if ($_ =~ /^\s*(\w+)\s+(.+)\s*/) {
                my $arg1 = $1;
                my $arg2 = $2;
                $arg2 =~ s/^\s+|\s+$|"\s*//g;
                if ($arg2 =~ /^Storage\s*(.+)/i) {
                    $SETTINGS->{$lun}->{$arg1}->{storage} = $1;
                } elsif ($arg2 =~ /^Option\s*(.+)/i) {
                    push @{ $SETTINGS->{$lun}->{$arg1}->{options} }, $1;
                } else {
                    $SETTINGS->{$lun}->{$arg1} = $arg2;
                }
            } else {
                die "$line: parse error [$_]";
            }
        }

specifically when "$arg1" is "QueueDepth" and "$arg2" is empty. Regex isn't my strong suit, but it appears that:
Perl:
if ($_ =~ /^\s*(\w+)\s+(.+)\s*/)
the second capture group "(.+)" expects one or more and since it is empty it never matches and dies with: "114: parse error [QueueDepth]".

To get past this and not change anything, I did this:
Perl:
if ($_ =~ /^\s*(\w+)\s*(.*)\s*/) {
                my $arg1 = $1;
                my $arg2 = defined $2 ? $2 : '';
match zero or more spaces and sets $2 to an empty string.
 
Before considering those...

Please note that issues may arise when using ZFS over iSCSI with the ctld target.

The target will stop due to restrictions on the ctld side.

Using a crld target with ZFS over iSCSI causes the lun_id to keep increasing even when only rolling back snapshots.

Is it absolutely necessary to use FreeBSD (ctld) in this situation?

https://github.com/boomshankerx/proxmox-truenas/issues/56#issuecomment-3315936158