Stopping open-iscsi.service before stopping a particular VM during shutdown.

sjevtic

New Member
Nov 24, 2024
10
5
3
On PVE 8.3, I have a VM running TrueNAS Scale, which generally works great. It runs an iSCSI target for which I have iSCSI storage configured in PVE to serve as a boot disk for that guest.

I have set startup orders set for the VMs in PVE such that the Truenas VM starts before those that depend on its iSCSI storage, and conversely the Truenas VM is stopped after the shutdown of those VMs that depend on its iSCSI storage. However, shutdown still takes a long time, and I see the following on the console:

Code:
[**    ] Job open-iscsi.service/stop running (5min 13s / 6min 26s)
[ 6930.664224]  session1: session recovery timed out after 120 secs
[ 6930.670307] sd 2:0:0:0: rejecting I/O to offline device
[ 6930.670861] sd 2:0:0:1: rejecting I/O to offline device
[ 6930.675574] I/O error, dev sda, sector 0 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[ 6930.680837] I/O error, dev sdb, sector 0 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0

So it looks like the iSCSI initiator is still running/connected at this point and worse, is still trying to write to the offline iSCSI device. How can I cleanly close open iSCSI connections ahead of shutting down the VM that serves them?

Thanks.
 
I'm getting the same problem, but it's with a VM connected to a Starwind iSCSI target...

Did you ever find a way to resolve this, so that open-iscsi shuts down cleanly on restarts?

In my case, it's also preventing me from connecting to the target again after the reboot. I assume that the target thinks that the VC is still logged in and throwing a hissy fit :)
 
I'm getting the same problem, but it's with a VM connected to a Starwind iSCSI target...

Did you ever find a way to resolve this, so that open-iscsi shuts down cleanly on restarts?

I was able to achieve a satisfactory resolution with a simple hookscript on PVE 8.3. The idea is that you can have a script run custom actions for you before/after startup/shutdown. There is a very brief snippet about hookscripts in the admin guide.

Here is the relevant part of my TrueNAS hookscript:

Code:
if ($phase eq 'post-start')
{
    print "$vmid started; enable iSCSI storage.\n";
    system('pvesm set tuxomatic-iscsi --disable 0');
    system('pvesm set platinum-test-iscsi --disable 0');
}
elsif ($phase eq 'pre-stop') 
{
    print "$vmid will be stopped; disable iSCSI storage.\n";
    system('pvesm set tuxomatic-iscsi --disable 1');
    system('pvesm set platinum-test-iscsi --disable 1');
    print "Logout from iSCSI sessions.\n";
    system('iscsiadm -m node --portal 10.0.194.41 --logout');
}
elsif ($phase eq 'post-stop')
{
    print "$vmid stopped. re-enable iSCSI storage.\n";
    system('pvesm set tuxomatic-iscsi --disable 0');
    system('pvesm set platinum-test-iscsi --disable 0');
}
else {
    die "got unknown phase '$phase'\n";
}

There are a several key observations here:
  1. Simply logging out from the iSCSI targets of interest before shutting down the NAS VM is insufficient. PVE's storage manager service will quickly notice the iSCSI connection is lost and re-establish it before the NAS VM can shut down, leading to no improvement from the initial condition.
  2. Simply disabling storage does not cause existing iSCSI connections to log out; both actions must be explicitly performed.
  3. The storage can be re-enabled as soon as the NAS VM is shutdown. The storage manager will then automatically reconnect when the NAS VM is online again, and this will also allow the storage to be visible in the PVE UI again. The post-start logic is added as a failsafe.
  4. This is a maintainability nightmare, since the script must be modified every time the list of iSCSI targets used by PVE hosted on the NAS VM is changed. Some additional intelligence to determine which iSCSI targets used by PVE are hosted on the NAS VM and automatically manage the storage would be advantageous.
 
  • Like
Reactions: _gabriel
I was able to achieve a satisfactory resolution with a simple hookscript on PVE 8.3. The idea is that you can have a script run custom actions for you before/after startup/shutdown. There is a very brief snippet about hookscripts in the admin guide.

Here is the relevant part of my TrueNAS hookscript:

Code:
if ($phase eq 'post-start')
{
    print "$vmid started; enable iSCSI storage.\n";
    system('pvesm set tuxomatic-iscsi --disable 0');
    system('pvesm set platinum-test-iscsi --disable 0');
}
elsif ($phase eq 'pre-stop')
{
    print "$vmid will be stopped; disable iSCSI storage.\n";
    system('pvesm set tuxomatic-iscsi --disable 1');
    system('pvesm set platinum-test-iscsi --disable 1');
    print "Logout from iSCSI sessions.\n";
    system('iscsiadm -m node --portal 10.0.194.41 --logout');
}
elsif ($phase eq 'post-stop')
{
    print "$vmid stopped. re-enable iSCSI storage.\n";
    system('pvesm set tuxomatic-iscsi --disable 0');
    system('pvesm set platinum-test-iscsi --disable 0');
}
else {
    die "got unknown phase '$phase'\n";
}

There are a several key observations here:
  1. Simply logging out from the iSCSI targets of interest before shutting down the NAS VM is insufficient. PVE's storage manager service will quickly notice the iSCSI connection is lost and re-establish it before the NAS VM can shut down, leading to no improvement from the initial condition.
  2. Simply disabling storage does not cause existing iSCSI connections to log out; both actions must be explicitly performed.
  3. The storage can be re-enabled as soon as the NAS VM is shutdown. The storage manager will then automatically reconnect when the NAS VM is online again, and this will also allow the storage to be visible in the PVE UI again. The post-start logic is added as a failsafe.
  4. This is a maintainability nightmare, since the script must be modified every time the list of iSCSI targets used by PVE hosted on the NAS VM is changed. Some additional intelligence to determine which iSCSI targets used by PVE are hosted on the NAS VM and automatically manage the storage would be advantageous.
Thanks for the quick response. I have a different setup to you, but your post has given me the idea of hopefully achieving the same but using services during shutdown to work around this. I'll comment back with the result.
 
  • Like
Reactions: sjevtic
I'm getting the same problem, but it's with a VM connected to a Starwind iSCSI target...

Did you ever find a way to resolve this, so that open-iscsi shuts down cleanly on restarts?

In my case, it's also preventing me from connecting to the target again after the reboot. I assume that the target thinks that the VC is still logged in and throwing a hissy fit :)
Hey,

Thanks for using StarWind. We will review our configuration to minimize startup and shutdown times. You can always request some help with StarWind products or configurations on our forum.
https://forums.starwindsoftware.com/

Regards,
Alex
 
I didn't find an answer to this and due to the potential loss of data due to this issue, I ditched the starwind idea and decided to run smb shares on the pve hosts using an lxc. This also overcomes the complexity of HBA passthrough, which I also struggled with. As this is a home lab, this should suffice. I just need to replicate the data between hosts using a bespoke method now. Maybe I'll give starwind another go later, but I spent too much time on it and I needed to get the data copied off it's current location asap as it wasnt backed up.