[SOLVED] Condition VM start on iSCSI reachability

mmmzon

New Member
Mar 30, 2024
9
1
3
Good day, forum member!

I have an bit of an interesting challenge to instrument using ProxMox. My storage is all sitting on iSCSI cluster and I wanted to condition the startup of VMs on the reachability of the said iSCSI cluster, using very simple logic here: do not start the VM(s) until iSCSI host becomes available and the storage is accessible. I have been looking at the features available today in the GUI (8.1.10) but I do not see any option to do anything remotely resembling this kind of functionality.

Any thoughts on what could be done and if any function like this could be added into the development queue?

Thanks !

M
 
Any thoughts on what could be done
You could create a somewhat simple script that probes storage status and only starts the VM when acceptable results are received. You could then configure this script to run on startup as systemd unit.
if any function like this could be added into the development queue?
Seems to me like a very infrastructure-specific problem. You can submit a request in bugzilla. I suspect this would be low on the development priority list.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: mmmzon
What would be the triggering mechanism? if its just to prevent start, the missing store would take care of that. If there is a programmatic way to START the iscsi storage cluster, it can be added as a prestart hook script to your vm.
 
  • Like
Reactions: mmmzon
You could create a somewhat simple script that probes storage status and only starts the VM when acceptable results are received. You could then configure this script to run on startup as systemd unit.
That would require a custom modifications at each and every VM, though, right? Since the VM runs on iSCSI storage, I am not sure I can see how the logic would work in here.
if its just to prevent start, the missing store would take care of that.
Possibly, but today we had an outage and it seems that the VMs attempted to start, failed, and then never attempted to start again. I was looking for something that would probe the iSCSI volume status, and then trigger start of VMs that are hosted on that volume.
If there is a programmatic way to START the iscsi storage cluster, it can be added as a prestart hook script to your vm.
The iSCSI cluster will start on its own, but since it is a separate host, when power is lost, ProxMox cluster seems to restore much quicker, and then I have hanging VMs
 
That would require a custom modifications at each and every VM, though, right? Since the VM runs on iSCSI storage, I am not sure I can see how the logic would work in here.

First review this thread https://forum.proxmox.com/threads/delay-start-of-a-vm.111212/ . However, setting an arbitrary delay in hopes that iSCSI comes up is not a great idea for production system. You could miss the time by a second and still have a failed system.

For the script, the rough workflow would be:
create a list of VMs that should be started in advance, could be a PVE tag that you read via API.
check that iscsi storage is available via PVE API.
if available, run through the list of VMs and start them.

good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: mmmzon
However, setting an arbitrary delay in hopes that iSCSI comes up is not a great idea for production system. You could miss the time by a second and still have a failed system.
I added 120 seconds extra on top what I already had (60) but I fear I am guessing right now. ProxMox cluster and iSCSI host sit on different UPS banks and will have different wake up times
For the script, the rough workflow would be:
create a list of VMs that should be started in advance, could be a PVE tag that you read via API.
check that iscsi storage is available via PVE API.
if available, run through the list of VMs and start them.
Onwards into the API then I go. Thank you that is super helpful. Much appreciated!
 
The place to deal with this is in a hook script. see https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_hookscripts
I just tried it with intentionally non-functioning storage. VM start fails with a storage error and the hook script is not executed. Its possible I am not doing it right, although when I fix the storage the hook works.

Edit: looks like it was due to having Cloud Init, which is regenerated _before_ pre-start hook is executed. This is arguably a wrong behavior.

I'd feel safer to have the control that Op is looking for outside of PVE start process.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
#!/bin/bash

TARGET_IP="<TARGET_IP>"
TARGET_IQN="<TARGET_IQN>"
MAX_ATTEMPTS=30
DELAY_BETWEEN_ATTEMPTS=10

log() {
echo "$(date +'%Y-%m-%d %H:%M:%S') - $1"
}

wait_for_iscsi() {
attempt=1
while [ $attempt -le $MAX_ATTEMPTS ]; do
# Use iscsiadm to perform a discovery on the target
if iscsiadm -m discovery -t sendtargets -p $TARGET_IP | grep -q $TARGET_IQN; then
log "iSCSI target $TARGET_IQN at $TARGET_IP is responding. Proceeding..."
return 0
else
log "Waiting for iSCSI target $TARGET_IQN at $TARGET_IP to respond... Attempt $attempt/$MAX_ATTEMPTS"
sleep $DELAY_BETWEEN_ATTEMPTS
((attempt++))
fi
done
log "Failed to connect to iSCSI target $TARGET_IQN at $TARGET_IP after $MAX_ATTEMPTS attempts."
exit 1
}

case "$1" in
pre-start)
if [ "$2" = "<VMID>" ]; then
log "Starting pre-start checks for VM $2..."
wait_for_iscsi
fi
;;
*)
# Other hooks can go here
;;
esac

exit 0

--edit gah stripped indentations :( sorry...
 
Last edited:
  • Like
Reactions: mmmzon
--edit gah stripped indentations :( sorry...
Nice script. As I mentioned, it works if Cloud Init is not on iSCSI storage, otherwise PVE tries to generate it first and does not get to execute hookscript.

Personally, I would use PVE API to confirm the storage is in "available" state. That way there is no chance that iscsiadm works, but pvestatd has not updated yet.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Nice script. As I mentioned, it works if Cloud Init is not on iSCSI storage
cloudinit was not mentioned in OPs requirement. I'm glad you ran into this anyway; I'll not know (and hopefully remember) that there may be a conflict between cloudinit and hookscripts so this was most profitable ;)

Personally, I would use PVE API to confirm the storage is in "available" state
yeah, that can work too. just replace the iscsi detection logic with pvesm status | grep [storname] | awk '{print $3}'. pvesm may be delayed in detection, and I'm not entirely sure if pvesm needs to detect the storage in order for you to use it. If I remember, I'll test it at some point when I'm feeling particularly motivated. if the storage is hung for pvestatd the api may not even work.
 
cloudinit was not mentioned in OPs requirement. I'm glad you ran into this anyway; I'll not know (and hopefully remember) that there may be a conflict between cloudinit and hookscripts so this was most profitable
I agree, CloudInit is our standard config so I am happy I ran into this as well :-) Good to know.
pvesm status | grep [storname] | awk '{print $3}'.
I'd use pvesh or curl and process json for the stability of the output. We actually ran into issues when building our Proxmox testing CI where storage should have been available, but pvestatd has not detected it yet, so the VM creation/execution failed. The difference was only a few seconds.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Something like this:
Bash:
#!/bin/bash
if [ -n "$1" ];then
  MYCOUNT=$1
else
  MYCOUNT=120
fi

for ((count = 0; count < MYCOUNT; count++)); do
  storage_status=$(pvesh get /cluster/resources -type storage --output json|jq -r '.[]|select(.plugintype == "blockbridge" and .status != "available") // empty')
  if [[ -z $storage_status ]];then
    echo "All storage is online"
    break
  fi
  sleep 1
  false
done || ( echo "iSCSI storage was not available after waiting for $MYCOUNT seconds"
          exit 9
        )



Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
  • Like
Reactions: mmmzon
Lots to catch up on, I see. I have been working on pvesh script as well and it has been a discovery trip for me for sure. I did not realize first shell access to API-like controls was available, so that is a neat part. Where I stopped last night was a reliable discovery of the iSCSI storage status.

For reference, the way iSCSI seems to be attached is as follows

Code:
{
    "content": "none",
    "disk": 0,
    "id": "storage/node1/iSCSI-Volume1",
    "maxdisk": 0,
    "node": "node1",
    "plugintype": "iscsi",
    "shared": 1,
    "status": "available",
    "storage": "iSCSI-Volume1",
    "type": "storage"
  },

which seems to bring up the storage status correctly. i will have to emulate an outage to see whether it changes the status correctly (I assume so)

Code:
root@node1:~# pvesh get /cluster/resources -type storage --output json|jq -r '.[]|select(.plugintype == "iscsi" and .status == "available") // empty'           
{
  "content": "none",
  "disk": 0,
  "id": "storage/node1/iSCSI-Volume1",
  "maxdisk": 0,
  "node": "node1",
  "plugintype": "iscsi",
  "shared": 1,
  "status": "available",
  "storage": "iSCSI-Volume1",
  "type": "storage"
}
{
  "content": "none",
  "disk": 0,
  "id": "storage/node3/iSCSI-Volume1",
  "maxdisk": 0,
  "node": "node3",
  "plugintype": "iscsi",
  "shared": 1,
  "status": "available",
  "storage": "iSCSI-Volume1",
  "type": "storage"
}
{
  "content": "none",
  "disk": 0,
  "id": "storage/node2/iSCSI-Volume1",
  "maxdisk": 0,
  "node": "node2",
  "plugintype": "iscsi",
  "shared": 1,
  "status": "available",
  "storage": "iSCSI-Volume1",
  "type": "storage"
}

cloudinit was not mentioned in OPs requirement.
It is not, at least for now. The VMs we're running are relatively simple and do not need any cloudinit at this time, so I guess I was lucky not to have run into it.

The only logical problem I still need to solve is where to run this script. I have three nodes in the cluster. Running the script on one of the nodes introduces a potential point of failure, running it on three nodes introduces potential conflict when multiple nodes attempt to start a VM. I did notice that the transition from stopped to starting with iSCSI storage does take a good while (sometimes 30-60 seconds or more), so I would have to run very long delay loops in script to sidestep this problem or have another way to detect the VM status reliably.
 

i will have to emulate an outage to see whether it changes the status correctly (I assume so)
just change the condition check from "available" to anything else.
he only logical problem I still need to solve is where to run this script. I have three nodes in the cluster. Running the script on one of the nodes introduces a potential point of failure, running it on three nodes introduces potential conflict when multiple nodes attempt to start a VM.
As @alexskysilk mentioned, you could run it as hook script. For that you will need shared file storage where you can store your snippets/scripts.
. I did notice that the transition from stopped to starting with iSCSI storage does take a good while (sometimes 30-60 seconds or more), so I would have to run very long delay loops in script to sidestep this problem or have another way to detect the VM status reliably.
For this reason, I would hesitate to run it as a hook script. Big chance of a timing conflict. Although you could try, maybe it will be ok. A different approach could be:

Run the script on an external controller. Convert pvesh to curl, add logic to probe PVE cluster availability and ability to select healthy node. Use API to start VMs serially.

If you don't have a controller, run it on all nodes, and create a lock file indicating that an instance is running already.

A lot of caveats here and edge conditions.

Good luck.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
After some back and forth with bash, I think I have a working solution that will just need a service wrapped around it, Sharing it in case anybody finds it useful.

Code:
for node_name in $(pvesh get /nodes/ --output json | jq -r '.[].node')
do
    if [ $(pvesh get /nodes/"$node_name"/qemu --output json | jq length) != "0" ];
    then
        
        # check status of the iSCSI storage for this node and if is available,
        
        if [ $(pvesh get /cluster/resources -type storage --output json | jq -r 'map(select(.plugintype == "iscsi" and .status == "available" and .node == "'$node_name'") | .status)[]') == "available" ];
        then
        
            for qemu_id in  $(pvesh get /nodes/"$node_name"/qemu --output json | jq -r 'map(select(.vmid != "null") | select(.tags|test("iscsi")) | .vmid)[]')
            do
                if [ $(pvesh get /nodes/"$node_name"/qemu/"$qemu_id"/status/current --output json | jq -r '.status') != "running" ];
                then                 
                    pvesh create /nodes/"$node_name"/qemu/"$qemu_id"/status/start
                fi
            done
        fi
        
    fi
done
 
  • Like
Reactions: bbgeek17

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!