Why Not Integrate `pct stop <containerID> --kill` into the Web GUI?

Lonnie

Renowned Member
Sep 16, 2014
80
6
73
I have been a dedicated user of Proxmox for several versions, and I have consistently noticed an issue with the web interface's ability to immediately stop a virtual machine or container. Specifically, the "Stop Immediately" option in the right-click menu has never worked for me.

Today, while using Proxmox 8.2.2, I encountered a significant problem with a Debian container. The summary status indicated that the container was running, but I couldn't SSH into it. Additionally, the console provided by the web interface couldn't log in, similar to the SSH issue. I tried both the "Shutdown" and "Stop" options, but neither could stop the container. When I attempted to reboot the server, it hung indefinitely. Ultimately, I had to physically shut down the node before I could successfully bring it back online.

The error messages in the web UI cluster log mentioned an inability to acquire a lock, but when I tried to review these logs later, the sub-logs were blank, preventing me from relaying the exact messages.

After all of this, I referred to my notes and found that the command
Code:
pct stop <containerID> --kill
has effectively stopped containers for me in the past. This leads me to question why this reliable method isn't integrated into the web UI. The "Stop" option is supposed to achieve this, but when it fails, it would be beneficial to have an alternative. Perhaps "Stop" could be wired to this command or an additional "Kill" option could be introduced to ensure users can stop containers without leaving the web GUI, which excels in other areas.

I appreciate your attention to this matter and look forward to any improvements you can make to address this issue.
 
Last edited:
Hi,
I have been a dedicated user of Proxmox for several versions, and I have consistently noticed an issue with the web interface's ability to immediately stop a virtual machine or container. Specifically, the "Stop Immediately" option in the right-click menu has never worked for me.

Today, while using Proxmox 8.2.2, I encountered a significant problem with a Debian container. The summary status indicated that the container was running, but I couldn't SSH into it. Additionally, the console provided by the web interface couldn't log in, similar to the SSH issue. I tried both the "Shutdown" and "Stop" options, but neither could stop the container. When I attempted to reboot the server, it hung indefinitely. Ultimately, I had to physically shut down the node before I could successfully bring it back online.
The error messages in the web UI cluster log mentioned an inability to acquire a lock, but when I tried to review these logs later, the sub-logs were blank, preventing me from relaying the exact messages.
The lock is to prevent multiple tasks to access the same config and potentially cause inconsistencies. In Proxmox VE 8.2 it's possible to override shutdown tasks when doing a hard stop, did you use that checkbox?
After all of this, I referred to my notes and found that the command
Code:
pct stop <containerID> --kill
has effectively stopped containers for me in the past.
There is no kill option for the pct stop command. Do you mean lxc-stop?
This leads me to question why this reliable method isn't integrated into the web UI. The "Stop" option is supposed to achieve this, but when it fails, it would be beneficial to have an alternative. Perhaps "Stop" could be wired to this command or an additional "Kill" option could be introduced to ensure users can stop containers without leaving the web GUI, which excels in other areas.
Because depending on what is going on in the container, this can lead to data loss, so it should only be a last resort, not some routine UI action.
I appreciate your attention to this matter and look forward to any improvements you can make to address this issue.
 
@fiona, thanks for the reply. I didn't actually stop the virtual machine with the command I specified. I found that command in my notes after I had already addressed the situation by physically powering off the server. Here are the commands from my notes, which were likely written years ago:

Bash:
# List containers and their IDs
pct list

# Kill Container by ID:
pct stop <containterID> --kill

# Get the process ID of the virtual machine you want to kill:
qm list

# Kill that virtual machine's Process
kill -9 <pid>

I did indeed see the checkbox you mentioned, and I tried to stop the container both ways (with the box checked and unchecked).

None of the provided options (with or without checked boxes) could stop this container. Rebooting the node hung indefinitely, causing me to improperly power off the server to stop the container!

I understand the need for caution when it comes to data loss, but consider this: which poses more risk of data loss?

1. Providing a GUI option that can actually kill a runaway container, OR
2. Yanking the power cable out of the wall to stop the container?

See my point? Improperly powering off the node poses no less risk of data loss than a proper kill option.

Therefore, I think the kill option should indeed be available in the web GUI. Upon selecting this option, you can include warnings in red that it should only be used as a last resort and to try other options first. However, I can't see a good reason to omit this option from the Web GUI altogether if all other alternatives risk the same degree of data loss as having the option would.

For example, consider Virtual Machine Manager. Like Proxmox, VMM also provides those less invasive options for stopping a VM. However, when those options don't work, I don't have to do a hard power off of the HOST MACHINE to stop a VM. Instead, VMM has a "Force Off" option that can be used. When you choose this option, it stops the VM immediately.

2024-07-15_17-08.png

Surely, providing this option is better than omitting it because when none of the other options work to stop the container, you have to resort to an even more invasive alternative, potentially risking even more disruption/data loss for other running VMs and containers that are not even hung up. If there is a way to do this with less data loss, those steps could be automated into the event handler of the "Force Off" click.

Thanks for considering this.
 
Last edited:
Therefore, I think the kill option should indeed be available in the web GUI. Upon selecting this option, you can include warnings in red that it should only be used as a last resort and to try other options first. However, I can't see a good reason to omit this option from the Web GUI altogether if all other alternatives risk the same degree of data loss as having the option would.
First off, warnings in red don't work. People ignore them, especially when things have gone wrong. See all the threads about people wrecking their system after getting a warning about what they were about to do.

Secondly, if a process is in an uninterruptible wait inside the kernel (state D in ps) it can't be killed. That can happen in the case of driver bugs or more often with NFS mounts where it is a hard mount and the server is down. Remember, a container is not a VM. It is running on the host's kernel. My point being that there are cases that can't be recovered.

Something like that is probably what happened in your case. A red button would not be able to fix it. If it happens a lot to one of your containers, maybe that workload should be in a VM instead. Then it doesn't matter so much if the guest kernel is hung.
 
Last edited:
It would be great to have the ability to `pct unlock <lxc-id>` from the GUI. I had this issue today due to a failed scheduled backup and I rebooted the node via GUI (on my phone), but even after the restart, my lxc container wouldn't start and a lock icon was showing for that container. I wasn't able to remotely ssh into the host from my phone and I had to wait until I got back to my desk to run `pct unlock 123`. Would've been great to be able to do this from the GUI. This may be slightly off topic from the OP's request, but I'd say it's in a similar vein.
 
Hi,
It would be great to have the ability to `pct unlock <lxc-id>` from the GUI. I had this issue today due to a failed scheduled backup and I rebooted the node via GUI (on my phone), but even after the restart, my lxc container wouldn't start and a lock icon was showing for that container. I wasn't able to remotely ssh into the host from my phone and I had to wait until I got back to my desk to run `pct unlock 123`. Would've been great to be able to do this from the GUI. This may be slightly off topic from the OP's request, but I'd say it's in a similar vein.
you can use the shell in the UI (select your node and then Shell) in the instead of SSH. The reason unlock is not promptly exposed is that it most often happens after an unexpected failure, so it should be first investigated by an admin that unlocking is safe.
 
  • Like
Reactions: Johannes S

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!