Shutdown/Reboot fails, Web GUI fails

silverado

Renowned Member
Apr 23, 2012
27
0
66
I habe been using several Proxmox versions from 2.x to 3.x and I was (nearly) almost very happy with them.

But v4.x drives me crazy.

My use scenario: I have a Proxmox standalone server that is running 24/7. The machine is fully encrypted. For backup purposes only I have an additional file server that offers a nfs share to Proxmox. This file server is powered off most of the time and gets only started when it is really needed for backing up or restoring a virtual machine. Every night the fileserver is waking up automatically, Proxmox writes automatic backups to the nfs share on the file server and after that the file server shuts down again.

This has been working flawlessly with Proxmox 2.x and 3.x.

Now I am trying to use Proxmox 4.x in the same way but I get issues:

1) Reboot or shutdown fails
Most of the time the Proxmox server hangs with the message "Reached target shutdown" and I have to press the reset or power button manually.

2) The Web GUI fails
This is happening many times when the file server is up and running and I try to access its content in the web GUI. The GUI shows me the rotating sand clock endlessly. The whole web interface hangs.

"service pveproxy restart" or "service pvestatd restart" via ssh do not not help. It even gets impossible to log in to the web interface any more.

The log says:
Dec 22 07:39:52 server2 pveproxy[1133]: proxy detected vanished client connection
Dec 22 07:39:55 server2 pvedaemon[1125]: <root@pam> successful auth for user 'root@pam'
Dec 22 07:40:29 server2 pveproxy[1133]: proxy detected vanished client connection

I have to reboot to fix the situation, but rebooting fails most of the time as described above.

This makes Proxmox 4.x unusable for me :(
 
Most of the time the Proxmox server hangs with the message "Reached target shutdown" and I have to press the reset or power button manually.

Are you on the newest version, we fixed quite some bugs/issues in regards to systemd combined with shutdown behaviour.

This is happening many times when the file server is up and running and I try to access its content in the web GUI. The GUI shows me the rotating sand clock endlessly. The whole web interface hangs.

So it does not happen when your file server is down? Is it a NFS target? Or how they are connected?
Whats the output from:
Code:
df -h
pvesm status
from the PVE CLI when the fileserver is up and running?
 
Are you on the newest version, we fixed quite some bugs/issues in regards to systemd combined with shutdown behaviour.

I made a fresh install yesterday so everything should be up to date. Before that I tested Proxmox 4.0 for about a month and had shutdown/reboot issues nearly all the time and experienced failures of the WebGui every once in a while.

So it does not happen when your file server is down?
Good question. After the fresh install yesterday the file server was up all the time and everything worked fine, even reboots. Then I shut down the file server and went to bed. This morning I powered up the file server, waited a while and then I tried to access it's content to restore some VMs. Then the WebGUI failed/freezed as described.

Now I've shutdown the file server and the web GUI behaves as expected: It searches for a while and then says that the "storage is not online (500)". The web GUI stays responsive and everything is fine.

Then I reboot. Reboot fails with "Reached target shutdown" :mad:

After Proxmox is up again I start the file server, wait a little and access it's content via web GUI without any problems.:confused:

Is it a NFS target?
Yes.

Whats the output from:
Code:
df -h
from the PVE CLI when the fileserver is up and running?
Code:
Dateisystem  Inodes IBenutzt  IFrei IUse% Eingehängt auf
udev  2038981  527  2038454  1% /dev
tmpfs  2042415  704  2041711  1% /run
/dev/dm-2  42729472  50822  42678650  1% /
tmpfs  2042415  85  2042330  1% /dev/shm
tmpfs  2042415  10  2042405  1% /run/lock
tmpfs  2042415  17  2042398  1% /sys/fs/cgroup
/dev/sda2  62496  322  62174  1% /boot
/dev/sda1  0  0  0  - /boot/efi
/dev/mapper/vg00-lvdata  76513280  21  76513259  1% /var/lib/vz
tmpfs  2042415  11  2042404  1% /run/lxcfs/controllers
cgmfs  2042415  13  2042402  1% /run/cgmanager/fs
/dev/fuse  10000  24  9976  1% /etc/pve
192.168.2.24:/data/daten 181579776  1077772 180502004  1% /mnt/pve/datasrv1

pvesm status
Code:
datasrv1  nfs 1  5764768768  5192859648  281381888 95.36%
local  dir 1  1204745768  2588280  1140953216 0.73%

After the fresh install yesterday I was really happy that shutdown&reboot seemed to work, finally. And it was really frustrating to see that f***ing "Reached target shutdown" again this morning.

At least things start to make some kind of sense to me: The failure of reboot/shutdown seems to be related to the status of the nfs-share. If Proxmox believes that it is up, reboots and shutdowns seem to work. If the nfs share is down (or Proxmox failed to access it for some reason that made also freeze the GUI) then Proxmox hangs with "reached shutdown target".

The failure of the web GUI seems to happen only sporadically.
 
Last edited:
After the fresh install yesterday I was really happy that shutdown&reboot seemed to work, finally. And it was really frustrating to see that f***ing "Reached target shutdown" again this morning.

I understand the frustration, but this is caused by the offline NFS share, first try:
Code:
systemctl reboot
instead of shutdown -r now

If that does not work try mounting the NFS share as soft option as a work around.

This morning I powered up the file server, waited a while and then I tried to access it's content to restore some VMs. Then the WebGUI failed/freezed as described.

I will try to reproduce both issues :)
 
I understand the frustration, but this is caused by the offline NFS share,
The simple idea that there seems to be a clear reason and that there might be a workaround or even a solution compensates my frustration.

first try:
Code:
systemctl reboot
instead of shutdown -r now
I've tried different shutdown and reboot methods with Proxmox 4.0 and had no success at all. But I'll try again with Proxmox 4.1.

By the way: As I already said I'm using a fully encrypted Proxmox. That means that I install Debian first and then Proxmox on top of it. Maybe that matters?

If that does not work try mounting the NFS share as soft option as a work around.
How exactly do I do that? Like that in /etc/pve/storage.cfg?

Code:
dir: local
   path /var/lib/vz
   content vztmpl,iso,images,rootdir
   maxfiles 0

nfs: datasrv1
   server 192.168.2.24
   path /mnt/pve/datasrv1
   export /data/daten
   maxfiles 3
   options vers=3,soft
   content backup

What are the disadvantages of the "soft" option?

I will try to reproduce both issues :)
Thank you!
 
Last edited:
The simple idea that there seems to be a clear reason and that there might be a workaround or even a solution compensates my frustration.

Its my first guess, the only thing which makes sense to me. Also I found some bug reports in other Debian based distros which where added very recently. It _could_ be a problem with systemd, will need to look more into it before I can say more, maybe someone other knows already more.

By the way: As I already said I'm using a fully encrypted Proxmox. That means that I install Debian first and then Proxmox on top of it. Maybe that matters?

Not that I know, but I keep that in mind when trying to test it.

options vers=3,soft

exactly like this, sorry forgot to mention that :)

Quoting the nfs man page:
[...]
If the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling
application.

NB: A so-called "soft" timeout can cause silent data corruption in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity.
Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option.
 
Thanks a lot for your efforts to help me.

I've tried the proposed workaround "soft":

1) I changed the config file and rebooted while the NFS-server was online > Success.

2) After the reboot (with the soft option active now) I took the NFS server offline and rebooted again after a few minutes > "Reached target shutdown" - The server hangs again. :mad:

So this workaround doesn't seem to help. The only way so far to get Proxmox to reboot and shutdown properly seems to be to keep the NFS server always on. But that's not really an option for me:
- I want that my backups are offline.
- I do not want to pay the power costs for an extra server that has nothing to do for 90% of the time.

So I think I will go back to Debian Wheezy and Proxmox 3. Systemd sucks! Jessie is the first Debian distro that really disappoints me.:(
 
This is happening many times when the file server is up and running and I try to access its content in the web GUI. The GUI shows me the rotating sand clock endlessly. The whole web interface hangs.

So i cannot reproduce this in any way, I guess you are using a reliable and fast network between the PVE4 Node and the NFS Server?

After the reboot (with the soft option active now) I took the NFS server offline and rebooted again after a few minutes > "Reached target shutdown" - The server hangs again. :mad:

This is reproducable, although the server restarts but only after 5-10 min - so not really a solution.
The general idea is that NFS are always reachable, but I naturally understand your argument.

What you could do is to force unmount the NFS when it's offline on reboots:
Code:
umount -f -l /mnt/pve/nfs-share
systemctl reboot

Solves it for me completely.

Systemd sucks

It has some good an bad sides, as the other init systems also have. And you can map more with it than previously. But yes there are problems.
 
So i cannot reproduce this in any way, I guess you are using a reliable and fast network between the PVE4 Node and the NFS Server?
Yes, the network is well cabled gigabit ethernet, the NICs are Intel PT server grade and everything works perfectly. As I said before: this happens from time to time only. But it happened right at first this morning and led to this annoying "Reached target shutdown"-message. Not even 5 minutes use of Proxmox and I had to go down to the server and press the reset button. That's why I mentioned it like a bigger problem as it really may be.

This is reproducable, although the server restarts but only after 5-10 min - so not really a solution.
The general idea is that NFS are always reachable, but I naturally understand your argument.

What you could do is to force unmount the NFS when it's offline on reboots:
Code:
umount -f -l /mnt/pve/nfs-share
systemctl reboot

Solves it for me completely.
Okay, I'm right now installing my beloved Debian Wheezy, but I have the harddisks with Jessie+Proxmox 4 still available. So when my nerves have calmed down a bit I will surely give your solution a try. A shutdown script would be an acceptable solution as long as it works reliably.

It has some good an bad sides, as the other init systems also have. And you can map more with it than previously. But yes there are problems.
Im not really in that subject, but as far as I know the Debian developers had some arguments about systemd and have even been considering a forking. Personally I have not noticed any advantages but the word "systemd" emerges frequently in the context of problems.

Thanks again for your great help!
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!