Constant "Stop job running for LSB" on reboot

kroem · Oct 13, 2016

I have some issues where I cannot perform a graceful reboot of my PVE4.3 node.

Sometimes I get "Stop job running for PVE Manager" but that becuase of stuck VM's. What is more annoiying is the stop job running for LSB - with no limit. I need to hard reboot it via IPMI to get around.

I get some log messages if I leave it hanging, see screengrab. I would greatly appreciate any help, I'll gather what information is needed if I get some pointers

manu · Oct 13, 2016

It looks to me the system is unmounting the NFS mount before the VMs are stopped.
Are you using other network shares on the machines besides NFS ?

To see if the problem is coming form the NFS stop vs VM stop problem, try the following:
shutdown all the VM of the host by calling

service pve-manager stop

Then try to reboot and see if the host hangs again.

kroem · Oct 13, 2016

manu said:
It looks to me the system is unmounting the NFS mount before the VMs are stopped.
Are you using other network shares on the machines besides NFS ?

To see if the problem is coming form the NFS stop vs VM stop problem, try the following:
shutdown all the VM of the host by calling

service pve-manager stop

Then try to reboot and see if the host hangs again.

No, all of the VM's are currently hosted over NFS, shared from the same host actaully.

So, I stopped pve-manager and rebooted. First it got a little stuck on PVE API Proxy Server, but not the full timer. Then it went to NFS lite this, waited the full timer and moved passed it

and then "Reached target shutdown"

(These seem like "new" issues, but in reality the process that it has been getting stuck on have differered a little before too - LSB/The unmounting of NFS)

Thank you in advance...

manu · Oct 13, 2016

Ok usually NFS will not shutdown properly due to processes trying to access files residing on the mount point.

Try the following ( if your customers can support the downtime

* stop the pve-manager service
when it's finished:
* stop the nfs service

if NFS hangs try to to see what's using it

fuser -uvm /path_to_my_nfs_mount

kroem · Oct 13, 2016

This is not in production - well, yes, Sonos is not working at home when it's down, but I can live this that

root@cat:~# service pve-manager status
* pve-manager.service - PVE VM Manager
Loaded: loaded (/lib/systemd/system/pve-manager.service; enabled)
Active: inactive (dead) since Thu 2016-10-13 16:40:55 CEST; 5min ago
Process: 8040 ExecStop=/usr/bin/pvesh --nooutput create /nodes/localhost/stopall (code=exited, status=0/SUCCESS)
Process: 8036 ExecStop=/usr/bin/vzdump -stop (code=exited, status=0/SUCCESS)
Process: 4697 ExecStart=/usr/bin/pvesh --nooutput create /nodes/localhost/startall (code=exited, status=0/SUCCESS)
Main PID: 4697 (code=exited, status=0/SUCCESS)

Oct 13 16:24:30 cat pve-manager[4699]: <root@pam> starting task UPID:cat:000013E5:000012D3:57FF991E:qmstart:112:root@pam:
Oct 13 16:24:30 cat pve-manager[5093]: start VM 112: UPID:cat:000013E5:000012D3:57FF991E:qmstart:112:root@pam:
Oct 13 16:24:35 cat pve-manager[4699]: <root@pam> starting task UPID:cat:0000142A:000014C9:57FF9923:vzstart:115:root@pam:
Oct 13 16:24:35 cat pve-manager[5162]: starting CT 115: UPID:cat:0000142A:000014C9:57FF9923:vzstart:115:root@pam:
Oct 13 16:24:39 cat pve-manager[4697]: <root@pam> end task UPID:cat:0000125B:00000584:57FF98FC:startall::root@pam: OK
Oct 13 16:24:40 cat systemd[1]: Started PVE VM Manager.
Oct 13 16:40:54 cat systemd[1]: Stopping PVE VM Manager...
Oct 13 16:40:55 cat pve-manager[8040]: <root@pam> starting task UPID:cat:00001F6A:00019387:57FF9CF7:stopall::root@pam:
Oct 13 16:40:55 cat pve-manager[8040]: <root@pam> end task UPID:cat:00001F6A:00019387:57FF9CF7:stopall::root@pam: OK
Oct 13 16:40:55 cat systemd[1]: Stopped PVE VM Manager.

A whole lot are still accessing it it seems?

root@cat:~# fuser -uvm /mnt/pve/nfs_vol1
16:44:07.070 §nfs_vol1/ nfs_vol1_VM/ nfs_vol1_backup/

The VM storage have nothing

16:44:17.974 §root@cat:~# fuser -uvm /mnt/pve/nfs_vol1_VM/
16:44:18.005 § USER PID ACCESS COMMAND
16:44:18.006 §/mnt/pve/nfs_vol1_VM:
16:44:18.006 § root kernel mount (root)/mnt/pve/nfs_vol1_VM

(The others are too large, had to put em on Git)

https://gist.github.com/anonymous/edb6b7f9da07f24aff9f6d86ed964497

https://gist.github.com/anonymous/6821640a93b2d109fb3b3419b8e76e9c

kroem · Oct 25, 2016

manu said:
Ok usually NFS will not shutdown properly due to processes trying to access files residing on the mount point.

Try the following ( if your customers can support the downtime

Any ideas?

Search

Search

Constant "Stop job running for LSB" on reboot

kroem

Well-Known Member

manu

Proxmox Staff Member

kroem

Well-Known Member

manu

Proxmox Staff Member

kroem

Well-Known Member

kroem

Well-Known Member