[SOLVED] CPU soft lockup

piertho

New Member
Jun 23, 2017
4
0
1
29
Hi !

That is the first time I post on this forum, so I don't really know if it's the right place to ask.

I have a Proxmox server which crash several times a day.

Here are the infos with the load when all vm are started :

CPU usage
1.35% of 16 CPU(s)



IO delay
0.25%



Load average
1.25,1.49,0.73


RAM usage
36.23% (11.33 GiB of 31.26 GiB)



KSM sharing
0 B

HD space(root)
1.37% (1.28 GiB of 93.99 GiB)



SWAP usage
0.00% (0 B of 8.00 GiB)




CPU(s)
16 x Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (1 Socket)

Kernel Version
Linux 4.10.15-1-pve #1 SMP PVE 4.10.15-15 (Fri, 23 Jun 2017 08:57:55 +0200)

PVE Manager Version
pve-manager/5.0-23/af4267bf


I run on it :

One Debian VM using 8GB ram and 4 CPU
Another debian with 4GB and 2 CPU
A Ubuntu with 4GB and 2CPU
2 WS2012 with 2CPU and 4GB RAM

Attached to this post is a picture of the console displaying the error (sorry about that, I couldn't access the server with terminal).

Does someone have an idea ?IMG_20170710_221515.jpg
 
yes this forum is the right one :)
this kind of problems tend to happen with either buggy bios or buggy drivers
make sure your bios is up to date

after the reboot, try to see if you have in the ouput of

dmesg -T

something which could give you a hint of the problem
 
Hi everyone !

Sorry for the late answer, work has been keeping me busy the past few days.
Bios seems tout be up tout date. I am not using zfs.

I Made a test, I shutdown the debian VM with 8GB RAM and the server is still working for 3 days. I will try with an other VM down tout see if it comes from the debian one.
 
Yes ! I have a NVIDIA gt710. As motherboard is sli, I need it to boot. What do you think I can do on this side ?
Server crashed once during last week with debian 8GB RAM down. It crashed Yesterday, I mannualy rebooted it few time after it happens. It crashed again this night.

EDIT :

It seems I haven't any nvidia drivers ... If I run apt-get purge nvidia* it says there is no package.
 
Last edited:
Yes ! I have a NVIDIA gt710. As motherboard is sli, I need it to boot. What do you think I can do on this side ?
Server crashed once during last week with debian 8GB RAM down. It crashed Yesterday, I mannualy rebooted it few time after it happens. It crashed again this night.

EDIT :

It seems I haven't any nvidia drivers ... If I run apt-get purge nvidia* it says there is no package.

There is your problem. You can disable the nouveau driver (open source nvidia driver) You will get a lower resolution, but it will work, trust me:

nano /etc/modprobe.d/blacklist-nouveau.conf - paste this this:

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

Then paste the following two lines into terminal (ssh or whatever)

echo options nouveau modeset=0 | tee -a /etc/modprobe.d/nouveau-kms.conf
update-initramfs -u
reboot

- The problem is not the card, but the open source driver sucks (because nvidia as a company does not help the open source community, not nouveau's fault). Feel free to report back in a few days, but this is 99,9% certainly the solution (I have the same card)
 
  • Like
Reactions: liptech
Ok, I just applied you solution. I heard about nvidia to cause problem with open source, but I did not think it could makes proc enter an infinite loop ...

I'll edit this message to report what happen.

Thank you !

EDIT :

I'm back for report.
It seems your solution worked ! Server did not crash for the past week.
Thank you again.
 
Last edited:
There is your problem. You can disable the nouveau driver (open source nvidia driver) You will get a lower resolution, but it will work, trust me:

nano /etc/modprobe.d/blacklist-nouveau.conf - paste this this:

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

Then paste the following two lines into terminal (ssh or whatever)

echo options nouveau modeset=0 | tee -a /etc/modprobe.d/nouveau-kms.conf
update-initramfs -u
reboot

- The problem is not the card, but the open source driver sucks (because nvidia as a company does not help the open source community, not nouveau's fault). Feel free to report back in a few days, but this is 99,9% certainly the solution (I have the same card)
Thank you very much!
I have a server built with parts from China and I thought it had gone to waste.
I have Proxmox 8.0.4 and your solution worked perfectly.
The curious thing is that I have always used this video card.
:):):):):):):):):):):):):):):):):):):):)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!