Proxmox Host - random freeze(?)

Semmo

New Member
May 27, 2019
20
0
1
33
I still have no crashlog but since i disabled the c-states there is no freeze anymore... it's with an Intel i7-3770.

But there was an kernel update also... hmmm...
 

n1nj4888

Member
Jan 13, 2019
87
2
8
39
I also haven’t had any similar crashes in the last couple of weeks. I haven’t changed anything c-state related in the BIOS but I did move all VMs off the suspect node a couple of days ago (so that it was idle to increase chances of a c-state crash) and still no crash...

Possibly the kernel update has fixed this?
 

n1nj4888

Member
Jan 13, 2019
87
2
8
39
I still have no crashlog but since i disabled the c-states there is no freeze anymore... it's with an Intel i7-3770.

But there was an kernel update also... hmmm...
Have you tried re-enabling the c-state changes to see whether it still crashes?
 

JanN

New Member
Mar 11, 2019
1
0
1
57
I've read several times about c-states higher than 1 are causing freezes/crashes on Linux servers...
BR
Jan
 

n1nj4888

Member
Jan 13, 2019
87
2
8
39
No I didn't because I'm so happy that it works ;) have you tried disabling your c-states?
Nope! I’ve left the c-states as is and haven’t had another crash even though I’ve experimented both with putting the suspect node under load and completely free for prolonged periods of time... still no crash...

I suspect the kernel upgrade fixed it...
 

Semmo

New Member
May 27, 2019
20
0
1
33
The problem is back... after about 3 weeks with no problems. I did the last updates a few days ago and i'm on kernel

"Linux proxmox 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul 2019 11:19:13 +0200) x86_64 GNU/Linux"

atm.

Does somebody else have the problem again?
 

n1nj4888

Member
Jan 13, 2019
87
2
8
39
I’ve had no problems for the last few weeks and, just before reading this post, I updated to the following kernel.

Linux pve-host1 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul 2019 11:19:13 +0200) x86_64 GNU/Linux

I’ve got netconsole running in case of any similar kernel panics and will post any results here but my setup has been solid for the last few weeks...
 

Semmo

New Member
May 27, 2019
20
0
1
33
I’ve had no problems for the last few weeks and, just before reading this post, I updated to the following kernel.

Linux pve-host1 4.15.18-18-pve #1 SMP PVE 4.15.18-44 (Wed, 03 Jul 2019 11:19:13 +0200) x86_64 GNU/Linux

I’ve got netconsole running in case of any similar kernel panics and will post any results here but my setup has been solid for the last few weeks...
I switched back do 4.14.18-17-pve just to try it out. I still cannot use netconsole because it's a dedicated server at a hoster and i dont wan't to rent another one with vlan...

If you get an error it would be nice to see the problem.
thx
 

Semmo

New Member
May 27, 2019
20
0
1
33
The older kernel doesn't help :(

Is it possible to run the netconsole over vpn? Or will the vpn fail atm when kernel panic happens?
Or is there any way to encrypt the netconsole traffic?
 

Semmo

New Member
May 27, 2019
20
0
1
33
And this is a never ending story of bugs....
https://github.com/zfsonlinux/zfs/issues/6476

It's not working for me too. I do the crash but get no file in /var/crash -.- Anyone else with kdump + zfs root here? Is it possible to store the crashdump on a smb/cifs share?


EDIT:

I updated to VE 6.0 now and can not install / use kdump-tools anymore:
electing previously unselected package kdump-tools.

Code:
(Reading database ... 91643 files and directories currently installed.)

Preparing to unpack .../kdump-tools_1%3a1.6.5-1_amd64.deb ...

Unpacking kdump-tools (1:1.6.5-1) ...

Setting up kdump-tools (1:1.6.5-1) ...


Creating config file /etc/default/kdump-tools with new version

dpkg: error processing package kdump-tools (--configure):

 installed kdump-tools package post-installation script subprocess returned error exit status 1

Processing triggers for man-db (2.8.5-2) ...

Processing triggers for systemd (241-5) ...

Errors were encountered while processing:

 kdump-tools

E: Sub-process /usr/bin/dpkg returned an error code (1)
I'm to tired after all those hours with none results... maybe someone has this issue too..

thanks in advance.
 
Last edited:

Semmo

New Member
May 27, 2019
20
0
1
33
After some research I found something: https://github.com/zfsonlinux/zfs/issues/6476
You have to add MODULES=most in the "/usr/share/initramfs-tools/conf-hooks.d/zfs" otherwise it wouldn't install. After that i tried it with the sysrq trigger but still got no kdump file in /var/crash.

So I tried to install Proxmox in a VM, one installation with ext4 and one with zfs. I made the same changes (only the MODULES=most wasn't needed with ext4) and yes, with ZFS the kdump doesn't work. It crashes but the dump doesnt happen. With ext4 it boots the crash kernel and dump the files and then reboots properly.. so it seems to be a problem with ZFS.

This still wouldn't help me to find out why my host is crashing but now i know that's a bug with kdump/zfs why i can not track it.. (again)
Since netconsole is not an option for my hosted server (and it seems to not support vlan, so a vlan switch + other hosted server is not an option) I have no way to find a solution.

@n1nj4888 What was you last step when the crashes stopped? Have you ever dumped a crash after this? Are you running the new version?

BTW: It crashes with the VE 6.0 too... :(
 

n1nj4888

Member
Jan 13, 2019
87
2
8
39
@n1nj4888 What was you last step when the crashes stopped? Have you ever dumped a crash after this? Are you running the new version?
I can’t really recall now. I was only getting crashes every so often on 5.4 and after I put netconsole on, I didn’t see any further crashes (I’m not suggesting netconsole affected this), even after I did the kernel upgrade to the last 5.4 version I mentioned above. I do recall doing some BIOS updates around the time so perhaps that could have improved the situation?

I’ve since moved to PVE6 on ZFS boot rather than ext4 and again haven’t seen any further crashes. I haven’t yet implemented netconsole on PVE6 as yet - Indeed, I’d have to do a little more research about how to implement that given the ZFS boot on UEFI now uses systemd-boot instead of grub.
 

RCK

Member
Oct 20, 2009
52
0
6
@n1nj4888
Thanks for the detailed description! But unfortunately my host is on a single root server and I have no access to a other machine in the same network.
Hello,

I managed to use netconsole on a remote debian server :)
First, verify that you can write UDP message to your netconsole receiver:
- setup your debian rsyslog as n1nj4888 describe it (port 5555)
- open all firewall between your proxmox host and your debian rsyslog with port UDP 5555
- test communication with the following command on your proxmox
echo "This is my udp data" > /dev/udp/213.182.49.210/22555

Next, add loglevel=7 to your /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet loglevel=7 ...
update-grub

Finally, start netconsole after the system has started
- find your local proxmox ip (192.168.0.99)
ip addr |grep 'inet '
- find your gateway (192.168.0.254)
netstat -rn | grep ^0.0.0.0
- find the mac of the gameway (14:dd:a9:4b:b1:10)
ping -c 1 192.168.0.254 > /dev/null
arp -n 192.168.0.254
- launch the command with your IP, your ETH, and the good GATEWAY MAC
modprobe netconsole netconsole=5555@192.168.0.99/vmbr0,5555@your.debian.ip/14:dd:a9:4b:b1:10

And it's working :)
 
Last edited:

RCK

Member
Oct 20, 2009
52
0
6
I updated to VE 6.0 now and can not install / use kdump-tools anymore:
electing previously unselected package kdump-tools.
By reading my previous post, you will be able to install netconsole over internet without VPN on Proxmox 6.0 :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!