[SOLVED] Kernel Panicking, Cannot enable crash dumps

FuriousGeorge

Renowned Member
Sep 25, 2012
84
2
73
There doesn't seem to be any documentation for enabling crash dumps in debian, and what I've cobbled together does not seem to be working.

Code:
# cat /etc/default/kexec
# Defaults for kexec initscript
# sourced by /etc/init.d/kexec and /etc/init.d/kexec-load

# Load a kexec kernel (true/false)
LOAD_KEXEC=true

# Kernel and initrd image
KERNEL_IMAGE="/boot/vmlinuz-4.4.6-1-pve"
INITRD="/boot/initrd.img-4.4.6-1-pve"

# If empty, use current /proc/cmdline
APPEND=""

# Load the default kernel from grub config (true/false)
USE_GRUB_CONFIG=false

Code:
# cat /etc/default/grub
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="Proxmox Virtual Environment"
GRUB_CMDLINE_LINUX_DEFAULT="crashkernel=128M"
GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/pve-1 boot=zfs crashkernel=128M nmi_watchdog=1"

# Disable os-prober, it might add menu entries for each guest
# root FS on a local partition
GRUB_DISABLE_OS_PROBER=true

Code:
# cat /etc/default/kdump-tools
# kdump-tools configuration
# ---------------------------------------------------------------------------
# USE_KDUMP - controls kdump will be configured
#     0 - kdump kernel will not be loaded
#     1 - kdump kernel will be loaded and kdump is configured
# KDUMP_SYSCTL - controls when a panic occurs, using the sysctl
#     interface.  The contents of this variable should be the
#     "variable=value ..." portion of the 'sysctl -w ' command.
#     If not set, the default value "kernel.panic_on_oops=1" will
#     be used.  Disable this feature by setting KDUMP_SYSCTL=" "
#     Example - also panic on oom:
#         KDUMP_SYSCTL="kernel.panic_on_oops=1 vm.panic_on_oom=1"
#
USE_KDUMP=1
KDUMP_SYSCTL="kernel.panic_on_oops=1"


# ---------------------------------------------------------------------------
# Kdump Kernel:
# KDUMP_KERNEL - A full pathname to a kdump kernel.
# KDUMP_INITRD - A full pathname to the kdump initrd (if used).
#     If these are not set, kdump-config will try to use the current kernel
#     and initrd if it is relocatable.  Otherwise, you will need to specify
#     these manually.
#KDUMP_KERNEL=
#KDUMP_INITRD=


# ---------------------------------------------------------------------------
# vmcore Handling:
# KDUMP_COREDIR - local path to save the vmcore to.
# KDUMP_FAIL_CMD - This variable can be used to cause a reboot or
#     start a shell if saving the vmcore fails.  If not set, "reboot -f"
#     is the default.
#     Example - start a shell if the vmcore copy fails:
#         KDUMP_FAIL_CMD="echo 'makedumpfile FAILED.'; /bin/bash; reboot -f"
KDUMP_COREDIR="/var/crash"
KDUMP_FAIL_CMD="reboot -f"


# ---------------------------------------------------------------------------
# Makedumpfile options:
# DEBUG_KERNEL - a debug version of the running kernel.  If not set,
#     kdump-config will use /usr/lib/debug/vmlinux-$(uname -r) if it is
#     available.  If it is not available, makedumpfile will be limited to
#     dumping all pages in memory.
# MAKEDUMP_ARGS - extra arguments passed to makedumpfile (8).  The default,
#     if unset, is to pass '-c -d 31' telling makedumpfile to use compression
#     and reduce the corefile to in-use kernel pages only.
#DEBUG_KERNEL=
#MAKEDUMP_ARGS="-c -d 31"


# ---------------------------------------------------------------------------
# Kexec/Kdump args
# KDUMP_KEXEC_ARGS - Additional arguments to the kexec command used to load
#     the kdump kernel
#     Example - Use this option on x86 systems with PAE and more than
#     4 gig of memory:
#         KDUMP_KEXEC_ARGS="--elf64-core-headers"
# KDUMP_CMDLINE - The default is to use the contents of /proc/cmdline.
#     Set this variable to override /proc/cmdline.
# KDUMP_CMDLINE_APPEND - Additional arguments to append to the command line
#     for the kdump kernel.  If unset, it defaults to "irqpoll maxcpus=1 nousb"
#KDUMP_KEXEC_ARGS=""
#KDUMP_CMDLINE=""
#KDUMP_CMDLINE_APPEND="irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service"

# --------------------------------------------

The wiki suggests using netconsole. However, this does not work for me either:

Code:
# modprobe netconsole netconsole=@10.5.0.250/,@10.5.0.251/
modprobe: ERROR: could not insert 'netconsole': Device or resource busy

Any help is appreciated.
 
Last edited:
Crash-Dump works perfectly in Debian. This is my running and working configuration (had crashes in ZFS with HP MSA-60 and it dump's correctly).
  • Kernel-Commandline
Code:
crashkernel=256M
  • /etc/default/kdump-tools
Code:
USE_KDUMP=1
KDUMP_COREDIR="/var/crash"
KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_unrecovered_nmi=1"
DEBUG_KERNEL=/vmlinuz
MAKEDUMP_ARGS="-c --message-level 7 -d 11,31"
  • Reboot and Check
Code:
root@backup ~ > uname -a
Linux backup 4.4.8-1-pve #1 SMP Tue May 31 07:12:32 CEST 2016 x86_64 GNU/Linux

root@backup ~ > service kdump-tools status
● kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled)
   Active: active (exited) since So 2016-06-12 13:37:37 CEST; 2 weeks 4 days ago
  Process: 3522 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
Main PID: 3522 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/kdump-tools.service

Jun 12 13:37:37 backup kdump-tools[3522]: Starting kdump-tools: loaded kdump kernel.

General problem is that you need a debug kernel to analyze further, yet you get dmesg in the same folder as the core dump.
 
I've made some progress.

DEBUG_KERNEL was not set on my end. I copied my current kernel to / and called in vmlinuz, so as to match your config exactly.

Code:
# service kdump-tools status
● kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled)
   Active: active (exited) since Tue 2016-07-05 02:12:28 EDT; 36s ago
  Process: 2534 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
 Main PID: 2534 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/kdump-tools.service
Jul 05 02:12:28 ads-proxmox-2 kdump-tools[2534]: Starting kdump-tools: loaded kdump kernel.
Jul 05 02:12:28 ads-proxmox-2 systemd[1]: Started Kernel crash dump capture service.

Then I simulate a crash:

Code:
# sync &&  echo c | tee /proc/sysrq-trigger

However, whereas I expect a kernel dump and the associated files in /var/crash, I instead get a link to my debug kernel which was not there before:

Code:
# ls -la /var/crash/
total 18
drwxr-xr-x  2 root root   4 Jul  5 02:14 .
drwxr-xr-x 12 root root  14 Jun 21 02:04 ..
lrwxrwxrwx  1 root root   8 Jul  5 02:14 kernel_link -> /vmlinuz
-rw-r--r--  1 root root 285 Jul  5 02:14 kexec_cmd

I'm new to this, so any help is much appreciated.
 
I've made some progress.

DEBUG_KERNEL was not set on my end. I copied my current kernel to / and called in vmlinuz, so as to match your config exactly.

I just symlinked mine :-D

I test via a real crash, not a triggered one. I found myself in the same situation as you did. I uploaded a zip file (no tar's allowed) which contains a kernel module to crash your system. Just build it with the build-script.
 

Attachments

  • kernel-panic-module.zip
    1.2 KB · Views: 14
I just symlinked mine :-D

I test via a real crash, not a triggered one. I found myself in the same situation as you did. I uploaded a zip file (no tar's allowed) which contains a kernel module to crash your system. Just build it with the build-script.

The thing crash on me before I could test the module. Once again, all I got was a link to the debug kernel. The date matches the time of the crash.
 
How much space is available on /var/crash? Maybe the crash does not fit?

Please try the crashdump and the simulation inside a VM to play around and the apply to the crashing host.
 
I had to reinstalled the system, and it worked. All I had to do was enable kdump in /etc/default/kdump-tools, and enable 256M of ram for the dumps in /etc/default/grub, then updated-grub.

Something I did must have been preventing this from working in the previous installation.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!