Weird kernel dumps ?

Discussion in 'Proxmox VE: Installation and configuration' started by jimmyjoe, Jul 30, 2018.

  1. jimmyjoe

    jimmyjoe Member
    Proxmox Subscriber

    Joined:
    Jan 12, 2015
    Messages:
    80
    Likes Received:
    2
    Since upgrading to 5.2, I've been getting VM crashes intermittently. They don't seem to be related to any one hypervisor and I can't actually tell what's wrong from the console message (below). Anyone have any ideas where to look for the problem? Seems like we used to get these years ago when the lzo pipe would have a buffer underrun from vzdump or something. Not sure how the vzdump is different with 5.2. thanks.
     

    Attached Files:

  2. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    2,345
    Likes Received:
    212
    On what 'pveversion -v' are you? And please post more from the trace, the screen sadly shows only half of it. But as a first guess, check your storage.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. jimmyjoe

    jimmyjoe Member
    Proxmox Subscriber

    Joined:
    Jan 12, 2015
    Messages:
    80
    Likes Received:
    2
    Hi,
    Is there a way to get more info on the crash? The screenshot is all that was on the console. I couldn't page up/down or login to the VM. I grep'd for 'Task' in the logs after a reboot of the VM but nothing turned up. pveversion output below.
    Thanks

    # pveversion -v
    proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
    pve-manager: 5.2-5 (running version: 5.2-5/eb24855a)
    pve-kernel-4.15: 5.2-4
    pve-kernel-4.15.18-1-pve: 4.15.18-15
    pve-kernel-4.15.17-3-pve: 4.15.17-14
    pve-kernel-4.15.17-1-pve: 4.15.17-9
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-35
    libpve-guest-common-perl: 2.0-17
    libpve-http-server-perl: 2.0-9
    libpve-storage-perl: 5.0-24
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.0-3
    lxcfs: 3.0.0-1
    novnc-pve: 1.0.0-1
    proxmox-widget-toolkit: 1.0-19
    pve-cluster: 5.0-28
    pve-container: 2.0-24
    pve-docs: 5.2-4
    pve-firewall: 3.0-13
    pve-firmware: 2.0-5
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.12.8-3
    pve-qemu-kvm: 2.11.2-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-29
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.9-pve1~bpo9
     
  4. jimmyjoe

    jimmyjoe Member
    Proxmox Subscriber

    Joined:
    Jan 12, 2015
    Messages:
    80
    Likes Received:
    2
    Hi - just a follow up. A second VM was discovered crashed today as well. Sorry, only a screendump of the console.
     

    Attached Files:

  5. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    2,345
    Likes Received:
    212
    So the VMs crash, not the hosts. Check your underlying storage, it well may have issues (slow, defective blocks,...).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  6. jimmyjoe

    jimmyjoe Member
    Proxmox Subscriber

    Joined:
    Jan 12, 2015
    Messages:
    80
    Likes Received:
    2
    Ok thank you. Any idea where to find logs on Proxmox server that might explain what happened to make the VM crash? Also, the storage server backing the VMs does not seem to be heavily utilized except for one 60% utilization spike - but maybe that would do it..(https://imgur.com/a/U9oaFvH) . The Proxmox and storage servers were rebooted on 7/28 in which they are configured to `fsck` on boot. The crash happened prior to 8am in the graph. I do not know what time exactly however.
     
  7. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    2,345
    Likes Received:
    212
    Only the standard logs, syslog/kernel/journal, on PVE and inside the VM. Depending on the issue, it might well be that there is nothing logged on either side.

    I don't understand the graph. What kind of utilization, bandwidth, I/O, fill level? The granularity of the graph might hide values. Another important factor is also the latency of the storage, especially when the network is involved.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice