4.15 based test kernel for PVE 5.x available

Discussion in 'Proxmox VE: Installation and configuration' started by fabian, Mar 12, 2018.

  1. udo

    udo Well-Known Member
    Proxmox Subscriber

    Joined:
    Apr 22, 2009
    Messages:
    5,829
    Likes Received:
    158
    Hi Thomas,
    I've look at an cluster with e1000e driver but running an older kernel (4.13.13-34).

    There are also trouble with two from seven nodes, but not so often (like I had on one node with the 4.15er):
    Code:
    root@pve01:~# dmesg | grep e1000
    ...
    [4815221.163083] NETDEV WATCHDOG: eth4 (e1000e): transmit queue 0 timed out
    [4815221.271681]  joydev intel_cstate pcspkr ipmi_si dcdbas shpchp mei intel_rapl_perf ipmi_devintf lpc_ich wmi mac_hid ipmi_msghandler acpi_pad acpi_power_meter vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 hid_generic usbmouse usbkbd btrfs usbhid xor hid raid6_pq e1000e(O) tg3 ahci ptp libahci megaraid_sas pps_core
    [4815221.628818] e1000e 0000:05:00.0 eth4: Reset adapter unexpectedly
    [4815224.767822] e1000e: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [5506569.903374] e1000e 0000:05:00.1 eth5: Reset adapter unexpectedly
    [5506573.100126] e1000e: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [6454490.510657] e1000e 0000:05:00.0 eth4: Reset adapter unexpectedly
    [6454493.679389] e1000e: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [6885168.270113] e1000e 0000:05:00.1 eth5: Reset adapter unexpectedly
    [6885171.474830] e1000e: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [11810161.032883] e1000e 0000:05:00.0 eth4: Reset adapter unexpectedly
    [11810164.317643] e1000e: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    [13020006.535342] e1000e 0000:05:00.1 eth5: Reset adapter unexpectedly
    [13020009.772075] e1000e: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    
    root@pve03:~# dmesg | grep e1000
    ...
    [8780811.944900] NETDEV WATCHDOG: eth4 (e1000e): transmit queue 0 timed out
    [8780812.065959]  dcdbas joydev soundcore shpchp pcspkr intel_rapl_perf mei wmi lpc_ich ipmi_si acpi_power_meter mac_hid ipmi_devintf ipmi_msghandler acpi_pad vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 hid_generic usbmouse usbkbd usbhid hid btrfs xor raid6_pq e1000e(O) ahci tg3 ptp libahci megaraid_sas pps_core
    [8780812.461362] e1000e 0000:05:00.0 eth4: Reset adapter unexpectedly
    [8780815.565703] e1000e: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
    
    both nodes has an nic with an older firmware:
    Code:
    ethtool -e eth4 length 256
    Offset          Values
    ------          ------
    0x0000:         00 15 17 84 cd 0c 20 05 98 11 62 50 ff ff ff ff
    0x0010:         77 c5 05 21 2f a4 5e 13 86 80 5e 10 86 80 6f b1
    0x0020:         08 00 5e 10 00 54 00 00 01 58 00 00 00 00 00 01
    0x0030:         f6 6c b0 37 ae 07 03 84 83 07 00 00 03 c3 02 06
    0x0040:         08 00 f0 0e 64 21 40 00 01 48 00 00 00 00 00 00
    0x0050:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0060:         00 01 00 40 1e 12 07 40 00 01 00 40 ff ff ff ff
    0x0070:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff dd 41
    0x0080:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0090:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00a0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00b0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00c0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00d0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00e0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00f0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    
    root@pve01:~# ethtool -i eth4
    driver: e1000e
    version: 3.3.6-NAPI
    firmware-version: 5.6-2
    expansion-rom-version:
    bus-info: 0000:05:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no
    
    All nodes without trouble has an newer firmware:
    Code:
    ethtool -i eth4
    driver: e1000e
    version: 3.3.6-NAPI
    firmware-version: 5.11-2
    expansion-rom-version:
    bus-info: 0000:05:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no
    
    @EDIT: but just see, that the NIC, which makes trouble on the 4.15er kernel has the same (not so old) firmware:
    Code:
    root@pvetest:~# ethtool -i enp5s0f0
    driver: e1000e
    version: 3.2.6-k
    firmware-version: 5.11-2
    expansion-rom-version:
    bus-info: 0000:05:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no
    
    root@pvetest:~# ethtool -i enp5s0f0
    driver: e1000e
    version: 3.2.6-k
    firmware-version: 5.11-2
    expansion-rom-version:
    bus-info: 0000:05:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no
    root@pvetest:~# ethtool -e enp5s0f0 length 256
    Offset          Values
    ------          ------
    0x0000:         00 15 17 4a 4a aa 20 04 ff ff b2 50 ff ff ff ff
    0x0010:         08 d5 03 68 2f a4 5e 11 86 80 5e 10 86 80 65 b1
    0x0020:         08 00 5e 10 00 54 00 00 01 50 00 00 00 00 00 01
    0x0030:         f6 6c b0 37 a6 07 03 84 83 07 00 00 03 c3 02 06
    0x0040:         08 00 f0 0e 64 21 40 00 01 40 00 00 00 00 00 00
    0x0050:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0060:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0070:         ff ff ff ff ff ff ff ff ff ff 97 01 ff ff bf 7e
    0x0080:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x0090:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00a0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00b0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00c0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00d0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00e0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    0x00f0:         ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
    
    Udo
     
    #121 udo, Jun 19, 2018
    Last edited: Jun 19, 2018
  2. resoli

    resoli Member

    Joined:
    Mar 9, 2010
    Messages:
    109
    Likes Received:
    0
    I applied to my quorum node; drbd9 syncronized immediately. Tomorrow I will update remaining nodes and will report.

    Thanks,
    rob
     
  3. resoli

    resoli Member

    Joined:
    Mar 9, 2010
    Messages:
    109
    Likes Received:
    0
    All nodes upgraded and now running 4.15.17-13 . All is well :)

    Nice job, Thomas!
    rob
     
  4. t.lamprecht

    t.lamprecht Proxmox Staff Member
    Staff Member

    Joined:
    Jul 28, 2015
    Messages:
    1,138
    Likes Received:
    148
    good to hear!

    Hmm, there were some problems with older firmwares in the past... Couldn't you update the older ones to the newer?

    Sorry, it seems that the
    Code:
    ethtool -i <DEV>
    (note -i not -e) is missing, would be nice to see if the firmware's are related, or at least also older than udo's working ones.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
    #124 t.lamprecht, Jun 20, 2018
    Last edited by a moderator: Jun 20, 2018
  5. Stefan Radman

    Stefan Radman New Member

    Joined:
    Feb 20, 2018
    Messages:
    1
    Likes Received:
    0
    Hi Thomas,
    With PVE5.2 and kernel 4.15.17-3-pve I was not able to run jumbo frames (MTU 9000) on an Intel I350 card (igb).
    Same configuration worked on two other nodes in the cluster with Broadcom NIC (tg3).
    Installation of pve kernel 4.15.17-13 solved the headache for me :)
    Thanks a lot
    Stefan
     
  6. sergopotap

    sergopotap New Member

    Joined:
    Jun 28, 2018
    Messages:
    3
    Likes Received:
    0
    Hello,
    We have five node pve 5.2 and kernel 4.15.17-3-pve (servers HP DL180 G6) and we are experiencing issues with it.
    Servers randomly crashes every day, without any kernel panic logs. Where we can enable kernel panic logs or where where we can found it?

    Code:
    proxmox-ve: 5.2-2 (running kernel: 4.15.17-3-pve)
    pve-manager: 5.2-3 (running version: 5.2-3/785ba980)
    pve-kernel-4.15: 5.2-3
    pve-kernel-4.13: 5.1-45
    pve-kernel-4.15.17-3-pve: 4.15.17-13
    pve-kernel-4.15.17-2-pve: 4.15.17-10
    pve-kernel-4.13.16-3-pve: 4.13.16-49
    pve-kernel-4.4.128-1-pve: 4.4.128-111
    pve-kernel-4.4.117-1-pve: 4.4.117-109
    pve-kernel-4.4.98-6-pve: 4.4.98-107
    pve-kernel-4.4.98-5-pve: 4.4.98-105
    pve-kernel-4.4.95-1-pve: 4.4.95-99
    pve-kernel-4.4.83-1-pve: 4.4.83-96
    pve-kernel-4.4.49-1-pve: 4.4.49-86
    pve-kernel-4.4.35-2-pve: 4.4.35-79
    pve-kernel-4.4.35-1-pve: 4.4.35-77
    pve-kernel-4.4.24-1-pve: 4.4.24-72
    pve-kernel-4.4.19-1-pve: 4.4.19-66
    pve-kernel-4.4.8-1-pve: 4.4.8-52
    pve-kernel-4.2.8-1-pve: 4.2.8-41
    pve-kernel-4.2.6-1-pve: 4.2.6-36
    pve-kernel-4.2.3-2-pve: 4.2.3-22
    pve-kernel-4.2.3-1-pve: 4.2.3-18
    pve-kernel-4.2.2-1-pve: 4.2.2-16
    ceph: 12.2.5-pve1
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-4
    libpve-common-perl: 5.0-34
    libpve-guest-common-perl: 2.0-17
    libpve-http-server-perl: 2.0-9
    libpve-storage-perl: 5.0-23
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.0-3
    lxcfs: 3.0.0-1
    novnc-pve: 1.0.0-1
    proxmox-widget-toolkit: 1.0-19
    pve-cluster: 5.0-27
    pve-container: 2.0-23
    pve-docs: 5.2-4
    pve-firewall: 3.0-12
    pve-firmware: 2.0-4
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.12.8-3
    pve-qemu-kvm: 2.11.1-5
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-29
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.9-pve1~bpo9
    Code:
    auto lo
    iface lo inet loopback
    
    auto bond0
    iface bond0 inet static
            primary eth2
            slaves eth2 eth3
            address 10.10.105.23
            netmask 255.255.255.0
            bond_miimon 100
            bond_mode 1
            pre-up ( ifconfig eth2 mtu 8900 && ifconfig eth3 mtu 8900 )
            mtu 8900
    auto vmbr2
    iface vmbr2 inet static
            address  172.16.4.127
            netmask  255.255.255.0
            gateway  172.16.4.1
            bridge_ports eth0.2
            bridge_stp off
            bridge_fd 0
    Code:
    root@pve4-node1:~# ethtool -i eth2
    driver: ixgbe
    version: 5.3.7
    firmware-version: 0x2b2c0001
    expansion-rom-version:
    bus-info: 0000:04:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: yes
    
    root@pve4-node1:~# ethtool -i eth3
    driver: ixgbe
    version: 5.3.7
    firmware-version: 0x2b2c0001
    expansion-rom-version:
    bus-info: 0000:04:00.1
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: yes
    
    root@pve4-node1:~# ethtool -i eth0
    driver: igb
    version: 5.3.5.18
    firmware-version: 1.7.2
    expansion-rom-version:
    bus-info: 0000:06:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: yes
    supports-register-dump: yes
    supports-priv-flags: no
     
  7. t.lamprecht

    t.lamprecht Proxmox Staff Member
    Staff Member

    Joined:
    Jul 28, 2015
    Messages:
    1,138
    Likes Received:
    148
    crashes as in "the server just resets (reboots) suddenly"?

    Where did you look? I assume /var/log/kern.log (and rotates) journalctl (or syslog if no persistent journal is enabled)?

    Do you have a watchdog configured?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. sergopotap

    sergopotap New Member

    Joined:
    Jun 28, 2018
    Messages:
    3
    Likes Received:
    0

    crashes as in "the server just resets (reboots) suddenly"? - Yes

    Do you have a watchdog configured? - No my server HP g6 and ilo100

    Where did you look? I assume /var/log/kern.log (and rotates) journalctl (or syslog if no persistent journal is enabled)? - server crash Jul 13 00:27:57 time


    Code:
    root@pve4-node2:~# kdump-config show
    DUMP_MODE:        kdump
    USE_KDUMP:        1
    KDUMP_SYSCTL:     kernel.panic_on_oops=1
    KDUMP_COREDIR:    /var/crash
    crashkernel addr: 0x26000000
       /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-4.15.18-1-pve
    kdump initrd:
       /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.15.18-1-pve
    current state:    ready to kdump
    
    kexec command:
      /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.18-1-pve root=/dev/mapper/pve-root ro quiet irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service ata_piix.prefer_ms_hyperv=0" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
    
    
    
    But
    
    
    root@pve4-node2:~# kdump-config savecore
    running makedumpfile -c -d 31 /proc/vmcore /var/crash/201807140829/dump-incomplete.
    open_dump_memory: Can't open the dump memory(/proc/vmcore). No such file or directory
    
    makedumpfile Failed.
    kdump-config: makedumpfile failed, falling back to 'cp' ... failed!
    cp: cannot stat '/proc/vmcore': No such file or directory
    kdump-config: failed to save vmcore in /var/crash/201807140829 ... failed!
    running makedumpfile --dump-dmesg /proc/vmcore /var/crash/201807140829/dmesg.201807140829.
    open_dump_memory: Can't open the dump memory(/proc/vmcore). No such file or directory
    
    makedumpfile Failed.
    kdump-config: makedumpfile --dump-dmesg failed. dmesg content will be unavailable ... failed!
    kdump-config: failed to save dmesg content in /var/crash/201807140829 ... failed!
    
     

    Attached Files:

  9. alsicorp

    alsicorp New Member
    Proxmox Subscriber

    Joined:
    Sep 25, 2013
    Messages:
    8
    Likes Received:
    1
    I have 2 Hp servers ( HP ProLiant DL360 G6, HP Proliant DL160 G6)
    that are having a kernel panic with the latest enterprise kernel 4.15.18-1

    I don't know if this matters or not... (I can't imagine a mtu setting causing a kernel panic)
    Both have jumbo frames enabled mtu 9000
    Both have been fine (and still are) with 4.15.17-3-pve

    one server has bnx2 driver
    one server has e1000e driver

    I checked both syslog and kernel logs - I guess it never booted far enough to write the logs...

    I could see the kernel panic on the screen but using a kvm switch and text was VERY large so I couldn't see the complete error.
     
  10. Guillaume

    Guillaume New Member

    Joined:
    Jun 12, 2015
    Messages:
    7
    Likes Received:
    0
    Hi

    In last month we checked 4.15 and see so many people have problem, we tested on G8 and have problem too
    We using old server, like R610 and HP G8.
    Do you know if problem was now corrected and we can update to 4.15 or must we need to stay on our 4.13.16-4 that work perfectly ?

    Best regards
    Guillaume
     
  11. tsarya

    tsarya New Member

    Joined:
    Sep 15, 2017
    Messages:
    21
    Likes Received:
    4
    Hi,

    I upgraded today my HP DL360p Gen8 to the latest 4.15.18-15 kernel and the system cannot boot, it is stuck at importing the zfs pool.

    Code:
    proxmox-ve: 5.2-2 (running kernel: 4.15.17-3-pve)
    pve-manager: 5.2-5 (running version: 5.2-5/eb24855a)
    pve-kernel-4.15: 5.2-4
    pve-kernel-4.15.18-1-pve: 4.15.18-15
    pve-kernel-4.15.17-3-pve: 4.15.17-14
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-35
    libpve-guest-common-perl: 2.0-17
    libpve-http-server-perl: 2.0-9
    libpve-storage-perl: 5.0-24
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.0-3
    lxcfs: 3.0.0-1
    novnc-pve: 1.0.0-1
    proxmox-widget-toolkit: 1.0-19
    pve-cluster: 5.0-28
    pve-container: 2.0-24
    pve-docs: 5.2-4
    pve-firewall: 3.0-13
    pve-firmware: 2.0-5
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.12.8-3
    pve-qemu-kvm: 2.11.2-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-29
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.9-pve1~bpo9
    
    It boots fine with kernel 4.15.17-14
     
  12. coppola_f

    coppola_f Member

    Joined:
    Apr 2, 2012
    Messages:
    55
    Likes Received:
    2
  13. Menno

    Menno New Member

    Joined:
    Aug 7, 2018
    Messages:
    6
    Likes Received:
    0
    I can confirm the panic and it seems to be related to PTI (page-table isolation), adding the nopti flag to the kernel command line makes the server boot again although I have not yet tested the machine extensively.

    Previous kernels 4.15.17 all work fine, ever since kernel 4.15.18 my machines became unstable and panic on boot with the back trace added below.

    Hardware used is ProLiant DL380 G6 and G7, please let me know if any other information is needed.

    The full back trace is:

    Code:
    [    6.455328] general protection fault: 0000 [#1] SMP PTI
    [    6.714257] Modules linked in: ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbmouse usbkbd usbhid hid psmouse bnx2 sfc mpt3sas mtd ptp raid_class pps_core mdio hpsa scsi_transport_sas
    [    8.417644] CPU: 1 PID: 330 Comm: systemd-modules Tainted: G          I      4.15.18-1-pve #1
    [    8.839981] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    [    9.153970] RIP: 0010:__kmalloc_node+0x1b0/0x2b0
    [    9.382121] RSP: 0018:ffffbda146d37a20 EFLAGS: 00010286
    [    9.641499] RAX: 0000000000000000 RBX: b9f63dd25eea69bd RCX: 00000000000009a7
    [    9.994883] RDX: 00000000000009a6 RSI: 0000000000000000 RDI: 0000000000027040
    [   10.347905] RBP: ffffbda146d37a58 R08: ffff981ce6e67040 R09: ffff981ce6807c00
    [   10.701943] R10: ffff981cdfebb488 R11: ffffffffc057cd80 R12: 0000000001080020
    [   11.055194] R13: 0000000000000008 R14: 00000000ffffffff R15: ffff981ce6807c00
    [   11.409098] FS:  00007f60956158c0(0000) GS:ffff981ce6e40000(0000) knlGS:0000000000000000
    [   11.810294] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   12.093672] CR2: 00007ff7b3a215c0 CR3: 00000008ef6fa005 CR4: 00000000000206e0
    [   12.093673] Call Trace:
    [   12.093680]  ? enqueue_task_fair+0xb5/0x800
    [   12.093684]  ? alloc_cpumask_var_node+0x1f/0x30
    [   12.093687]  ? x86_configure_nx+0x50/0x50
    [   12.093689]  alloc_cpumask_var_node+0x1f/0x30
    [   12.093691]  alloc_cpumask_var+0xe/0x10
    [   12.093694]  native_send_call_func_ipi+0x2e/0x130
    [   12.093696]  ? find_next_bit+0xb/0x10
    [   12.093699]  smp_call_function_many+0x1bb/0x260
    [   12.093701]  ? x86_configure_nx+0x50/0x50
    [   12.093703]  on_each_cpu+0x2d/0x60
    [   12.093704]  flush_tlb_kernel_range+0x79/0x80
    [   12.093708]  ? purge_fragmented_blocks_allcpus+0x53/0x1f0
    [   12.093711]  __purge_vmap_area_lazy+0x52/0xc0
    [   12.093713]  vm_unmap_aliases+0xfa/0x130
    [   12.093716]  change_page_attr_set_clr+0xea/0x370
    [   12.093718]  ? 0xffffffffc0578000
    [   12.093721]  set_memory_ro+0x29/0x30
    [   12.093722]  ? 0xffffffffc0578000
    [   12.093724]  frob_text.isra.33+0x23/0x30
    [   12.093726]  module_enable_ro.part.54+0x35/0x90
    [   12.093728]  do_init_module+0x119/0x219
    [   12.093730]  load_module+0x28e6/0x2e00
    [   12.093734]  ? ima_post_read_file+0x83/0xa0
    [   12.093737]  SYSC_finit_module+0xe5/0x120
    [   12.093738]  ? SYSC_finit_module+0xe5/0x120
    [   12.093740]  SyS_finit_module+0xe/0x10
    [   12.093743]  do_syscall_64+0x73/0x130
    [   12.093746]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [   12.093747] RIP: 0033:0x7f6094b01229
    [   12.093748] RSP: 002b:00007ffe1ac72988 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
    [   12.093750] RAX: ffffffffffffffda RBX: 00005650c392dc40 RCX: 00007f6094b01229
    [   12.093751] RDX: 0000000000000000 RSI: 00007f6094fea265 RDI: 0000000000000006
    [   12.093752] RBP: 00007f6094fea265 R08: 0000000000000000 R09: 0000000000000000
    [   12.093753] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000
    [   12.093754] R13: 00005650c392d930 R14: 0000000000020000 R15: 00007ffe1ac72af0
    [   12.093755] Code: 89 d0 4c 01 d3 48 33 1b 49 33 9f 40 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 ef fe ff ff 48 85 db 74 14 49 63 47 20 48 01 c3 <48> 33 1b 49 33 9f 40 01 00 00 0f 18 0b 41 f7 c4 00 80 00 00 4c
    [   12.093781] RIP: __kmalloc_node+0x1b0/0x2b0 RSP: ffffbda146d37a20
    [   12.093816] ---[ end trace 6a54e144d0e4034e ]---
    [   12.118561] general protection fault: 0000 [#2] SMP PTI
    [   12.118562] Modules linked in: ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbmouse usbkbd usbhid hid psmouse bnx2 sfc mpt3sas mtd ptp raid_class pps_core mdio hpsa scsi_transport_sas
    [   12.118582] CPU: 1 PID: 335 Comm: mount Tainted: G      D   I      4.15.18-1-pve #1
    [   12.118582] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    [   12.118586] RIP: 0010:__kmalloc_track_caller+0xc3/0x220
    [   12.118587] RSP: 0018:ffffbda146d7be70 EFLAGS: 00010286
    [   12.118588] RAX: b9f63dd25eea69bd RBX: b9f63dd25eea69bd RCX: 00000000000009a8
    [   12.118589] RDX: 00000000000009a7 RSI: 0000000000000000 RDI: b9f63dd25eea69bd
    [   12.118590] RBP: ffffbda146d7bea0 R08: 0000000000027040 R09: ffff981ce6807c00
    [   12.118591] R10: 8080808080808080 R11: fefefefefefefeff R12: 00000000014000c0
    [   12.118592] R13: 0000000000000005 R14: ffffffff9bdf5506 R15: ffff981ce6807c00
    [   12.118594] FS:  00007ff7b3ba4480(0000) GS:ffff981ce6e40000(0000) knlGS:0000000000000000
    [   12.118595] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   12.118596] CR2: 00007ffc7504bd68 CR3: 00000008f4e5a003 CR4: 00000000000206e0
    [   12.118597] Call Trace:
    [   12.118601]  memdup_user+0x2c/0x70
    [   12.118603]  strndup_user+0x46/0x60
    [   12.118607]  SyS_mount+0x34/0xd0
    [   12.118609]  do_syscall_64+0x73/0x130
    [   12.118611]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [   12.118612] RIP: 0033:0x7ff7b326c24a
    [   12.118613] RSP: 002b:00007ffc7504cdb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
    [   12.118615] RAX: ffffffffffffffda RBX: 0000558d7bde9030 RCX: 00007ff7b326c24a
    [   12.118616] RDX: 0000558d7bde9210 RSI: 0000558d7bde9250 RDI: 0000558d7bde9230
    [   12.118617] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
    [   12.118618] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000558d7bde9230
    [   12.118619] R13: 0000558d7bde9210 R14: 0000000000000000 R15: 00000000ffffffff
    [   12.118620] Code: 5e 1c 64 49 83 78 10 00 49 8b 38 0f 84 ea 00 00 00 48 85 ff 0f 84 e1 00 00 00 49 63 5f 20 4d 8b 07 48 8d 4a 01 48 89 f8 48 01 fb <48> 33 1b 49 33 9f 40 01 00 00 65 49 0f c7 08 0f 94 c0 84 c0 74
    [   12.118646] RIP: __kmalloc_track_caller+0xc3/0x220 RSP: ffffbda146d7be70
    [   12.118648] ---[ end trace 6a54e144d0e4034f ]---
    [   12.122166] general protection fault: 0000 [#3] SMP PTI
    [   12.122166] Modules linked in: ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbmouse usbkbd usbhid hid psmouse bnx2 sfc mpt3sas mtd ptp raid_class pps_core mdio hpsa scsi_transport_sas
    [   12.122187] CPU: 1 PID: 333 Comm: mount Tainted: G      D   I      4.15.18-1-pve #1
    [   12.122187] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    [   12.122190] RIP: 0010:__kmalloc_track_caller+0xc3/0x220
    [   12.122191] RSP: 0018:ffffbda146dc7e70 EFLAGS: 00010286
    [   12.122193] RAX: b9f63dd25eea69bd RBX: b9f63dd25eea69bd RCX: 00000000000009a8
    [   12.122194] RDX: 00000000000009a7 RSI: 0000000000000000 RDI: b9f63dd25eea69bd
    [   12.122195] RBP: ffffbda146dc7ea0 R08: 0000000000027040 R09: ffff981ce6807c00
    [   12.122196] R10: 8080808080808080 R11: fefefefefefefeff R12: 00000000014000c0
    [   12.122197] R13: 0000000000000007 R14: ffffffff9bdf5506 R15: ffff981ce6807c00
    [   12.122198] FS:  00007f5a2fccb480(0000) GS:ffff981ce6e40000(0000) knlGS:0000000000000000
    [   12.122200] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   12.122201] CR2: 00007f5a2f2f0d30 CR3: 000000091e25c004 CR4: 00000000000206e0
    [   12.122201] Call Trace:
    [   12.122205]  memdup_user+0x2c/0x70
    [   12.122207]  strndup_user+0x46/0x60
    [   12.122209]  SyS_mount+0x51/0xd0
    [   12.122211]  do_syscall_64+0x73/0x130
    [   12.122213]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [   12.122215] RIP: 0033:0x7f5a2f39324a
    [   12.122216] RSP: 002b:00007ffee4915338 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
    [   12.122217] RAX: ffffffffffffffda RBX: 0000560f461de030 RCX: 00007f5a2f39324a
    [   12.122218] RDX: 0000560f461de210 RSI: 0000560f461de250 RDI: 0000560f461de230
    [   12.122219] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000020
    [   12.122220] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000560f461de230
    [   12.122221] R13: 0000560f461de210 R14: 0000000000000000 R15: 00000000ffffffff
    [   12.122222] Code: 5e 1c 64 49 83 78 10 00 49 8b 38 0f 84 ea 00 00 00 48 85 ff 0f 84 e1 00 00 00 49 63 5f 20 4d 8b 07 48 8d 4a 01 48 89 f8 48 01 fb <48> 33 1b 49 33 9f 40 01 00 00 65 49 0f c7 08 0f 94 c0 84 c0 74
    [   12.122249] RIP: __kmalloc_track_caller+0xc3/0x220 RSP: ffffbda146dc7e70
    [   12.122250] ---[ end trace 6a54e144d0e40350 ]---
    
    edit: I might have spoken too soon, one of my machines still does not work with the new kernel, it boots further than without the nopti flag but still crashes.

    First back trace:

    Code:
    [    6.192737] usercopy: kernel memory exposure attempt detected from 000000001790da28 (kmalloc-8) (15 bytes)
    [    6.670605] kernel BUG at mm/usercopy.c:72!
    [    6.877866] invalid opcode: 0000 [#1] SMP NOPTI
    [    7.102128] Modules linked in: tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq hid_generic usbkbd usbmouse usbhid hid psmouse mpt3sas raid_class bnx2 sfc mtd ptp pps_core mdio hpsa scsi_transport_sas
    [    8.462061] CPU: 3 PID: 314 Comm: udevadm Tainted: G          I      4.15.18-1-pve #1
    [    8.849616] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    [    9.163444] RIP: 0010:__check_object_size+0x167/0x190
    [    9.413124] RSP: 0018:ffffac9dc6db3db8 EFLAGS: 00010286
    [    9.671357] RAX: 000000000000005e RBX: 000000000000000f RCX: 0000000000000000
    [   10.025387] RDX: 0000000000000000 RSI: ffff9300c2ed6498 RDI: ffff9300c2ed6498
    [   10.378198] RBP: ffffac9dc6db3dd8 R08: 0000000000000003 R09: 00000000000003bc
    [   10.731334] R10: 0000000000000008 R11: ffffffffb155680d R12: 0000000000000001
    [   11.084980] R13: ffff9300b41d3857 R14: ffff9300b41d3848 R15: ffff9300b41d3848
    [   11.437743] FS:  00007fd5189d98c0(0000) GS:ffff9300c2ec0000(0000) knlGS:0000000000000000
    [   11.437744] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   11.437747] CR2: 000055971ec212b8 CR3: 00000008f7ecc004 CR4: 00000000000206e0
    [   11.437748] Call Trace:
    [   11.437754]  filldir+0xb0/0x140
    [   11.437758]  kernfs_fop_readdir+0x103/0x270
    [   11.437760]  iterate_dir+0xa8/0x1a0
    [   11.437762]  SyS_getdents+0x9e/0x120
    [   11.437763]  ? fillonedir+0x100/0x100
    [   11.437767]  do_syscall_64+0x73/0x130
    [   11.437768]  ? do_syscall_64+0x73/0x130
    [   11.437772]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [   11.437773] RIP: 0033:0x7fd51782bf2b
    [   11.437774] RSP: 002b:00007fff8a8eec40 EFLAGS: 00000202 ORIG_RAX: 000000000000004e
    [   11.437776] RAX: ffffffffffffffda RBX: 000055971e81a170 RCX: 00007fd51782bf2b
    [   11.437777] RDX: 0000000000008000 RSI: 000055971e81a170 RDI: 0000000000000004
    [   11.437778] RBP: 000055971e81a170 R08: fffe000000000000 R09: 0000000000008040
    [   11.437779] R10: 0000000000000090 R11: 0000000000000202 R12: fffffffffffffe58
    [   11.437780] R13: 0000000000000000 R14: 000055971e81a140 R15: 00007fd5189d9718
    [   11.437781] Code: 48 0f 45 d1 48 c7 c6 ef 11 ce b0 48 c7 c1 e9 11 cf b0 48 0f 45 f1 49 89 d9 49 89 c0 4c 89 f1 48 c7 c7 28 12 cf b0 e8 99 e8 e7 ff <0f> 0b 48 c7 c0 d2 11 cf b0 eb b9 48 c7 c0 e2 11 cf b0 eb b0 48
    [   11.437807] RIP: __check_object_size+0x167/0x190 RSP: ffffac9dc6db3db8
    [   11.437821] ---[ end trace fa877bd9a718e005 ]---
    
    And a bit later:

    Code:
    [  121.361676] general protection fault: 0000 [#2] SMP NOPTI
    [  121.632590] Modules linked in: ipt_REJECT nf_reject_ipv4 iptable_filter bonding 8021q garp mrp softdog nfnetlink_log nfnetlink vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq hid_generic usbkbd usbmouse usbhid hid psmouse mpt3sas raid_class bnx2 sfc mtd ptp pps_core mdio hpsa scsi_transport_sas
    [  123.476354] CPU: 3 PID: 3860 Comm: (start.sh) Tainted: G      D   I      4.15.18-1-pve #1
    [  123.881161] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    [  124.196485] RIP: 0010:__kmalloc_track_caller+0xe5/0x220
    [  124.454684] RSP: 0018:ffffac9dccc47ce0 EFLAGS: 00010282
    [  124.714902] RAX: 0000000000000000 RBX: 9a870d7bf26606d3 RCX: 000000000000166b
    [  125.067714] RDX: 000000000000166a RSI: 0000000000000000 RDI: ffff9300b41d3838
    [  125.420275] RBP: ffffac9dccc47d10 R08: 0000000000027040 R09: ffff9300c2807c00
    [  125.775725] R10: 0000000000000155 R11: ffffac9dccc47cf0 R12: 00000000014000c0
    [  126.129125] R13: 0000000000000006 R14: ffffffffafdf50a4 R15: ffff9300c2807c00
    [  126.481481] FS:  00007f1590203940(0000) GS:ffff9300c2ec0000(0000) knlGS:0000000000000000
    [  126.882090] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  127.165534] CR2: 00007f158e867e48 CR3: 00000008cb2f0003 CR4: 00000000000206e0
    [  127.519583] Call Trace:
    [  127.640589]  kstrdup+0x31/0x60
    [  127.792890]  kstrdup_const+0x24/0x30
    [  127.970359]  alloc_vfsmnt+0xb1/0x230
    [  128.146541]  clone_mnt+0x36/0x330
    [  128.309898]  copy_tree+0x17c/0x310
    [  128.477589]  copy_mnt_ns+0x86/0x290
    [  128.650410]  ? create_new_namespaces+0x36/0x1e0
    [  128.874440]  create_new_namespaces+0x61/0x1e0
    [  129.090364]  unshare_nsproxy_namespaces+0x5a/0xb0
    [  129.322606]  SyS_unshare+0x201/0x3a0
    [  129.499390]  do_syscall_64+0x73/0x130
    [  129.680466]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  129.930403] RIP: 0033:0x7f158e7e7487
    [  130.107402] RSP: 002b:00007ffe2bb5e248 EFLAGS: 00000a07 ORIG_RAX: 0000000000000110
    [  130.481490] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f158e7e7487
    [  130.836166] RDX: 00007ffe2bb5e238 RSI: 00007ffe2bb5e370 RDI: 0000000000020000
    [  131.189537] RBP: 00007ffe2bb5e4c0 R08: 0000556ddc80e442 R09: 00007ffe2bb5e260
    [  131.541847] R10: 0000000000000000 R11: 0000000000000a07 R12: 00007ffe2bb5e250
    [  131.894958] R13: 00007ffe2bb5e250 R14: 0000556ddc80e441 R15: 0000556ddc8072d9
    [  132.246620] Code: 8d 4a 01 48 89 f8 48 01 fb 48 33 1b 49 33 9f 40 01 00 00 65 49 0f c7 08 0f 94 c0 84 c0 74 b2 48 85 db 74 14 49 63 47 20 48 01 c3 <48> 33 1b 49 33 9f 40 01 00 00 0f 18 0b 41 f7 c4 00 80 00 00 48
    [  133.181775] RIP: __kmalloc_track_caller+0xe5/0x220 RSP: ffffac9dccc47ce0
    [  133.514693] softdog: Initiating system reboot
    [  133.515953] ------------[ cut here ]------------
    [  133.515955] NETDEV WATCHDOG: ens3f1np1 (sfc): transmit queue 1 timed out
    [  133.515976] WARNING: CPU: 1 PID: 3776 at net/sched/sch_generic.c:323 dev_watchdog+0x222/0x230
    [  133.515977] Modules linked in: ipt_REJECT nf_reject_ipv4 iptable_filter bonding 8021q garp mrp softdog nfnetlink_log nfnetlink vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs xor zstd_compress
    [  133.515996] sfc 0000:0e:00.0 ens3f0np0: TX stuck with port_enabled=1: resetting channels
    [  133.515996]  raid6_pq hid_generic usbkbd usbmouse usbhid hid psmouse mpt3sas raid_class bnx2 sfc mtd ptp pps_core mdio hpsa scsi_transport_sas
    [  133.516008] CPU: 1 PID: 3776 Comm: pmxcfs Tainted: G      D   I      4.15.18-1-pve #1
    [  133.516009] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    [  133.516010] RIP: 0010:dev_watchdog+0x222/0x230
    [  133.516011] RSP: 0018:ffff9300c2e43e58 EFLAGS: 00010286
    [  133.516013] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000000001f
    [  133.516014] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000246
    [  133.516015] RBP: ffff9300c2e43e88 R08: 0000000000000000 R09: 000000000000003c
    [  133.516016] R10: ffff9300c2e5a770 R11: 0000000000028fd0 R12: 0000000000000040
    [  133.516017] R13: ffff9300b1c22000 R14: ffff9300b1c22478 R15: ffff9300b1c2cf40
    [  133.516018] FS:  00007f9cbeffd700(0000) GS:ffff9300c2e40000(0000) knlGS:0000000000000000
    [  133.516019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  133.516021] CR2: 00007f1b980441b8 CR3: 00000008be846003 CR4: 00000000000206e0
    [  133.516021] Call Trace:
    [  133.516022]  <IRQ>
    [  133.516025]  ? dev_deactivate_queue.constprop.33+0x60/0x60
    [  133.516029]  call_timer_fn+0x32/0x130
    [  133.516032]  run_timer_softirq+0x1dd/0x430
    [  133.516035]  ? timerqueue_add+0x59/0x90
    [  133.516037]  ? ktime_get+0x43/0xa0
    [  133.516040]  __do_softirq+0x109/0x29b
    [  133.516043]  irq_exit+0xb6/0xc0
    [  133.516045]  smp_apic_timer_interrupt+0x71/0x130
    [  133.516046]  apic_timer_interrupt+0x84/0x90
    [  133.516047]  </IRQ>
    [  133.516051] RIP: 0010:finish_task_switch+0x78/0x200
    [  133.516052] RSP: 0018:ffffac9dccad7c10 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
    [  133.516053] RAX: ffff9300bc185600 RBX: ffff93007e955600 RCX: 0000000000000000
    [  133.516054] RDX: 0000000000007f9c RSI: 00000000beffd700 RDI: ffff9300c2e628c0
    [  133.516055] RBP: ffffac9dccad7c38 R08: 0000000000001534 R09: 0000000000000002
    [  133.516056] R10: ffffac9dc62a3e08 R11: 0000000000000400 R12: ffff9300c2e628c0
    [  133.516059] R13: ffff9300bc185600 R14: ffff93007e853180 R15: 0000000000000000
    [  133.516063]  __schedule+0x3e8/0x870
    [  133.516068]  ? fuse_copy_one+0x53/0x70
    [  133.516070]  schedule+0x36/0x80
    [  133.516073]  do_wait_intr+0x6f/0x80
    [  133.516075]  fuse_dev_do_read.isra.25+0x47f/0x860
    [  133.516077]  ? wait_woken+0x80/0x80
    [  133.516079]  fuse_dev_read+0x65/0x90
    [  133.516082]  new_sync_read+0xe4/0x130
    [  133.516084]  __vfs_read+0x29/0x40
    [  133.516086]  vfs_read+0x96/0x130
    [  133.516088]  SyS_read+0x55/0xc0
    [  133.516091]  do_syscall_64+0x73/0x130
    [  133.516092]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  133.516094] RIP: 0033:0x7f9cccd4c20d
    [  133.516095] RSP: 002b:00007f9cbeffcbf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
    [  133.516096] RAX: ffffffffffffffda RBX: 00007f9cbeffcd40 RCX: 00007f9cccd4c20d
    [  133.516097] RDX: 0000000000021000 RSI: 00007f9cce38c010 RDI: 0000000000000007
    [  133.516098] RBP: 00007f9cbeffcd38 R08: 0000000000000000 R09: 0000000000000000
    [  133.516099] R10: 00007f9cb40008c0 R11: 0000000000000293 R12: 000055c7aaf50080
    [  133.516100] R13: 000055c7aaf4f9c0 R14: 00007f9cbeffd698 R15: 0000000000021000
    [  133.516101] Code: 37 00 49 63 4e e8 eb 92 4c 89 ef c6 05 26 29 d8 00 01 e8 a2 21 fd ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 60 56 d9 b0 e8 1e ce 7f ff <0f> 0b eb c0 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
    [  133.516128] ---[ end trace fa877bd9a718e006 ]---
    
     
    #133 Menno, Aug 7, 2018
    Last edited: Aug 7, 2018
  14. coppola_f

    coppola_f Member

    Joined:
    Apr 2, 2012
    Messages:
    55
    Likes Received:
    2
    Menno,
    we're working HP DL380 G6 too....
    we solved rolling back to 4.13.xx kernel

    many thanks again for your time,
    regards,
    Francesco
     
  15. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    2,097
    Likes Received:
    184
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  16. coppola_f

    coppola_f Member

    Joined:
    Apr 2, 2012
    Messages:
    55
    Likes Received:
    2
    Actually unable to retrieve bios version
    i really can't reboot any node,
    hoping to give you feedback about this value on our 4x dl380 g6 ASAP!!

    regards,
    Francesco
     
  17. Alwin

    Alwin Proxmox Staff Member
    Staff Member

    Joined:
    Aug 1, 2017
    Messages:
    2,097
    Likes Received:
    184
    @coppola_f, dmidecode should give you the information too.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  18. coppola_f

    coppola_f Member

    Joined:
    Apr 2, 2012
    Messages:
    55
    Likes Received:
    2
    @Alwin

    here the results:
    (bios relase date is 02/22/2018!!)

    # dmidecode 3.0
    Getting SMBIOS data from sysfs.
    SMBIOS 2.7 present.
    133 structures occupying 4124 bytes.
    Table at 0xDF7FE000.

    Handle 0x0000, DMI type 0, 24 bytes
    BIOS Information
    Vendor: HP
    Version: P62
    Release Date: 02/22/2018
    Address: 0xF0000
    Runtime Size: 64 kB
    ROM Size: 8192 kB
    Characteristics:
    PCI is supported
    PNP is supported
    BIOS is upgradeable
    BIOS shadowing is allowed
    ESCD support is available
    Boot from CD is supported
    Selectable boot is supported
    EDD is supported
    5.25"/360 kB floppy services are supported (int 13h)
    5.25"/1.2 MB floppy services are supported (int 13h)
    3.5"/720 kB floppy services are supported (int 13h)
    Print screen service is supported (int 5h)
    8042 keyboard services are supported (int 9h)
    Serial services are supported (int 14h)
    Printer services are supported (int 17h)
    CGA/mono video services are supported (int 10h)
    ACPI is supported
    USB legacy is supported
    BIOS boot specification is supported
    Function key-initiated network boot is supported
    Targeted content distribution is supported
    Firmware Revision: 2.33
     
  19. Menno

    Menno New Member

    Joined:
    Aug 7, 2018
    Messages:
    6
    Likes Received:
    0
    Thanks Alwin, I was not aware there was an updated BIOS.

    These machines however are only for testing and out of warranty so I'm unable to upgrade the BIOS, we do have newer hardware available to continue our testing so I'm in the process of upgrading the machines as we speak. Hopefully they play nice with the latest kernel.

    Though I find this issue to be a regression as a kernel upgrade should never break things, the issue also plays on multiple machines of different generations (gen 6 and 7) and is reported by multiple users. Perhaps someone else is able to upgrade their BIOS to see if it resolves the issue so it can be marked as fixed that way or otherwise be debugged some more.
     
  20. David Herselman

    David Herselman Active Member
    Proxmox Subscriber

    Joined:
    Jun 8, 2016
    Messages:
    183
    Likes Received:
    38
    We have a 3 x HP ProLiant DL380 G7 node cluster which is working perfectly:

    kvm1:
    Code:
    HP ProLiant DL380 G7 (583914-B21)
    BIOS: 05/05/2011
    kvm2:
    Code:
    HP ProLiant DL380 G7 (583914-B21)
    BIOS: 12/01/2010
    kvm3:
    Code:
    HP ProLiant DL380 G7 (583914-B21)
    BIOS: 12/01/2010
    Running Ceph 12.2.7, two NICs in a LACP bond for VM traffic and another two NICs in a LACP bond for Ceph traffic.

    We're running OVS, perhaps that's different to your environment?


    Code:
    [root@kvm1 ~]# pveversion -v
    proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
    pve-manager: 5.2-5 (running version: 5.2-5/eb24855a)
    pve-kernel-4.15: 5.2-4
    pve-kernel-4.15.18-1-pve: 4.15.18-15
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-35
    libpve-guest-common-perl: 2.0-17
    libpve-http-server-perl: 2.0-9
    libpve-storage-perl: 5.0-24
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.0-3
    lxcfs: 3.0.0-1
    novnc-pve: 1.0.0-1
    openvswitch-switch: 2.7.0-3
    proxmox-widget-toolkit: 1.0-19
    pve-cluster: 5.0-28
    pve-container: 2.0-24
    pve-docs: 5.2-4
    pve-firewall: 3.0-13
    pve-firmware: 2.0-5
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.12.8-3
    pve-qemu-kvm: 2.11.2-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-29
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.9-pve1~bpo9
    Code:
    [root@kvm2 ~]# pveversion -v
    proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
    pve-manager: 5.2-5 (running version: 5.2-5/eb24855a)
    pve-kernel-4.15: 5.2-4
    pve-kernel-4.15.18-1-pve: 4.15.18-15
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-35
    libpve-guest-common-perl: 2.0-17
    libpve-http-server-perl: 2.0-9
    libpve-storage-perl: 5.0-24
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.0-3
    lxcfs: 3.0.0-1
    novnc-pve: 1.0.0-1
    openvswitch-switch: 2.7.0-3
    proxmox-widget-toolkit: 1.0-19
    pve-cluster: 5.0-28
    pve-container: 2.0-24
    pve-docs: 5.2-4
    pve-firewall: 3.0-13
    pve-firmware: 2.0-5
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.12.8-3
    pve-qemu-kvm: 2.11.2-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-29
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.9-pve1~bpo9
    Code:
    [root@kvm3 ~]# pveversion -v
    proxmox-ve: 5.2-2 (running kernel: 4.15.18-1-pve)
    pve-manager: 5.2-5 (running version: 5.2-5/eb24855a)
    pve-kernel-4.15: 5.2-4
    pve-kernel-4.15.18-1-pve: 4.15.18-15
    corosync: 2.4.2-pve5
    criu: 2.11.1-1~bpo90
    glusterfs-client: 3.8.8-1
    ksm-control-daemon: 1.2-2
    libjs-extjs: 6.0.1-2
    libpve-access-control: 5.0-8
    libpve-apiclient-perl: 2.0-5
    libpve-common-perl: 5.0-35
    libpve-guest-common-perl: 2.0-17
    libpve-http-server-perl: 2.0-9
    libpve-storage-perl: 5.0-24
    libqb0: 1.0.1-1
    lvm2: 2.02.168-pve6
    lxc-pve: 3.0.0-3
    lxcfs: 3.0.0-1
    novnc-pve: 1.0.0-1
    openvswitch-switch: 2.7.0-3
    proxmox-widget-toolkit: 1.0-19
    pve-cluster: 5.0-28
    pve-container: 2.0-24
    pve-docs: 5.2-4
    pve-firewall: 3.0-13
    pve-firmware: 2.0-5
    pve-ha-manager: 2.0-5
    pve-i18n: 1.0-6
    pve-libspice-server1: 0.12.8-3
    pve-qemu-kvm: 2.11.2-1
    pve-xtermjs: 1.0-5
    qemu-server: 5.0-29
    smartmontools: 6.5+svn4324-1
    spiceterm: 3.0-5
    vncterm: 1.5-3
    zfsutils-linux: 0.7.9-pve1~bpo9
    

    Hrm... I don't see ceph in the pveversion -v output. Using Proxmox apt sources though:
    Code:
    [root@kvm1 ~]# cat /etc/apt/sources.list.d/ceph.list
    deb http://download.proxmox.com/debian/ceph-luminous stretch main
    
    [root@kvm1 sources.list.d]# dpkg -l | grep ceph
    ii  ceph-base                            12.2.7-pve1                    amd64        common ceph daemon libraries and management tools
    ii  ceph-common                          12.2.7-pve1                    amd64        common utilities to mount and interact with a ceph storage cluster
    ii  ceph-fuse                            12.2.7-pve1                    amd64        FUSE-based client for the Ceph distributed file system
    ii  ceph-mds                             12.2.7-pve1                    amd64        metadata server for the ceph distributed file system
    ii  ceph-mgr                             12.2.7-pve1                    amd64        manager for the ceph distributed storage system
    ii  ceph-mon                             12.2.7-pve1                    amd64        monitor server for the ceph storage system
    ii  ceph-osd                             12.2.7-pve1                    amd64        OSD server for the ceph storage system
    ii  libcephfs1                           10.2.10-1~bpo80+1              amd64        Ceph distributed file system client library
    ii  libcephfs2                           12.2.7-pve1                    amd64        Ceph distributed file system client library
    ii  python-ceph                          12.2.7-pve1                    amd64        Meta-package for python libraries for the Ceph libraries
    ii  python-cephfs                        12.2.7-pve1                    amd64        Python 2 libraries for the Ceph libcephfs library
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice