soft lockup bug

ipkpjersi

Active Member
Jul 31, 2017
20
0
41
35
Hello,

I have Proxmox on 3 whitebox hosts and 2 of them are flawless. However, the third one is having some issues and always ends up locking up, and I end up needing to physically reboot the host. I have the latest updates for Proxmox 5.0 installed.

I have attached the error screenshot.

Here is the pveversion output:
Code:
pveversion --verbose
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
   LANGUAGE = (unset),
   LC_ALL = (unset),
   LC_CTYPE = "en_CA.UTF-8",
   LANG = "en_US.UTF-8"
   are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
proxmox-ve: 5.0-15 (running kernel: 4.10.15-1-pve)
pve-manager: 5.0-23 (running version: 5.0-23/af4267bf)
pve-kernel-4.10.15-1-pve: 4.10.15-15
libpve-http-server-perl: 2.0-5
lvm2: 2.02.168-pve2
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-10
qemu-server: 5.0-12
pve-firmware: 2.0-2
libpve-common-perl: 5.0-16
libpve-guest-common-perl: 2.0-11
libpve-access-control: 5.0-5
libpve-storage-perl: 5.0-12
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-6
pve-qemu-kvm: 2.9.0-2
pve-container: 2.0-14
pve-firewall: 3.0-1
pve-ha-manager: 2.0-2
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.0.8-3
lxcfs: 2.0.7-pve2
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.6.5.9-pve16~bpo90



Here is the specs of the machine:
CPU: Intel - Xeon E5-2620 V3 2.4GHz 6-Core Processor ($404.99 @ SuperBiiz)
CPU Cooler: Noctua - NH-U12S 55.0 CFM CPU Cooler ($57.99 @ Amazon)
Motherboard: Asus - X99-M WS Micro ATX LGA2011-3 Motherboard ($263.99 @ SuperBiiz)
Storage: Samsung - 850 Pro Series 1TB 2.5" Solid State Drive ($421.39 @ Amazon)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Video Card: EVGA - GeForce GT 730 2GB Video Card ($67.09 @ Amazon)
Case: Fractal Design - Define XL R2 (Black Pearl) ATX Full Tower Case ($109.99 @ SuperBiiz)
Power Supply: EVGA - SuperNOVA G2 750W 80+ Gold Certified Fully-Modular ATX Power Supply ($112.89 @ OutletPC)
Other: Intel X540T2 Ethernet 10 Gbps Network Adapter
Other: Kingston 32GB (2 x 16GB) 288-Pin DDR4 SDRAM ECC Registered DDR4 2133 (PC4 17000) Server Memory Model KVR21R15D4K4
Total: $1438.33
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2017-08-25 16:14 EDT-0400




Please let me know of some things I can try or some more information I can provide. This host was running ESXi 6.0U3 fine for months without any issues. I am running the latest BIOS for my motherboard so it should have fairly recent microcode.

I am not super new to Linux, but I thought that the idea behind watchdog is to reboot a server once there is an error detected, not just warn about it? I may be mistaken.
 

Attachments

  • SK8NfQC.jpg
    SK8NfQC.jpg
    452.8 KB · Views: 9
Last edited:
You know, I was thinking earlier today maybe it is my GPU because it didn't make sense for anything else to be a problem, but the weird thing is, I have a nearly identical server except for different RAM + mobo + CPU (5820K, X99 Deluxe U3.1, etc) but it has the same GT 730 GPU without any crashing.
 
Last edited:
I could be a combination of chipset + nvidia (nouveau driver) but this is 99,9% fix it.
Yup, that fixed it, thanks! I will bump this again if I experience this issue again but I think you solved my issue. Thanks again!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!