Hello,
I have Proxmox on 3 whitebox hosts and 2 of them are flawless. However, the third one is having some issues and always ends up locking up, and I end up needing to physically reboot the host. I have the latest updates for Proxmox 5.0 installed.
I have attached the error screenshot.
Here is the pveversion output:
Here is the specs of the machine:
CPU: Intel - Xeon E5-2620 V3 2.4GHz 6-Core Processor ($404.99 @ SuperBiiz)
CPU Cooler: Noctua - NH-U12S 55.0 CFM CPU Cooler ($57.99 @ Amazon)
Motherboard: Asus - X99-M WS Micro ATX LGA2011-3 Motherboard ($263.99 @ SuperBiiz)
Storage: Samsung - 850 Pro Series 1TB 2.5" Solid State Drive ($421.39 @ Amazon)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Video Card: EVGA - GeForce GT 730 2GB Video Card ($67.09 @ Amazon)
Case: Fractal Design - Define XL R2 (Black Pearl) ATX Full Tower Case ($109.99 @ SuperBiiz)
Power Supply: EVGA - SuperNOVA G2 750W 80+ Gold Certified Fully-Modular ATX Power Supply ($112.89 @ OutletPC)
Other: Intel X540T2 Ethernet 10 Gbps Network Adapter
Other: Kingston 32GB (2 x 16GB) 288-Pin DDR4 SDRAM ECC Registered DDR4 2133 (PC4 17000) Server Memory Model KVR21R15D4K4
Total: $1438.33
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2017-08-25 16:14 EDT-0400
Please let me know of some things I can try or some more information I can provide. This host was running ESXi 6.0U3 fine for months without any issues. I am running the latest BIOS for my motherboard so it should have fairly recent microcode.
I am not super new to Linux, but I thought that the idea behind watchdog is to reboot a server once there is an error detected, not just warn about it? I may be mistaken.
I have Proxmox on 3 whitebox hosts and 2 of them are flawless. However, the third one is having some issues and always ends up locking up, and I end up needing to physically reboot the host. I have the latest updates for Proxmox 5.0 installed.
I have attached the error screenshot.
Here is the pveversion output:
Code:
pveversion --verbose
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LC_CTYPE = "en_CA.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
proxmox-ve: 5.0-15 (running kernel: 4.10.15-1-pve)
pve-manager: 5.0-23 (running version: 5.0-23/af4267bf)
pve-kernel-4.10.15-1-pve: 4.10.15-15
libpve-http-server-perl: 2.0-5
lvm2: 2.02.168-pve2
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-10
qemu-server: 5.0-12
pve-firmware: 2.0-2
libpve-common-perl: 5.0-16
libpve-guest-common-perl: 2.0-11
libpve-access-control: 5.0-5
libpve-storage-perl: 5.0-12
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.0-6
pve-qemu-kvm: 2.9.0-2
pve-container: 2.0-14
pve-firewall: 3.0-1
pve-ha-manager: 2.0-2
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.0.8-3
lxcfs: 2.0.7-pve2
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.6.5.9-pve16~bpo90
Here is the specs of the machine:
CPU: Intel - Xeon E5-2620 V3 2.4GHz 6-Core Processor ($404.99 @ SuperBiiz)
CPU Cooler: Noctua - NH-U12S 55.0 CFM CPU Cooler ($57.99 @ Amazon)
Motherboard: Asus - X99-M WS Micro ATX LGA2011-3 Motherboard ($263.99 @ SuperBiiz)
Storage: Samsung - 850 Pro Series 1TB 2.5" Solid State Drive ($421.39 @ Amazon)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Storage: Western Digital - Red Pro 6TB 3.5" 7200RPM Internal Hard Drive ($330.00)
Video Card: EVGA - GeForce GT 730 2GB Video Card ($67.09 @ Amazon)
Case: Fractal Design - Define XL R2 (Black Pearl) ATX Full Tower Case ($109.99 @ SuperBiiz)
Power Supply: EVGA - SuperNOVA G2 750W 80+ Gold Certified Fully-Modular ATX Power Supply ($112.89 @ OutletPC)
Other: Intel X540T2 Ethernet 10 Gbps Network Adapter
Other: Kingston 32GB (2 x 16GB) 288-Pin DDR4 SDRAM ECC Registered DDR4 2133 (PC4 17000) Server Memory Model KVR21R15D4K4
Total: $1438.33
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2017-08-25 16:14 EDT-0400
Please let me know of some things I can try or some more information I can provide. This host was running ESXi 6.0U3 fine for months without any issues. I am running the latest BIOS for my motherboard so it should have fairly recent microcode.
I am not super new to Linux, but I thought that the idea behind watchdog is to reboot a server once there is an error detected, not just warn about it? I may be mistaken.
Attachments
Last edited: