VE 4.0 Kernel Panic on HP Proliant servers

Hi,another way could be to disable motherboard watchdog,to use the hp ilo watchdog by default.
Code:
 edit:  /etc/default/grubGRUB_CMDLINE_LINUX_DEFAULT="nmi_watchdog=0"#update-grub#reboot
Hi, with this way, and with stop the "service watchdog-mux stop" I have a kernel panic :confused:If I stop service watchdog-mux and put in black list hpwdt, HA doesnt't work fine. If I power off a node, a VM is in frezee state, and no migrate. I have hp proliant dl 380 g6. Suggestions?????
 
Last edited:
Hi, with this way, and with stop the "service watchdog-mux stop" I have a kernel panic :confused:If I stop service watchdog-mux and put in black list hpwdt, HA doesnt't work fine. If I power off a node, a VM is in frezee state, and no migrate. I have hp proliant dl 380 g6. Suggestions?????
Hello, did you reebot the machine before the test?
you need what the machine is started before the Ha can do his work. you need shared storage for images too.
Code:
Important: note that before enabling HA for a service you should test it thoughtfully. See if migration works, look that NO local resources are used by it. Secure that it may run on all nodes defined by its group and even better on all cluster nodes.
Code:
Note: When you gracefully shutdown a node, it services won't get migrated by the HA stack. You have to migrate them manually before you power off your node (for example for hardware maintenance).
more info: https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x
Test if it works for you this:
HA enabled on VMID, and VM is started too, you remove the network cable on node, check if ha works or not :)
what version of proxmox do you have?
sorry for my bad english :(
Ha is working for me on HP Proliand G5 360L and HP ProLiant SE316M1 G6
I hope this helps!! :)
 
Hello everybody! this is my first post on forum.proxmox.
Thank you for this post, and the help.
i tested this on HP proliant Servers, ILO+Watchdog on linux produces kernel panic,when you use HA on proxmox.
But you can solve doing this: the modules what produces this is hpwdt. you must do on each hp node:
Code:
lsmod|grep hpwdt (you check that module is loaded)
Stop the service watchdog-mux
Code:
 service watchdog-mux stop
Add the module on blacklist:
Code:
nano /etc/modprobe.d/pve-blacklist
Write on file the next:
Code:
  blacklist hpwdt
Save the file and reboot
Code:
reboot
Check again what the module don´t load now.
Code:
 lsmod|grep hpwdt
My configuration: 2 servers Hp proliant + 1 other machine with proxmox 4. HA is working now, :)

I am also running into the same issue. I blacklisted the module and all seems to be well. However, the blacklist procedure was off a bit.

Instead of
Code:
nano /etc/modprobe.d/pve-blacklist

I had to do
Code:
nano /etc/modprobe.d/pve-blacklist.conf
 
Hi All
Update on this topic:
hpwdt will be disabled by default in the next PVE kernel releases
( similar to what is doing Ubuntu)

Thanks for bringing up the issue !
 
I am also running into the same issue. I blacklisted the module and all seems to be well. However, the blacklist procedure was off a bit. Instead of
Code:
nano /etc/modprobe.d/pve-blacklist
I had to do
Code:
nano /etc/modprobe.d/pve-blacklist.conf
Yes, Thank you, you are reason.
I edited my post,solved :)
 
Last edited:
Hi All,

Have any one tried latest PVE version 4.0-64 ?
Does it resolve this issue ?

Thanks,
H
 
Hi, with this way, and with stop the "service watchdog-mux stop" I have a kernel panic :confused:If I stop service watchdog-mux and put in black list hpwdt, HA doesnt't work fine. If I power off a node, a VM is in frezee state, and no migrate. I have hp proliant dl 380 g6. Suggestions?????

Same probleme,
HA doesnt't work fine. If I power off a node, a VM is in frezee state, and no migrate. I have hp proliant dl 380 g8. Suggestions?
 
Same probleme,
HA doesnt't work fine. If I power off a node, a VM is in frezee state, and no migrate. I have hp proliant dl 380 g8. Suggestions?
this is normal,
Code:
Note: When you gracefully shutdown a node, it services won't get migrated by the HA stack. You have to migrate them manually before you power off your node (for example for hardware maintenance).
More info : https://pve.proxmox.com/wiki/High_Availability_Cluster_4.x
 
even after an unforeseen electrical outage does not migrate VMs.
How can I test HA?

How did you determined that? What was your Test and if it fails then please attach logs from the node powering down and the current master (see "ha-manager status" to determine which node that is).

Note that we made some changes to the behaviour of a graceful shutdown/reboot and as a result the Services/Node will be fenced if the powered down node does not come back fast enough (about > 2 minutes with).

This is included in the pve-ha-manager package with version > 1.0-15, already available in the pvetest repo AFAIK.
 
Last edited:
Hi guys, i tested hpwdt module again on proxmox 4.4. on Proliant DL360 G7 i continue with bad news :D :

my test lab i did the next:

edit: /etc/default/grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="nmi_watchdog=0"
#update-grub
hpasmcli disable asr     #asr disabled now
#reboot
then


i added on /etc/default/pve-ha-manager
Code:
WATCHDOG_MODULE=hpwdt
#reboot
#service watchdog-mux status  (this is active)
#journalctl -u watchdog-mux

Jan 19 07:39:24 proxmoxtestlab systemd[1]: Starting Proxmox VE watchdog multiplexer...
Jan 19 07:39:24 proxmoxtestlab systemd[1]: Started Proxmox VE watchdog multiplexer.
Jan 19 07:39:24 proxmoxtestlab watchdog-mux[1205]: Loading watchdog module 'hpwdt'
Jan 19 07:39:24 proxmoxtestlab watchdog-mux[1205]: Watchdog driver 'HP iLO2+ HW Watchdog Timer', version 0

Code:
#echo "A" | socat - UNIX-CONNECT:/var/run/watchdog-mux.sock  kernel panic when i run this.


with asr enabled. no kernel panic, but it´s still not working.when i try echo "A" | socat - UNIX-CONNECT:/var/run/watchdog-mux.sock .nothing happens. And on the log of watchdog-mux "Client did not stop watchdog -disabled watchdog updates.

So, i had bad experiences with softdog and proxmox4.1 HA but... i will try test this again with proxmox 4.4.

Hope that this post help to the people!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!