ha-Questions in pve4

MasterTH

Renowned Member
Jun 12, 2009
224
7
83
www.sonog.de
Hi,

i installed two nodes with proxmox 4 for testing the new features and prepare myself to migrate the older clusters to the new version. So i'm currently trying to break the cluster.

one thing i faced is that a lxc was in an error-state. i didn't know what i can do and deleted it. This was not good i know, i should be able to bring the vm back from the error-state. (Log file: "service ct:100 is not running and in an error state")
The question is, in proxmox 2 and 3 we got the rgmanager and it's clusvcadm to disable a pvevm:xxx and reenable it when the vm is not booted because of an error state. Whats the tool in pve4?
What can cause a vm to be in error-state?
What can i do when a vm is in this state?

I tried to migrate the lxc from one to another host (disk stored on nfs), it failed. Then i rebooted the server where the vm was running to force the migration


kind regards
 
Hi, I have the same problem, I ve forgot to detach local CDROM from VM and it originated errors in a HA migration test with proxmox 4. Now the VM is in a error state and I cant find a way to clear that state. The VM no longer starts. Please someone how to clear VM error states in PVE4? Luis Miguel
Hi, i installed two nodes with proxmox 4 for testing the new features and prepare myself to migrate the older clusters to the new version. So i'm currently trying to break the cluster. one thing i faced is that a lxc was in an error-state. i didn't know what i can do and deleted it. This was not good i know, i should be able to bring the vm back from the error-state. (Log file: "service ct:100 is not running and in an error state") The question is, in proxmox 2 and 3 we got the rgmanager and it's clusvcadm to disable a pvevm:xxx and reenable it when the vm is not booted because of an error state. Whats the tool in pve4? What can cause a vm to be in error-state? What can i do when a vm is in this state? I tried to migrate the lxc from one to another host (disk stored on nfs), it failed. Then i rebooted the server where the vm was running to force the migration kind regards
 
Hi,

what to do in an error state is documented in the ha-manager man page:
$ man ha-manager

Code:
ERROR RECOVERY
       If after all tries the service state could not be recovered it gets placed in an error state. In this
       state the service won't get touched by the HA stack anymore.  To recover from this state you should
       follow these steps:

       ·   bring the resource back into an safe and consistent state (e.g: killing its process)

       ·   disable the ha resource to place it in an stopped state

       ·   fix the error which led to this failures

       ·   after you fixed all errors you may enable the service again

So simply disable the resource to place it in an stopped state.

I updated the PVE 4 HA wiki entry also, as it seems that people look rather there than in man pages (both should be used :) ).
 
Thank you Thomas!, Ive found that text in the manual :) but I didnt know how to disable the resource now i found it ha-manager disable vm:101 and its working again. Thanks!
Hi, what to do in an error state is documented in the ha-manager man page: $ man ha-manager
Code:
 ERROR RECOVERY        If after all tries the service state could not be recovered it gets placed in an error state. In this        state the service won't get touched by the HA stack anymore.  To recover from this state you should        follow these steps:         ·   bring the resource back into an safe and consistent state (e.g: killing its process)         ·   disable the ha resource to place it in an stopped state         ·   fix the error which led to this failures         ·   after you fixed all errors you may enable the service again
So simply disable the resource to place it in an stopped state. I updated the PVE 4 HA wiki entry also, as it seems that people look rather there than in man pages (both should be used :) ).
 
Sorry to update this old post,
But I have a very similar problem, but it keeps coming back without any reason.
One of my VMs goes to HA error state, I don't know why, then I disable it from HA and adding it again, it gets fixed.
But everything will happen again day after.
How can I find the cause of this weird behavior ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!