Major issue with proxmox

adamb

Famous Member
Mar 1, 2012
1,329
77
113
Hey all having a major issue with proxmox atm.

I have a virtual machine which the gui is saying is stopped but it is most definitely running. When I look at clustat it shows the VM is running and on node 1. The GUI shows the VM on node 2 but not running. All the other VM's are running AOK but this one. It so happens this is the first VM we have put into production, lucklily its running, but the gui says other wise.

I appreciate any input!
 
Last edited:
Looks like it is unable to find the configuration file. This VM hasn't been touched since day 1. Odd that the config file just disapeers.


root@proxmox2:~# qm start 101
Executing HA start for VM 101
Member proxmox2 trying to enable pvevm:101...Aborted; service failed
command 'clusvcadm -e pvevm:101 -m proxmox2' failed: exit code 254



root@proxmox2:~# clustat
Cluster Status for ccs @ Wed Jul 11 10:29:28 2012
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
proxmox1 1 Online, rgmanager
proxmox2 2 Online, Local, rgmanager

Service Name Owner (Last) State
------- ---- ----- ------ -----
pvevm:100 proxmox2 started
pvevm:101 (proxmox1) failed
pvevm:102 proxmox2 started
pvevm:103 proxmox2 started
pvevm:104 proxmox2 started


root@proxmox2:~# clusvcadm -e pvevm:101
Local machine trying to enable pvevm:101...Aborted; service failed
 
Last edited:
Looks like I have all the config files.

root@proxmox1:/etc/pve/nodes/proxmox2/qemu-server# ls -ltr
total 3
-rw-r----- 1 root www-data 207 Jul 11 08:31 104.conf
-rw-r----- 1 root www-data 257 Jul 11 08:32 103.conf
-rw-r----- 1 root www-data 194 Jul 11 09:36 101.conf
-rw-r----- 1 root www-data 208 Jul 11 10:19 102.conf
-rw-r----- 1 root www-data 152 Jul 11 10:19 100.conf
 
I was able to get the VM backup.

- Remove from HA
- Start
- Add to HA
- clusvcadm -e pvevm:101

Not sure why I keep hitting this same issue over and over.
 
Last edited:
You first claimed that 'it is unable to find the configuration file', but the showed that the file is there. Then you posted 'I was able to get the VM backup' - I assumed you have restored from backup? All in all this is quite confusing.
 
You first claimed that 'it is unable to find the configuration file', but the showed that the file is there. Then you posted 'I was able to get the VM backup' - I assumed you have restored from backup? All in all this is quite confusing.

When I rebooted the 2nd node, the gui on node 1 stated that it was unable to find the config file for VM 101. The funny thing is the VM actually migrated to node 1, but still showed as down on node2. I was able to find the config files manually from the command line which is what I provided in my previous post.

Once node 2 came back up, i rebooted node1. VM 101 would no longer come up and was in a failed state. I had to remove the VM from HA, start the VM, then enable it from the CLI. Once that was done I added it back to HA and all is well.

This is the third or forth time this has happened with a VM. We have no other issues other than this. It doesn't happen all the time, but every so often.
 
How do you manage quorum on your 2 node cluster (please post /etc/pve/cluster.conf)?

<?xml version="1.0"?>
<cluster config_version="45" name="ccs">
<cman expected_votes="1" keyfile="/var/lib/pve-cluster/corosync.authkey" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="10.80.12.126" lanplus="1" login="USERID" name="ipmi1" passwd="PASSW0RD" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="10.80.12.131" lanplus="1" login="USERID" name="ipmi2" passwd="PASSW0RD" power_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="proxmox1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipmi1"/>
</method>
</fence>
</clusternode>
<clusternode name="proxmox2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="ipmi2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="100"/>
<pvevm autostart="1" vmid="102"/>
<pvevm autostart="1" vmid="103"/>
<pvevm autostart="1" vmid="104"/>
<pvevm autostart="1" vmid="101"/>
</rm>
</cluster>
 
Hi All,

Sorry to bring this thread up because I was having this same problem on web browser I can see all my vm status says it is not running but verifying the vm server it was actually running. I don't know where to start how to sort this out. Im afraid if I restart this I would lose something.

Hope someone could help me this.

Note: Im not running on cluster as what the thread starter is using. I am only using one server.


Regards,
Rocel
 
Sorry to bring this thread up because I was having this same problem on web browser I can see all my vm status says it is not running but verifying the vm server it was actually running. I don't know where to start how to sort this out. Im afraid if I restart this I would lose something.

Please update to latest version.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!