Authentication failed when migrating

jerim · May 30, 2012

I have four Proxmox nodes and I have some VMs setup and running on the first node. I migrate a VM from node one to either node two, three or four. I lose console connection even though it is an online migration. When I try to open the console back up, it gives these messages in this order:

Status: X509
Status: Plain Authentication
Error: Failed Authentication

If I move the VM back to the first node, the console will start working again.

jerim · Jun 14, 2012

Any ideas? I am still getting this message. I can't access any of the VMs that aren't running on my first node.

dietmar · Jun 14, 2012

After migrate you need to close the console window, then open a new one.

jerim · Jun 14, 2012

I have tried that. I have even tried stopping the VM after migration and starting it back up.
If I move the VM back to Node 1 then the console window will come up.

dietmar · Jun 14, 2012

any hint in /var/log/syslog or /var/log/apache/error.log?

jerim · Jun 15, 2012

/var/log/syslog
Jun 14 17:06:09 KNTCLCN002 qm[33961]: start VM 101: UPID:KNTCLCN002:000084A9:006528CC:4FDA6051:qmstart:101:root@pam:
Jun 14 17:06:09 KNTCLCN002 qm[33958]: <root@pam> starting task UPID:KNTCLCN002:000084A9:006528CC:4FDA6051:qmstart:101:root@pam:
Jun 14 17:06:09 KNTCLCN002 kernel: device tap101i0 entered promiscuous mode
Jun 14 17:06:09 KNTCLCN002 kernel: vmbr0: port 2(tap101i0) entering forwarding state
Jun 14 17:06:09 KNTCLCN002 kernel: New device tap101i0 does not support netpoll
Jun 14 17:06:09 KNTCLCN002 kernel: Disabling netpoll for vmbr0
Jun 14 17:06:09 KNTCLCN002 qm[33958]: <root@pam> end task UPID:KNTCLCN002:000084A9:006528CC:4FDA6051:qmstart:101:root@pam: OK
Jun 14 17:06:19 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:20 KNTCLCN002 kernel: tap101i0: no IPv6 routers present
Jun 14 17:06:20 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:26 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:26 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:27 KNTCLCN002 ntpd[1326]: Listen normally on 12 tap101i0 fe80::4c07:d2ff:fed1:ecf3 UDP 123
Jun 14 17:06:31 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:31 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:33 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:33 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:36 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:40 KNTCLCN002 rgmanager[34078]: [pvevm] VM 101 is running
Jun 14 17:06:41 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:41 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:49 KNTCLCN002 pvedaemon[2407]: authentication failure; rhost= user=root@pam msg=Authentication failure
Jun 14 17:06:49 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:06:50 KNTCLCN002 rgmanager[34128]: [pvevm] VM 101 is running
Jun 14 17:07:06 KNTCLCN002 pmxcfs[1425]: [status] notice: received log
Jun 14 17:07:20 KNTCLCN002 rgmanager[34243]: [pvevm] VM 101 is running

/var/log/apache/error.log
[Wed Jun 13 22:41:37 2012] [warn] RSA server certificate CommonName (CN) `KNTCLCN002.XXXXXXX.com' does NOT match server name!?
[Wed Jun 13 22:41:37 2012] [warn] RSA server certificate CommonName (CN) `KNTCLCN002.XXXXXXX.com' does NOT match server name!?
[Wed Jun 13 22:41:38 2012] [warn] RSA server certificate CommonName (CN) `KNTCLCN002.XXXXXXX.com' does NOT match server name!?
[Wed Jun 13 22:41:38 2012] [warn] RSA server certificate CommonName (CN) `KNTCLCN002.XXXXXXX.com' does NOT match server name!?
[Wed Jun 13 22:41:38 2012] [notice] Apache/2.2.16 (Debian) mod_ssl/2.2.16 OpenSSL/0.9.8o mod_perl/2.0.4 Perl/v5.10.1 configured -- resuming normal operations

dietmar · Jun 15, 2012

Maybe something wrong with the certificates. Try:

# pvecm -f updatecerts

The restart that server.

jerim · Jun 19, 2012

I tried that. I am still having the same issue. I do have the Proxmox nodes setup in a HA cluster.

jerim · Jun 21, 2012

I tried an offline migration. Proxmox says:
Executing HA migrate for VM 100 to node KNTCLCN003
Trying to migrate pvevm:100 to KNTCLCN003...Temporary failure; try again
TASK ERROR: command 'clusvcadm -M pvevm:100 -m KNTCLCN003' failed: exit code 250

When I execute the command "clusvcadm -M pvecm:100 -m KNTCLCN003" directly into the console, I get:
Trying to migrate pvecm:100 to KNTCLCN003...Service does not exist

dietmar · Jun 21, 2012

What is the output of

# cat /etc/pve/cluster.conf

and

# clustat -x

jerim · Jun 21, 2012

On all four of my nodes, /etc/pve/cluster.conf is as follows:
<?xml version="1.0"?>
<cluster name="Cluster" config_version="6">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey">
</cman>
<fencedevices>
<fencedevice agent="fence_ipmilan" name="KNTCLCN001" lanplus="1" ipaddr="10.89.99.51" login="root" passwd="password" power_wait="5"/>
<fencedevice agent="fence_ipmilan" name="KNTCLCN002" lanplus="1" ipaddr="10.89.99.50" login="root" passwd="password" power_wait="5"/>
<fencedevice agent="fence_ipmilan" name="KNTCLCN003" lanplus="1" ipaddr="10.89.99.49" login="root" passwd="password" power_wait="5"/>
<fencedevice agent="fence_ipmilan" name="KNTCLCN004" lanplus="1" ipaddr="10.89.99.48" login="root" passwd="password" power_wait="5"/>
</fencedevices>

<clusternodes>
<clusternode name="KNTCLCN001" votes="1" nodeid="1">
<fence>
<method name="1">
<device name="KNTCLCN001"/>
</method>
</fence>
</clusternode>
<clusternode name="KNTCLCN002" votes="1" nodeid="2">
<fence>
<method name="1">
<device name="KNTCLCN002"/>
</method>
</fence>
</clusternode>
<clusternode name="KNTCLCN003" votes="1" nodeid="3">
<fence>
<method name="1">
<device name="KNTCLCN003"/>
</method>
</fence>
</clusternode>
<clusternode name="KNTCLCN004" votes="1" nodeid="4">
<fence>
<method name="1">
<device name="KNTCLCN004"/>
</method>
</fence>
</clusternode>

</clusternodes>
<rm>
<service autostart="1" exclusive="0" name="ha_test_ip" recovery="relocate">
<ip address="10.89.99.30"/>
</service>
<pvevm autostart="1" vmid="100"/>
<pvevm autostart="1" vmid="101"/>
</rm>
</cluster>

On the node KNTCLCN001 which works, clustat -x:
<?xml version="1.0"?>
<clustat version="4.1.1">
<cluster name="Cluster" id="11316" generation="108"/>
<quorum quorate="1" groupmember="1"/>
<nodes>
<node name="KNTCLCN001" state="1" local="1" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000001"/>
<node name="KNTCLCN003" state="1" local="0" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000002"/>
<node name="KNTCLCN004" state="1" local="0" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000003"/>
</nodes>
<groups>
<group name="pvevm:100" state="112" state_str="started" flags="0" flags_str="" owner="KNTCLCN003" last_owner="KNTCLCN004" restarts="3" last_transition="1340250847" last_transition_str="Wed Jun 20 22:54:07 2012"/>
<group name="pvevm:101" state="112" state_str="started" flags="0" flags_str="" owner="KNTCLCN003" last_owner="KNTCLCN004" restarts="0" last_transition="1340250893" last_transition_str="Wed Jun 20 22:54:53 2012"/>
</groups>
</clustat>

On the node KNTCLCN003, which doesn't work. clustat -x:
<clustat version="4.1.1">
<cluster name="Cluster" id="11316" generation="108"/>
<quorum quorate="1" groupmember="1"/>
<nodes>
<node name="KNTCLCN001" state="1" local="0" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000001"/>
<node name="KNTCLCN003" state="1" local="1" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000002"/>
<node name="KNTCLCN004" state="1" local="0" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000003"/>
</nodes>
<groups>
<group name="pvevm:100" state="112" state_str="started" flags="0" flags_str="" owner="KNTCLCN003" last_owner="KNTCLCN004" restarts="3" last_transition="1340251271" last_transition_str="Wed Jun 20 23:01:11 2012"/>
<group name="pvevm:101" state="112" state_str="started" flags="0" flags_str="" owner="KNTCLCN003" last_owner="KNTCLCN004" restarts="0" last_transition="1340250893" last_transition_str="Wed Jun 20 22:54:53 2012"/>
</groups>
</clustat>

On the node KNTCLCN004, which doesn't work, clutstat -x:
<?xml version="1.0"?>
<clustat version="4.1.1">
<cluster name="Cluster" id="11316" generation="108"/>
<quorum quorate="1" groupmember="1"/>
<nodes>
<node name="KNTCLCN001" state="1" local="0" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000001"/>
<node name="KNTCLCN003" state="1" local="0" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000002"/>
<node name="KNTCLCN004" state="1" local="1" estranged="0" rgmanager="1" rgmanager_master="0" qdisk="0" nodeid="0x00000003"/>
</nodes>
<groups>
<group name="pvevm:100" state="112" state_str="started" flags="0" flags_str="" owner="KNTCLCN003" last_owner="KNTCLCN004" restarts="3" last_transition="1340251271" last_transition_str="Wed Jun 20 23:01:11 2012"/>
<group name="pvevm:101" state="112" state_str="started" flags="0" flags_str="" owner="KNTCLCN003" last_owner="KNTCLCN004" restarts="0" last_transition="1340250893" last_transition_str="Wed Jun 20 22:54:53 2012"/>
</groups>
</clustat>

dietmar · Jun 21, 2012

That looks correct (no more ideas).

jerim · Jun 22, 2012

I noticed this in the Proxmox interface under syslog for KNTCLCN003:

Jun 21 23:41:54 KNTCLCN003 pmxcfs[1416]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/101: -1
Jun 21 23:42:04 KNTCLCN003 pmxcfs[1416]: [status] notice: RRDC update error /var/lib/rrdcached/db/pve2-vm/101: -1
Jun 21 23:42:14 KNTCLCN003 rrdcached[1362]: queue_thread_main: rrd_update_r (/var/lib/rrdcached/db/pve2-vm/101) failed with status -1. (/var/lib/rrdcached/db/pve2-vm/101: illegal attempt to update using time 1340339867 when last update time is 1340340124 (minimum one second step))
Jun 21 23:42:15 KNTCLCN003 pmxcfs[1416]: [status] notice: received log
Jun 21 23:42:15 KNTCLCN003 pmxcfs[1416]: [status] notice: received log
Jun 21 23:42:19 KNTCLCN003 rgmanager[24812]: [pvevm] VM 101 is running
Jun 21 23:42:22 KNTCLCN003 pvedaemon[24666]: authentication failure; rhost= user=root@pam msg=Authentication failure
Jun 21 23:42:22 KNTCLCN003 pmxcfs[1416]: [status] notice: received log
Jun 21 23:42:31 KNTCLCN003 pvedaemon[2365]: worker 17081 finished
Jun 21 23:42:31 KNTCLCN003 pvedaemon[2365]: starting 1 worker(s)
Jun 21 23:42:31 KNTCLCN003 pvedaemon[2365]: worker 24866 started
Jun 21 23:42:39 KNTCLCN003 rgmanager[24887]: [pvevm] VM 101 is running
Jun 21 23:42:49 KNTCLCN003 rgmanager[24937]: [pvevm] VM 101 is running
Jun 21 23:43:19 KNTCLCN003 rgmanager[25027]: [pvevm] VM 101 is running
Jun 21 23:43:30 KNTCLCN003 kernel: kvm: emulating exchange as write
Jun 21 23:43:39 KNTCLCN003 rgmanager[25061]: [pvevm] VM 101 is running
Jun 21 23:43:49 KNTCLCN003 rgmanager[25088]: [pvevm] VM 101 is running

That got me to thinking. I placed all my nodes in a cluster before accessing the Proxmox interface. I accessed Proxmox Web Interface from the IP address of the first node in the cluster. I never went to the other nodes. It never did ask me to log into nodes KNTCLCN003 and KNTCLCN004 when I logged into the first node. Maybe because nodes 3 and 4 never asked for a username/password, then I am not authenticated on them as far as proxmox is concerned?

So I go to the IP address of each node and log in. Still no luck. Then I tried migrating a VM from KNTCLCN001 over to KNTCLCN003. I then went to the IP address of KNTCLCN003 and was able to pull up the console for the VM. So apparently, I can only access the VM from the node that it is running on. Odd, but at least there is a solution.

dietmar · Jun 22, 2012

jerim said:
Jun 21 23:42:14 KNTCLCN003 rrdcached[1362]: queue_thread_main: rrd_update_r (/var/lib/rrdcached/db/pve2-vm/101) failed with status -1. (/var/lib/rrdcached/db/pve2-vm/101: illegal attempt to update using time 1340339867 when last update time is 1340340124 (minimum one second step))

Seems there is something wrong with the time on that host? Please make sure that all nodes have correct time.

Search

Search

Authentication failed when migrating

jerim

Guest

jerim

Guest

dietmar

Proxmox Staff Member

jerim

Guest

dietmar

Proxmox Staff Member

jerim

Guest

dietmar

Proxmox Staff Member

jerim

Guest

jerim

Guest

dietmar

Proxmox Staff Member

jerim

Guest

dietmar

Proxmox Staff Member

jerim

Guest

dietmar

Proxmox Staff Member

We value your privacy