PRMOXMOX node goes offline randomly

Hello all, I'm pretty new to Proxmox and KVM/OpenVZ in general. When I log into teh web interface I get alerts that the node is offline sometimes. I can still access the consoles from the web gui on both KVM guests and OpenVZ containers but I can't create new images or guests. A rebbot of the whole server usually resolves the issue but I can't always keep rebooting the machine.

It just happened right before I posted this. Here is the relevant portion of my syslog:

Code:
Oct 31 06:25:01 proxmox rsyslogd: [origin software="rsyslogd" swVersion="4.6.4" x-pid="1237" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
Oct 31 06:28:43 proxmox rrdcached[1348]: flushing old values
Oct 31 06:28:43 proxmox rrdcached[1348]: rotating journals
Oct 31 06:28:43 proxmox rrdcached[1348]: started new journal /var/lib/rrdcached/journal//rrd.journal.1351679323.568538
Oct 31 06:28:43 proxmox rrdcached[1348]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1351672123.568544
Oct 31 07:17:01 proxmox /USR/SBIN/CRON[55510]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Oct 31 07:28:43 proxmox rrdcached[1348]: flushing old values
Oct 31 07:28:43 proxmox rrdcached[1348]: rotating journals
Oct 31 07:28:43 proxmox rrdcached[1348]: started new journal /var/lib/rrdcached/journal//rrd.journal.1351682923.568499
Oct 31 07:28:43 proxmox rrdcached[1348]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1351675723.568533
Oct 31 08:17:01 proxmox /USR/SBIN/CRON[61163]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Oct 31 08:28:43 proxmox rrdcached[1348]: flushing old values
Oct 31 08:28:43 proxmox rrdcached[1348]: rotating journals
Oct 31 08:28:43 proxmox rrdcached[1348]: started new journal /var/lib/rrdcached/journal//rrd.journal.1351686523.568536
Oct 31 08:28:43 proxmox rrdcached[1348]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1351679323.568538
Oct 31 08:32:18 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:32:25 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:32:25 proxmox pvedaemon[62630]: starting vnc proxy UPID:proxmox:0000F4A6:003748D4:50911A59:vncproxy:103:root@pam:
Oct 31 08:32:25 proxmox pvedaemon[1566]: <root@pam> starting task UPID:proxmox:0000F4A6:003748D4:50911A59:vncproxy:103:root@pam:
Oct 31 08:32:35 proxmox pvedaemon[62630]: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 103 2>/dev/null'' failed: exit code 1
Oct 31 08:32:35 proxmox pvedaemon[1566]: <root@pam> end task UPID:proxmox:0000F4A6:003748D4:50911A59:vncproxy:103:root@pam: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 103 2>/dev/null'' failed: exit code 1
Oct 31 08:32:45 proxmox pvedaemon[62647]: starting vnc proxy UPID:proxmox:0000F4B7:00375073:50911A6D:vncproxy:103:root@pam:
Oct 31 08:32:45 proxmox pvedaemon[1566]: <root@pam> starting task UPID:proxmox:0000F4B7:00375073:50911A6D:vncproxy:103:root@pam:
Oct 31 08:32:45 proxmox pvedaemon[1566]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:32:55 proxmox pvedaemon[62647]: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 103 2>/dev/null'' failed: exit code 1
Oct 31 08:32:55 proxmox pvedaemon[1566]: <root@pam> end task UPID:proxmox:0000F4B7:00375073:50911A6D:vncproxy:103:root@pam: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 103 2>/dev/null'' failed: exit code 1
Oct 31 08:33:22 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:33:27 proxmox pvedaemon[62727]: starting vnc proxy UPID:proxmox:0000F507:003760D3:50911A97:vncproxy:103:root@pam:
Oct 31 08:33:27 proxmox pvedaemon[1568]: <root@pam> starting task UPID:proxmox:0000F507:003760D3:50911A97:vncproxy:103:root@pam:
Oct 31 08:33:27 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:33:35 proxmox pvedaemon[1566]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:48:18 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 08:48:27 proxmox pvedaemon[1566]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:03:17 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:03:27 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:17:01 proxmox /USR/SBIN/CRON[66201]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Oct 31 09:18:17 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:18:27 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:18:36 proxmox kernel: vmbr1: port 5(tap103i0) entering disabled state
Oct 31 09:18:36 proxmox kernel: vmbr1: port 5(tap103i0) entering disabled state
Oct 31 09:18:36 proxmox pvedaemon[1568]: <root@pam> end task UPID:proxmox:0000F507:003760D3:50911A97:vncproxy:103:root@pam: OK
Oct 31 09:18:52 proxmox pvedaemon[1568]: <root@pam> update VM 103: -net0 virtio=12:74:B4:49:BB:25,bridge=vmbr1
Oct 31 09:18:57 proxmox pvedaemon[66281]: start VM 103: UPID:proxmox:000102E9:003B8B59:50912541:qmstart:103:root@pam:
Oct 31 09:18:57 proxmox pvedaemon[1565]: <root@pam> starting task UPID:proxmox:000102E9:003B8B59:50912541:qmstart:103:root@pam:
Oct 31 09:18:58 proxmox kernel: device tap103i0 entered promiscuous mode
Oct 31 09:18:58 proxmox kernel: vmbr1: port 5(tap103i0) entering forwarding state
Oct 31 09:18:58 proxmox pvedaemon[1565]: <root@pam> end task UPID:proxmox:000102E9:003B8B59:50912541:qmstart:103:root@pam: OK
Oct 31 09:18:58 proxmox pvedaemon[66310]: starting vnc proxy UPID:proxmox:00010306:003B8BC8:50912542:vncproxy:103:root@pam:
Oct 31 09:18:58 proxmox pvedaemon[1566]: <root@pam> starting task UPID:proxmox:00010306:003B8BC8:50912542:vncproxy:103:root@pam:
Oct 31 09:19:02 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:19:08 proxmox kernel: tap103i0: no IPv6 routers present
Oct 31 09:19:26 proxmox ntpd[1315]: Listen normally on 18 tap103i0 fe80::9c41:67ff:fee1:2954 UDP 123
Oct 31 09:19:26 proxmox ntpd[1315]: Deleting interface #17 tap103i0, fe80::dcac:43ff:fe5e:ba5#123, interface stats: received=0, sent=0, dropped=0, active_time=36900 secs
Oct 31 09:21:54 proxmox kernel: vmbr1: port 5(tap103i0) entering disabled state
Oct 31 09:21:54 proxmox kernel: vmbr1: port 5(tap103i0) entering disabled state
Oct 31 09:21:54 proxmox pvedaemon[1566]: <root@pam> end task UPID:proxmox:00010306:003B8BC8:50912542:vncproxy:103:root@pam: OK
Oct 31 09:22:03 proxmox pvedaemon[1565]: <root@pam> update VM 103: -net0 e1000=12:74:B4:49:BB:25,bridge=vmbr1
Oct 31 09:23:07 proxmox pvedaemon[1568]: VM 103 monitor command failed - VM 103 not running
Oct 31 09:24:26 proxmox ntpd[1315]: Deleting interface #18 tap103i0, fe80::9c41:67ff:fee1:2954#123, interface stats: received=0, sent=0, dropped=0, active_time=300 secs
Oct 31 09:24:46 proxmox pvedaemon[67090]: starting vnc proxy UPID:proxmox:00010612:003C13A8:5091269E:vncproxy:103:root@pam:
Oct 31 09:24:46 proxmox pvedaemon[1565]: <root@pam> starting task UPID:proxmox:00010612:003C13A8:5091269E:vncproxy:103:root@pam:
Oct 31 09:24:46 proxmox pvedaemon[1566]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:24:47 proxmox pvedaemon[67090]: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 103 2>/dev/null'' failed: exit code 111
Oct 31 09:24:47 proxmox pvedaemon[1565]: <root@pam> end task UPID:proxmox:00010612:003C13A8:5091269E:vncproxy:103:root@pam: command '/bin/nc -l -p 5900 -w 10 -c '/usr/sbin/qm vncproxy 103 2>/dev/null'' failed: exit code 111
Oct 31 09:24:52 proxmox pvedaemon[67111]: start VM 103: UPID:proxmox:00010627:003C1609:509126A4:qmstart:103:root@pam:
Oct 31 09:24:52 proxmox pvedaemon[1568]: <root@pam> starting task UPID:proxmox:00010627:003C1609:509126A4:qmstart:103:root@pam:
Oct 31 09:24:53 proxmox kernel: device tap103i0 entered promiscuous mode
Oct 31 09:24:53 proxmox kernel: vmbr1: port 5(tap103i0) entering forwarding state
Oct 31 09:24:53 proxmox pvedaemon[1568]: <root@pam> end task UPID:proxmox:00010627:003C1609:509126A4:qmstart:103:root@pam: OK
Oct 31 09:24:53 proxmox pvedaemon[67137]: starting vnc proxy UPID:proxmox:00010641:003C1677:509126A5:vncproxy:103:root@pam:
Oct 31 09:24:53 proxmox pvedaemon[1568]: <root@pam> starting task UPID:proxmox:00010641:003C1677:509126A5:vncproxy:103:root@pam:
Oct 31 09:24:57 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:25:03 proxmox kernel: tap103i0: no IPv6 routers present
Oct 31 09:28:10 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:28:10 proxmox pvedaemon[1566]: <root@pam> starting task UPID:proxmox:000107E2:003C636B:5091276A:vncproxy:100:root@pam:
Oct 31 09:28:10 proxmox pvedaemon[67554]: starting openvz vnc proxy UPID:proxmox:000107E2:003C636B:5091276A:vncproxy:100:root@pam:
Oct 31 09:28:11 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:28:23 proxmox pvedaemon[1566]: <root@pam> end task UPID:proxmox:000107E2:003C636B:5091276A:vncproxy:100:root@pam: OK
Oct 31 09:28:43 proxmox rrdcached[1348]: flushing old values
Oct 31 09:28:43 proxmox rrdcached[1348]: rotating journals
Oct 31 09:28:43 proxmox rrdcached[1348]: started new journal /var/lib/rrdcached/journal//rrd.journal.1351690123.569516
Oct 31 09:28:43 proxmox rrdcached[1348]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1351682923.568499
Oct 31 09:29:26 proxmox ntpd[1315]: Listen normally on 19 tap103i0 fe80::fc33:feff:fe21:6f46 UDP 123
Oct 31 09:32:31 proxmox pvedaemon[1563]: worker 1566 finished
Oct 31 09:32:31 proxmox pvedaemon[1563]: starting 1 worker(s)
Oct 31 09:32:31 proxmox pvedaemon[1563]: worker 68370 started
Oct 31 09:33:18 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:33:27 proxmox pvedaemon[1565]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:33:32 proxmox pvedaemon[1568]: <root@pam> end task UPID:proxmox:00010641:003C1677:509126A5:vncproxy:103:root@pam: OK
Oct 31 09:34:26 proxmox pvedaemon[1565]: authentication failure; rhost=10.10.10.6 user=root@pam msg=Authentication failure
Oct 31 09:34:31 proxmox pvedaemon[1568]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:34:46 proxmox pvedaemon[68416]: starting service corosync: UPID:proxmox:00010B40:003CFDFE:509128F6:srvstart:corosync:root@pam:
Oct 31 09:34:46 proxmox pvedaemon[1565]: <root@pam> starting task UPID:proxmox:00010B40:003CFDFE:509128F6:srvstart:corosync:root@pam:
Oct 31 09:34:46 proxmox pvedaemon[1565]: <root@pam> end task UPID:proxmox:00010B40:003CFDFE:509128F6:srvstart:corosync:root@pam: OK
Oct 31 09:34:57 proxmox pvedaemon[1563]: worker 1565 finished
Oct 31 09:34:57 proxmox pvedaemon[1563]: starting 1 worker(s)
Oct 31 09:34:57 proxmox pvedaemon[1563]: worker 68433 started
Oct 31 09:35:09 proxmox pvedaemon[1563]: worker 1568 finished
Oct 31 09:35:09 proxmox pvedaemon[1563]: starting 1 worker(s)
Oct 31 09:35:09 proxmox pvedaemon[1563]: worker 68440 started
Oct 31 09:35:11 proxmox pvedaemon[68441]: starting service rgmanager: UPID:proxmox:00010B59:003D07A0:5091290F:srvstart:rgmanager:root@pam:
Oct 31 09:35:11 proxmox pvedaemon[68370]: <root@pam> starting task UPID:proxmox:00010B59:003D07A0:5091290F:srvstart:rgmanager:root@pam:
Oct 31 09:35:11 proxmox pvedaemon[68370]: <root@pam> end task UPID:proxmox:00010B59:003D07A0:5091290F:srvstart:rgmanager:root@pam: OK
Oct 31 09:35:34 proxmox pvedaemon[68445]: re-starting service pvedaemon: UPID:proxmox:00010B5D:003D10C7:50912926:srvrestart:pvedaemon:root@pam:
Oct 31 09:35:34 proxmox pvedaemon[68440]: <root@pam> starting task UPID:proxmox:00010B5D:003D10C7:50912926:srvrestart:pvedaemon:root@pam:
Oct 31 09:35:34 proxmox pvedaemon[1563]: received terminate request
Oct 31 09:35:34 proxmox pvedaemon[1563]: worker 68433 finished
Oct 31 09:35:34 proxmox pvedaemon[1563]: worker 68370 finished
Oct 31 09:35:34 proxmox pvedaemon[1563]: worker 68440 finished
Oct 31 09:35:34 proxmox pvedaemon[1563]: server closing
Oct 31 09:35:37 proxmox pvedaemon[68450]: starting server
Oct 31 09:35:37 proxmox pvedaemon[68450]: starting 3 worker(s)
Oct 31 09:35:37 proxmox pvedaemon[68450]: worker 68452 started
Oct 31 09:35:37 proxmox pvedaemon[68450]: worker 68453 started
Oct 31 09:35:37 proxmox pvedaemon[68450]: worker 68454 started
Oct 31 09:36:00 proxmox pvedaemon[68465]: starting vnc proxy UPID:proxmox:00010B71:003D1AEC:50912940:vncproxy:103:root@pam:
Oct 31 09:36:00 proxmox pvedaemon[68453]: <root@pam> starting task UPID:proxmox:00010B71:003D1AEC:50912940:vncproxy:103:root@pam:
Oct 31 09:36:00 proxmox pvedaemon[68452]: <root@pam> successful auth for user 'root@pam'
Oct 31 09:36:09 proxmox pvedaemon[68453]: <root@pam> successful auth for user 'root@pam'

any help is appreciated.
 
I have the same problem. Here is my output on pveversion -v

root@proxmox2:~# pveversion -v
pve-manager: 2.2-31 (pve-manager/2.2/e94e95e9)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-33
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1
root@proxmox2:~#

Anyone????
 
This problem after 2.2 update. If I reboot server some time it works.
my pveversion -v
pve-manager: 2.2-26 (pve-manager/2.2/c1614c8c)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-80
pve-kernel-2.6.32-10-pve: 2.6.32-63
pve-kernel-2.6.32-16-pve: 2.6.32-80
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-1
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-28
qemu-server: 2.0-64
pve-firmware: 1.0-21
libpve-common-perl: 1.0-37
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-34
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1
 
The same problem I think: My thread "Proxmox Ve 2.2 2 node cluster: NODES restarting now and then!?"

pveversion -v
pve-manager: 2.2-31 (pve-manager/2.2/e94e95e9)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-33
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1

Did you find a solution?
 
Same issue yesterday,
The main Proxmox IPs run Ok, but all the virtual machines lost connection.
I need stop/start the virtual machine to recover the network.
Regards,



root@pveabs1:~# pveversion -v
pve-manager: 2.2-31 (pve-manager/2.2/e94e95e9)
running kernel: 2.6.32-16-pve
proxmox-ve-2.6.32: 2.2-82
pve-kernel-2.6.32-16-pve: 2.6.32-82
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-33
qemu-server: 2.0-69
pve-firmware: 1.0-21
libpve-common-perl: 1.0-39
libpve-access-control: 1.0-25
libpve-storage-perl: 2.0-36
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.2-7
ksm-control-daemon: 1.1-1
 
I have the same problem. I hope somebody has already created a ticket for this issue because I am seeing many people recently having this same problem. I have just updated to the latest version 2.2 did all the updates and hoping that will fix it. If not maybe I should go back to previous version maybe 2.0 where I didn't notice these problems. What do you guys think?Maybe we should stick to old stable version for now...