[SOLVED] GUI - Node Summary - no graphs shown anymore

bodowalter

New Member
Feb 25, 2016
3
0
1
58
I do have set up a 3 node cluster and all went fine. No issues so far.
Now there is the need to take out the cluster creation node for maintenance. I need to reinstall it, following the advice at https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster#Re-installing_a_cluster_node

After I stopped the three services
# systemctl stop pvestatd.service
# systemctl stop pvedaemon.service
# systemctl stop pve-cluster.service
the graphs stopped being updated.

What to do to get this functionality working again? Unfortunately rebooting the remaining nodes is not an option. ;)

Thanks in advance
Bodo
 
I do have set up a 3 node cluster and all went fine. No issues so far.
Now there is the need to take out the cluster creation node for maintenance. I need to reinstall it, following the advice at https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster#Re-installing_a_cluster_node

After I stopped the three services
# systemctl stop pvestatd.service
# systemctl stop pvedaemon.service
# systemctl stop pve-cluster.service
the graphs stopped being updated.

What to do to get this functionality working again? Unfortunately rebooting the remaining nodes is not an option. ;)

Thanks in advance
Bodo
Hi Bodo,
your storage on the remaining nodes is fine? EG. "pvesm status" work without delay?

In this case restart the pve-services on the running nodes.

Udo
 
Hi Udo,
thanks for fast reply.

Unfortunately the command "pvesm status" has a huge delay ... seems like the command is dealing with some timeout, but shows no errors

root@Terminal4:~# pvesm status
ceph rbd 1 0 0 0 100.00%
local dir 1 47930248 1525452 43946992 3.85%
root@Terminal4:~#


But anyway - it seems to me that there might be an issue caused by the missing "1st node".
The ceph storage system is still seeking for this node:

root@Terminal4:~# ceph status
2016-03-14 16:11:39.450618 7fe08c66e700 0 -- :/1012275 >> 172.17.0.1:6789/0 pipe(0x7fe08405e2c0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fe08405a930).fault
cluster fe76edf9-8c3b-4cef-90e5-1240925da59a
health HEALTH_WARN
clock skew detected on mon.2, mon.3
31 pgs degraded
31 pgs stuck degraded
31 pgs stuck unclean
31 pgs stuck undersized
31 pgs undersized
recovery 47517/968730 objects degraded (4.905%)
1 mons down, quorum 1,2,3 1,2,3
Monitor clock skew detected
monmap e4: 4 mons at {0=172.17.0.1:6789/0,1=172.17.0.2:6789/0,2=172.17.0.3:6789/0,3=172.17.0.4:6789/0}
election epoch 298, quorum 1,2,3 1,2,3
osdmap e2254: 8 osds: 6 up, 6 in
pgmap v2488237: 320 pgs, 5 pools, 1837 GB data, 473 kobjects
3511 GB used, 8585 GB / 12097 GB avail
47517/968730 objects degraded (4.905%)
289 active+clean
31 active+undersized+degraded
client io 539 kB/s wr, 72 op/s
root@Terminal4:~#

I didn't take any action on ceph, when bringing down Terminal1 - supposed PROXMOX will take care on everything important. :rolleyes:

Any suggestions how to proceed?
 
Solved !!

The output of pvesm and the error message which was pointing on 172.17.0.1, the node which was brought down, leaded me in the right direction.

When you shut down your initial cluster node, which is "the first ceph monitor" of your cluster also, without having removed it - the file /etc/pve/storage.cfg is still referencing that monitor. Therefore PROXMOX tries to contact the monitor, while it should be aware it is down already. This is messing up statistics gathering somehow.

The issue was solved as soon as I removed the monitor from ceph and adjusted the files /etc/pve/storage.cfg and /etc/pve/ceph.conf as described at https://forum.proxmox.com/threads/remove-dead-ceph-monitor-after-removing-cluster-node.22761/

There was no need to restart any service. :)

Thanks for help.
Bodo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!