Hello all,
I'm managing a couple of separate 3-node clusters. I have a service that pulls the /cluster/status via API, but I've noticed that in some cases the IP for a node is missing, usually when that node is offline.
Problematic status:
vs working status (with node2 actually offline in this case too)
These clusters are all the same, so I'm unsure what could be causing this.
The manual for /cluster/status says that the IP is "[node] IP of the resolved nodename.". Cluster manual says that it's using getaddrinfo() to resolve the hostname. I've confirmed that it's resolvable, using getent.
For the record, on each cluster, all nodes are in /etc/hosts.
corosync.conf always contains all nodes with their IP addresses etc.
Can someone shed some light here?
I'm managing a couple of separate 3-node clusters. I have a service that pulls the /cluster/status via API, but I've noticed that in some cases the IP for a node is missing, usually when that node is offline.
Problematic status:
Code:
root@pve2:~# pvesh get /cluster/status
┌───────────┬──────┬─────────┬───────────────┬───────┬───────┬────────┬───────┬────────┬─────────┬─────────┐
│ id │ name │ type │ ip │ level │ local │ nodeid │ nodes │ online │ quorate │ version │
╞═══════════╪══════╪═════════╪═══════════════╪═══════╪═══════╪════════╪═══════╪════════╪═════════╪═════════╡
│ cluster │ UNI │ cluster │ │ │ │ │ 3 │ │ 1 │ 3 │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve1 │ pve1 │ node │ │ │ 0 │ 1 │ │ 0 │ │ │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve2 │ pve2 │ node │ 169.254.0.102 │ │ 1 │ 2 │ │ 1 │ │ │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve3 │ pve3 │ node │ 169.254.0.103 │ │ 0 │ 3 │ │ 1 │ │ │
└───────────┴──────┴─────────┴───────────────┴───────┴───────┴────────┴───────┴────────┴─────────┴─────────┘
vs working status (with node2 actually offline in this case too)
Code:
root@pve1:~# pvesh get /cluster/status
┌───────────┬──────┬─────────┬───────────────┬───────┬───────┬────────┬───────┬────────┬─────────┬─────────┐
│ id │ name │ type │ ip │ level │ local │ nodeid │ nodes │ online │ quorate │ version │
╞═══════════╪══════╪═════════╪═══════════════╪═══════╪═══════╪════════╪═══════╪════════╪═════════╪═════════╡
│ cluster │ UNI │ cluster │ │ │ │ │ 3 │ │ 1 │ 9 │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve1 │ pve1 │ node │ 169.254.0.101 │ │ 1 │ 1 │ │ 1 │ │ │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve2 │ pve2 │ node │ 169.254.0.102 │ │ 0 │ 2 │ │ 0 │ │ │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve3 │ pve3 │ node │ 169.254.0.103 │ │ 0 │ 3 │ │ 1 │ │ │
These clusters are all the same, so I'm unsure what could be causing this.
The manual for /cluster/status says that the IP is "[node] IP of the resolved nodename.". Cluster manual says that it's using getaddrinfo() to resolve the hostname. I've confirmed that it's resolvable, using getent.
For the record, on each cluster, all nodes are in /etc/hosts.
corosync.conf always contains all nodes with their IP addresses etc.
Can someone shed some light here?