Missing Node IP address in /cluster/status API call

sin3vil

New Member
Mar 11, 2024
4
0
1
Hello all,

I'm managing a couple of separate 3-node clusters. I have a service that pulls the /cluster/status via API, but I've noticed that in some cases the IP for a node is missing, usually when that node is offline.

Problematic status:

Code:
root@pve2:~# pvesh get /cluster/status
┌───────────┬──────┬─────────┬───────────────┬───────┬───────┬────────┬───────┬────────┬─────────┬─────────┐
│ id        │ name │ type    │ ip            │ level │ local │ nodeid │ nodes │ online │ quorate │ version │
╞═══════════╪══════╪═════════╪═══════════════╪═══════╪═══════╪════════╪═══════╪════════╪═════════╪═════════╡
│ cluster   │ UNI  │ cluster │               │       │       │        │     3 │        │ 1       │       3 │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve1 │ pve1 │ node    │               │       │ 0     │      1 │       │ 0      │         │         │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve2 │ pve2 │ node    │ 169.254.0.102 │       │ 1     │      2 │       │ 1      │         │         │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve3 │ pve3 │ node    │ 169.254.0.103 │       │ 0     │      3 │       │ 1      │         │         │
└───────────┴──────┴─────────┴───────────────┴───────┴───────┴────────┴───────┴────────┴─────────┴─────────┘

vs working status (with node2 actually offline in this case too)

Code:
root@pve1:~# pvesh get /cluster/status
┌───────────┬──────┬─────────┬───────────────┬───────┬───────┬────────┬───────┬────────┬─────────┬─────────┐
│ id        │ name │ type    │ ip            │ level │ local │ nodeid │ nodes │ online │ quorate │ version │
╞═══════════╪══════╪═════════╪═══════════════╪═══════╪═══════╪════════╪═══════╪════════╪═════════╪═════════╡
│ cluster   │ UNI  │ cluster │               │       │       │        │     3 │        │ 1       │       9 │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve1 │ pve1 │ node    │ 169.254.0.101 │       │ 1     │      1 │       │ 1      │         │         │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve2 │ pve2 │ node    │ 169.254.0.102 │       │ 0     │      2 │       │ 0      │         │         │
├───────────┼──────┼─────────┼───────────────┼───────┼───────┼────────┼───────┼────────┼─────────┼─────────┤
│ node/pve3 │ pve3 │ node    │ 169.254.0.103 │       │ 0     │      3 │       │ 1      │         │         │

These clusters are all the same, so I'm unsure what could be causing this.
The manual for /cluster/status says that the IP is "[node] IP of the resolved nodename.". Cluster manual says that it's using getaddrinfo() to resolve the hostname. I've confirmed that it's resolvable, using getent.
For the record, on each cluster, all nodes are in /etc/hosts.

corosync.conf always contains all nodes with their IP addresses etc.

Can someone shed some light here?
 
this information is only available if the node has been online and broadcasted it, that is likely the difference between your two examples..
 
Hello Fabian,

This is probably the case, I can see the "working" cluster has a much longer uptime, probably before the problematic node went offline.
Still, why this design? I'd expect this information to be read from corosync.conf or something.
 
it is broadcasted when pmxcfs on that node starts up (and then re-broadcasted on cluster membership changes).

the contents of the corosync config are queryiable using other means, and might contain other IP addresses anyway (the corosync links aren't necessarily identical to the resolved addresses of the hostnames of the nodes after all)
 
Fabian, I'm slightly confused.

On one hand you say that the information is broadcasted by the node, on the other that hostnames are resolved.
I mean, the hostname is available, what info is needed to be broadcast by the node that then gets resolved into an IP?

Using the API, how would I be able to get the IP address for a node that isn't currently online (and hasn't broadcast back for a while)?
 
On one hand you say that the information is broadcasted by the node, on the other that hostnames are resolved.
I mean, the hostname is available, what info is needed to be broadcast by the node that then gets resolved into an IP?
when the pmxcfs service is started, it will resolve the hostname/node name and broadcast the result (or fail if there is no non-loopback IP that it resolves to).

Using the API, how would I be able to get the IP address for a node that isn't currently online (and hasn't broadcast back for a while)?

you can't if no node has this information. that information might not be accurate anyhow if the node is not online.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!