changed primary network interface, now all nodes are gray ? marks

Hi @Daniel Drucker ,

How can I fix this?
Your network naming has likely changed.

You need to examine the following outputs and then reconcile between them:
"ip a" - find your NIC and note interface name
"cat /etc/network/interfaces" - find and replace old NIC interface name with new one
reboot or restart network services: systemctl restart networking

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
No, I did that. /etc/network/interfaces has the correct name, both for the interface itself and in the bridge name.
You need to provide more information. At the very least:
- output of : ip a
- output of : /etc/network/interfaces
- output of: pvecm status
- ping/ssh results between each pair of nodes
- describe whether other nodes are reporting properly
- have you rebooted?
- is there anything in : journalctl -f (or journalctl -n 500)
etc

"nodes with gray ?" is a symptom of network issue, or, possibly, pvestatd issue. Most pvestatd problems are caused by network issues.

What you described is a generic symptom that can be caused by many different things. More details are needed.

Cheers


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I have 3 nodes. I think probably a big underlying problem is that suddenly ssh isn't working FROM the node that I changed. I can ssh TO it from both the other nodes, but if I ssh -vvv FROM it to either of the other nodes, ssh hangs at:

...
debug1: Remote protocol version 2.0, remote software version OpenSSH_9.2p1 Debian-2+deb12u4
debug1: compat_banner: match: OpenSSH_9.2p1 Debian-2+deb12u4 pat OpenSSH* compat 0x04000000
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to proxmox-i9:22 as 'root'
debug1: load_hostkeys: fopen /root/.ssh/known_hosts2: No such file or directory

and can't even be control-c'd.

I get this from pvecm status on two of the nodes (proxmox-i9 and proxmox03), but on proxmox-images it hangs.

Code:
root@proxmox-images:~# pvecm status
Cluster information
-------------------
Name:             mic-proxmox
Config Version:   15
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Fri Feb 21 12:00:04 2025
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000002
Ring ID:          2.5af
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   4
Highest expected: 4
Total votes:      3
Quorum:           3
Flags:            Quorate Qdevice

Membership information
----------------------
    Nodeid      Votes    Qdevice Name
0x00000002          1   A,NV,NMW 172.29.20.236 (local)
0x00000003          1   A,NV,NMW 172.29.158.35
0x00000004          1  NA,NV,NMW 172.29.20.204
0x00000000          0            Qdevice (votes 1)


I also have a qdevice which I need to delete but can't - it hangs at waiting for lock.
 
ssh is hanging because it's trying to read /etc/pve/priv/authorized_keys.
network is reachable from each to every other (ssh may fail, but telnet to port 22 works just fine - ssh is failing from that one node because the node can't read from /etc/pve/ )
 
Last edited:
In order to get my critical services back up, I ended up following these instructions to detach each node from the cluster, so at least I could bring everything back up.

Of course, that means now I can't join the three nodes into a cluster again, because each node already has VMs on it.

Any way around that?
 
Now I'm back to having problems.

I have two nodes joined right now:

On one of them I see:

Code:
root@proxmox03:~# pvecm status
Cluster information
-------------------
Name:             mic-cluster
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Fri Feb 21 16:15:59 2025
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.39
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 172.29.158.35 (local)
0x00000002          1 172.29.20.204

root@proxmox03:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 proxmox03 (local)
         2          1 proxmox-i9

On the other I see:

Code:
root@proxmox-i9:~# pvecm status
Cluster information
-------------------
Name:             mic-cluster
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Fri Feb 21 16:16:17 2025
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000002
Ring ID:          1.39
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   2
Highest expected: 2
Total votes:      2
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 172.29.158.35
0x00000002          1 172.29.20.204 (local)
root@proxmox-i9:~# pvecm nodes

Membership information
----------------------
    Nodeid      Votes Name
         1          1 proxmox03
         2          1 proxmox-i9 (local)
root@proxmox-i9:~#

I can ssh between them in both directions.

But if I try to do anything that modifies any files in /etc/pve - including just "touch /etc/pve/testfile" - the command hangs.