Hello,
I have 4 Proxmox nodes with 2x10G interfaces. I am testing out a configuration without a switch. I have followed the tutorial in the https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server
The only difference between my setup and the setup in the tutorial is the addition of the 4th node. It works really well... until I unplug one of the interfaces and plug it back in. If I do, the network for Ceph starts going crazy (the entire meshed network stops working):
A few things I have noticed:
My config is identical to the one in the Proxmox Wiki article, also I tried tweaking all of the timings or leaving them at defaults (traditional and datacenter).
Does anyone know what the issue may be?
I have 4 Proxmox nodes with 2x10G interfaces. I am testing out a configuration without a switch. I have followed the tutorial in the https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server
The only difference between my setup and the setup in the tutorial is the addition of the 4th node. It works really well... until I unplug one of the interfaces and plug it back in. If I do, the network for Ceph starts going crazy (the entire meshed network stops working):
- pings do correctly go through from each node to every other node
- the Ceph config panel in Proxmox does work (most of the time, sometimes it takes a long time to load)
- The Ceph storage immidiately stops working, not accepting any reads or writes, the disk usage in VMs jumps up to 100%
max-lsp-lifetime
and then plug it in, it connects up correctly, without any interruption. Every single time.A few things I have noticed:
- The routes/neighbors/topology in vtysh show up correctly, even if the network is going crazy
- If I don't wait, replug and look at
show openfabric neighbors
in vtysh it is immidiately Up, but when I wait thoughmax-lsp-lifetime
and then replug the interface, it's in Initializing state for approx. 2s - If I unplug the interface and begin to shuffle the connections around, it all gets connected almost instantly, but when I plug the last one in it stops working again.
- prox01 connected to prox03 and prox04
- prox02 connected to prox03 and prox04
My config is identical to the one in the Proxmox Wiki article, also I tried tweaking all of the timings or leaving them at defaults (traditional and datacenter).
Does anyone know what the issue may be?