proxmox-ve: 7.4-1
ceph: 17.2.6-pve1
I have 4 node Ceph cluster using Proxmox as a host. Public and Private networks are separated using 2 x 10Gbps ports each (2 cards per node , 4 ports total). All nodes are setup in exactly the same way. Here is an example of Ceph Private config:
auto enp4s0f0
iface enp4s0f0 inet manual
mtu 9000
auto enp5s0f0
iface enp5s0f0 inet manual
mtu 9000
auto bond1
iface bond1 inet manual
bond-slaves enp4s0f0 enp5s0f0
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
mtu 9000
#Ceph Private
auto vmbr1
iface vmbr1 inet static
address 10.221.2.70/24
bridge-ports bond1
bridge-stp off
bridge-fd 0
mtu 9000
#Ceph Private
I was just testing a new storage and looking at the bandwidth with nload. On one node and ONLY on one node It seems that enp4s0f0 is not sending (outgoing) any traffic or very little. The incoming traffic is working properly.
NODE 1 enp4s0f0
Outgoing
Curr: 0.00 Bit/s
Avg: 32.00 Bit/s
Min: 0.00 Bit/s
Max: 976.00 Bit/s
Ttl: 6.43 MByte
- the 6.42 MByte is there only because I unplugged the enp5s0f0 to see if this port or direction of the traffic is even active/working and it was working!
NODE 1 enp5s0f0
Outgoing:
Curr: 849.15 MBit/s
Avg: 1.00 GBit/s
Min: 3.42 MBit/s
Max: 4.75 GBit/s
Ttl: 1254.14 GByte
The other 3 nodes are working the way I would expect using two ports per bonded interface , here is an example from node 3.
NODE 3 enp4s0f0
Outgoing
Curr: 3.40 GBit/s
Avg: 847.84 MBit/s
Min: 1.74 MBit/s
Max: 3.40 GBit/s
Ttl: 576.82 GByte
NODE 3 enp5s0f0
Outgoing
Curr: 1.07 GBit/s
Avg: 424.29 MBit/s
Min: 732.96 kBit/s
Max: 1.43 GBit/s
Ttl: 272.50 GByte
Even it is up to the server how the traffic is send out I checked the switch and all ports are configured the same way. I looked at the port configuration side by side and it is the same on all 4 nodes (except for ip addresses). I also checked and all the network cards are on the same firmware.
Is there anything about 1st node that would cause that ?
Has anybody seen that behavior on one node only in the cluster ?
Thank you.
ceph: 17.2.6-pve1
I have 4 node Ceph cluster using Proxmox as a host. Public and Private networks are separated using 2 x 10Gbps ports each (2 cards per node , 4 ports total). All nodes are setup in exactly the same way. Here is an example of Ceph Private config:
auto enp4s0f0
iface enp4s0f0 inet manual
mtu 9000
auto enp5s0f0
iface enp5s0f0 inet manual
mtu 9000
auto bond1
iface bond1 inet manual
bond-slaves enp4s0f0 enp5s0f0
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
mtu 9000
#Ceph Private
auto vmbr1
iface vmbr1 inet static
address 10.221.2.70/24
bridge-ports bond1
bridge-stp off
bridge-fd 0
mtu 9000
#Ceph Private
I was just testing a new storage and looking at the bandwidth with nload. On one node and ONLY on one node It seems that enp4s0f0 is not sending (outgoing) any traffic or very little. The incoming traffic is working properly.
NODE 1 enp4s0f0
Outgoing
Curr: 0.00 Bit/s
Avg: 32.00 Bit/s
Min: 0.00 Bit/s
Max: 976.00 Bit/s
Ttl: 6.43 MByte
- the 6.42 MByte is there only because I unplugged the enp5s0f0 to see if this port or direction of the traffic is even active/working and it was working!
NODE 1 enp5s0f0
Outgoing:
Curr: 849.15 MBit/s
Avg: 1.00 GBit/s
Min: 3.42 MBit/s
Max: 4.75 GBit/s
Ttl: 1254.14 GByte
The other 3 nodes are working the way I would expect using two ports per bonded interface , here is an example from node 3.
NODE 3 enp4s0f0
Outgoing
Curr: 3.40 GBit/s
Avg: 847.84 MBit/s
Min: 1.74 MBit/s
Max: 3.40 GBit/s
Ttl: 576.82 GByte
NODE 3 enp5s0f0
Outgoing
Curr: 1.07 GBit/s
Avg: 424.29 MBit/s
Min: 732.96 kBit/s
Max: 1.43 GBit/s
Ttl: 272.50 GByte
Even it is up to the server how the traffic is send out I checked the switch and all ports are configured the same way. I looked at the port configuration side by side and it is the same on all 4 nodes (except for ip addresses). I also checked and all the network cards are on the same firmware.
Is there anything about 1st node that would cause that ?
Has anybody seen that behavior on one node only in the cluster ?
Thank you.