Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

As for my TB4 cables... The ones I have seem to be good... but I'll happily replace them if they're affecting performance.

Edit: Ok, I just ordered these cables to test...

The cables arrived and I ran tests... The results are unexpectedly good, with both performing similarly.
OWC ($20) edges out Cable Matters ($28) in performance (and consistency, too)!

Node 1 --> Nodes 2 & 3
Code:
root@nuc1:~# iperf3 -c fc00::112; iperf3 -c fc00::113
Connecting to host fc00::112, port 5201
[  5] local fc00::111 port 54016 connected to fc00::112 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  3.05 GBytes  26.2 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  3.07 GBytes  26.3 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  3.07 GBytes  26.3 Gbits/sec   10   1.69 MBytes     
[  5]   7.00-8.00   sec  3.07 GBytes  26.4 Gbits/sec    2   1.62 MBytes     
[  5]   8.00-9.00   sec  3.08 GBytes  26.4 Gbits/sec    0   1.62 MBytes     
[  5]   9.00-10.00  sec  3.07 GBytes  26.4 Gbits/sec   13   1.69 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec   25             sender
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec                  receiver

iperf Done.
Connecting to host fc00::113, port 5201
[  5] local fc00::111 port 32936 connected to fc00::113 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  3.04 GBytes  26.1 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  3.07 GBytes  26.3 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   7.00-8.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   8.00-9.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   9.00-10.00  sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec                  receiver

iperf Done.

Node 2 --> Nodes 1 & 3
Code:
root@nuc2:~# iperf3 -c fc00::111; iperf3 -c fc00::113
Connecting to host fc00::111, port 5201
[  5] local fc00::112 port 48870 connected to fc00::111 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   1.00-2.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   2.00-3.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   3.00-4.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   4.00-5.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   5.00-6.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   6.00-7.00   sec  3.06 GBytes  26.2 Gbits/sec    0   1.75 MBytes     
[  5]   7.00-8.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   8.00-9.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.75 MBytes     
[  5]   9.00-10.00  sec  3.07 GBytes  26.3 Gbits/sec    0   1.75 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec                  receiver

iperf Done.
Connecting to host fc00::113, port 5201
[  5] local fc00::112 port 47710 connected to fc00::113 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.96 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  2.96 GBytes  25.4 Gbits/sec    1   1.69 MBytes     
[  5]   4.00-5.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  2.94 GBytes  25.3 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  2.96 GBytes  25.4 Gbits/sec    0   1.69 MBytes     
[  5]   7.00-8.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   8.00-9.00   sec  2.96 GBytes  25.4 Gbits/sec    0   1.69 MBytes     
[  5]   9.00-10.00  sec  2.95 GBytes  25.4 Gbits/sec    0   1.69 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  29.6 GBytes  25.4 Gbits/sec    1             sender
[  5]   0.00-10.00  sec  29.6 GBytes  25.4 Gbits/sec                  receiver

iperf Done.

Node 3 --> Nodes 1 & 2
Code:
root@nuc3:~# iperf3 -c fc00::111; iperf3 -c fc00::112
Connecting to host fc00::111, port 5201
[  5] local fc00::113 port 34390 connected to fc00::111 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  3.07 GBytes  26.3 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   7.00-8.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   8.00-9.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   9.00-10.00  sec  3.07 GBytes  26.4 Gbits/sec    2   2.37 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec    2             sender
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec                  receiver

iperf Done.
Connecting to host fc00::112, port 5201
[  5] local fc00::113 port 43928 connected to fc00::112 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.96 GBytes  25.4 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  2.91 GBytes  25.0 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  2.92 GBytes  25.0 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  2.92 GBytes  25.1 Gbits/sec    0   1.69 MBytes     
[  5]   7.00-8.00   sec  2.92 GBytes  25.1 Gbits/sec    2   1.75 MBytes     
[  5]   8.00-9.00   sec  2.97 GBytes  25.5 Gbits/sec    0   1.75 MBytes     
[  5]   9.00-10.00  sec  2.97 GBytes  25.5 Gbits/sec    0   1.75 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  29.5 GBytes  25.3 Gbits/sec    2             sender
[  5]   0.00-10.00  sec  29.5 GBytes  25.3 Gbits/sec                  receiver

iperf Done.

Node 1 --> Nodes 2 & 3
Code:
root@nuc1:~# iperf3 -c fc00::112; iperf3 -c fc00::113
Connecting to host fc00::112, port 5201
[  5] local fc00::111 port 40330 connected to fc00::112 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.10 GBytes  26.6 Gbits/sec    0   2.37 MBytes     
[  5]   1.00-2.00   sec  3.09 GBytes  26.6 Gbits/sec    0   2.37 MBytes     
[  5]   2.00-3.00   sec  3.10 GBytes  26.7 Gbits/sec    0   2.37 MBytes     
[  5]   3.00-4.00   sec  3.10 GBytes  26.6 Gbits/sec    0   2.37 MBytes     
[  5]   4.00-5.00   sec  3.11 GBytes  26.7 Gbits/sec    0   2.37 MBytes     
[  5]   5.00-6.00   sec  3.10 GBytes  26.6 Gbits/sec    0   2.37 MBytes     
[  5]   6.00-7.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.37 MBytes     
[  5]   7.00-8.00   sec  3.09 GBytes  26.6 Gbits/sec    0   2.37 MBytes     
[  5]   8.00-9.00   sec  3.09 GBytes  26.5 Gbits/sec    0   2.37 MBytes     
[  5]   9.00-10.00  sec  3.11 GBytes  26.7 Gbits/sec    0   2.37 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec                  receiver

iperf Done.
Connecting to host fc00::113, port 5201
[  5] local fc00::111 port 38284 connected to fc00::113 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  3.10 GBytes  26.7 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  3.07 GBytes  26.4 Gbits/sec    1   1.69 MBytes     
[  5]   7.00-8.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   8.00-9.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   9.00-10.00  sec  3.11 GBytes  26.7 Gbits/sec    0   1.69 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec    1             sender
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec                  receiver

iperf Done.

Node 2 --> Nodes 1 & 3
Code:
root@nuc2:~# iperf3 -c fc00::111; iperf3 -c fc00::113
Connecting to host fc00::111, port 5201
[  5] local fc00::112 port 59786 connected to fc00::111 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.09 GBytes  26.5 Gbits/sec    0   2.43 MBytes     
[  5]   1.00-2.00   sec  3.08 GBytes  26.5 Gbits/sec    0   2.43 MBytes     
[  5]   2.00-3.00   sec  3.09 GBytes  26.6 Gbits/sec    0   2.43 MBytes     
[  5]   3.00-4.00   sec  3.09 GBytes  26.5 Gbits/sec    0   2.43 MBytes     
[  5]   4.00-5.00   sec  3.09 GBytes  26.5 Gbits/sec    0   2.43 MBytes     
[  5]   5.00-6.00   sec  3.10 GBytes  26.6 Gbits/sec    0   2.43 MBytes     
[  5]   6.00-7.00   sec  3.10 GBytes  26.6 Gbits/sec    0   2.43 MBytes     
[  5]   7.00-8.00   sec  3.11 GBytes  26.7 Gbits/sec    0   2.43 MBytes     
[  5]   8.00-9.00   sec  3.10 GBytes  26.7 Gbits/sec    0   2.43 MBytes     
[  5]   9.00-10.00  sec  3.09 GBytes  26.6 Gbits/sec    0   2.43 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.9 GBytes  26.6 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.9 GBytes  26.6 Gbits/sec                  receiver

iperf Done.
Connecting to host fc00::113, port 5201
[  5] local fc00::112 port 43812 connected to fc00::113 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.50 MBytes     
[  5]   1.00-2.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.50 MBytes     
[  5]   2.00-3.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.50 MBytes     
[  5]   3.00-4.00   sec  3.09 GBytes  26.6 Gbits/sec    0   2.50 MBytes     
[  5]   4.00-5.00   sec  3.09 GBytes  26.6 Gbits/sec    0   2.50 MBytes     
[  5]   5.00-6.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.50 MBytes     
[  5]   6.00-7.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.50 MBytes     
[  5]   7.00-8.00   sec  3.07 GBytes  26.4 Gbits/sec    0   2.50 MBytes     
[  5]   8.00-9.00   sec  3.08 GBytes  26.5 Gbits/sec    0   2.50 MBytes     
[  5]   9.00-10.00  sec  3.08 GBytes  26.5 Gbits/sec    0   2.50 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.7 GBytes  26.4 Gbits/sec                  receiver

iperf Done.

Node 3 --> Nodes 1 & 2
Code:
root@nuc3:~# iperf3 -c fc00::111; iperf3 -c fc00::112
Connecting to host fc00::111, port 5201
[  5] local fc00::113 port 60048 connected to fc00::111 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.07 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  3.09 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   7.00-8.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   8.00-9.00   sec  3.09 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   9.00-10.00  sec  3.09 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec                  receiver

iperf Done.
Connecting to host fc00::112, port 5201
[  5] local fc00::113 port 55722 connected to fc00::112 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.06 GBytes  26.3 Gbits/sec    0   1.69 MBytes     
[  5]   1.00-2.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   2.00-3.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   3.00-4.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   4.00-5.00   sec  3.09 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   5.00-6.00   sec  3.08 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   6.00-7.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   7.00-8.00   sec  3.10 GBytes  26.6 Gbits/sec    0   1.69 MBytes     
[  5]   8.00-9.00   sec  3.09 GBytes  26.5 Gbits/sec    0   1.69 MBytes     
[  5]   9.00-10.00  sec  3.08 GBytes  26.4 Gbits/sec    0   1.69 MBytes     
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  30.9 GBytes  26.5 Gbits/sec                  receiver

iperf Done.

I can't precisely explain why I'm getting 26Gbps performance instead of 21Gbps now though. Absolutely zero complaints.
No VMs or CTs are running at the moment. Will test with them running, too.
 
Last edited:
My NUC 13 Pro cluster is up and running! I didn't hit any major speed bumps thanks to you guys. :cool:

I'm posting... with a secondary goal of figuring out how to get my Thunderbolt network to perform at 26Gbps (it's currently "only" at 21Gbps).

All 3 nodes perform at this level, even when running simultaneously (Node 1>>2, Node 2>>3, Node 3>>1), and running with -P 10 yields identical performance

Code:
root@nuc1:~# iperf3 -c fc00::112
Connecting to host fc00::112, port 5201
[  5] local fc00::111 port 42512 connected to fc00::112 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.49 GBytes  21.4 Gbits/sec  813   1023 KBytes
[  5]   1.00-2.00   sec  2.49 GBytes  21.4 Gbits/sec  731   1.31 MBytes
[  5]   2.00-3.00   sec  2.49 GBytes  21.4 Gbits/sec  829    575 KBytes
[  5]   3.00-4.00   sec  2.47 GBytes  21.2 Gbits/sec  1032   1.37 MBytes
[  5]   4.00-5.00   sec  2.49 GBytes  21.4 Gbits/sec  1106   1.06 MBytes
[  5]   5.00-6.00   sec  2.48 GBytes  21.3 Gbits/sec  1103   1.19 MBytes
[  5]   6.00-7.00   sec  2.48 GBytes  21.3 Gbits/sec  1013   1.12 MBytes
[  5]   7.00-8.00   sec  2.50 GBytes  21.5 Gbits/sec  1019   1.06 MBytes
[  5]   8.00-9.00   sec  2.03 GBytes  17.5 Gbits/sec  803   1.37 MBytes
[  5]   9.00-10.00  sec  2.52 GBytes  21.6 Gbits/sec  1156   1023 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  24.4 GBytes  21.0 Gbits/sec  9605             sender
[  5]   0.00-10.00  sec  24.4 GBytes  21.0 Gbits/sec                  receiver

I just noticed another significant thing with my previous tests (21Gbps) and the current ones (26Gbps). The slower performance results were also plagued with 800-1000 retries every second. The faster results today have virtually no retries (for either cable brand).

I must have had something bad going on. Not sure if it was hardware or software related, but thankfully it didn't affect the stability of the cluster as I was building and configuring it.

Edit: Thinking about it more, since the problem affected all 3 physically-independent links in the same fashion, I suspect it was a software issue.

Edit 2: I started all VMs/CTs and re-tested with the OWC cables - the results are the same - a rock-solid 26Gbps.
 
Last edited:
so i wonder how to delay ceph startup until thunderbolt is up and renamed?

I'm not sure that's entirely necessary. Based on observing Ceph, I assume it's built to be robust and handle late timing like this. I'm sure it happens a lot.

This timing problem doesn't seem to affect Ceph directly. I just can't stand the potential failure/instability of not failing back reliably. It seems proxmox's HA/migration connection logic isn't as robust as Ceph's.

What would be amazing is having a link hierarchy like the Cluster (corosync) does with assigned priorities. We could assign the TB network as link0 and the LAN as link1.

one quick double check, you have proven you can ssh manually between nodes on IPv6?

Yes, I have always been able to SSH between nodes on Thunderbolt IPv6 (and IPv4 when it's actually up), and I was able to do so not long after the "Network is unreachable" message. But I do vaguely recall trying to either SSH or run an iPerf test on the TB network when the node was coming up, and I believe I got a message like "No route to host" or something like that. It was very temporary though. Perhaps that was during the same timeframe as this error.

(As a reminder, to trigger the "Network is unreachable" message, I configured TB as my migration network with subnet of /125, then rebooted Node 3. As it was rebooting, its VMs/CTs auto-migrated to the other nodes over TB (so SSH was obviously working), and once Node 3 was somewhat online, the failback migration tried to execute but failed.)
 
Last edited:
@esReveRse i wonder if it is a s
I can't precisely explain why I'm getting 26Gbps performance instead of 21Gbps now though. Absolutely zero complaints.
yeah i love the length of the short OWC - that's what i used

i don't know why one node was doing 18gbps but only one direction, it went away and i am unclear what i changed, soemthing for us to keep an eye on and see if it comes back....
 
I can't precisely explain why I'm getting 26Gbps performance instead of 21Gbps now though. Absolutely zero complaints.
I'm seeing similar numbers around 22Gbit/s with lots of retries in the iperf3 output.

I do have an older-generation NUC (Intel NUC 12 Pro with i3 1220p), so that can be a factor, too. Did you do anything special to get rid of the retries?

I think I might also be limited by the i3 CPU in that NUC, when i run the iperf test, I see 100% utilization on one core, so that could suggest the CPU is not fast enough to push more with iperf than what I'm getting. The cables are also nothing special i might try and get the owc one to see if there is any difference.

I also have only ipv4 setup no ipv6, are you seeing some speed or retry difference between the two?

In general, the setup is very reliable. After I resolved the issue with the network being down after a reboot, I tested cutting power on one node, and when it came back, it automatically connected to the network. I'm really happy with the cluster; it mostly works way better than I was expecting.

Thanks to everybody documenting their process setting it up. It saved me a lot of trouble and headaches :D


Code:
[  5] local 10.0.0.82 port 39828 connected to 10.0.0.81 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.48 GBytes  21.3 Gbits/sec  636   1.12 MBytes      
[  5]   1.00-2.00   sec  2.46 GBytes  21.2 Gbits/sec  694   1.19 MBytes      
[  5]   2.00-3.00   sec  2.51 GBytes  21.6 Gbits/sec  666   1.44 MBytes      
[  5]   3.00-4.00   sec  2.48 GBytes  21.3 Gbits/sec  745   1.31 MBytes      
[  5]   4.00-5.00   sec  2.55 GBytes  21.9 Gbits/sec  585   1.31 MBytes      
[  5]   5.00-6.00   sec  2.46 GBytes  21.2 Gbits/sec  767   1.37 MBytes      
[  5]   6.00-7.00   sec  2.52 GBytes  21.7 Gbits/sec  705   1.31 MBytes      
[  5]   7.00-8.00   sec  2.47 GBytes  21.2 Gbits/sec  680   1.12 MBytes      
[  5]   8.00-9.00   sec  2.51 GBytes  21.6 Gbits/sec  735   1.25 MBytes      
[  5]   9.00-10.00  sec  2.53 GBytes  21.7 Gbits/sec  618   1.06 MBytes      
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  25.0 GBytes  21.5 Gbits/sec  6831             sender
[  5]   0.00-10.00  sec  25.0 GBytes  21.5 Gbits/sec                  receiver

EDIT: I'm starting to think its related to cable quality. I tested it with only two nodes directly connected with two thunderbolt cables and Retry value was a lot lower.
Code:
Connecting to host 10.0.0.82, port 5201
[  5] local 10.0.0.81 port 47814 connected to 10.0.0.82 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.52 GBytes  21.6 Gbits/sec  159   1.06 MBytes       
[  5]   1.00-2.00   sec  2.61 GBytes  22.4 Gbits/sec  187   1.62 MBytes       
[  5]   2.00-3.00   sec  2.41 GBytes  20.7 Gbits/sec  283   1.37 MBytes       
[  5]   3.00-4.00   sec  2.41 GBytes  20.7 Gbits/sec  255   1.69 MBytes       
[  5]   4.00-5.00   sec  2.37 GBytes  20.3 Gbits/sec  293   1023 KBytes       
[  5]   5.00-6.00   sec  2.54 GBytes  21.9 Gbits/sec  182   1.19 MBytes       
[  5]   6.00-7.00   sec  2.56 GBytes  22.0 Gbits/sec  205    895 KBytes       
[  5]   7.00-8.00   sec  2.42 GBytes  20.7 Gbits/sec  297   1.19 MBytes       
[  5]   8.00-9.00   sec  2.43 GBytes  20.9 Gbits/sec  272   1.25 MBytes       
[  5]   9.00-10.00  sec  2.42 GBytes  20.8 Gbits/sec  275   1.37 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  24.7 GBytes  21.2 Gbits/sec  2408             sender
[  5]   0.00-10.00  sec  24.7 GBytes  21.2 Gbits/sec                  receiver

I also tested with ipv6 setup, there was no significat difference in the speed or `Retr` value. I might just call it at this point, even that speed is more than what my ssd drives can do. Not really sure if putting more time in to resolving this will benefit me i any way.

Once the new cables arrive I'll post an update here for anyone else who might be in the same boat.
 
Last edited:
I'm running different hardware than you guys, the minisforum ms-01. I'm getting a lot of retransmits with iperf3. I started with some no-name thunderbolt4 cables, "connbull". I ordered and tried a belkin tb4 cable and an owc tb4 cables no apparent change. I've followed @scyto guide over on github to get this far.

the smaller my mtu the more retransmits i get. Using an MTU of 65520 i get on average about 700-900 retries per second and every now and then it drops down to single digits for a second or two.

Any thoughts?


Code:
Connecting to host fc00::84, port 5201
[  5] local fc00::83 port 60596 connected to fc00::84 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.56 GBytes  22.0 Gbits/sec  625   1.12 MBytes    
[  5]   1.00-2.00   sec  2.56 GBytes  22.0 Gbits/sec  871   1.19 MBytes    
[  5]   2.00-3.00   sec  2.56 GBytes  22.0 Gbits/sec  954   1.31 MBytes    
[  5]   3.00-4.00   sec  2.02 GBytes  17.3 Gbits/sec  850   1.19 MBytes    
[  5]   4.00-5.00   sec  2.52 GBytes  21.7 Gbits/sec  905   1.37 MBytes    
[  5]   5.00-6.00   sec  2.55 GBytes  21.9 Gbits/sec  747   1.25 MBytes    
[  5]   6.00-7.00   sec  2.54 GBytes  21.9 Gbits/sec  745   1.19 MBytes    
[  5]   7.00-8.00   sec  2.54 GBytes  21.8 Gbits/sec  799   1.12 MBytes    
[  5]   8.00-9.00   sec  2.55 GBytes  21.9 Gbits/sec  799   1.06 MBytes    
[  5]   9.00-10.00  sec  2.56 GBytes  22.0 Gbits/sec  947   1.31 MBytes    
[  5]  10.00-11.00  sec  2.56 GBytes  22.0 Gbits/sec  1037   1.37 MBytes    
[  5]  11.00-12.00  sec  2.03 GBytes  17.5 Gbits/sec  748   1023 KBytes    
[  5]  12.00-13.00  sec   976 MBytes  8.18 Gbits/sec  387   63.9 KBytes    
[  5]  13.00-14.00  sec  2.02 GBytes  17.4 Gbits/sec  837   1.06 MBytes    
[  5]  14.00-15.00  sec  1.99 GBytes  17.1 Gbits/sec  829   63.9 KBytes    
[  5]  15.00-16.00  sec   941 MBytes  7.90 Gbits/sec  377   1.37 MBytes    
[  5]  16.00-17.00  sec  2.56 GBytes  22.0 Gbits/sec  1060   1.25 MBytes    
[  5]  17.00-18.00  sec  2.03 GBytes  17.5 Gbits/sec  800   2.00 MBytes    
[  5]  18.00-19.00  sec  2.53 GBytes  21.8 Gbits/sec  1133   1.25 MBytes    
[  5]  19.00-20.00  sec  2.55 GBytes  21.9 Gbits/sec  1018   1.31 MBytes    
[  5]  20.00-21.00  sec   945 MBytes  7.93 Gbits/sec  343   1.25 MBytes    
[  5]  21.00-22.00  sec  2.87 GBytes  24.7 Gbits/sec  430   3.68 MBytes    
[  5]  22.00-23.00  sec  3.07 GBytes  26.4 Gbits/sec    1   3.68 MBytes    
[  5]  23.00-24.00  sec  2.86 GBytes  24.6 Gbits/sec  511   1.31 MBytes    
[  5]  24.00-25.00  sec  2.10 GBytes  18.1 Gbits/sec  600   3.50 MBytes    
[  5]  25.00-26.00  sec  3.07 GBytes  26.4 Gbits/sec    2   3.50 MBytes    
[  5]  26.00-27.00  sec  3.07 GBytes  26.4 Gbits/sec    1   3.50 MBytes    
[  5]  27.00-28.00  sec  2.62 GBytes  22.5 Gbits/sec  943   1.31 MBytes    
[  5]  28.00-29.00  sec  2.61 GBytes  22.4 Gbits/sec  957   3.37 MBytes    
[  5]  29.00-30.00  sec  2.83 GBytes  24.3 Gbits/sec   15   1.87 MBytes    
[  5]  30.00-31.00  sec  1.14 GBytes  9.81 Gbits/sec  362   3.87 MBytes    
[  5]  31.00-32.00  sec  3.07 GBytes  26.4 Gbits/sec    2   3.93 MBytes    
[  5]  32.00-33.00  sec  3.07 GBytes  26.4 Gbits/sec    3   3.93 MBytes    
[  5]  33.00-34.00  sec  3.04 GBytes  26.1 Gbits/sec   97   3.93 MBytes    
[  5]  34.00-35.00  sec  3.07 GBytes  26.4 Gbits/sec    3   3.93 MBytes    
[  5]  35.00-36.00  sec  3.06 GBytes  26.3 Gbits/sec   59   4.56 MBytes    
[  5]  36.00-37.00  sec  3.07 GBytes  26.4 Gbits/sec    3   4.56 MBytes    
[  5]  37.00-38.00  sec  3.07 GBytes  26.4 Gbits/sec    2   4.56 MBytes    
[  5]  38.00-39.00  sec  2.33 GBytes  20.0 Gbits/sec  203   1.12 MBytes    
[  5]  39.00-40.00  sec  2.03 GBytes  17.5 Gbits/sec  825   1.31 MBytes    
[  5]  40.00-41.00  sec  2.04 GBytes  17.5 Gbits/sec  863   1.31 MBytes    
[  5]  41.00-42.00  sec   939 MBytes  7.87 Gbits/sec  378   1.37 MBytes    
[  5]  42.00-43.00  sec  1008 MBytes  8.45 Gbits/sec  401   1.93 MBytes    
[  5]  43.00-44.00  sec  1.91 GBytes  16.4 Gbits/sec  723   63.9 KBytes    
[  5]  44.00-45.00  sec  1.56 GBytes  13.4 Gbits/sec  558   1023 KBytes    
[  5]  45.00-46.00  sec  2.01 GBytes  17.3 Gbits/sec  719   1023 KBytes    
[  5]  46.00-47.00  sec  2.55 GBytes  21.9 Gbits/sec  721   1.37 MBytes    
[  5]  47.00-48.00  sec  2.56 GBytes  22.0 Gbits/sec  968   1.31 MBytes    
[  5]  48.00-49.00  sec  2.03 GBytes  17.4 Gbits/sec  804   1.19 MBytes    
[  5]  49.00-50.00  sec  1.50 GBytes  12.9 Gbits/sec  535   1.50 MBytes    
[  5]  50.00-51.00  sec  2.02 GBytes  17.3 Gbits/sec  868   1.06 MBytes    
[  5]  51.00-52.00  sec  2.09 GBytes  17.9 Gbits/sec  884   2.00 MBytes    
[  5]  52.00-53.00  sec   878 MBytes  7.36 Gbits/sec  385   1.31 MBytes    
[  5]  53.00-54.00  sec  2.55 GBytes  21.9 Gbits/sec  992   1.31 MBytes    
[  5]  54.00-55.00  sec   999 MBytes  8.38 Gbits/sec  470   2.12 MBytes    
[  5]  55.00-56.00  sec  2.79 GBytes  24.0 Gbits/sec  653   3.99 MBytes    
[  5]  56.00-57.00  sec  3.07 GBytes  26.4 Gbits/sec    2   3.99 MBytes    
[  5]  57.00-58.00  sec  3.07 GBytes  26.4 Gbits/sec    0   3.99 MBytes    
[  5]  58.00-59.00  sec  3.07 GBytes  26.4 Gbits/sec    1   3.99 MBytes    
[  5]  59.00-60.00  sec  3.07 GBytes  26.3 Gbits/sec    3   3.99 MBytes    
[  5]  60.00-61.00  sec  2.81 GBytes  24.2 Gbits/sec  402   1.81 MBytes    
[  5]  61.00-62.00  sec  2.52 GBytes  21.7 Gbits/sec  969   1.19 MBytes    
[  5]  62.00-63.00  sec  2.53 GBytes  21.8 Gbits/sec  1018   1.06 MBytes    
[  5]  63.00-64.00  sec  2.69 GBytes  23.1 Gbits/sec  772   3.25 MBytes    
[  5]  64.00-65.00  sec  3.07 GBytes  26.4 Gbits/sec    2   3.31 MBytes    
[  5]  65.00-66.00  sec  2.93 GBytes  25.2 Gbits/sec  279   1023 KBytes    
[  5]  66.00-67.00  sec  1.99 GBytes  17.1 Gbits/sec  933   1.31 MBytes    
[  5]  67.00-68.00  sec  1.98 GBytes  17.0 Gbits/sec  903   1.12 MBytes    
[  5]  68.00-69.00  sec   939 MBytes  7.88 Gbits/sec  405   1.19 MBytes    
[  5]  69.00-70.00  sec  2.46 GBytes  21.1 Gbits/sec  1153   1.44 MBytes    
[  5]  70.00-71.00  sec  2.46 GBytes  21.2 Gbits/sec  1143   1.06 MBytes    
[  5]  71.00-72.00  sec  1.93 GBytes  16.6 Gbits/sec  840   1023 KBytes    
[  5]  72.00-73.00  sec  2.00 GBytes  17.2 Gbits/sec  839   1.87 MBytes    
[  5]  73.00-74.00  sec  1.48 GBytes  12.7 Gbits/sec  584   1.12 MBytes    
[  5]  74.00-75.00  sec   968 MBytes  8.12 Gbits/sec  367   1.06 MBytes    
[  5]  75.00-76.00  sec  2.01 GBytes  17.2 Gbits/sec  872   1.06 MBytes    
[  5]  76.00-77.00  sec   930 MBytes  7.80 Gbits/sec  398   1023 KBytes    
[  5]  77.00-78.00  sec  2.02 GBytes  17.4 Gbits/sec  847   1.31 MBytes    
[  5]  78.00-79.00  sec  2.56 GBytes  22.0 Gbits/sec  1038   2.00 MBytes    
[  5]  79.00-80.00  sec  2.03 GBytes  17.4 Gbits/sec  758   1.06 MBytes    
[  5]  80.00-81.00  sec  2.01 GBytes  17.3 Gbits/sec  634   1.69 MBytes    
[  5]  81.00-82.00  sec  2.56 GBytes  22.0 Gbits/sec  831   1.31 MBytes    
[  5]  82.00-83.00  sec  2.57 GBytes  22.1 Gbits/sec  899   1.06 MBytes    
[  5]  83.00-84.00  sec  2.56 GBytes  22.0 Gbits/sec  1032   1.12 MBytes    
[  5]  84.00-85.00  sec  2.52 GBytes  21.7 Gbits/sec  988    831 KBytes    
[  5]  85.00-86.00  sec  1.49 GBytes  12.8 Gbits/sec  624   1023 KBytes    
[  5]  86.00-87.00  sec  1.47 GBytes  12.6 Gbits/sec  683   1.06 MBytes    
[  5]  87.00-88.00  sec  2.33 GBytes  20.0 Gbits/sec  1003   2.06 MBytes    
[  5]  88.00-89.00  sec  2.24 GBytes  19.2 Gbits/sec  983   1.19 MBytes    
[  5]  89.00-90.00  sec  1.89 GBytes  16.3 Gbits/sec  732   63.9 KBytes    
[  5]  90.00-91.00  sec  1.34 GBytes  11.5 Gbits/sec  422   1.81 MBytes    
[  5]  91.00-92.00  sec  1.65 GBytes  14.2 Gbits/sec  641   2.18 MBytes    
[  5]  92.00-93.00  sec  2.13 GBytes  18.3 Gbits/sec  1003   1.25 MBytes    
[  5]  93.00-94.00  sec  2.54 GBytes  21.8 Gbits/sec  1091   1.50 MBytes    
[  5]  94.00-95.00  sec  2.54 GBytes  21.8 Gbits/sec  1125   1.12 MBytes    
[  5]  95.00-96.00  sec  2.02 GBytes  17.3 Gbits/sec  769   1.31 MBytes    
[  5]  96.00-97.00  sec  2.51 GBytes  21.5 Gbits/sec  1081   1.19 MBytes    
[  5]  97.00-98.00  sec   950 MBytes  7.97 Gbits/sec  385   1.37 MBytes    
[  5]  98.00-99.00  sec  2.02 GBytes  17.4 Gbits/sec  954   1.06 MBytes    
[  5]  99.00-100.00 sec  2.36 GBytes  20.3 Gbits/sec  735   1.44 MBytes    
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-100.00 sec   224 GBytes  19.2 Gbits/sec  63895             sender
[  5]   0.00-100.00 sec   224 GBytes  19.2 Gbits/sec                  receiver

iperf Done.


EDIT

If i set a target bitrate in iperf3, the retries go down a LOT.
Code:
root@pve3:~# iperf3 -c fc00::84 -fg -t10 -b19G
Connecting to host fc00::84, port 5201
[  5] local fc00::83 port 50910 connected to fc00::84 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.21 GBytes  19.0 Gbits/sec   61   2.50 MBytes      
[  5]   1.00-2.00   sec  2.21 GBytes  19.0 Gbits/sec    0   2.50 MBytes      
[  5]   2.00-3.00   sec  2.21 GBytes  19.0 Gbits/sec   10   2.56 MBytes      
[  5]   3.00-4.00   sec  2.21 GBytes  19.0 Gbits/sec   26   2.25 MBytes      
[  5]   4.00-5.00   sec  2.21 GBytes  19.0 Gbits/sec    3   2.68 MBytes      
[  5]   5.00-6.00   sec  2.21 GBytes  19.0 Gbits/sec   12   2.62 MBytes      
[  5]   6.00-7.00   sec  2.21 GBytes  19.0 Gbits/sec   16   2.62 MBytes      
[  5]   7.00-8.00   sec  2.21 GBytes  19.0 Gbits/sec   21   1.75 MBytes      
[  5]   8.00-9.00   sec  2.21 GBytes  19.0 Gbits/sec    1   2.62 MBytes      
[  5]   9.00-10.00  sec  2.21 GBytes  19.0 Gbits/sec   17   1.93 MBytes      
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  22.1 GBytes  19.0 Gbits/sec  167             sender
[  5]   0.00-10.00  sec  22.1 GBytes  19.0 Gbits/sec                  receiver

iperf Done.

for the hell of it i've ordered some more tb4 cables, including the stupidly expensive apple cable.
 
Last edited:
Hey guys, I had a 3 node Cluster built as written in the guide here.

Unfortunately after a reboot of one node the cluster lost its Ceph config?
Or at least it got timeouts everywhere. I rebootet 1 node because of the Kernel Update from 6.5.13-3 to 6.5.13-3.
I waited until everything was online again and restarted the 2nd node and then it started with the "?" next to the CEPH Pool.

Does anyone knows how to fix this?

1713770400233.png

EDIT: I'm able to ping each node and the fabric topology looks ok to me.

1713772394327.png

Solution for now:
Reinstalled from scratch, working again, had to restore my backups, nothing lost
 
Last edited:
If i set a target bitrate in iperf3, the retries go down a LOT.
That can suggest it's not a cable-related issue. I know MS-01 has a lot of PCI lane splitting, depending on what is utilized. I wouldn't be surprised if the Thunderbolt 4 ports were sharing PCI lanes with other devices, and you're hitting the bandwidth limit with something else using the same lanes. You can also see that the throughput drops a lot from time to time.
When you limit the speed, you're not hitting the upper limit as much, and it doesn't get congested. I'm just guessing here. I might have the same problem with the i3 CPU not having enough PCI lanes and hitting similar limitations. I'll update it here once I test it with the new cables if there is any difference at all.

When I do the iper3 speed test with the limit, I see basically no difference in the cwd or retry value until I get down to 10Gbit, where it goes down a bit (from 200-300 to 100-200)
 
I wouldn't be surprised if the Thunderbolt 4 ports were sharing PCI lanes with other devices
The CPU has its own tb4 controller and dedicated lanes for both ports. Ive also gotten 26gbps at other points of my testing I had to send a unit back due to a bad rj45 port. But that was also with high retries.

I did also hit a weird wall where if my mtu size exceeded 35,000 or so, ceph would lock up the system hard. But iperf didn't care if the mtu was 65k.
 
Last edited:
ok, i've been messing with this for the better part of 2 hours. i figured it out. I noticed that when i was getting a lot of retries ksoftirqd was eating up nearly 100% of a CPU core, but when i told iperf3 to limit to 19gbps and i got few retries ksoftirqd was using hardly any cpu. I found this strange that a couple percent bump in speed caused it to go from low usage, to max usage.

As i was diving into understanding more about ksoftirqd i was looking at /proc/interrupts and noticed the cores that thunderbolt was utilizing and thought they might have been little cores... intels use big/little architecture so i wondered if i could pin the irq's for TB to big cores. I set the irq smp_affinity to do this, in my case i did this by "echo 000AAA | tee /proc/irq/22*/smp_affinity" This will vary based on your CPU and how many cores you have and which are P cores and which are E cores.

Now i get 26gbps and virtually 0 retransmits, and ksoftirqd is only using like 8% of one core

Code:
root@pve3:~# iperf3 -c fc00::84 -fg -t10
Connecting to host fc00::84, port 5201
[  5] local fc00::83 port 57640 connected to fc00::84 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.09 GBytes  26.5 Gbits/sec    0   2.81 MBytes   
[  5]   1.00-2.00   sec  3.10 GBytes  26.6 Gbits/sec    7   1.93 MBytes   
[  5]   2.00-3.00   sec  3.11 GBytes  26.7 Gbits/sec    0   1.93 MBytes   
[  5]   3.00-4.00   sec  3.06 GBytes  26.3 Gbits/sec    0   1.93 MBytes   
[  5]   4.00-5.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.93 MBytes   
[  5]   5.00-6.00   sec  3.11 GBytes  26.7 Gbits/sec    0   1.93 MBytes   
[  5]   6.00-7.00   sec  3.10 GBytes  26.7 Gbits/sec    0   1.93 MBytes   
[  5]   7.00-8.00   sec  3.11 GBytes  26.7 Gbits/sec    0   1.93 MBytes   
[  5]   8.00-9.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.93 MBytes   
[  5]   9.00-10.00  sec  3.10 GBytes  26.6 Gbits/sec    0   1.93 MBytes   
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec    7             sender
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec                  receiver

iperf Done.

Code:
root@pve3:~# iperf3 -c fc00::84 -fg -t10   --bidir
Connecting to host fc00::84, port 5201
[  5] local fc00::83 port 45626 connected to fc00::84 port 5201
[  7] local fc00::83 port 45640 connected to fc00::84 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  2.62 GBytes  22.5 Gbits/sec    3   1.87 MBytes   
[  7][RX-C]   0.00-1.00   sec  2.61 GBytes  22.4 Gbits/sec              
[  5][TX-C]   1.00-2.00   sec  2.64 GBytes  22.6 Gbits/sec   10   2.31 MBytes   
[  7][RX-C]   1.00-2.00   sec  2.66 GBytes  22.8 Gbits/sec              
[  5][TX-C]   2.00-3.00   sec  2.71 GBytes  23.3 Gbits/sec   48   1.37 MBytes   
[  7][RX-C]   2.00-3.00   sec  2.54 GBytes  21.8 Gbits/sec              
[  5][TX-C]   3.00-4.00   sec  2.86 GBytes  24.6 Gbits/sec   32   3.43 MBytes   
[  7][RX-C]   3.00-4.00   sec  2.50 GBytes  21.5 Gbits/sec              
[  5][TX-C]   4.00-5.00   sec  2.68 GBytes  23.0 Gbits/sec    7   1.87 MBytes   
[  7][RX-C]   4.00-5.00   sec  2.44 GBytes  20.9 Gbits/sec              
[  5][TX-C]   5.00-6.00   sec  2.92 GBytes  25.1 Gbits/sec   18   2.25 MBytes   
[  7][RX-C]   5.00-6.00   sec  2.41 GBytes  20.7 Gbits/sec              
[  5][TX-C]   6.00-7.00   sec  2.94 GBytes  25.3 Gbits/sec   31   3.43 MBytes   
[  7][RX-C]   6.00-7.00   sec  2.31 GBytes  19.9 Gbits/sec              
[  5][TX-C]   7.00-8.00   sec  2.66 GBytes  22.9 Gbits/sec    2   2.37 MBytes   
[  7][RX-C]   7.00-8.00   sec  2.54 GBytes  21.8 Gbits/sec              
[  5][TX-C]   8.00-9.00   sec  2.56 GBytes  22.0 Gbits/sec    0   2.37 MBytes   
[  7][RX-C]   8.00-9.00   sec  2.70 GBytes  23.2 Gbits/sec              
[  5][TX-C]   9.00-10.00  sec  2.66 GBytes  22.8 Gbits/sec   22   3.31 MBytes   
[  7][RX-C]   9.00-10.00  sec  2.58 GBytes  22.2 Gbits/sec              
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  27.2 GBytes  23.4 Gbits/sec  173             sender
[  5][TX-C]   0.00-10.00  sec  27.2 GBytes  23.4 Gbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec  25.3 GBytes  21.7 Gbits/sec  273             sender
[  7][RX-C]   0.00-10.00  sec  25.3 GBytes  21.7 Gbits/sec                  receiver

iperf Done.
root@pve3:~#

EDIT: because of Hyper threading i realized i don't want to risk 2 cores being shared, so i'm trying "echo 000155 | tee /proc/irq/22*/smp_affinity" cores 0-11 on my i9-13900h are P cores, 000155, should restrict me to cores 0, 2, 4, 6, 8, 10 if i'm right.


Edit 2: I just tested a whole bunch of cables. the Apple cable and the OWC cable perform nearly identical. The belkin cable and the no-name cable perform slightly worse, but we're talking imperceivable differences. Looks like i'm going to order some more OWC cables. Amazing value on these things.
 
Last edited:
ok, i've been messing with this for the better part of 2 hours. i figured it out. I noticed that when i was getting a lot of retries ksoftirqd was eating up nearly 100% of a CPU core, but when i told iperf3 to limit to 19gbps and i got few retries ksoftirqd was using hardly any cpu. I found this strange that a couple percent bump in speed caused it to go from low usage, to max usage.

As i was diving into understanding more about ksoftirqd i was looking at /proc/interrupts and noticed the cores that thunderbolt was utilizing and thought they might have been little cores... intels use big/little architecture so i wondered if i could pin the irq's for TB to big cores. I set the irq smp_affinity to do this, in my case i did this by "echo 000AAA | tee /proc/irq/22*/smp_affinity" This will vary based on your CPU and how many cores you have and which are P cores and which are E cores.

Now i get 26gbps and virtually 0 retransmits, and ksoftirqd is only using like 8% of one core

Code:
root@pve3:~# iperf3 -c fc00::84 -fg -t10
Connecting to host fc00::84, port 5201
[  5] local fc00::83 port 57640 connected to fc00::84 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.09 GBytes  26.5 Gbits/sec    0   2.81 MBytes 
[  5]   1.00-2.00   sec  3.10 GBytes  26.6 Gbits/sec    7   1.93 MBytes 
[  5]   2.00-3.00   sec  3.11 GBytes  26.7 Gbits/sec    0   1.93 MBytes 
[  5]   3.00-4.00   sec  3.06 GBytes  26.3 Gbits/sec    0   1.93 MBytes 
[  5]   4.00-5.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.93 MBytes 
[  5]   5.00-6.00   sec  3.11 GBytes  26.7 Gbits/sec    0   1.93 MBytes 
[  5]   6.00-7.00   sec  3.10 GBytes  26.7 Gbits/sec    0   1.93 MBytes 
[  5]   7.00-8.00   sec  3.11 GBytes  26.7 Gbits/sec    0   1.93 MBytes 
[  5]   8.00-9.00   sec  3.09 GBytes  26.6 Gbits/sec    0   1.93 MBytes 
[  5]   9.00-10.00  sec  3.10 GBytes  26.6 Gbits/sec    0   1.93 MBytes 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec    7             sender
[  5]   0.00-10.00  sec  31.0 GBytes  26.6 Gbits/sec                  receiver

iperf Done.

Code:
root@pve3:~# iperf3 -c fc00::84 -fg -t10   --bidir
Connecting to host fc00::84, port 5201
[  5] local fc00::83 port 45626 connected to fc00::84 port 5201
[  7] local fc00::83 port 45640 connected to fc00::84 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec  2.62 GBytes  22.5 Gbits/sec    3   1.87 MBytes 
[  7][RX-C]   0.00-1.00   sec  2.61 GBytes  22.4 Gbits/sec            
[  5][TX-C]   1.00-2.00   sec  2.64 GBytes  22.6 Gbits/sec   10   2.31 MBytes 
[  7][RX-C]   1.00-2.00   sec  2.66 GBytes  22.8 Gbits/sec            
[  5][TX-C]   2.00-3.00   sec  2.71 GBytes  23.3 Gbits/sec   48   1.37 MBytes 
[  7][RX-C]   2.00-3.00   sec  2.54 GBytes  21.8 Gbits/sec            
[  5][TX-C]   3.00-4.00   sec  2.86 GBytes  24.6 Gbits/sec   32   3.43 MBytes 
[  7][RX-C]   3.00-4.00   sec  2.50 GBytes  21.5 Gbits/sec            
[  5][TX-C]   4.00-5.00   sec  2.68 GBytes  23.0 Gbits/sec    7   1.87 MBytes 
[  7][RX-C]   4.00-5.00   sec  2.44 GBytes  20.9 Gbits/sec            
[  5][TX-C]   5.00-6.00   sec  2.92 GBytes  25.1 Gbits/sec   18   2.25 MBytes 
[  7][RX-C]   5.00-6.00   sec  2.41 GBytes  20.7 Gbits/sec            
[  5][TX-C]   6.00-7.00   sec  2.94 GBytes  25.3 Gbits/sec   31   3.43 MBytes 
[  7][RX-C]   6.00-7.00   sec  2.31 GBytes  19.9 Gbits/sec            
[  5][TX-C]   7.00-8.00   sec  2.66 GBytes  22.9 Gbits/sec    2   2.37 MBytes 
[  7][RX-C]   7.00-8.00   sec  2.54 GBytes  21.8 Gbits/sec            
[  5][TX-C]   8.00-9.00   sec  2.56 GBytes  22.0 Gbits/sec    0   2.37 MBytes 
[  7][RX-C]   8.00-9.00   sec  2.70 GBytes  23.2 Gbits/sec            
[  5][TX-C]   9.00-10.00  sec  2.66 GBytes  22.8 Gbits/sec   22   3.31 MBytes 
[  7][RX-C]   9.00-10.00  sec  2.58 GBytes  22.2 Gbits/sec            
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  27.2 GBytes  23.4 Gbits/sec  173             sender
[  5][TX-C]   0.00-10.00  sec  27.2 GBytes  23.4 Gbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec  25.3 GBytes  21.7 Gbits/sec  273             sender
[  7][RX-C]   0.00-10.00  sec  25.3 GBytes  21.7 Gbits/sec                  receiver

iperf Done.
root@pve3:~#

EDIT: because of Hyper threading i realized i don't want to risk 2 cores being shared, so i'm trying "echo 000155 | tee /proc/irq/22*/smp_affinity" cores 0-11 on my i9-13900h are P cores, 000155, should restrict me to cores 0, 2, 4, 6, 8, 10 if i'm right.


Edit 2: I just tested a whole bunch of cables. the Apple cable and the OWC cable perform nearly identical. The belkin cable and the no-name cable perform slightly worse, but we're talking imperceivable differences. Looks like i'm going to order some more OWC cables. Amazing value on these things.
WOW! Good catch :)

I tried pinning the Thunderbolt IRQS to P cores and its hyperthreaded cores, and the results are amazing...

Code:
root@nucerone2:~# iperf3 -c 10.0.0.81 -fg -t10 -bidir
Connecting to host 10.0.0.81, port 5201
[  5] local 10.0.0.82 port 57162 connected to 10.0.0.81 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.01 GBytes  25.9 Gbits/sec    3   2.43 MBytes      
[  5]   1.00-2.00   sec  3.04 GBytes  26.1 Gbits/sec    7   2.43 MBytes     
[  5]   2.00-3.00   sec  3.04 GBytes  26.1 Gbits/sec    0   2.43 MBytes      
[  5]   3.00-4.00   sec  3.02 GBytes  25.9 Gbits/sec    6   2.68 MBytes      
[  5]   4.00-5.00   sec  3.04 GBytes  26.1 Gbits/sec   16   2.87 MBytes      
[  5]   5.00-6.00   sec  3.04 GBytes  26.1 Gbits/sec    4   2.06 MBytes      
[  5]   6.00-7.00   sec  3.06 GBytes  26.2 Gbits/sec    0   2.06 MBytes      
[  5]   7.00-8.00   sec  3.04 GBytes  26.1 Gbits/sec    0   2.06 MBytes      
[  5]   8.00-9.00   sec  3.02 GBytes  26.0 Gbits/sec    1   2.06 MBytes      
[  5]   9.00-10.00  sec  3.06 GBytes  26.3 Gbits/sec    0   2.06 MBytes      
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.4 GBytes  26.1 Gbits/sec   37             sender
[  5]   0.00-10.00  sec  30.4 GBytes  26.1 Gbits/sec                  receiver

I don't think you should be worried about using shared cores for it. Even on the i3-1220p CPU I have, I've seen max utilization at 40% (that was iperf) and around 10% for the ksoftirqd of one P core. The CPU you have has much more powerful P cores, so that shouldn't be an issue at all. Also from the looks of it it seems to be a singlethreaded process (or maybe its because of the iperf3 which is singlethreaded).

I assume IRQs can change on every reboot or kernel update so there would have to bee some script that will run at startup and dynamically update the affinity for the Thunderbolt IRQs.
 
Last edited:
Wow, thank you for finding this @anaxagoras

@scyto you should definetly have a lookt at this and add this to your github gist, this instantly bumps my iperf3 tests to solid 26Gbit/s with very low retransmits even with my small i3!

so there would have to bee some script that will run at startup
I created a rc.local to do so:
Bash:
#!/bin/bash
for id in $(grep 'thunderbolt' /proc/interrupts | awk '{print $1}' | cut -d ':' -f1); do
    echo 0f > /proc/irq/$id/smp_affinity
done
dont forget to make it executable.


Edit (Added iperf):
Code:
Connecting to host fc00::1, port 5201
[  5] local fc00::2 port 49652 connected to fc00::1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.03 GBytes  26.0 Gbits/sec    0   1.81 MBytes       
[  5]   1.00-2.00   sec  3.03 GBytes  26.1 Gbits/sec    9   2.37 MBytes       
[  5]   2.00-3.00   sec  3.02 GBytes  25.9 Gbits/sec    0   2.37 MBytes       
[  5]   3.00-4.00   sec  3.02 GBytes  26.0 Gbits/sec    0   2.37 MBytes       
[  5]   4.00-5.00   sec  3.06 GBytes  26.3 Gbits/sec    0   2.37 MBytes       
[  5]   5.00-6.00   sec  3.04 GBytes  26.1 Gbits/sec    1   2.43 MBytes       
[  5]   6.00-7.00   sec  3.03 GBytes  26.0 Gbits/sec    0   2.43 MBytes       
[  5]   7.00-8.00   sec  2.93 GBytes  25.1 Gbits/sec   43   2.56 MBytes       
[  5]   8.00-9.00   sec  2.95 GBytes  25.3 Gbits/sec   60   2.50 MBytes       
[  5]   9.00-10.00  sec  3.03 GBytes  26.0 Gbits/sec   29   1.87 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  30.1 GBytes  25.9 Gbits/sec  142             sender
[  5]   0.00-10.00  sec  30.1 GBytes  25.9 Gbits/sec                  receiver

iperf Done.
 
Last edited:
Wow, thank you for finding this @anaxagoras

@scyto you should definetly have a lookt at this and add this to your github gist, this instantly bumps my iperf3 tests to solid 26Gbit/s with very low retransmits even with my small i3!


I created a rc.local to do so:
Bash:
#!/bin/bash
for id in $(grep 'thunderbolt' /proc/interrupts | awk '{print $1}' | cut -d ':' -f1); do
    echo 0f > /proc/irq/$id/smp_affinity
done
dont forget to make it executable.

Pretty close to what I did, but running this on boot only doesn't work. On my system I've seen that the interrupt addresses have changed when thunderbolt cables were plugged and unplugged. So i'm thinking it might make more sense to make this part of the udev rule that scyto setup to bring up the thunderbolt network adapter en0[56]


Here's what I have so far that I added to /usr/local/bin/pve-en0[56].sh which is called from the udev rule /etc/udev/rules.d/10-tb-en.rules from @scyto 's guide.

Code:
grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo 0-11 | tee "/proc/irq/{}/smp_affinity_list"'

as an aside i found smp_affinity_list which is more human readable than figuring out the appropriate bitmask.

EDIT: i hate spamming multiple posts in a row. I think i'm on the right track because all the irq's aren't assigned until after the interface is up. I assume this is partially for rx/tx queues. However, it looks like the udev rule i mentioned isn't being processed on my system at all, so it's not actually working. I added a line to "touch /tmp/en05" and create an empty file and nothing is happening.

EDIT2: my udev issue was a non-issue, udev doesn't have the same PATH variable my terminal login has. That said, this didn't quite work, on reboot it only set affinity on a few IRQ's, but not all, my guess is because the interface wasn't fully up, even adding a 2 second sleep made no impact.
 
Last edited:
Pretty close to what I did, but running this on boot only doesn't work. On my system I've seen that the interrupt addresses have changed when thunderbolt cables were plugged and unplugged. So i'm thinking it might make more sense to make this part of the udev rule that scyto setup to bring up the thunderbolt network adapter en0[56]


Here's what I have so far that I added to /usr/local/bin/pve-en0[56].sh which is called from the udev rule /etc/udev/rules.d/10-tb-en.rules from @scyto 's guide.

Code:
grep thunderbolt /proc/interrupts | cut -d ":" -f1 | xargs -I {} sh -c 'echo 0-11 | tee "/proc/irq/{}/smp_affinity_list"'

as an aside i found smp_affinity_list which is more human readable than figuring out the appropriate bitmask.

EDIT: i hate spamming multiple posts in a row. I think i'm on the right track because all the irq's aren't assigned until after the interface is up. I assume this is partially for rx/tx queues. However, it looks like the udev rule i mentioned isn't being processed on my system at all, so it's not actually working. I added a line to "touch /tmp/en05" and create an empty file and nothing is happening.

EDIT2: my udev issue was a non-issue, udev doesn't have the same PATH variable my terminal login has. That said, this didn't quite work, on reboot it only set affinity on a few IRQ's, but not all, my guess is because the interface wasn't fully up, even adding a 2 second sleep made no impact.

We could run the script for the en05 and en06 interfaces in the up state if you check my response on the scytho's gist. I'm using /etc/network/if-up.d/ to restart the frr service, so I assume adding the affinity script in there would work because it will run only if the interface is up.

One disadvantage is that it will run the script twice on reboot. But it would solve the unplug/replug issue with changing IRQS because it would get updated after the interface is up.
 
Maybe calling the scripts as post-up in the /etc/network/interfaces file would fit the most.

But I'm currently still struggling with interfaces wont come up on node reboot until i manually enter "ifup en0[56]" on the other (attached) host.
 
putting a post-up script by itself is not working on boot, but does work if i restart networking. Since I'm testing things remote i can't test if it helps with physically plugging cables in.

I also tried @dovh 's method with an if-up.d script, this did work on a reboot! Next tonight i will test unplug/pluging a cable in, and see if it still works. I've got a good feeling about it being fine.

I feel like this isn't the cleanest solution, in that i won't remember this in a couple of years. I'd really like to avoid creating yet another config file, and do though this a file i've already had to touch.
 
putting a post-up script by itself is not working on boot, but does work if i restart networking. Since I'm testing things remote i can't test if it helps with physically plugging cables in.

I also tried @dovh 's method with an if-up.d script, this did work on a reboot! Next tonight i will test unplug/pluging a cable in, and see if it still works. I've got a good feeling about it being fine.

I feel like this isn't the cleanest solution, in that i won't remember this in a couple of years. I'd really like to avoid creating yet another config file, and do though this a file i've already had to touch.

The script in if-up.d runs on reboot or when i unplug/replug the cable, its tied to the en0x interfaces up state so if the intefrace will become UP and its either one of the two it will run the script. Tested this yesterday and it was working as expected, but what i noticed is that sometimes after a reboot one of the interfaces did not come up on the node, the other one was fine. I assume its similiar issue as to what @rene.bayer is having.
 
Last edited:
The script in if-up.d runs on reboot or when i unplug/replug the cable, its tied to the en0x interfaces up state so if the intefrace will become UP and its either one of the two it will run the script. Tested this yesterday and it was working as expected, but what i noticed is that sometimes after a reboot one of the interfaces did not come up on the node, the other one was fine. I assume its similiar issue as to what @rene.bayer is having.
So cable pulls worked fine. But I'm having the same problem of frr restarting too early as a post-up command and not surviving a reboot, so I tried your if-up script and having the same issue of only 1 interface coming up on boot
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!