Let me start off by saying thanks to all who have figured most of this out. Without you I wouldn't even know where to start.
I have a NUC13 i7-1360P 3 node cluster. PVE 8.2.2. BIOS ANRPL357.0031.2024.0207.1420. Addon 2.5GbE and the 0.3 OWC TB4 cables.
I've spent the past week and half learning and following everyone's advice. I ended up ditching OpenFabric and dual stack, and have a working IPv6 OSPFv3 config. IPerf3 gets about 26Gbps but gets interrupts averaging about 400 per second. I don't know that this even matters because when I run iperf3 with -b10000M for example, the interrupt count is negligible, so it only jumps when pushed to the limit. What's a realistic ceph usage on these anyway?! I've tried the smp_affinity_list script restricting thunderbolt to cores 0-7 which are the P cores on these, and saw no difference whatsoever in the interrupt count.
Here's the big question. I see you all posting your iperf results point-to-point. When i pull the thunderbolt cable between nodes, so there's no longer a direct link and it has to route through the 3rd NUC, the results are abysmal. It goes from 26Gbps to 20Mbps.  What are your results?
Also, CPU usage. I noticed the sending node doesn't use much CPU when maxing out iperf, but the receive side has one CPU pegged at 100%.  Dropping the iperf bandwidth test down to 10Gb brings it down to a more reasonable 15%.  What are you seeing using OSPFv2, OSPFv3, or OpenFabric?