Really? I ran into it every time on a network with at least 10 GBE. SSH encryption is single-threaded, so it'll be bound by your CPU. The faster the single-core performance of you CPU, the highter the throughput.
Did you try to tweak this with permitted ciphers (and their order) to be custom on both ends? With AES-NI it should not be capped at 500 per core. I am asking here, not tested it myself.
This limit is nowadays in the 300-500 MB/s, but depending on the used storage, it'll be the bottle neck for most high end systems. You can run muliple ssh streams which will combine the throughput until you run into another bottleneck. On a private network, you can also just run socat or nc to get much highter transfer rates with send/receive. Maybe wireguard is able to get better throughtput if you run over a non-private network, yet I haven't tried that.
I would usually run that over something else doing IPsec (you can often squeeze out a lot with "lowering" the HMAC to something like SHA1 because it does not really compromise anything for a good enough encryption - by the time integrity of the data could be compromised the transmission is over) inbetween the two ends.
it's pretty much 500MB/s on a reasonably modern system, yes. like I said, unless your disks on both ends and the network in between are very fast, it won't be the bottle neck (500MB/s is already half of what you effectively get out of a 10Gbit line!). note that zfs send/recv itself is also single-threaded, so the stream it generates will also not have your theoretical max storage speed either. but sure, if you are on a private network, with all flash or massively parallel storage, and with 10Gbit or higher, you can benefit from using a more transparent transport layer.
I never realised zfs send/receive is single core, but I totally had over 20Gbps (no
ssh
there, but mbuffer
) between zfs send/receive on measlyCore i31115G4 (over Thunderbolt point to point) which I believe was limited by the TB.