It's been a while and this is still an issue. In the mean time I've also discovered that disabling PCIe power management in /etc/kernel/cmdline (pcie_aspm=off) also stops this log spam, with the benefit of not needing manual intervention each boot.
Just a note that both intel_idle and intel_pstate modules are only active on Intel CPUs (as per the names) so those commands won't do anything on your EPYC server. 'processor.max_cstate=2' might be what you want.
The usecs parameters specify how many microseconds after at least 1 packet is received/transmitted before generating an interrupt. In my case the default is 3. You can check yours by running ethtool -c <interface> and reading the output on a fresh boot before you disable it.
As this is the first thread on the issue with a staff reply I thought I'd add the following:
It appears that it's 'fixed' not just by disabling interrupt coalescing but rather toggling it; it can safely be turned back on again after disabling. Also as noticed by 'ajeffco' in another thread on...
None of the features are disabled or limited, there are two major differences, support and the enterprise repository.
Support should be self explainitory, should you have an issue without at least a "basic" subscription your only avenue for help is community settings such as this forum.
I haven't had any issue since turning off interrupt coalescing. The expected side effect of doing this is slight cpu usage increase as there will be more interrupts. In my environment this is immeasurably small.
I have the same issue atm with the 82599ES driver on proxmox. I've found that if you disable interrupt coalescing on the affected cards then the errors stop happening.
sudo ethtool -C <iface> rx-usecs 0