softirq squeezed (dropped) problem

Spiros Pap

Well-Known Member
Aug 1, 2017
83
0
46
44
Hi all,

I have installed netdata on my proxmox nodes and I belileve there is a problem.

I am getting about 2800 events per 10min dropped in the system.softnet_stat, which is the number of times, during the last 10min, ksoftirq ran out of sysctl net.core.netdev_budget or net.core.netdev_budget_usecs, with work remaining (this can be a cause for dropped packets).

The node has very low load in terms of CPU (<3%) and network BW (<100Mbit in 10G adapterrs, due to ceph).

Do you have any suggestions about fixing this?


Thanx,
Sp
 
Hi again,

Adjusting, net.core.netdev_budget_usecs to 5000 (from 2000) and also netdev_budget and netdev_max_backlog, solved the problem. I would like an explanation though, since I doubt that the kernel default is faulty.

I have the impression (and this is a mostly wild guess), that the problem is related to the i40 driver behaviour (ixgbe does not have this problem). The driver does not trigger an interrupt if too few packets (less than 4 i think), are in the input queue, which means that when the kernel actually processes the packets, some time may have passed after these packets were received. So, this problem might be solved when the ports have enough traffic. I haven't test it though.

I would really like, a more educated opinion on this, since the i40 driver is very common and I believe that the problem is also common among 10G/i40 systems.



Thanx,
Sp
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!