Hello.
After years of Network tuning many hypervisors like: OpenStack, Proxmox VE, and now also Red Hat/Oracle Virtualization, but in general Linux based ones, i can conclude that there are a constant tuning needs in the network section of hypervisors. Here i am talking about the situations when using at least multiple 10Gbps+ interfaces with the higher network utilization's .
The same situation is happening in the commercial Hypervisors like VMware.
So i am talking about the network intensive load when using multiple 10Gbps+ interfaces because in that situation we will overwhelm primarily hypervisor and then propagates those utilization's into the corresponding VMs.
In that situations VM will not be able to process all the network traffic and at the worst case it will start to drop it.
As 10Gbps Ethernet is nowadays common (and normal), and 25Gbps+ Ethernet is used more and more, and a faster Ethernet is coming (50/100Gbps+), the situation will be worst in the way of the packet processing capabilities of the non-tuned Hypervisors.
My proposal is adding at least a few mostly common options in the GUI form, for the network interfaces tab (similar as it is added in the VM CPU tab/configuration named "Extra CPU Flags").
My proposed options under "Advanced" tab for network interface are:
1. Multiqueue is used for parallel packet processing of TX+RX on specific server NICs. Multiqueue allows network interface cards (NICs) to use multiple transmit and receive queues. Each queue can be assigned to a different CPU core, which allows parallel processing of network traffic.
Info with
.
2. RSS is used also for parallel packet processing, but only for RX on specific server NICs. RSS works by hashing incoming packet headers (typically using information like the IP address and port) to distribute the workload evenly across multiple receive queues. Each queue is then processed by a different CPU core.
Info with
.
3. LRO is used for aggregating RX traffic at the NIC (hardware). It offloads the task of aggregating incoming TCP packets into larger chunks, reducing the number of interrupts and the load on the CPU. However, LRO only works with TCP traffic and is hardware-dependent, meaning it may not be supported by all NICs.
Info with
.
4. GRO for aggregating (merging) small packets into larger ones for RX traffic (generic). GRO is a software-based technique in the Linux kernel that performs a similar function to LRO but it is more flexible.
Info with
.
5. TSO reduce the number of outgoing packets by segmenting them at the NIC level (hardware). TSO is a hardware-based feature where the task of segmenting large TCP packets into smaller ones (suitable for transmission over the network) is offloaded to the network interface card (NIC). This reduces CPU load since the NIC handles packet segmentation, improving performance in TCP traffic.
Info with
.
6. GSO reduce the number of outgoing packets by segmenting them (generic). GSO performs the segmentation in software but still provides efficiency by reducing the overhead of handling many small packets.
Info with
.
7. Checksum offload shifts checksum calculations to the NIC hardware.
Info with
.
8. Scatter-gather allows sending data from multiple memory regions at once. Scatter-gather is a technique that allows the NIC to send or receive data from multiple non-contiguous memory buffers in a single operation. Instead of copying data into a single contiguous buffer before transmission, Scatter-gather reduces CPU overhead and memory copies, which makes data transmission and reception more efficient.
Info with
.
9. Optimization of Txqueuelen which adjusts the size of the NIC's transmit queue. TXQUEUELEN sets the length of the transmit queue (TX queue) in the Linux networking stack for a network interface card (NIC). This queue holds packets waiting to be sent by the NIC. A longer TX queue allows more packets to be queued for transmission, which can smooth out traffic spikes and prevent packet drops under high load. However, too long of a queue can introduce higher latency and bufferbloat (where excessive buffering leads to longer delays).
To show the value (most common is 1000)
Qlen value should/may be change:
We can also make a voting for the other necessary options for high network and packet processing speeds.
Complexity of the Linux network stack can be visualized in this diagram:
https://doi.org/10.5281/zenodo.12723600
P.S. Already commented on:
https://forum.proxmox.com/threads/p...-persistent-across-reboots-and-updates.104827
P.P.S. Enhancement/feature proposal open in Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=5802
After years of Network tuning many hypervisors like: OpenStack, Proxmox VE, and now also Red Hat/Oracle Virtualization, but in general Linux based ones, i can conclude that there are a constant tuning needs in the network section of hypervisors. Here i am talking about the situations when using at least multiple 10Gbps+ interfaces with the higher network utilization's .
The same situation is happening in the commercial Hypervisors like VMware.
So i am talking about the network intensive load when using multiple 10Gbps+ interfaces because in that situation we will overwhelm primarily hypervisor and then propagates those utilization's into the corresponding VMs.
In that situations VM will not be able to process all the network traffic and at the worst case it will start to drop it.
As 10Gbps Ethernet is nowadays common (and normal), and 25Gbps+ Ethernet is used more and more, and a faster Ethernet is coming (50/100Gbps+), the situation will be worst in the way of the packet processing capabilities of the non-tuned Hypervisors.
My proposal is adding at least a few mostly common options in the GUI form, for the network interfaces tab (similar as it is added in the VM CPU tab/configuration named "Extra CPU Flags").
My proposed options under "Advanced" tab for network interface are:
1. Multiqueue is used for parallel packet processing of TX+RX on specific server NICs. Multiqueue allows network interface cards (NICs) to use multiple transmit and receive queues. Each queue can be assigned to a different CPU core, which allows parallel processing of network traffic.
Info with
Code:
ethtool -l eth0
2. RSS is used also for parallel packet processing, but only for RX on specific server NICs. RSS works by hashing incoming packet headers (typically using information like the IP address and port) to distribute the workload evenly across multiple receive queues. Each queue is then processed by a different CPU core.
Info with
Code:
ethtool -x eth0
3. LRO is used for aggregating RX traffic at the NIC (hardware). It offloads the task of aggregating incoming TCP packets into larger chunks, reducing the number of interrupts and the load on the CPU. However, LRO only works with TCP traffic and is hardware-dependent, meaning it may not be supported by all NICs.
Info with
Code:
ethtool -k eth0 | grep large-receive-offload
4. GRO for aggregating (merging) small packets into larger ones for RX traffic (generic). GRO is a software-based technique in the Linux kernel that performs a similar function to LRO but it is more flexible.
Info with
Code:
ethtool -k eth0 |grep generic-receive-offload
5. TSO reduce the number of outgoing packets by segmenting them at the NIC level (hardware). TSO is a hardware-based feature where the task of segmenting large TCP packets into smaller ones (suitable for transmission over the network) is offloaded to the network interface card (NIC). This reduces CPU load since the NIC handles packet segmentation, improving performance in TCP traffic.
Info with
Code:
ethtool -k eth0 |grep tcp-segmentation-offload
6. GSO reduce the number of outgoing packets by segmenting them (generic). GSO performs the segmentation in software but still provides efficiency by reducing the overhead of handling many small packets.
Info with
Code:
ethtool -k eth0 |grep generic-segmentation-offload
7. Checksum offload shifts checksum calculations to the NIC hardware.
Info with
Code:
ethtool -k eth0 |grep checksumming
8. Scatter-gather allows sending data from multiple memory regions at once. Scatter-gather is a technique that allows the NIC to send or receive data from multiple non-contiguous memory buffers in a single operation. Instead of copying data into a single contiguous buffer before transmission, Scatter-gather reduces CPU overhead and memory copies, which makes data transmission and reception more efficient.
Info with
Code:
ethtool -k eth0 |grep scatter-gather
9. Optimization of Txqueuelen which adjusts the size of the NIC's transmit queue. TXQUEUELEN sets the length of the transmit queue (TX queue) in the Linux networking stack for a network interface card (NIC). This queue holds packets waiting to be sent by the NIC. A longer TX queue allows more packets to be queued for transmission, which can smooth out traffic spikes and prevent packet drops under high load. However, too long of a queue can introduce higher latency and bufferbloat (where excessive buffering leads to longer delays).
To show the value (most common is 1000)
Code:
ip link show INTF-NAME
Qlen value should/may be change:
- Per physical interface.
- Per Bond/Bridge interface.
- Per all TAP interfaces (global option).
We can also make a voting for the other necessary options for high network and packet processing speeds.
Complexity of the Linux network stack can be visualized in this diagram:
https://doi.org/10.5281/zenodo.12723600
P.S. Already commented on:
https://forum.proxmox.com/threads/p...-persistent-across-reboots-and-updates.104827
P.P.S. Enhancement/feature proposal open in Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=5802
Last edited: