Packet dropped on lxc

Paspao

Active Member
Aug 1, 2017
69
2
28
55
Hello,

I am asking suggestion on how to troubleshoot packets dropped on an LXC.

Proxmox host has a public IP + a different public IP range routed on bridge.

The bridge works on a linux bond.

Ping between Proxmox IP (on same bond/bridge) and other servers works without packet loss.

Ping and mtr between LXC and remote hosts show high packet loss.

Code:
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9006633 errors:0 dropped:12304932 overruns:0 frame:0
          TX packets:23998 errors:0 dropped:161 overruns:0 carrier:0
          collisions:0 txqueuelen:1000

netstat -s

Ip:

1085089 total packets received

0 forwarded

0 incoming packets discarded

36914 incoming packets delivered

32904 requests sent out

Icmp:

646 ICMP messages received

20 input ICMP message failed.

ICMP input histogram:

destination unreachable: 25

echo requests: 537

echo replies: 84

704 ICMP messages sent

0 ICMP messages failed

ICMP output histogram:

destination unreachable: 74

echo request: 93

echo replies: 537

IcmpMsg:

InType0: 84

InType3: 25

InType8: 537

OutType0: 537

OutType3: 74

OutType8: 93

Tcp:

1466 active connections openings

354 passive connection openings

37 failed connection attempts

20 connection resets received

3 connections established

28097 segments received

27082 segments send out

1779 segments retransmited

68 bad segments received.

3413 resets sent

InCsumErrors: 28

Udp:

2782 packets received

80 packets to unknown port received.

53 packet receive errors

3450 packets sent

InCsumErrors: 53

IgnoredMulti: 5256

UdpLite:

TcpExt:

35 resets received for embryonic SYN_RECV sockets

260 TCP sockets finished time wait in fast timer

339 delayed acks sent

Quick ack mode was activated 17 times

7470 packet headers predicted

2884 acknowledgments not containing data payload received

5892 predicted acknowledgments

2 times recovered from packet loss by selective acknowledgements

Detected reordering 1 times using SACK

9 congestion windows recovered without slow start by DSACK

293 congestion windows recovered without slow start after partial ack

2 fast retransmits

1 retransmits in slow start

510 other TCP timeouts

TCPLossProbes: 1051

TCPLossProbeRecovery: 99

32 DSACKs sent for old packets

1 DSACKs sent for out of order packets

622 DSACKs received

64 connections reset due to unexpected data

9 connections reset due to early user close

18 connections aborted due to timeout

TCPDSACKIgnoredNoUndo: 546

TCPSackShiftFallback: 70

TCPRcvCoalesce: 613

TCPOFOQueue: 217

TCPOFOMerge: 1

TCPChallengeACK: 42

TCPSYNChallenge: 40

TCPSynRetrans: 387

TCPOrigDataSent: 16629

IpExt:

InBcastPkts: 5256

InOctets: 859622789

OutOctets: 3159784

InBcastOctets: 1367496

InNoECTPkts: 1085204

InECT0Pkts: 14

Issue seems to start when more lxc are active as when I have only one LXC up packet loss stops.

I am not a networking expert, how do you suggest to investigate the cause of packet drop?

Thank you.
P.
 
I tried removing the bond but I still have packet loss only in LXC !

Netstat output

Kernel Interface table

Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg

eno1 1500 15228877 0 193 0 200224 0 0 0 BMRU
lo 65536 278432 0 255 0 278432 0 0 0 LRU
veth226i 1500 19183 0 652 0 9482050 0 5250379 0 BMRU

Ethtool shows increasing number of rx_queue_0_csum_err

The odd thing is there is no packet loss with 2 or 3 lxc running (with no internal load) but it increases when running up to 30 (with no load on server).

I have a similar cluster in hardware and configuration with no issues.

Please anyone has any hint?

Thank you
P.
 
Last edited:
The issue seems related to NIC card.

In other cluster (working well) I have a BCM5720 and I see in /proc/interrupts 1 TX channel and 4 RX channels assigned to different CPU cores.

In server with packet loss and I350 I see only 1 single TxRx assigned to a single CPU.

ethtool -l eno1 for BCM5720:

Pre-set maximums:

RX: 4
TX: 4
Other: 0
Combined: 0

Current hardware settings:
RX: 4
TX: 1
Other: 0
Combined: 0


In server with packet loss I have an Intel I350

ethtool -l eno1 for I350:

Pre-set maximums:
RX: 16
TX: 16
Other: 1
Combined: 16

Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 1

Anyone already solved a similar issue?

Thanks
P.
 
Setting more channels on NIC seems to have great benefits (still getting a 2% packet loss with 6 channels):

ethtool -L eno1 combined 6
 
I think I'm having the same issue - I have the Intel I211 and my `ethtool -l` shows

Bash:
~ ❯ ethtool -l enp4s0         
Channel parameters for enp4s0:
Pre-set maximums:
RX:             0
TX:             0
Other:          1
Combined:       2
Current hardware settings:
RX:             0
TX:             0
Other:          1
Combined:       2

I guess using consumer hardware has bitten me in the butt.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!