Kernel dumps while doing multi hundred mbps download on a VM

ispirto

Renowned Member
Oct 20, 2012
37
1
73
Hello,

We are seeing the following warning messages a few times a minute while a VM is doing multiple hundred mbits of download per second. I'm not sure if this is bridge related or the NIC related.

Any insight?

Code:
[7960695.303208] WARNING: CPU: 1 PID: 9426 at net/core/dev.c:2422 skb_warn_bad_offload+0xd3/0x120()
[7960695.303211] ixgbe: caps=(0x00000802202043a1, 0x0000000000000000) len=1598 data_len=1544 gso_size=1460 gso_type=5 ip_summed=0
[7960695.303212] Modules linked in: ipmi_devintf binfmt_misc xt_tcpudp act_police cls_basic sch_ingress sch_htb ip_set ip6table_filter ip6_tables iptable_filter ip_tables x_tables softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_ssif aesni_intel aes_x86_64 lrw gf128mul glue_helper snd_pcm ablk_helper snd_timer cryptd snd joydev input_leds soundcore shpchp pcspkr sb_edac ioatdma edac_core mei_me ipmi_si mei 8250_fintek
[7960695.303271]  i2c_i801 lpc_ich ipmi_msghandler wmi mac_hid vhost_net vhost macvtap macvlan autofs4 btrfs xor raid6_pq ses enclosure hid_generic ixgbe(O) vxlan ip6_udp_tunnel udp_tunnel usbmouse usbkbd igb(O) usbhid ahci isci dca hid libahci ptp libsas pps_core scsi_transport_sas megaraid_sas fjes
[7960695.303300] CPU: 1 PID: 9426 Comm: vhost-9400 Tainted: P        W  O    4.4.19-1-pve #1
[7960695.303302] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
[7960695.303304]  0000000000000286 00000000ba9c4c24 ffff88181fa43948 ffffffff813f3de3
[7960695.303306]  ffff88181fa43990 ffffffff81d6782e ffff88181fa43980 ffffffff81081796
[7960695.303309]  ffff8812e3046100 ffff88300c960000 0000000000000005 ffff88300c960000
[7960695.303311] Call Trace:
[7960695.303313]  <IRQ>  [<ffffffff813f3de3>] dump_stack+0x63/0x90
[7960695.303323]  [<ffffffff81081796>] warn_slowpath_common+0x86/0xc0
[7960695.303325]  [<ffffffff8108182c>] warn_slowpath_fmt+0x5c/0x80
[7960695.303328]  [<ffffffff813f9db6>] ? ___ratelimit+0x86/0xe0
[7960695.303331]  [<ffffffff81730da3>] skb_warn_bad_offload+0xd3/0x120
[7960695.303334]  [<ffffffff8173526e>] __skb_gso_segment+0x7e/0xd0
[7960695.303336]  [<ffffffff8173560f>] validate_xmit_skb.isra.99.part.100+0x12f/0x2b0
[7960695.303338]  [<ffffffff81735bcb>] validate_xmit_skb_list+0x3b/0x60
[7960695.303342]  [<ffffffff8175b448>] sch_direct_xmit+0x138/0x220
[7960695.303344]  [<ffffffff81735f23>] __dev_queue_xmit+0x253/0x590
[7960695.303346]  [<ffffffff81736270>] dev_queue_xmit+0x10/0x20
[7960695.303350]  [<ffffffff818269d8>] br_dev_queue_push_xmit+0x88/0x150
[7960695.303353]  [<ffffffff81826ae1>] br_forward_finish+0x41/0xb0
[7960695.303355]  [<ffffffff81826950>] ? deliver_clone+0x50/0x50
[7960695.303358]  [<ffffffff81826d46>] __br_forward+0xa6/0x140
[7960695.303361]  [<ffffffff810abb07>] ? ttwu_do_wakeup+0x87/0xe0
[7960695.303363]  [<ffffffff81826aa0>] ? br_dev_queue_push_xmit+0x150/0x150
[7960695.303366]  [<ffffffff81827167>] br_forward+0x87/0x90
[7960695.303369]  [<ffffffff81828208>] br_handle_frame_finish+0x338/0x610
[7960695.303372]  [<ffffffff81222563>] ? pollwake+0x73/0x90
[7960695.303375]  [<ffffffff8176b72d>] ? nf_iterate+0x5d/0x70
[7960695.303378]  [<ffffffff8182865f>] br_handle_frame+0x17f/0x2c0
[7960695.303380]  [<ffffffff81827ed0>] ? br_handle_local_finish+0xa0/0xa0
[7960695.303383]  [<ffffffff81733430>] __netif_receive_skb_core+0x370/0xa60
[7960695.303384]  [<ffffffff810aba99>] ? ttwu_do_wakeup+0x19/0xe0
[7960695.303387]  [<ffffffff810abbfd>] ? ttwu_do_activate.constprop.89+0x5d/0x70
[7960695.303388]  [<ffffffff81733b36>] __netif_receive_skb+0x16/0x70
[7960695.303390]  [<ffffffff81734928>] process_backlog+0xa8/0x150
[7960695.303392]  [<ffffffff81734085>] net_rx_action+0x215/0x350
[7960695.303395]  [<ffffffff8108629e>] __do_softirq+0x10e/0x2a0
[7960695.303399]  [<ffffffff818561cc>] do_softirq_own_stack+0x1c/0x30
[7960695.303400]  <EOI>  [<ffffffff81085ae8>] do_softirq.part.20+0x38/0x40
[7960695.303404]  [<ffffffff8108649d>] do_softirq+0x1d/0x20
[7960695.303406]  [<ffffffff81732e13>] netif_rx_ni+0x33/0x80
[7960695.303409]  [<ffffffff816044f1>] tun_get_user+0x521/0x930
[7960695.303412]  [<ffffffff81604951>] tun_sendmsg+0x51/0x70
[7960695.303416]  [<ffffffffc00dbe40>] handle_tx+0x2f0/0x500 [vhost_net]
[7960695.303418]  [<ffffffffc00dc085>] handle_tx_kick+0x15/0x20 [vhost_net]
[7960695.303422]  [<ffffffffc010870e>] vhost_worker+0x10e/0x1b0 [vhost]
[7960695.303425]  [<ffffffffc0108600>] ? vhost_dev_reset_owner+0x50/0x50 [vhost]
[7960695.303428]  [<ffffffff810a0f4a>] kthread+0xea/0x100
[7960695.303430]  [<ffffffff810a0e60>] ? kthread_park+0x60/0x60
[7960695.303432]  [<ffffffff8185484f>] ret_from_fork+0x3f/0x70
[7960695.303434]  [<ffffffff810a0e60>] ? kthread_park+0x60/0x60
[7960695.303436] ---[ end trace 49ab5ded8d73dfe9 ]---

Code:
pveversion -v
proxmox-ve: 4.4-79 (running kernel: 4.4.19-1-pve)
pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74)
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.19-1-pve: 4.4.19-66
lvm2: 2.02.116-pve3
corosync-pve: 2.4.0-1
libqb0: 1.0-1
pve-cluster: 4.0-48
qemu-server: 4.0-108
pve-firmware: 1.1-10
libpve-common-perl: 4.0-91
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-73
pve-libspice-server1: 0.12.8-1
vncterm: 1.2-1
pve-docs: 4.4-3
pve-qemu-kvm: 2.7.1-1
pve-container: 1.0-93
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-1
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve14~bpo80
 
Hi,
"...bad_offload"?? Perhaps you should try disable tcp_offload for this nic?

And supermicro... look for an new bios (they don't write what fixes an bios included - crapware).

Udo

I think I did but it didn't help. What I ran was:

Code:
ethtool -K eth5 rx off
ethtool -K eth5 tx off
ethtool -K eth5 sg off
ethtool -K eth5 tso off
ethtool -K eth5 ufo off
ethtool -K eth5 gso off
ethtool -K eth5 gro off
ethtool -K eth5 lro off
ethtool -K eth5 rxvlan off
ethtool -K eth5 txvlan off
ethtool -K eth5 rxhash off

Code:
Features for eth5:
rx-checksumming: off
tx-checksumming: off
    tx-checksum-ipv4: off
    tx-checksum-ip-generic: off [fixed]
    tx-checksum-ipv6: off
    tx-checksum-fcoe-crc: on [fixed]
    tx-checksum-sctp: off [fixed]
scatter-gather: off
    tx-scatter-gather: off
    tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
    tx-tcp-segmentation: off
    tx-tcp-ecn-segmentation: off [fixed]
    tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
rx-vlan-offload: off
tx-vlan-offload: off
ntuple-filters: off
receive-hashing: off
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: on [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: on [fixed]
hw-tc-offload: off [fixed]

Also the NIC is not onboard. It's a X520-DA2 addon card. Still a BIOS update would help?

Oktay
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!