[SOLVED] WARNING: CPU: 0 PID: ... [openvswitch]

dsi

Renowned Member
Dec 15, 2015
25
5
68
Germany
Hello,

since upgrading to Kernel 5.15.35 I have following warning (and a second similar one) during boot:
Code:
May 15 17:04:57 pve-2 kernel: WARNING: CPU: 0 PID: 1228 at include/net/netfilter/nf_conntrack.h:175 __ovs_ct_lookup+0x907/0xa40 [openvswitch]
May 15 17:04:57 pve-2 kernel: Modules linked in: bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 softdog nfnetlink_log nfnetlink snd_hda_codec_hdmi snd_hda_c>
May 15 17:04:57 pve-2 kernel:  sunrpc ip_tables x_tables autofs4 hid_generic usbmouse usbkbd usbhid hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blak>
May 15 17:04:57 pve-2 kernel: CPU: 0 PID: 1228 Comm: ovs-vswitchd Tainted: P         C O      5.15.35-1-pve #1
May 15 17:04:57 pve-2 kernel: Hardware name: Supermicro Super Server/X11SBA-F, BIOS 1.1d 07/07/2020
May 15 17:04:57 pve-2 kernel: RIP: 0010:__ovs_ct_lookup+0x907/0xa40 [openvswitch]
May 15 17:04:57 pve-2 kernel: Code: 00 49 8b 94 24 c0 00 00 00 44 0f b6 3c 02 41 83 e7 0f 41 c1 e7 02 e9 69 fc ff ff 66 90 0f b6 53 1a 4c 8b 73 10 e9 59 f7 ff ff <0f> 0b e9 90 f7 ff ff 4c 89 d7 4c 89 >
May 15 17:04:57 pve-2 kernel: RSP: 0018:ffffadd7808875b0 EFLAGS: 00010246
May 15 17:04:57 pve-2 kernel: RAX: 0000000000000000 RBX: ffff9097c4042e20 RCX: 0000000000000000
May 15 17:04:57 pve-2 kernel: RDX: 0000000000000002 RSI: ffffadd7808875d0 RDI: 0000000000000000
May 15 17:04:57 pve-2 kernel: RBP: ffffadd780887680 R08: 0000000000000000 R09: ffffffff8f1b6cc0
May 15 17:04:57 pve-2 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff9097d5f9a100
May 15 17:04:57 pve-2 kernel: R13: ffff9097ca47eee8 R14: ffff9097c4042700 R15: 0000000000000000
May 15 17:04:57 pve-2 kernel: FS:  00007f066beeba40(0000) GS:ffff909937c00000(0000) knlGS:0000000000000000
May 15 17:04:57 pve-2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 15 17:04:57 pve-2 kernel: CR2: 00007ffe64fe9fd8 CR3: 000000010850a000 CR4: 00000000001006f0
May 15 17:04:57 pve-2 kernel: Call Trace:
May 15 17:04:57 pve-2 kernel:  <TASK>
May 15 17:04:57 pve-2 kernel:  ? nf_ct_get_tuple+0x13f/0x1e0 [nf_conntrack]
May 15 17:04:57 pve-2 kernel:  ? nf_ct_get_tuplepr+0x64/0xa0 [nf_conntrack]
May 15 17:04:57 pve-2 kernel:  ovs_ct_execute+0x449/0x8d0 [openvswitch]
May 15 17:04:57 pve-2 kernel:  do_execute_actions+0xd4/0x1ab0 [openvswitch]
May 15 17:04:57 pve-2 kernel:  ? call_rcu+0xa8/0x280
May 15 17:04:57 pve-2 kernel:  ? __ovs_nla_copy_actions+0x979/0xf80 [openvswitch]
May 15 17:04:57 pve-2 kernel:  ? __kmalloc_node_track_caller+0x16f/0x3a0
May 15 17:04:57 pve-2 kernel:  ? __kmalloc+0x179/0x330
May 15 17:04:57 pve-2 kernel:  ovs_execute_actions+0x48/0x110 [openvswitch]
May 15 17:04:57 pve-2 kernel:  ? ovs_execute_actions+0x48/0x110 [openvswitch]
May 15 17:04:57 pve-2 kernel:  ovs_packet_cmd_execute+0x280/0x300 [openvswitch]
May 15 17:04:57 pve-2 kernel:  genl_family_rcv_msg_doit+0xe7/0x150
May 15 17:04:57 pve-2 kernel:  genl_rcv_msg+0xe2/0x1e0
May 15 17:04:57 pve-2 kernel:  ? ovs_vport_cmd_del+0x200/0x200 [openvswitch]
May 15 17:04:57 pve-2 kernel:  ? genl_get_cmd+0xd0/0xd0
May 15 17:04:57 pve-2 kernel:  netlink_rcv_skb+0x55/0x100
May 15 17:04:57 pve-2 kernel:  genl_rcv+0x29/0x40
May 15 17:04:57 pve-2 kernel:  netlink_unicast+0x221/0x330
May 15 17:04:57 pve-2 kernel:  netlink_sendmsg+0x23f/0x4a0
May 15 17:04:57 pve-2 kernel:  sock_sendmsg+0x65/0x70
May 15 17:04:57 pve-2 kernel:  ____sys_sendmsg+0x257/0x2a0
May 15 17:04:57 pve-2 kernel:  ? import_iovec+0x31/0x40
May 15 17:04:57 pve-2 kernel:  ? sendmsg_copy_msghdr+0x7e/0xa0
May 15 17:04:57 pve-2 kernel:  ? __check_object_size+0x4d/0x150
May 15 17:04:57 pve-2 kernel:  ? _copy_from_user+0x2e/0x60
May 15 17:04:57 pve-2 kernel:  ___sys_sendmsg+0x82/0xc0
May 15 17:04:57 pve-2 kernel:  ? ___sys_recvmsg+0xa3/0x130
May 15 17:04:57 pve-2 kernel:  ? __fget_files+0x86/0xc0
May 15 17:04:57 pve-2 kernel:  ? __fget_light+0x32/0x80
May 15 17:04:57 pve-2 kernel:  __sys_sendmsg+0x62/0xb0
May 15 17:04:57 pve-2 kernel:  __x64_sys_sendmsg+0x1f/0x30
May 15 17:04:57 pve-2 kernel:  do_syscall_64+0x5c/0xc0
May 15 17:04:57 pve-2 kernel:  ? do_syscall_64+0x69/0xc0
May 15 17:04:57 pve-2 kernel:  ? syscall_exit_to_user_mode+0x27/0x50
May 15 17:04:57 pve-2 kernel:  ? __x64_sys_sendmsg+0x1f/0x30
May 15 17:04:57 pve-2 kernel:  ? do_syscall_64+0x69/0xc0
May 15 17:04:57 pve-2 kernel:  ? syscall_exit_to_user_mode+0x27/0x50
May 15 17:04:57 pve-2 kernel:  ? __x64_sys_recvmsg+0x1f/0x30
May 15 17:04:57 pve-2 kernel:  ? do_syscall_64+0x69/0xc0
May 15 17:04:57 pve-2 kernel:  ? sysvec_apic_timer_interrupt+0x4e/0x90
May 15 17:04:57 pve-2 kernel:  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
May 15 17:04:57 pve-2 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
May 15 17:04:57 pve-2 kernel: RIP: 0033:0x7f066c817e4d
May 15 17:04:57 pve-2 kernel: Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 >
May 15 17:04:57 pve-2 kernel: RSP: 002b:00007ffe64ffab20 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
May 15 17:04:57 pve-2 kernel: RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f066c817e4d
May 15 17:04:57 pve-2 kernel: RDX: 0000000000000000 RSI: 00007ffe64ffabb0 RDI: 0000000000000015
May 15 17:04:57 pve-2 kernel: RBP: 00007ffe64ffb9a0 R08: 0000000000000000 R09: 0000000000000001
May 15 17:04:57 pve-2 kernel: R10: 0000000000000004 R11: 0000000000000293 R12: 0000560fe267e2b0
May 15 17:04:57 pve-2 kernel: R13: 0000000000000000 R14: 0000560fe267e2b0 R15: 00007ffe64ffabb0
May 15 17:04:57 pve-2 kernel:  </TASK>
May 15 17:04:57 pve-2 kernel: ---[ end trace 007b3b535cde2d0b ]---

The trace happens after initialising the network interfaces and bringing up vmbr0. No problems with former Kernel 5.13.19.
As stated, it is only a warning and the system is stable so far. Same behaviour on a different hardware in the cluster.

Any ideas what's missing here? Thank you for your support.

Dirk
 
Hello Dirk,

I have the same stacktrace in my kernel log. Do you also use a 3-nodes mesh cluster?

I noticed something weird that the OVSBridge doesn't correctly work after reboot, but I have to restart both of the bridged ports, see the "post-up" command:

Code:
auto vmbr1
iface vmbr1 inet static
        address 172.31.255.241/29
        ovs_type OVSBridge
        ovs_ports enp4s0 enp5s0
        ovs_mtu 9000
        up ovs-vsctl set Bridge ${IFACE} rstp_enable=true other_config:rstp-priority=32768 other_config:rstp-forward-delay=4 other_config:rstp-max-age=6
        post-up sleep 10 && ifdown enp4s0 && ifup enp4s0 && ifdown enp5s0 && ifup enp5s0
 
Last edited:
I can see the same on a newly installed node when trying to start ovs bond on bootup.

[ 7.152637] softdog: initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0)
[ 7.152641] softdog: soft_reboot_cmd=<not set> soft_active_on_boot=0
[ 7.372713] openvswitch: Open vSwitch switching datapath
[ 7.723321] device ovs-system entered promiscuous mode
[ 7.724588] ------------[ cut here ]------------
[ 7.724591] WARNING: CPU: 2 PID: 1382 at include/net/netfilter/nf_conntrack.h:175 __ovs_ct_lookup+0x907/0xa40 [openvswitch]
[ 7.724601] Modules linked in: bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 softdog nfnetlink_log nfnetlink ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ast drm_vram_helper drm_ttm_helper rapl pcspkr efi_pstore ttm joydev drm_kms_helper cec input_leds rc_core fb_sys_fops usbmouse syscopyarea sysfillrect sysimgblt ptdma ccp k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress raid6_pq simplefb dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic usbkbd usbhid hid crc32_pclmul xhci_pci igb ahci xhci_pci_renesas
[ 7.724660] i2c_algo_bit bnxt_en dca libahci xhci_hcd i2c_piix4
[ 7.724666] CPU: 2 PID: 1382 Comm: ovs-vswitchd Tainted: P O 5.15.35-1-pve #1
[ 7.724668] Hardware name: Thomas-Krenn.AG 2HE AMD Dual-CPU RA2208 Server/H11DSi, BIOS 2.3 08/02/2021
[ 7.724670] RIP: 0010:__ovs_ct_lookup+0x907/0xa40 [openvswitch]
[ 7.724675] Code: 00 49 8b 94 24 c0 00 00 00 44 0f b6 3c 02 41 83 e7 0f 41 c1 e7 02 e9 69 fc ff ff 66 90 0f b6 53 1a 4c 8b 73 10 e9 59 f7 ff ff <0f> 0b e9 90 f7 ff ff 4c 89 d7 4c 89 95 40 ff ff ff e8 63 45 ee ff
[ 7.724678] RSP: 0018:ffffba16eb467610 EFLAGS: 00010246
[ 7.724680] RAX: 0000000000000000 RBX: ffff937ce0fb7b20 RCX: 0000000000000000
[ 7.724682] RDX: 0000000000000002 RSI: ffffba16eb467630 RDI: 0000000000000000
[ 7.724683] RBP: ffffba16eb4676e0 R08: 0000000000000000 R09: ffffffff9ddb6cc0
[ 7.724684] R10: 0000000000000000 R11: 0000000000000000 R12: ffff937cce04d000
[ 7.724686] R13: ffff937ce18aa7c8 R14: ffff937ce0fb7c00 R15: 0000000000000000
[ 7.724687] FS: 00007f302e063a40(0000) GS:ffff93bb0ea80000(0000) knlGS:0000000000000000
[ 7.724689] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.724690] CR2: 00007ffc3a1df178 CR3: 0000000150fca000 CR4: 0000000000350ee0
[ 7.724692] Call Trace:
[ 7.724694] <TASK>
[ 7.724696] ? nf_ct_get_tuple+0x13f/0x1e0 [nf_conntrack]
[ 7.724705] ? nf_ct_get_tuplepr+0x64/0xa0 [nf_conntrack]
[ 7.724711] ovs_ct_execute+0x449/0x8d0 [openvswitch]
[ 7.724716] do_execute_actions+0xd4/0x1ab0 [openvswitch]
[ 7.724721] ? __ovs_nla_copy_actions+0x979/0xf80 [openvswitch]
[ 7.724726] ? __kmalloc_node_track_caller+0x16f/0x3a0
[ 7.724730] ? __kmalloc+0x179/0x330
[ 7.724732] ovs_execute_actions+0x48/0x110 [openvswitch]
[ 7.724736] ? ovs_execute_actions+0x48/0x110 [openvswitch]
[ 7.724740] ovs_packet_cmd_execute+0x280/0x300 [openvswitch]
[ 7.724744] genl_family_rcv_msg_doit+0xe7/0x150
[ 7.724749] genl_rcv_msg+0xe2/0x1e0
[ 7.724750] ? ovs_vport_cmd_del+0x200/0x200 [openvswitch]
[ 7.724754] ? genl_get_cmd+0xd0/0xd0
[ 7.724756] netlink_rcv_skb+0x55/0x100
[ 7.724759] genl_rcv+0x29/0x40
[ 7.724760] netlink_unicast+0x221/0x330
[ 7.724762] netlink_sendmsg+0x23f/0x4a0
[ 7.724764] sock_sendmsg+0x65/0x70
[ 7.724767] ____sys_sendmsg+0x257/0x2a0
[ 7.724768] ? import_iovec+0x31/0x40
[ 7.724770] ? sendmsg_copy_msghdr+0x7e/0xa0
[ 7.724772] ___sys_sendmsg+0x82/0xc0
[ 7.724774] ? do_futex+0x13f/0xb90
[ 7.724778] ? __fget_files+0x86/0xc0
[ 7.724781] ? __fget_light+0x32/0x80
[ 7.724782] ? __fget_files+0x86/0xc0
[ 7.724784] ? __fget_light+0x32/0x80
[ 7.724785] __sys_sendmsg+0x62/0xb0
[ 7.724787] ? fib_dump_info.cold+0x93/0x118
[ 7.724790] __x64_sys_sendmsg+0x1f/0x30
[ 7.724792] do_syscall_64+0x5c/0xc0
[ 7.724793] ? do_syscall_64+0x69/0xc0
[ 7.724794] ? do_syscall_64+0x69/0xc0
[ 7.724796] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7.724798] RIP: 0033:0x7f302e98fe4d
[ 7.724800] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 48
[ 7.724801] RSP: 002b:00007ffc3a1efbd0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[ 7.724803] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f302e98fe4d
[ 7.724804] RDX: 0000000000000000 RSI: 00007ffc3a1efc60 RDI: 0000000000000015
[ 7.724805] RBP: 00007ffc3a1f0a50 R08: 0000000000000000 R09: 0000000000000001
[ 7.724806] R10: 0000000000000008 R11: 0000000000000293 R12: 0000556e9fee5280
[ 7.724807] R13: 0000000000000000 R14: 0000556e9fee5280 R15: 00007ffc3a1efc60
[ 7.724808] </TASK>
[ 7.724809] ---[ end trace 3132269414d77199 ]---
[ 7.725113] Timeout policy base is empty
[ 7.725114] Failed to associated timeout policy `ovs_test_tp'
[ 7.725124] ------------[ cut here ]------------
[ 7.725124] WARNING: CPU: 2 PID: 1382 at include/net/netfilter/nf_conntrack.h:175 __ovs_ct_lookup+0x907/0xa40 [openvswitch]
[ 7.725129] Modules linked in: bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 softdog nfnetlink_log nfnetlink ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ast drm_vram_helper drm_ttm_helper rapl pcspkr efi_pstore ttm joydev drm_kms_helper cec input_leds rc_core fb_sys_fops usbmouse syscopyarea sysfillrect sysimgblt ptdma ccp k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress raid6_pq simplefb dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic usbkbd usbhid hid crc32_pclmul xhci_pci igb ahci xhci_pci_renesas
[ 7.725164] i2c_algo_bit bnxt_en dca libahci xhci_hcd i2c_piix4
[ 7.725167] CPU: 2 PID: 1382 Comm: ovs-vswitchd Tainted: P W O 5.15.35-1-pve #1
[ 7.725168] Hardware name: Thomas-Krenn.AG 2HE AMD Dual-CPU RA2208 Server/H11DSi, BIOS 2.3 08/02/2021
[ 7.725169] RIP: 0010:__ovs_ct_lookup+0x907/0xa40 [openvswitch]
[ 7.725173] Code: 00 49 8b 94 24 c0 00 00 00 44 0f b6 3c 02 41 83 e7 0f 41 c1 e7 02 e9 69 fc ff ff 66 90 0f b6 53 1a 4c 8b 73 10 e9 59 f7 ff ff <0f> 0b e9 90 f7 ff ff 4c 89 d7 4c 89 95 40 ff ff ff e8 63 45 ee ff
[ 7.725174] RSP: 0018:ffffba16eb4675c0 EFLAGS: 00010246
[ 7.725175] RAX: 0000000000000000 RBX: ffff937ce0fb7c20 RCX: 0000000000000000
[ 7.725176] RDX: 0000000000000002 RSI: ffffba16eb4675e0 RDI: 0000000000000000
[ 7.725177] RBP: ffffba16eb467690 R08: 0000000000000000 R09: ffffffff9ddb6cc0
[ 7.725178] R10: 0000000000000000 R11: 0000000000000000 R12: ffff937cce04d400
[ 7.725179] R13: ffff937d12869fe8 R14: ffff937ce0fb7b00 R15: 0000000000000000
[ 7.725180] FS: 00007f302e063a40(0000) GS:ffff93bb0ea80000(0000) knlGS:0000000000000000
[ 7.725181] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.725182] CR2: 00007ffc3a1df178 CR3: 0000000150fca000 CR4: 0000000000350ee0
[ 7.725183] Call Trace:
[ 7.725184] <TASK>
[ 7.725185] ? nf_ct_get_tuple+0x13f/0x1e0 [nf_conntrack]
[ 7.725191] ? nf_ct_get_tuplepr+0x64/0xa0 [nf_conntrack]
[ 7.725197] ovs_ct_execute+0x449/0x8d0 [openvswitch]
[ 7.725202] do_execute_actions+0xd4/0x1ab0 [openvswitch]
[ 7.725206] ? __ovs_nla_copy_actions+0x979/0xf80 [openvswitch]
[ 7.725210] ? __kmalloc_node_track_caller+0x16f/0x3a0
[ 7.725212] ? __kmalloc+0x179/0x330
[ 7.725214] ovs_execute_actions+0x48/0x110 [openvswitch]
[ 7.725218] ? ovs_execute_actions+0x48/0x110 [openvswitch]
[ 7.725222] ovs_packet_cmd_execute+0x280/0x300 [openvswitch]
[ 7.725226] genl_family_rcv_msg_doit+0xe7/0x150
[ 7.725228] genl_rcv_msg+0xe2/0x1e0
[ 7.725230] ? ovs_vport_cmd_del+0x200/0x200 [openvswitch]
[ 7.725234] ? genl_get_cmd+0xd0/0xd0
[ 7.725236] netlink_rcv_skb+0x55/0x100
[ 7.725238] genl_rcv+0x29/0x40
[ 7.725239] netlink_unicast+0x221/0x330
[ 7.725241] netlink_sendmsg+0x23f/0x4a0
[ 7.725243] sock_sendmsg+0x65/0x70
[ 7.725244] ____sys_sendmsg+0x257/0x2a0
[ 7.725246] ? import_iovec+0x31/0x40
[ 7.725247] ? sendmsg_copy_msghdr+0x7e/0xa0
[ 7.725249] ___sys_sendmsg+0x82/0xc0
[ 7.725250] ? do_futex+0x13f/0xb90
[ 7.725253] ? ___sys_recvmsg+0xa3/0x130
[ 7.725254] ? __fget_files+0x86/0xc0
[ 7.725256] ? __fget_light+0x32/0x80
[ 7.725257] __sys_sendmsg+0x62/0xb0
[ 7.725259] __x64_sys_sendmsg+0x1f/0x30
[ 7.725261] do_syscall_64+0x5c/0xc0
[ 7.725262] ? do_syscall_64+0x69/0xc0
[ 7.725263] ? __x64_sys_sendmsg+0x1f/0x30
[ 7.725264] ? do_syscall_64+0x69/0xc0
[ 7.725265] ? syscall_exit_to_user_mode+0x27/0x50
[ 7.725267] ? __x64_sys_recvmsg+0x1f/0x30
[ 7.725269] ? do_syscall_64+0x69/0xc0
[ 7.725270] ? do_syscall_64+0x69/0xc0
[ 7.725271] ? do_syscall_64+0x69/0xc0
[ 7.725272] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7.725274] RIP: 0033:0x7f302e98fe4d
[ 7.725275] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 48
[ 7.725276] RSP: 002b:00007ffc3a1efbd0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[ 7.725277] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f302e98fe4d
[ 7.725278] RDX: 0000000000000000 RSI: 00007ffc3a1efc60 RDI: 0000000000000015
[ 7.725279] RBP: 00007ffc3a1f0a50 R08: 0000000000000000 R09: 0000000000000001
[ 7.725279] R10: 0000000000000008 R11: 0000000000000293 R12: 0000556e9fee5280
[ 7.725280] R13: 0000000000000000 R14: 0000556e9fee5280 R15: 00007ffc3a1efc60
[ 7.725282] </TASK>
[ 7.725282] ---[ end trace 3132269414d7719a ]---
 
I am also running into a similar issue on a 3 node cluster (2 nodes very similar hardware, one node very different, but all 3 use same 4 port Intel 1Gb card and are doing a 4 interface bond. They all come up without requiring any intervention as @Max2048 mentioned above:

Code:
auto bond0
iface bond0 inet manual
        ovs_bonds enp11s0f0 enp11s0f1 enp11s0f2 enp11s0f3
        ovs_type OVSBond
        ovs_bridge vmbr0
        ovs_options lacp=active bond_mode=balance-tcp other_config:lacp-time=fast

auto vmbr0
iface vmbr0 inet manual
        ovs_type OVSBridge
        ovs_ports bond0 vlan100

PVE1 - same hardware as PVE3 below, just has 5950x instead of 3900x:

Code:
[   11.117339] openvswitch: Open vSwitch switching datapath
[   11.669947] device ovs-system entered promiscuous mode
[   11.670701] ------------[ cut here ]------------
[   11.672063] WARNING: CPU: 24 PID: 3126 at include/net/netfilter/nf_conntrack.h:175 __ovs_ct_lookup+0x907/0xa40 [openvswitch]
[   11.673491] Modules linked in: bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio nouveau snd_hda_codec_hdmi kvm drm_ttm_helper snd_hda_intel irqbypass ttm snd_intel_dspcfg crct10dif_pclmul ghash_clmulni_intel snd_intel_sdw_acpi aesni_intel drm_kms_helper snd_hda_codec crypto_simd eeepc_wmi cec snd_hda_core cryptd rc_core asus_wmi snd_hwdep platform_profile fb_sys_fops snd_pcm rapl syscopyarea sparse_keymap snd_timer sysfillrect pcspkr video efi_pstore wmi_bmof sysimgblt snd mxm_wmi k10temp ccp soundcore ipmi_devintf ipmi_msghandler mac_hid vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs
[   11.673528]  blake2b_generic xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbhid hid xhci_pci ahci xhci_pci_renesas crc32_pclmul i2c_piix4 mpt3sas igb libahci r8169 xhci_hcd ehci_pci nvme i2c_algo_bit raid_class ehci_hcd realtek dca scsi_transport_sas nvme_core wmi
[   11.685366] CPU: 24 PID: 3126 Comm: ovs-vswitchd Tainted: P           O      5.15.35-1-pve #1
[   11.687232] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 3801 07/30/2021
[   11.689101] RIP: 0010:__ovs_ct_lookup+0x907/0xa40 [openvswitch]
[   11.690980] Code: 00 49 8b 94 24 c0 00 00 00 44 0f b6 3c 02 41 83 e7 0f 41 c1 e7 02 e9 69 fc ff ff 66 90 0f b6 53 1a 4c 8b 73 10 e9 59 f7 ff ff <0f> 0b e9 90 f7 ff ff 4c 89 d7 4c 89 95 40 ff ff ff e8 63 15 f4 ff
[   11.692942] RSP: 0018:ffffaf2f18d7b540 EFLAGS: 00010246
[   11.694849] RAX: 0000000000000000 RBX: ffff96a949366220 RCX: 0000000000000000
[   11.696762] RDX: 0000000000000002 RSI: ffffaf2f18d7b560 RDI: 0000000000000000
[   11.698669] RBP: ffffaf2f18d7b610 R08: 0000000000000000 R09: ffffffffa49b6cc0
[   11.700576] R10: 0000000000000000 R11: 0000000000000000 R12: ffff96a949384500
[   11.702439] R13: ffff96a9841b3408 R14: ffff96a949367d00 R15: 0000000000000000
[   11.704260] FS:  00007fa8cd30ba40(0000) GS:ffff96c7ef000000(0000) knlGS:0000000000000000
[   11.706097] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   11.707929] CR2: 00007ffe7221b4d8 CR3: 000000013daca000 CR4: 0000000000750ee0
[   11.709735] PKRU: 55555554
[   11.711527] Call Trace:
[   11.713265]  <TASK>
[   11.714970]  ? nf_ct_get_tuple+0x13f/0x1e0 [nf_conntrack]
[   11.716683]  ? nf_ct_get_tuplepr+0x64/0xa0 [nf_conntrack]
[   11.718345]  ovs_ct_execute+0x449/0x8d0 [openvswitch]
[   11.720003]  do_execute_actions+0xd4/0x1ab0 [openvswitch]
[   11.721668]  ? cpumask_next+0x23/0x30
[   11.723322]  ? tbl_mask_array_reset_counters+0x82/0xc0 [openvswitch]
[   11.724996]  ? __ovs_nla_copy_actions+0x979/0xf80 [openvswitch]
[   11.726676]  ? __kmalloc_node_track_caller+0x16f/0x3a0
[   11.728360]  ? __kmalloc+0x179/0x330
[   11.730043]  ovs_execute_actions+0x48/0x110 [openvswitch]
[   11.731725]  ? ovs_execute_actions+0x48/0x110 [openvswitch]
[   11.733408]  ovs_packet_cmd_execute+0x280/0x300 [openvswitch]
[   11.735098]  genl_family_rcv_msg_doit+0xe7/0x150
[   11.736779]  genl_rcv_msg+0xe2/0x1e0
[   11.738455]  ? ovs_vport_cmd_del+0x200/0x200 [openvswitch]
[   11.740143]  ? genl_get_cmd+0xd0/0xd0
[   11.741833]  netlink_rcv_skb+0x55/0x100
[   11.743511]  genl_rcv+0x29/0x40
[   11.745177]  netlink_unicast+0x221/0x330
[   11.746845]  netlink_sendmsg+0x23f/0x4a0
[   11.748501]  sock_sendmsg+0x65/0x70
[   11.750157]  ____sys_sendmsg+0x257/0x2a0
[   11.751810]  ? import_iovec+0x31/0x40
[   11.753446]  ? sendmsg_copy_msghdr+0x7e/0xa0
[   11.755072]  ___sys_sendmsg+0x82/0xc0
[   11.756678]  ? wake_up_q+0x50/0x90
[   11.758277]  ? futex_wake+0x155/0x180
[   11.759867]  ? __fget_files+0x86/0xc0
[   11.761454]  ? __fget_light+0x32/0x80
[   11.763030]  __sys_sendmsg+0x62/0xb0
[   11.764604]  __x64_sys_sendmsg+0x1f/0x30
[   11.766182]  do_syscall_64+0x5c/0xc0
[   11.767758]  ? exit_to_user_mode_prepare+0x37/0x1b0
[   11.769339]  ? syscall_exit_to_user_mode+0x27/0x50
[   11.770915]  ? do_syscall_64+0x69/0xc0
[   11.772476]  ? __x64_sys_sendmsg+0x1f/0x30
[   11.774040]  ? do_syscall_64+0x69/0xc0
[   11.775601]  ? __x64_sys_recvmsg+0x1f/0x30
[   11.777160]  ? do_syscall_64+0x69/0xc0
[   11.778724]  ? syscall_exit_to_user_mode+0x27/0x50
[   11.780289]  ? __x64_sys_recvmsg+0x1f/0x30
[   11.781849]  ? do_syscall_64+0x69/0xc0
[   11.783399]  ? syscall_exit_to_user_mode+0x27/0x50
[   11.784928]  ? __x64_sys_sendmsg+0x1f/0x30
[   11.786397]  ? do_syscall_64+0x69/0xc0
[   11.787796]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   11.789138] RIP: 0033:0x7fa8cdc37e4d
[   11.790414] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 48
[   11.791727] RSP: 002b:00007ffe7222b5d0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   11.792977] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fa8cdc37e4d
[   11.794176] RDX: 0000000000000000 RSI: 00007ffe7222b660 RDI: 0000000000000015
[   11.795361] RBP: 00007ffe7222c450 R08: 0000000000000000 R09: 0000000000000001
[   11.796529] R10: 0000000000000008 R11: 0000000000000293 R12: 000055680cb0b2f0
[   11.797687] R13: 0000000000000000 R14: 000055680cb0b2f0 R15: 00007ffe7222b660
[   11.798796]  </TASK>
[   11.799852] ---[ end trace aec2f65c06d36200 ]---
[   11.801022] Timeout policy base is empty
[   11.801279] Failed to associated timeout policy `ovs_test_tp'
[   11.801542] ------------[ cut here ]------------
[   11.802610] WARNING: CPU: 8 PID: 3126 at include/net/netfilter/nf_conntrack.h:175 __ovs_ct_lookup+0x907/0xa40 [openvswitch]
[   11.803728] Modules linked in: bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio nouveau snd_hda_codec_hdmi kvm drm_ttm_helper snd_hda_intel irqbypass ttm snd_intel_dspcfg crct10dif_pclmul ghash_clmulni_intel snd_intel_sdw_acpi aesni_intel drm_kms_helper snd_hda_codec crypto_simd eeepc_wmi cec snd_hda_core cryptd rc_core asus_wmi snd_hwdep platform_profile fb_sys_fops snd_pcm rapl syscopyarea sparse_keymap snd_timer sysfillrect pcspkr video efi_pstore wmi_bmof sysimgblt snd mxm_wmi k10temp ccp soundcore ipmi_devintf ipmi_msghandler mac_hid vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs
[   11.803754]  blake2b_generic xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbhid hid xhci_pci ahci xhci_pci_renesas crc32_pclmul i2c_piix4 mpt3sas igb libahci r8169 xhci_hcd ehci_pci nvme i2c_algo_bit raid_class ehci_hcd realtek dca scsi_transport_sas nvme_core wmi
[   11.813271] CPU: 8 PID: 3126 Comm: ovs-vswitchd Tainted: P        W  O      5.15.35-1-pve #1
[   11.814813] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 3801 07/30/2021
[   11.816360] RIP: 0010:__ovs_ct_lookup+0x907/0xa40 [openvswitch]
[   11.817915] Code: 00 49 8b 94 24 c0 00 00 00 44 0f b6 3c 02 41 83 e7 0f 41 c1 e7 02 e9 69 fc ff ff 66 90 0f b6 53 1a 4c 8b 73 10 e9 59 f7 ff ff <0f> 0b e9 90 f7 ff ff 4c 89 d7 4c 89 95 40 ff ff ff e8 63 15 f4 ff
[   11.819556] RSP: 0018:ffffaf2f18d7b540 EFLAGS: 00010246
[   11.821178] RAX: 0000000000000000 RBX: ffff96a947353220 RCX: 0000000000000000
[   11.822813] RDX: 0000000000000002 RSI: ffffaf2f18d7b560 RDI: 0000000000000000
[   11.824436] RBP: ffffaf2f18d7b610 R08: 0000000000000000 R09: ffffffffa49b6cc0
[   11.826070] R10: 0000000000000000 R11: 0000000000000000 R12: ffff96a942049d00
[   11.827697] R13: ffff96a983cd0f28 R14: ffff96a947353700 R15: 0000000000000000
[   11.829324] FS:  00007fa8cd30ba40(0000) GS:ffff96c7eec00000(0000) knlGS:0000000000000000
[   11.830966] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   11.832599] CR2: 00007fce83be6840 CR3: 000000013daca000 CR4: 0000000000750ee0
[   11.834251] PKRU: 55555554
[   11.835887] Call Trace:
[   11.837518]  <TASK>
[   11.839138]  ? nf_ct_get_tuple+0x13f/0x1e0 [nf_conntrack]
[   11.840782]  ? nf_ct_get_tuplepr+0x64/0xa0 [nf_conntrack]
[   11.842424]  ovs_ct_execute+0x449/0x8d0 [openvswitch]
[   11.844070]  do_execute_actions+0xd4/0x1ab0 [openvswitch]
[   11.845727]  ? tbl_mask_array_reset_counters+0x82/0xc0 [openvswitch]
[   11.847393]  ? __ovs_nla_copy_actions+0x979/0xf80 [openvswitch]
[   11.849065]  ? __kmalloc_node_track_caller+0x16f/0x3a0
[   11.850741]  ? __kmalloc+0x179/0x330
[   11.852408]  ovs_execute_actions+0x48/0x110 [openvswitch]
[   11.854093]  ? ovs_execute_actions+0x48/0x110 [openvswitch]
[   11.855784]  ovs_packet_cmd_execute+0x280/0x300 [openvswitch]
[   11.857489]  genl_family_rcv_msg_doit+0xe7/0x150
[   11.859198]  genl_rcv_msg+0xe2/0x1e0
[   11.860903]  ? ovs_vport_cmd_del+0x200/0x200 [openvswitch]
[   11.862630]  ? genl_get_cmd+0xd0/0xd0
[   11.864347]  netlink_rcv_skb+0x55/0x100
[   11.866065]  genl_rcv+0x29/0x40
[   11.867776]  netlink_unicast+0x221/0x330
[   11.869491]  netlink_sendmsg+0x23f/0x4a0
[   11.871196]  sock_sendmsg+0x65/0x70
[   11.872899]  ____sys_sendmsg+0x257/0x2a0
[   11.874607]  ? import_iovec+0x31/0x40
[   11.876310]  ? sendmsg_copy_msghdr+0x7e/0xa0
[   11.878019]  ? __smp_call_single_queue+0x55/0x80
[   11.879723]  ___sys_sendmsg+0x82/0xc0
[   11.881432]  ? try_to_wake_up+0x214/0x5c0
[   11.883136]  ? wake_up_q+0x50/0x90
[   11.884832]  ? futex_wake+0x155/0x180
[   11.886531]  ? __fget_files+0x86/0xc0
[   11.888220]  ? __fget_light+0x32/0x80
[   11.889902]  __sys_sendmsg+0x62/0xb0
[   11.891535]  __x64_sys_sendmsg+0x1f/0x30
[   11.893121]  do_syscall_64+0x5c/0xc0
[   11.894700]  ? syscall_exit_to_user_mode+0x27/0x50
[   11.896277]  ? __x64_sys_futex+0x81/0x1d0
[   11.897856]  ? do_syscall_64+0x69/0xc0
[   11.899428]  ? exit_to_user_mode_prepare+0x37/0x1b0
[   11.901009]  ? exit_to_user_mode_prepare+0x37/0x1b0
[   11.902573]  ? syscall_exit_to_user_mode+0x27/0x50
[   11.904130]  ? do_syscall_64+0x69/0xc0
[   11.905686]  ? irqentry_exit+0x19/0x30
[   11.907229]  ? sysvec_call_function+0x4e/0x90
[   11.908768]  ? asm_sysvec_call_function+0xa/0x20
[   11.910288]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   11.911745] RIP: 0033:0x7fa8cdc37e4d
[   11.913135] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 48
[   11.914568] RSP: 002b:00007ffe7222b5d0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   11.915931] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fa8cdc37e4d
[   11.917237] RDX: 0000000000000000 RSI: 00007ffe7222b660 RDI: 0000000000000015
[   11.918489] RBP: 00007ffe7222c450 R08: 0000000000000000 R09: 0000000000000001
[   11.919673] R10: 0000000000000008 R11: 0000000000000293 R12: 000055680cb0b2f0
[   11.920844] R13: 0000000000000000 R14: 000055680cb0b2f0 R15: 00007ffe7222b660
[   11.922002]  </TASK>
[   11.923130] ---[ end trace aec2f65c06d36201 ]---

PVE2 - completely different hardware (read old, repurposed), but same 4 port igb ovs bond on all 3:

Attachment: pve2.txt

PVE3 - experiencing some instability on this node that I have not been able to track down yet (likely related to RAM I am RMAing), but newly built 5/8/22 - same hardware as PVE1, just 3900x instead of 5950x processor:

Attachment: pve3.txt
 

Attachments

Warning is gone with latest Linux pve-2 5.15.39-1-pve
Code:
Jul 06 12:08:03 pve-2 kernel: Timeout policy base is empty
Jul 06 12:08:03 pve-2 kernel: Failed to associated timeout policy `ovs_test_tp'
 
  • Like
Reactions: Alibek and Max2048