Strange issue with IPv6

Alex_u-94 · Aug 31, 2022

Hello.
I have a Proxmox cluster of four physical servers. The cluster has been working successfully for a long time. In addition to the Proxmox cluster, a CEPH cluster was also created. Everything worked fine until this morning.
Today one of the nodes disconnected from CEPH. I ran a diagnostic and found that this node lost the ability to send and receive IPv6 packets through the interface registered for CEPH.

To check the functioning of the physical interface, I added IPv4 to it and did it on other nodes.
Data is successfully transmitted over IPv4 without any restrictions. There are no firewall rules on the cluster that could restrict traffic transmission over IPv6. Another IPv6 network is also used in the cluster, which is working successfully.

Please tell me what can be checked to detect a malfunction?

Stoiko Ivanov · Sep 1, 2022

This sounds odd
* did you install upgrades (on the PVE nodes, but also on the switches that connect the nodes)?
* If you're not the one responsible for the network setup - I'd ask the responsible team (or your ISP/Hosting provider if this applies)
* else - anything that was written to the journal (and also in the ceph-logs) around the time when ipv6 stopped working?
* also check the status of all links (`ip -details link`, `ethtool <interface>`)

I hope this helps!

Alex_u-94 · Sep 5, 2022

Stoiko Ivanov said:
This sounds odd
* did you install upgrades (on the PVE nodes, but also on the switches that connect the nodes)?
* If you're not the one responsible for the network setup - I'd ask the responsible team (or your ISP/Hosting provider if this applies)
* else - anything that was written to the journal (and also in the ceph-logs) around the time when ipv6 stopped working?
* also check the status of all links (`ip -details link`, `ethtool <interface>`)

I hope this helps!

Hello. I created a new topic and updated the current issue. https://forum.proxmox.com/threads/ceph-issue.114670/

* did you install upgrades (on the PVE nodes, but also on the switches that connect the nodes)?

the system was not updated before the problem occurred

* If you're not the one responsible for the network setup - I'd ask the responsible team (or your ISP/Hosting provider if this applies)

I am the only one responsible for the infrastructure and there was no work or changes during the specified period (I was on vacation).

* else - anything that was written to the journal (and also in the ceph-logs) around the time when ipv6 stopped working?

There is a record of degradation. I have not found any data that would clarify the situation.

* also check the status of all links (`ip -details link`, `ethtool <interface>`)

I checked everything I could, even connected the 10Gb interface to another router and configured the network accordingly, trying to ping the nodes. Unfortunately, it was not possible to find out the reason.

Stoiko Ivanov · Sep 5, 2022

Alex_u-94 said:
Hello. I created a new topic and updated the current issue. https://forum.proxmox.com/threads/ceph-issue.114670/

While the issue is probably related to this issue here you did not provide the outputs of `ip link`, `ip addr`, `ip -6 route` - or the journal there

all of the 3 might help in finding out what the problem currently is - but I'd suggest to stick to one thread for now - as it does not help to solve the issue in 2 places at once

Alex_u-94 · Sep 5, 2022

`ip link`, `ip addr`, `ip -6 route` - won't help

I did on 2 nodes OVS Bridge and assign IPv6 on Bridges. With this it's no matter which physical interface attached to the bridge. Output of commands will be the same.

`ip link`, `ip addr`, `ip -6 route` - with Bridge + 1Gb interface (nodes have connectivity)

Code:

root@node-B:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether b4:96:91:29:bb:16 brd ff:ff:ff:ff:ff:ff
    inet 192.168.9.12/24 scope global enp3s0
       valid_lft forever preferred_lft forever
    inet6 fe80::b696:91ff:fe29:bb16/64 scope link
       valid_lft forever preferred_lft forever
3: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3f brd ff:ff:ff:ff:ff:ff
    inet6 fe80::ec4:7aff:fe32:e3f/64 scope link
       valid_lft forever preferred_lft forever
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9a:32:ee:63:3a:5d brd ff:ff:ff:ff:ff:ff
6: vmbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3f brd ff:ff:ff:ff:ff:ff
    inet6 fd00:dc:ce:192:168:90:0:12/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::409e:f7ff:fe35:ea4a/64 scope link
       valid_lft forever preferred_lft forever
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
    inet 192.168.90.12/24 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fd00:dc:cc:192:168:90:0:22/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::ec4:7aff:fe32:e3e/64 scope link
       valid_lft forever preferred_lft forever
      
root@node-B:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:29:bb:16 brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3f brd ff:ff:ff:ff:ff:ff
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 9a:32:ee:63:3a:5d brd ff:ff:ff:ff:ff:ff
6: vmbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3f brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff

root@node-B:~# ip -6 route
::1 dev lo proto kernel metric 256 pref medium
fd00:dc:cc:192::/64 dev vmbr0 proto kernel metric 256 pref medium
fd00:dc:ce:192::/64 dev vmbr2 proto kernel metric 256 pref medium
fe80::/64 dev vmbr2 proto kernel metric 256 pref medium
fe80::/64 dev enp3s0 proto kernel metric 256 pref medium
fe80::/64 dev enp1s0f1 proto kernel metric 256 pref medium
fe80::/64 dev vmbr0 proto kernel metric 256 pref medium

Ping from NodeB to NodeA

Code:

root@node-B:~# ping fd00:dc:ce:192:168:90:0:11
PING fd00:dc:ce:192:168:90:0:11(fd00:dc:ce:192:168:90:0:11) 56 data bytes
64 bytes from fd00:dc:ce:192:168:90:0:11: icmp_seq=1 ttl=64 time=0.171 ms
64 bytes from fd00:dc:ce:192:168:90:0:11: icmp_seq=2 ttl=64 time=0.207 ms

`ip link`, `ip addr`, `ip -6 route` - with Bridge + 10Gb interface (nodes have not connectivity)

Code:

root@node-B:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
    link/ether b4:96:91:29:bb:16 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::b696:91ff:fe29:bb16/64 scope link
       valid_lft forever preferred_lft forever
3: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3f brd ff:ff:ff:ff:ff:ff
    inet6 fe80::ec4:7aff:fe32:e3f/64 scope link
       valid_lft forever preferred_lft forever
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 9a:32:ee:63:3a:5d brd ff:ff:ff:ff:ff:ff
6: vmbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether b4:96:91:29:bb:16 brd ff:ff:ff:ff:ff:ff
    inet6 fd00:dc:ce:192:168:90:0:12/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::409e:f7ff:fe35:ea4a/64 scope link
       valid_lft forever preferred_lft forever
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
    inet 192.168.90.12/24 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fd00:dc:cc:192:168:90:0:22/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::ec4:7aff:fe32:e3e/64 scope link
       valid_lft forever preferred_lft forever
root@node-B:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:29:bb:16 brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master vmbr0 state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3f brd ff:ff:ff:ff:ff:ff
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 9a:32:ee:63:3a:5d brd ff:ff:ff:ff:ff:ff
6: vmbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether b4:96:91:29:bb:16 brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 0c:c4:7a:32:0e:3e brd ff:ff:ff:ff:ff:ff
root@node-B:~# ip -6 route
::1 dev lo proto kernel metric 256 pref medium
fd00:dc:cc:192::/64 dev vmbr0 proto kernel metric 256 pref medium
fd00:dc:ce:192::/64 dev vmbr2 proto kernel metric 256 pref medium
fe80::/64 dev vmbr2 proto kernel metric 256 pref medium
fe80::/64 dev enp3s0 proto kernel metric 256 pref medium
fe80::/64 dev enp1s0f1 proto kernel metric 256 pref medium
fe80::/64 dev vmbr0 proto kernel metric 256 pref medium

Ping from NodeB to NodeA

Code:

root@node-B:~# ping fd00:dc:ce:192:168:90:0:11
PING fd00:dc:ce:192:168:90:0:11(fd00:dc:ce:192:168:90:0:11) 56 data bytes
From fd00:dc:ce:192:168:90:0:12 icmp_seq=1 Destination unreachable: Address unreachable
From fd00:dc:ce:192:168:90:0:12 icmp_seq=2 Destination unreachable: Address unreachable
From fd00:dc:ce:192:168:90:0:12 icmp_seq=3 Destination unreachable: Address unreachable
^C
--- fd00:dc:ce:192:168:90:0:11 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4103ms

CEPH logs, as well as any others, I can publish, but because of their size, I doubt that anyone will read them.

Stoiko Ivanov · Sep 6, 2022

a) I'd suggest to try to just use regular linux-bridges as they are simpler and this is usually good in finding issues)
b) anything in the journal/dmesg, that might indicate where the issue with the nic might be?
c) you can try to see what the traffic does with tcpdump
d) make sure you don't have any firewall rules preventing traffic from flowing (ip6tables -nvL, nft list ruleset, and check all devices on the path between node a and node b)
e) /etc/network/interfaces file might help
f) the ip l, ip a, ip -6 r output from nodeA might also be of interest

I hope this helps!

Search

Search

Strange issue with IPv6

Alex_u-94

New Member

Stoiko Ivanov

Proxmox Staff Member

Alex_u-94

New Member

Stoiko Ivanov

Proxmox Staff Member

Alex_u-94

New Member

Stoiko Ivanov

Proxmox Staff Member

We value your privacy