Proxmox 6.2-10 CEPH Cluster reboots if one node is shutdown or rebooting

dan.ger · Jul 23, 2020

Hello,

after update the proxmox nodes few days ago with latest CEPH-Version, something strange happens. If a node is rebooted, all HA cluster nodes are rebooted.

In the log I saw something like that on Node that is not rebooted:

Code:

Jul 23 13:36:28 hyperx-01 ceph-mon[2793]: 2020-07-23 13:36:28.792 7fb71a14c700 -1 mon.pve-01@0(electing) e5 failed to get devid for : fallback method has serial ''but no model
Jul 23 13:36:32 hyperx-01 pvestatd[2980]: got timeout
Jul 23 13:36:32 pve-01 ceph-mon[2793]: 2020-07-23 13:36:32.528 7fb71a14c700 -1 mon.pve-01@0(electing) e5 get_health_metrics reporting 6 slow ops, oldest is auth(proto 0 34 bytes epoch 0)
Jul 23 13:36:32 pve-01 pvestatd[2980]: status update time (5.172 seconds)
Jul 23 13:36:35 pve-01 ceph-osd[2862]: 2020-07-23 13:36:35.052 7f90f53bec80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2858]: 2020-07-23 13:36:35.052 7fe504797c80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2862]: 2020-07-23 13:36:35.052 7f90f53bec80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2858]: 2020-07-23 13:36:35.052 7fe504797c80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2864]: 2020-07-23 13:36:35.060 7f5bc1187c80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2864]: 2020-07-23 13:36:35.064 7f5bc1187c80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2847]: 2020-07-23 13:36:35.076 7f8466cbfc80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''
Jul 23 13:36:35 pve-01 ceph-osd[2847]: 2020-07-23 13:36:35.076 7f8466cbfc80 -1 unable to find any IPv4 address in networks '10.10.10.0/24' interfaces ''

Nodes are:
pve-01: 10.10.10.1
pve-02: 10.10.10.2
pve-03: 10.10.10.3

Each node has 2 Bonds with 10GBe Nics (MTU 9000), one for CEPH-Cluster and one for default network. I can ping each host within the cluster from every node in the cluster.

Any suggestions?

So s

Alwin · Jul 23, 2020

What is the network config of pve-01? And how does the ceph.conf look like?

dan.ger · Jul 23, 2020

Here is the network config of pve-01 (same for pve-02/pve-03)

Code:

# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual
    mtu 9000

auto eth1
iface eth1 inet manual
    mtu 9000

auto eth2
iface eth2 inet manual
    mtu 9000

auto eth3
iface eth3 inet manual
    mtu 9000

auto bond0
iface bond0 inet manual
    bond-slaves eth0 eth1
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer2+3
    mtu 9000

auto bond1
iface bond1 inet static
    address 10.10.10.1/24
    bond-slaves eth2 eth3
    bond-miimon 100
    bond-mode 802.3ad
    bond-xmit-hash-policy layer2+3
    mtu 9000

auto vmbr0
iface vmbr0 inet static
    address xx.xxx.xxx.xxx/27
    gateway xx.xxx.xxx.xxx
    bridge-ports bond0
    bridge-stp off
    bridge-fd 0
    mtu 9000
#Wan

ceph.config

Code:

[global]
     auth client required = cephx
     auth cluster required = cephx
     auth service required = cephx
     cluster network = 10.10.10.0/24
     fsid = 15ebfe6d-db76-4ed3-bf14-3a31243ca94e
     mon allow pool delete = true
     osd journal size = 5120
     osd pool default min size = 1
     osd pool default size = 2
     public network = 10.10.10.0/24
     mon_host = 10.10.10.1 10.10.10.2 10.10.10.3

[osd]
     keyring = /var/lib/ceph/osd/ceph-$id/keyring

[client]
    keyring = /etc/pve/priv/$cluster.$name.keyring

[mon.pve-01]
     host = pve-01
     mon addr = 10.10.10.1:6789

[mon.pve-02]
     host = pve-02
     mon addr = 10.10.10.2:6789

[mon.pve-03]
     host = pve-03
     mon addr = 10.10.10.3:6789

Alwin · Jul 23, 2020

dan.ger said:
iface bond1 inet static address 10.10.10.1/24

Try to set the netmask in the old style, as a separate config line. Or it doesn't see the bond interface.

dan.ger · Jul 23, 2020

Mhhh, really? it is working until the upgrade for mor than 1 year. i do not think it is really the matter, cause proxmox use debian buster under the hood.

Alwin · Jul 23, 2020

It's the ceph-osd service that is complaining.

dan.ger · Jul 23, 2020

Changing from

Code:

iface bond1 inet static
    address 10.10.10.1/24

to

Code:

iface bond1 inet static
    address 10.10.10.1
    netmask 255.255.255.0

takes no effect, all nodes reboot, same as before. Any other suggestions?

Alwin · Jul 23, 2020

dan.ger said:
takes no effect, all nodes reboot, same as before. Any other suggestions?

ATM, I would think that it could maybe be the bond interface, since the interface part in the log message is empty.

dan.ger · Jul 23, 2020

Mmmh, I can repoduce it, reboot pve-01 and pve-02 is not a problem everything just fine but pve-03 causes the issue rebooting pve-01 and pve-02

alexskysilk · Jul 23, 2020

How are your nodes PHYSICALLY cabled? I'd start with doing cable pull tests to see if you can replicate the problem without reboot by disrupting the network connectivity.

Alwin · Jul 24, 2020

dan.ger said:
Mmmh, I can repoduce it, reboot pve-01 and pve-02 is not a problem everything just fine but pve-03 causes the issue rebooting pve-01 and pve-02

Do you reboot both at the same time?

dan.ger · Jul 24, 2020

The nodes connected to an 10GBe switch with configured trunks. So the cable shouldn't be the problem. Ceph is working without issues only monitors seems to restart nodes.

dan.ger · Jul 24, 2020

Alwin said:
Do you reboot both at the same time?

No, first restart pve-03 and then wait round about 30 seconds, then pve-01 and pve-02 restart automatically... It is a little bit anoying and drives me round the bent... cause this is a productive system...

If you only restart pve-01 everything ist fine, all other nodes are online. Same happens if you restart pve-02.

budy · Jul 24, 2020

If those nodes get fenced, then you should be able to get from the logs (syslog/messages) why that has been the case. I'd assume some issue with your ceph/osd config. I was actually wondering about this config:

Code:

osd pool default min size = 1
osd pool default size = 2

when you do have a 3-node setup… since for a 3-node cluster it's probably more like this:

Code:

osd pool default min size = 2
osd pool default size = 3

This way, you can loose one Ceph node and still run the Ceph cluster in an degraded state, once you loose the 2nd node, the Ceph pool will render read-only.

Alwin · Jul 24, 2020

dan.ger said:
No, first restart pve-03 and then wait round about 30 seconds, then pve-01 and pve-02 restart automatically... It is a little bit anoying and drives me round the bent... cause this is a productive system...

AS @budy said, the nodes get fenced. Best check the syslog and remove all HA resources (no fencing). This seems to call for a network issue. Would fit the ceph message as well.

dan.ger · Jul 24, 2020

I found the issue pve-03 has an MTU of 1500 and all others are set to 9000. So configuring pve-03 mtu to 9000 and it works like before.

Search

Search

Proxmox 6.2-10 CEPH Cluster reboots if one node is shutdown or rebooting

dan.ger

Well-Known Member

Alwin

Proxmox Retired Staff

dan.ger

Well-Known Member

Alwin

Proxmox Retired Staff

dan.ger

Well-Known Member

Alwin

Proxmox Retired Staff

dan.ger

Well-Known Member

Alwin

Proxmox Retired Staff

dan.ger

Well-Known Member

alexskysilk

Distinguished Member

Alwin

Proxmox Retired Staff

dan.ger

Well-Known Member

dan.ger

Well-Known Member

budy

Well-Known Member

Alwin

Proxmox Retired Staff

dan.ger

Well-Known Member

We value your privacy