Proxmox with Corosync+Ceph / how multible "routes" to ceph storage

TheMrg · Aug 2, 2019

We have 3 nodes with 2 private nics
eth0 - 192.168.0.0/24 corosync link 1 , ceph storage, proxmox gui and clustercommunication f.e migration
eth1 - 192.168.1.0/24 corosync link 0

we have set corosync nodes to have 2 rings (ring0_addr: 192.168.1.2 / ring1_addr: 192.168.0.2).
this works well.
so corosync not runs on storage network.

ceph.conf
[global]
<------> auth_client_required = cephx
<------> auth_cluster_required = cephx
<------> auth_service_required = cephx
<------> cluster network = 192.168.0.0/24
<------> fsid = 00cf7097-2cb5-47f6-b342-e55094bf839e
<------> mon_host = 192.168.0.20 192.168.0.2 192.168.0.1
<------> mon_initial_members = storage1 storage2
<------> public network = 192.168.0.0/24

[btw: is the _ or the space correct? we get different statements]

Now if eth0 failed on storage1 (corosync works well via ring1, so the node is not marked offline)
- proxmox gui can not reach the host, no route to host
- the VM on storage1 still run (pvecm says node is online).
- now the problem is that ceph is nor reachable. so the VM hungs. But Proxmox HA think all is fine.
- ceph -s hungs of cause

How can i say ceph to use the eth1/192.168.1.0 alternatively if eth0/192.168.1.0 failed?
public network in [global]?
public network = 192.168.0.0/24, 192.168.1.0/24 seems not working. also monitors are on 192.168.0.0/24 network.

or: may ceph can fence the host so that corosync can mark this host as faulty?

or do i have a fundamental design failure?

Thanks.

Alwin · Aug 5, 2019

TheMrg said:
[btw: is the _ or the space correct? we get different statements]

Ceph should handle both.

TheMrg said:
- proxmox gui can not reach the host, no route to host

Check that you have both IPs in your /etc/hosts and you can login with ssh from both networks.

TheMrg said:
How can i say ceph to use the eth1/192.168.1.0 alternatively if eth0/192.168.1.0 failed?
public network in [global]?
public network = 192.168.0.0/24, 192.168.1.0/24 seems not working. also monitors are on 192.168.0.0/24 network.

This would only work with either a bond or multipath routing. But in general, Ceph needs the same network redundancy as Corosync to function properly with HA.

TheMrg · Aug 5, 2019

Thanks.

"Check that you have both IPs in your /etc/hosts and you can login with ssh from both networks."
do you mean to hosts?

192.168.0.2 storage2
192.168.1.2 storage2

This does not help. May Proxmox GUI think the nodes are on 192.168.0.N .. in
GUI cluster -> Overview the nodes are listed with
192.168.0.N
in
GUI cluster -> Cluster the nodes are listed with link 0 192.168.1.N and link 1 192.168.0.N

Alwin · Aug 5, 2019

The entries need to be populated in the whole cluster, eg.:

Code:

root@pve6ceph01:~# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.19.151    pve6ceph01.proxmox.com pve6ceph01
10.10.10.151    pve6ceph01.proxmox.com pve6ceph01

192.168.19.152    pve6ceph02.proxmox.com pve6ceph02
10.10.10.152    pve6ceph02.proxmox.com pve6ceph02

192.168.19.153    pve6ceph03.proxmox.com pve6ceph03
10.10.10.153    pve6ceph03.proxmox.com pve6ceph03

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

TheMrg · Aug 8, 2019

sadly not. we have in hosts

192.168.0.1 storage1
192.168.1.1 storage1
192.168.0.2 storage2
192.168.1.2 storage2
192.168.0.3 storage3
192.168.1.3 storage3

some weeks ago added all nodes to the cluster via

pvecm add 192.168.0.1 -link0 192.168.1.2 -link1 192.168.0.2

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.1.3
0x00000002 1 192.168.1.1
0x00000003 1 192.168.1.20
0x00000004 1 192.168.1.2 (local)

but if we down interface with 192.168.0.2 (eth0) the GUI says
No route to host (595) to storage2
corosync see the host is online. so no fence

Alwin · Aug 12, 2019

Please restart the 'pveproxy.service' on all nodes and try it again.

Search

Search

Proxmox with Corosync+Ceph / how multible "routes" to ceph storage

TheMrg

Well-Known Member

Alwin

Proxmox Retired Staff

TheMrg

Well-Known Member

Alwin

Proxmox Retired Staff

TheMrg

Well-Known Member

Alwin

Proxmox Retired Staff

We value your privacy