Adding cluster_network to an existing ALL public_network configuration

liszca

Active Member
May 8, 2020
64
1
28
22
Since new Hardware has arrived I wanted to configure a separate network for the OSDs

Its 4 Hosts each has one OSD

Code:
 # ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-1         3.72595  root default                              
-3         0.93149      host aegaeon                          
 0    ssd  0.93149          osd.0         up   1.00000  1.00000
-5         0.93149      host anthe                            
 1    ssd  0.93149          osd.1         up   1.00000  1.00000
-7         0.93149      host atlas                            
 2    ssd  0.93149          osd.2         up   1.00000  1.00000
-9         0.93149      host calypso                          
 3    ssd  0.93149          osd.3         up   1.00000  1.00000

So I configured the extra ethernet with static IPs: 10.1.0.10..13 for host aegaeon,anthe,atlas,calypso.
Then I changed /etc/ceph/ceph.conf: cluster_network 10.1.0.10/24 followed by a restart: systemctl restart ceph*

After host 10.1.0.10 came back I checked for cluster_network:
ceph config set global cluster_network
This gave me the expected result so I did systemctl restart ceph* on every other node.
But somehow it didn't find together by itself complaining about slow ops.
Exakt message:
"oldest one blocked for 221 sec, mon.aegaeon has slow ops"

Is my approach wrong or do I have to set the cluster_network differently to how I did?
 
Last edited:
The services might need to be recreated as they might still run only with public ips in ceph. Whats your network bandwith you have in the ceph-networks? Separating ceph networks usually helps the mot when you are on low-bandwithn and have more then 3 hosts. Clusternetwork helps the most on recovery. I would destroy one of the mons (if you have 4) and the create a new one, delete the next one, recreate it etc. until your finished

Can you share your /etc/network/interfaces file? Having it only with one osd, might not help that much. Its likely that your limiting the performance with the single osd.
 
Last edited:
The services might need to be recreated as they might still run only with public ips in ceph. Whats your network bandwith you have in the ceph-networks? Separating ceph networks usually helps the mot when you are on low-bandwithn and have more then 3 hosts. Clusternetwork helps the most on recovery. I would destroy one of the mons (if you have 4) and the create a new one, delete the next one, recreate it etc. until your finished
I planned for 2.5Gb. But 3 of the 5 USB 2.5Gb where not able to operate correctly.

After managing to set it up by removal of the fault USB Ethernet it is running. But I am using 1Gb ethernet.
Recovery doesn't seem to use even full bandwith of 1Gb, I am curious if the remaining USB Ethernet is still faulty.

In case somebody is interested which hardware:
https://geizhals.de/inter-tech-argus-it-732-lan-adapter-88885593-a2750561.html
What I noticed on the broken ones:
  • The only managed to negotiate for 1Gb
  • And they didn't git as warm as the working ones, which I think is normal in 1Gb mode

Can you share your /etc/network/interfaces file? Having it only with one osd, might not help that much. Its likely that your limiting the performance with the single osd.
I have multiple diffent network configurations:

Host: Aegaeon
Code:
auto lo
iface lo inet loopback

iface enxf4b52021da43 inet manual
    ethernet-wol g

auto enx00e04c680029
iface enx00e04c680029 inet static
    address 10.1.0.10/24

auto vmbr0
iface vmbr0 inet static
    address 192.168.0.10/24
    gateway 192.168.0.1
    bridge-ports enxf4b52021da43
    bridge-stp off
    bridge-fd 0

Host: Anthe
Code:
auto lo
iface lo inet loopback

iface enx6045cba2e668 inet manual
    ethernet-wol g

iface enx001f2955f0d4 inet manual

auto enx001f2955f0d5
iface enx001f2955f0d5 inet static
    address 10.1.0.11/24

auto vmbr0
iface vmbr0 inet static
    address 192.168.0.11/24
    gateway 192.168.0.1
    bridge-ports enx6045cba2e668
    bridge-stp off
    bridge-fd 0

Host: Atlas
Code:
Is Powered off for testing recovery speed


Host: Calypso
Code:
auto enx00e04c680053
iface enx00e04c680053 inet static
    address 10.1.0.13/24
    ethernet-wol g

iface enxf4b520183dac inet manual

auto vmbr0
iface vmbr0 inet static
    address 192.168.0.13/24
    gateway 192.168.0.1
    bridge-ports enxf4b520183dac
    bridge-stp off
    bridge-fd 0


And the Ceph config, I also added the OSDs to the public net in addition

1708826109306.png

Code:
[global]
     auth_client_required = cephx
     auth_cluster_required = cephx
     auth_service_required = cephx
     cluster_network = 10.1.0.0/24
     #cluster_network = 192.168.0.0/24
     fsid = ddfe12d5-782f-4028-b499-71f3e6763d8a
     mon_allow_pool_delete = true
     mon_host = 192.168.0.10 192.168.0.11 192.168.0.12 192.168.0.13
     ms_bind_ipv4 = true
     ms_bind_ipv6 = false
     osd_pool_default_min_size = 2
     osd_pool_default_size = 3
     public_network = 192.168.0.0/24

[client]
     keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
     keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.aegaeon]
     host = aegaeon
     mds_standby_for_name = pve

[mds.anthe]
     host = anthe
     mds standby for name = pve

[mds.atlas]
     host = atlas
     mds_standby_for_name = pve

[mds.calypso]
     host = calypso
     mds_standby_for_name = pve

[mon.aegaeon]
     public_addr = 192.168.0.10

[mon.anthe]
     public_addr = 192.168.0.11

[mon.atlas]
     public_addr = 192.168.0.12

[mon.calypso]
     public_addr = 192.168.0.13

[osd]
    public_network = 192.168.0.0/24
    cluster_network = 10.1.0.0/24

[osd.0]
    host = aegaeon
    public_addr = 192.168.0.10/24
    cluster_addr = 10.1.0.10/24
[osd.1]
        host = anthe
    public_addr = 192.168.0.11/24
    cluster_addr = 10.1.0.11/24
[osd.2]
        host = atlas
    public_addr = 192.168.0.12/24
    cluster_addr = 10.1.0.12/24
[osd.3]
    host = calypso
    public_addr = 192.168.0.13/24
    cluster_addr = 10.1.0.13/24
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!