I'm very new to Ceph and have been trying to synthesize the instructions to create a 3 node full mesh routed (with fallback) cluster using Minisforum MS-01's (2x10Gbe + 2x 2.5Gbe):
Looking for any and all input on what to try next.
What I've tried so far:
I'm sure it's a misconfiguration on my part, but I seem to have exhausted documentation and Google searches for a potential fix.
FRR config (/etc/frr/frr.conf):
Networking (/etc/network/interfaces):
Ceph conf ( /etc/pve/ceph.conf):
(I've gone back and forth between 10.15.15.0/24 vs 10.15.15.50/24)
- https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server#Routed_Setup_(with_Fallback)
- https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster
Code:
root@pve02:~# pveceph mon create --mon-address 10.15.15.51
Could not connect to ceph cluster despite configured monitors
Looking for any and all input on what to try next.
What I've tried so far:
- Rebuilt the nodes from scratch twice (note: each time using Proxmox Helper-Scripts post install updater script.)
- Services seem to be running correctly, but could use some input here to confirm
- Confirmed node1 is listening on the correct ports (ss -tulw)
- Pings to IPs all work correctly (inside and outside 'vtysh')
- Pinging DNS names don't work (haven't added names to hosts files since I'm not 100% sure of the ramifications to the cluster beyond Ceph)
- Disabled firewall cluster-wide
- Checked iptables, seems no explicit rules affecting anything (I'm a noob to iptables)
- Telnet to node 1 fails:
Code:
root@pve02:~# telnet 10.15.15.50 6789
Trying 10.15.15.50...
telnet: Unable to connect to remote host: Connection refused
I'm sure it's a misconfiguration on my part, but I seem to have exhausted documentation and Google searches for a potential fix.
FRR config (/etc/frr/frr.conf):
Code:
log syslog informational
frr defaults traditional
hostname pve01
log syslog warning
ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
!
interface lo
ip address 10.15.15.50/24
ip router openfabric 1
openfabric passive
!
interface enp2s0f0np0
ip router openfabric 1
openfabric csnp-interval 2
openfabric hello-interval 1
openfabric hello-multiplier 2
!
interface enp2s0f1np1
ip router openfabric 1
openfabric csnp-interval 2
openfabric hello-interval 1
openfabric hello-multiplier 2
!
line vty
!
router openfabric 1
net 49.0001.1111.1111.1111.00
lsp-gen-interval 1
max-lsp-lifetime 600
lsp-refresh-interval 180
Networking (/etc/network/interfaces):
Code:
auto lo
iface lo inet loopback
iface enp90s0 inet manual
iface enp87s0 inet manual
auto enp2s0f0np0
iface enp2s0f0np0 inet static
mtu 9000
auto enp2s0f1np1
iface enp2s0f1np1 inet manual
mtu 9000
auto vmbr0
iface vmbr0 inet static
bridge-ports enp90s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
auto Management
iface Management inet static
address 192.168.100.10/24
gateway 192.168.100.1
vlan-id 100
vlan-raw-device vmbr0
iface wlp90s0 inet manual
auto vmbr1
iface vmbr1 inet manual
bridge-ports enp87s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
source /etc/network/interfaces.d/*
post-up /usr/bin/systemctl restart frr.service
Ceph conf ( /etc/pve/ceph.conf):
(I've gone back and forth between 10.15.15.0/24 vs 10.15.15.50/24)
Code:
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 10.15.15.50/24
fsid = 99639064-1ae4-4f0b-9631-07854c9a6424
mon_allow_pool_delete = true
mon_host = 10.15.15.50
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.15.15.50/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[client.crash]
keyring = /etc/pve/ceph/$cluster.$name.keyring
[mon.pve01]
public_addr = 10.15.15.50