GUI cannot connect to ceph cluster

gollo

New Member
Oct 3, 2014
5
0
1
Hi,

I have set up a Proxmox VE 3.3 cluster with three nodes (HP ProLiant DL380 G6), with a ceph cluster on the same machines (all three nodes currently have two disks used as OSDs).
I am using 1 NIC for the public-facing side (vmbr0 with eth0) and three NICs with bonding for the storage and cluster communication (vmbr1 on bond0). The /etc/network/interfaces looks like this:
Code:
auto lo
iface lo inet loopback

auto vmbr0
iface vmbr0 inet static
    address PUBLIC_IPv4
    netmask 255.255.255.192
    gateway GATEWAY
    bridge_ports eth0
    bridge_stp off
    bridge_fd 0

iface vmbr0 inet6 static
    address PUBLIC_IPv6
    netmask 64
    up route -A inet6 add v6PREFIX::/48 dev vmbr0
    up route -A inet6 add default gw v6PREFIX::1 dev vmbr0
    down route -A inet6 del default gw v6PREFIX::1 dev vmbr0
    down route -A inet6 del v6PREFIX::/48 dev vmbr0


auto bond0
iface bond0 inet manual
    slaves eth1 eth2 eth3
    bond_mode 4
    bond_miimon 100
    bond_lacp_rate 1
    bond_xmit_hash_policy 2
    bond_downdelay 200
    bond_updelay 200
    mtu 9000

auto vmbr1
iface vmbr1 inet static
    address 172.30.254.5
    netmask 255.255.255.0
    bridge_ports bond0
    bridge_stp off
    bridge_fd 0
    mtu 9000
    up route add -net 224.0.0.0 netmask 240.0.0.0 dev vmbr1
    down route del -net 224.0.0.0 netmask 240.0.0.0 dev vmbr1

In the storage network, the nodes have the addresses .5, .10 and .15.
In /etc/hosts, the storage network IPs are connected to the hostname and AFAICT, all cluster and storage traffic is going through the bond.
I was able to set up the OSDs and the pool via the GUI and I could also add the RBD storage that way. My /etc/pve/storage.cfg looks like this:
Code:
rbd: pool-ceph01
    monhost 172.30.254.5 172.30.254.10 172.30.254.15
    pool pool-ha01
    content images
    username admin

The Ceph storage is properly shown on each node in the GUI. However, if I select the storage and click the "Content" tab, I cannot do anything, but rather get the error message
rbd error: rbd: couldn't connect to the cluster! (500)

I am rather new to PVE, so I was not entirely sure where to look for the reason (of course I searched the web, but this error seems to be not very common). However, I tried to look through the logs in /var/log, where I couldn't find anything suspicious.

Any hints on a possible reason or a place to look at are highly appreciated. If you need more information on the setup, I will provide them ASAP.

Thank you very much!
 
Thanks for the hint! I have changed it now, however I still get the same error. Should I restart any of the services?
 
I set everything up with pveceph and the GUI. I have just checked the file you mentioned. I found
/etc/pve/priv/ceph.client.admin.keyring but not /etc/pve/priv/ceph/pool-ceph01.keyring. I will copy it now and see if it works. Should this file be copied automatically when working with the GUI?
 
I actually had (I had even read quite a lot of blog posts on the topic as well), but I obviously overlooked this particular part. I'm very sorry :( But at least now any search engine will find this thread when searching for the error message, so people who make the same mistake, will find the solution to the problem faster :)