[SOLVED] Using external Ceph cluster with Proxmox

Is the connection to the external ceph cluster working? You can, for example, click into one of the content types for that storage. If you get an error, there is most likely a deeper problem connecting to the cluster.
 
Yes, I can currently connect to the GUI and see all active nodes and all storage (osd) and grafana gui with all features.

I've also tried another method, but when I create the rdb, it sends me this error:
"Failed to create storage: mount error: job failed. See "journalctl -xe" for details. (500)"
 
You did check it in the Proxmox VE GUI right?

Can you post the contents of you /etc/pve/storage.cfg file and the output of pvesm status? Ideally inside [CODE][/CODE] tags.
 
Yes, I have looked at it and use it regularly for error correction, etc... so it works well.

storage.cfg :
Code:
dir: local
        path /var/lib/vz
        content iso,vztmpl,backup
        shared 0

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

pbs: storage
        datastore storage
        server X.X.X.X
        content backup
        fingerprint X>
        prune-backups keep-all=1
        username root@pam
    
nfs: VM-PROXMOX
        export /volume1/VM-PROXMOX
        path /mnt/pve/VM-PROXMOX
        server X.X.X.X
        content snippets,iso,rootdir,backup,vztmpl,images
        prune-backups keep-all=1

rbd: P_Pool
        content rootdir,images
        krbd 1
        monhost X.X.X.X
        pool P_Pool
        username admin

pvesm status :
Code:
root@pve-1:~# pvesm status
Not a proper rbd authentication file: /etc/pve/priv/ceph/P_Pool.keyring
Name              Type     Status           Total            Used       Available        %
P_Pool             rbd   inactive               0               0               0    0.00%
VM-PROXMOX         nfs     active      1917967488       181555072      1736293632    9.47%
local              dir     active        98497780        21864692        71583540   22.20%
local-lvm      lvmthin     active       354275328               0       354275328    0.00%
storage            pbs     active        14764660         4732088         9260772   32.05%
 
Last edited:
Okay, it is shown as inactive. Do you get an error message that might shed some light if you run pvesm list P_Pool?

And just to make sure, all Proxmox VE nodes can talk to all the nodes in the Ceph cluster on the network that is configured as the Ceph Public network?
The keyring used when you added the storage has the right permissions for RBD?
 
Code:
root@pve-1:~# pvesm list P_Pool
Not a proper rbd authentication file: /etc/pve/priv/ceph/P_Pool.keyring

All nodes in both clusters can communicate with each other.

Is there a command to check if the keyring used when adding storage has the right permissions for rbd ?
I'm not sure, but I think it has the right permissions.
 
Last edited:
Not a proper rbd authentication file
sounds like you did not provide a proper keyring file for the user.

Since the user in the storage config is called admin, you would need to run
Code:
$ ceph auth get client.admin
[client.admin]
    key = AQCNr8BlF2jeHhAAyhew3Vyp8bUnF5BV+yDYrA==
    caps mds = "allow *"
    caps mgr = "allow *"
    caps mon = "allow *"
    caps osd = "allow *"
on the Ceph cluster and paste the output in the keyring field when adding the storage.
 
I added the storage from the "Data Center" -> "Storage" -> "Add" -> "RBD".
It added well, I have the storage space on each of the nodes but I don't have the visible storage space, it says "unavailable".


The second method I tried was to create CephFS storage directly, which returned "create storage failed: mount error: Job failed. See "journalctl -xe" for details. (500)", but the journalctl -xe says nothing.
 
Last edited:
If you check, on one of the nodes where you want to connect to the other Ceph cluster via RBD, does the following file contain the keyring info as you got it from the Ceph cluster and provided it when configuring the storage?
Code:
/etc/pve/priv/ceph/{storage name}.keyring

You can also try it manually to see if it should work:
Code:
rbd -m {ip of monitor} --keyring /etc/pve/priv/ceph/{storage}.keyring ls
 
My :
Code:
/etc/pve/priv/ceph/{storage name}.keyring
Contains the key I got when I did the ceph auth get client.admin command

This command :
Code:
rbd -m {ip of monitor} --keyring /etc/pve/priv/ceph/{storage}.keyring ls
Return me :
Code:
root@pve-1:/etc/pve/priv/ceph# rbd -m X.X.X.X --keyring /etc/pve/priv/ceph/P_Pool.keyring ls
2024-02-09T16:02:24.077+0100 7f62b25046c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1]
rbd: couldn't connect to the cluster!
rbd: listing images failed: (1) Operation not permitted

I also tried with another file that contains everything

First file :
Code:
<keyring>
It return :

Code:
root@pve-1:/etc/pve/priv/ceph# rbd -m X.X.X.X --keyring /etc/pve/priv/ceph/P_Pool.keyring ls
2024-02-09T16:28:29.969+0100 7fa309e7d6c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1]
rbd: couldn't connect to the cluster!
rbd: listing images failed: (1) Operation not permitted
Second file :
Code:
[client.admin]
        key = XXXXXXXXXXXXXXX
        caps mds = "allow *"
        caps mgr = "allow *"
        caps mon = "allow *"
        caps osd = "allow *"
It return :

Code:
root@pve-1:/etc/pve/priv/ceph# rbd -m X.X.X.X --keyring /etc/pve/priv/ceph/P_Pool.key ls
rbd: error opening default pool 'rbd'
Ensure that the default pool has been created or specify an alternate pool name.
rbd: listing images failed: (2) No such file or directory
 
Last edited:
Ah the last output looks good, the connection seemed to have worked. If you did not call the pool "rbd", you need to specify the pool name as well, so an additional -p {pool}
 
The command works, I haven't had any errors, but my storage is still unavailable.

I've added the pool but I still get the "unavailable" message

Furthermore, when I click on "Virtual Machine Disk" and "Container volumes", I get "Not a proper rbd authentification file :/etc/pve/priv/ceph/P_Pool.keyring(500)".
 
Last edited:
So, when you recreate the storage, add the contents of the "second file" in the keyring field as that seems to be good. :)
 
  • Like
Reactions: noname01
I'll try as soon as possible, i'll edit the topic status if it works.

Thanks in advance for your help ! ;)
 
Hello, it works fine now, thank you very much!

But I have a ceph cluster of 3 nodes (300GB per node), I only have 300GB instead of 900GB in my storage...
Do you have an idea ?
 
Please read up on how Ceph works. https://docs.ceph.com/en/quincy/architecture/
With a replicated pool, you have a size/min_size of 3/2. That means, that Ceph will try to keep 3 replicas of each data on a different host and will keep the pool operational as long as 2 replicas are available.

Do NOT set the min_size to anything lower than 2, unless you don't value your data!

SO having a raw capacity of 900 GiB in the cluster and usable ~300 GiB sound absolutely fine.
 
I finalise my documentation and change the thread state !

Well, thanks you very much for your help ! ;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!