[SOLVED] Problem add external rbd storage

infrant

New Member
Apr 30, 2021
16
1
3
37
Good morning, I have a cluster with 16 Hosts proxmox and an external CEPH cluster configured to store the VMS from the promox, I was using it normally on the proxmox and recently we had to do a maintenance on the storage, we moved all the VMS to another storage, I allocated the RBD storage from the proxmox cluster and formatted the ceph, after we returned the RBD storage with the same IPS to the proxmox but it doesn’t connect anymore, I can make it go up on another proxmox that I have, but no longer on this cluster where it was before removing, now I have unknow status, my doubt is if there was any trash left behind that would not allow ceph to be connected again.

In the picture show the vmstore rbd inactive , but i dont see error , only got time out , in other proxmox this same storage its ok , but in this cluster i has this error

any idea ?

Many tks

1619783741655.png
 

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
804
66
33

infrant

New Member
Apr 30, 2021
16
1
3
37
Thanks for the return, I did the keyring exchange, but the error continues, if I do the same configuration on a new proxmox the storage connects without problems, in this cluster where I already had the storage just that I have the problem, it will be that by using the even IP it was om some "garbage", some place where I can not see that has the old settings?

In my storage.cfg its ok i believe


1619785273300.png
 

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
804
66
33
Are all of those Ceph Monitors?
Is the external pool called vmstore and the file in /etc/pve/priv/ceph is also called vmstore.keyring?

What's the output of pveversion -v.

What Ceph version is running on the external cluster?
 

infrant

New Member
Apr 30, 2021
16
1
3
37
Are all of those Ceph Monitors?
Is the external pool called vmstore and the file in /etc/pve/priv/ceph is also called vmstore.keyring?

Yes the name of keyring is the same of external pool ,
1619786822250.png

In the other test proxmox where I never had this storage it connects without problems, in this one I had to remove the storage and then put it again giving this problem of connecting.

What Ceph version is running on the external cluster?
Ceph Octupos , pool vmstore is ok for access , another proxmox access , only this cluster noting
1619787003844.png

root@pve1:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
root@pve1:~#
 

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
804
66
33
Is there anything in the pvestatd logs? journalctl -u pvestatd
 

infrant

New Member
Apr 30, 2021
16
1
3
37
Hey Mira, i found this entrance in journalctl , this message apear when i enable RBD storage ceph

May 03 14:56:52 pve1 pvestatd[2212]: status update time (5.716 seconds)
May 03 14:57:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 03 14:57:00 pve1 systemd[1]: pvesr.service: Succeeded.
May 03 14:57:00 pve1 systemd[1]: Started Proxmox VE replication runner.
May 03 14:57:02 pve1 pvestatd[2212]: got timeout
May 03 14:57:03 pve1 pvestatd[2212]: status update time (5.662 seconds)
May 03 14:57:13 pve1 pvestatd[2212]: got timeout
May 03 14:57:13 pve1 pvestatd[2212]: status update time (5.635 seconds)


Ans this

May 03 15:04:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
May 03 15:04:00 pve1 systemd[1]: pvesr.service: Succeeded.
May 03 15:04:00 pve1 systemd[1]: Started Proxmox VE replication runner.
May 03 15:04:03 pve1 pvestatd[2212]: got timeout
May 03 15:04:03 pve1 pvestatd[2212]: status update time (5.630 seconds)
May 03 15:04:07 pve1 pvestatd[2212]: rados_connect failed - Operation not supported
May 03 15:04:22 pve1 pvestatd[2212]: got timeout
May 03 15:04:22 pve1 pvestatd[2212]: rados_connect failed - Operation not supported
May 03 15:04:23 pve1 pvestatd[2212]: status update time (5.629 seconds)
 
Last edited:

infrant

New Member
Apr 30, 2021
16
1
3
37
When i connect this storage return this

1620068059494.png
And storage

1620068093656.png

This is my configuration in storage cluster

1620068131253.png
 

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
804
66
33
Try adding the Ceph Repository and update the packages via apt update && apt full-upgrade.
For Nautilus:
Code:
# cat /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-nautilus buster main
For Octopus:
Code:
# cat /etc/apt/sources.list.d/ceph.list
deb http://download.proxmox.com/debian/ceph-octopus buster main

Don't run pveceph install! That is not required, but a newer client version might help in this situation.
 

infrant

New Member
Apr 30, 2021
16
1
3
37
Mira thank you very much, perfect this solution , that's right, probably some rbd package was out of date because CEPH updated this octupus version and so it didn't connect, now all 16 hosts are ok after the update, thanks again.
 

mira

Proxmox Staff Member
Staff member
Aug 1, 2018
804
66
33
Glad that solved your issue.
When updating the cluster to the latest version, was there a message similar to:
Code:
client is using insecure global_id reclaim
mons are allowing insecure global_id reclaim
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!