Storage Problems With Proxmox/Ceph

tmsg

Member
Jan 13, 2011
11
0
21
I'm trying to test out a Proxmox/Ceph cluster and the gui/storage seems to stop working for all storage related tasks/info when I setup Ceph.

I setup a nested proxmox cluster (wiki/Nested_Virtualization) and everything seems to work with that. Hardware server is running pve-manager: 3.4-9 with kernel: 3.10.0-11-pve. I have three VMs setup:
OS: PVE 3.4-9, Kernel 2.6.32-40-pv,
NICs: VIRTIO0 - Bridge to Internet connected NIC
VIRTIO1 - Bridge to be used for Proxmox/VMs
VIRTIO2 - Bridge to be used for Ceph
Hard Drives: VIRTIO0 - For Proxmox
VIRTIO1 - For Ceph Journal
VIRTIO2 - For Ceph disk #1
VIRTIO3 - For Ceph disk #2

Everything seems to run fine with the nested proxmox cluster until I setup ceph (wiki - Ceph_Server). I have ceph installed, monitors setup, osd disks setup and a pool created. Now if I try to check content through the gui I get communication failure. If I try and create a VM, the Hard Disk -> Storage box is grey/unavailable. Once I get this communication failure, everything related to storage gets a communication failure as well. Anything that makes a call to /api2/json/nodes/pc1/storage generates a error. Once the storage timeouts start, I may get timeouts in other parts of the gui and the graphs do not show any data. Making a request through the gui for storage related information seems to start this. The same thing happens on all three nested proxmox nodes. I've started over from bare metal install on this configuration several times, double-checking every step along the way.

I've tried creating a VM from the command line with local storage and that works, but the storage is not created.

Any ideas on what is going on would be appreciated.

From access.log
xxx - root-at-pam [20/Aug/2015:10:29:52 -0700] "GET /api2/json/nodes/pc1/storage?content=images HTTP/1.1" 596 -
xxx - root-at-pam [20/Aug/2015:10:29:52 -0700] "GET /api2/json/nodes/pc1/storage?content=iso HTTP/1.1" 596 -
xxx - root-at-pam [20/Aug/2015:10:30:19 -0700] "GET /api2/json/nodes/pc1/storage/local/status HTTP/1.1" 596 -
xxx - root-at-pam [20/Aug/2015:10:30:49 -0700] "GET /api2/json/nodes/pc1/storage/RBD_Pool1/status HTTP/1.1" 596 -
xxx - root-at-pam [20/Aug/2015:10:32:20 -0700] "GET /api2/json/nodes/pc1/storage/local/status HTTP/1.1" 596 -
xxx - root-at-pam [20/Aug/2015:10:32:50 -0700] "GET /api2/json/nodes/pc1/storage/RBD_Pool1/status HTTP/1.1" 596 -

From syslog
Aug 20 10:29:36 pc1 pveproxy[87782]: proxy detected vanished client connection
Aug 20 10:33:06 pc1 pvestatd[3807]: status update time (300.102 seconds)
Aug 20 10:35:22 pc1 pveproxy[87780]: proxy detected vanished client connection

Top level version information
proxmox-ve-2.6.32: 3.4-160 (running kernel: 3.10.0-11-pve)
pve-manager: 3.4-9 (running version: 3.4-9/4b51d87a)
pve-kernel-2.6.32-40-pve: 2.6.32-160
pve-kernel-3.10.0-11-pve: 3.10.0-36
pve-kernel-2.6.32-29-pve: 2.6.32-126
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-3
pve-cluster: 3.0-18
qemu-server: 3.4-6
pve-firmware: 1.1-4
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-33
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-11
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

Nested version information
proxmox-ve-2.6.32: 3.4-160 (running kernel: 2.6.32-40-pve)
pve-manager: 3.4-9 (running version: 3.4-9/4b51d87a)
pve-kernel-2.6.32-40-pve: 2.6.32-160
pve-kernel-2.6.32-39-pve: 2.6.32-157
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-3
pve-cluster: 3.0-18
qemu-server: 3.4-6
pve-firmware: 1.1-4
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-33
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-11
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

Virtualization working on VM. CPUINFO
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm
constant_tsc arch_perfmon rep_good unfair_spinlock pni vmx ssse3 cx16 sse4_1 x2apic hypervisor lahf_lm vnmi

VM Config file
args: -enable-kvm
bootdisk: virtio0
cores: 2
cpu: host
ide2: local:iso/proxmox-ve_3.4-102d4547-6.iso,media=cdrom
memory: 12288
name: PC1
net0: virtio=92:E7:5E:18:CC:E7,bridge=vmbr0
net1: virtio=56:07:4F:EE:E6:5F,bridge=vmbr1
net2: virtio=96:78:D1:F5:11:70,bridge=vmbr2
numa: 0
onboot: 1
ostype: l26
smbios1: uuid=e73284d5-4878-4904-beec-3b3100829b4f
sockets: 2
virtio0: local:101/vm-101-disk-1.qcow2,format=qcow2,size=15G
virtio1: local:101/vm-101-disk-2.qcow2,format=qcow2,size=15G
virtio2: local:101/vm-101-disk-3.qcow2,format=qcow2,size=25G
virtio3: local:101/vm-101-disk-4.qcow2,format=qcow2,size=25G

VM network (have tried with configuring eth1/eth2 instead of vmbr1/vmbr2)
auto lo
iface lo inet loopback

auto vmbr1
iface vmbr1 inet static
address 10.10.10.101
netmask 255.255.255.0
bridge_ports eth1
bridge_stp off
bridge_fd 0

auto vmbr2
iface vmbr2 inet static
address 10.10.11.101
netmask 255.255.255.0
bridge_ports eth2
bridge_stp off
bridge_fd 0

auto vmbr0
iface vmbr0 inet static
address xx.xx.xx.xx
netmask xx.xx.xx.xx
gateway xx.xx.xx.xx
bridge_ports eth0
bridge_stp off
bridge_fd 0


Storage.cfg
rbd: RBD_Pool1
monhost pc1
pool pool1
content images
username admin

dir: local
path /var/lib/vz
content images,iso,vztmpl,rootdir
maxfiles 0


Ceph Status
cluster fe377151-e3ac-498f-8fac-daaf98defe56
health HEALTH_OK
monmap e3: 3 mons at {0=10.10.11.101:6789/0,1=10.10.11.102:6789/0,2=10.10.11.103:6789/0}
election epoch 40, quorum 0,1,2 0,1,2
osdmap e185: 6 osds: 5 up, 5 in
pgmap v550: 320 pgs, 2 pools, 0 bytes data, 0 objects
188 MB used, 124 GB / 124 GB avail
320 active+clean
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
Hello tmsg,
Did you find what was causing the 'communication failure (0)' error message in Proxmox Web interface? I am facing the exact same symptoms on latest (4.3-1) version…
Thanks.
 

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
I'm trying to test out a Proxmox/Ceph cluster and the gui/storage seems to stop working for all storage related tasks/info when I setup Ceph.

I setup a nested proxmox cluster (wiki/Nested_Virtualization) and everything seems to work with that. Hardware server is running pve-manager: 3.4-9 with kernel: 3.10.0-11-pve. I have three VMs setup:
OS: PVE 3.4-9, Kernel 2.6.32-40-pv,
NICs: VIRTIO0 - Bridge to Internet connected NIC
VIRTIO1 - Bridge to be used for Proxmox/VMs
VIRTIO2 - Bridge to be used for Ceph
Hard Drives: VIRTIO0 - For Proxmox
VIRTIO1 - For Ceph Journal
VIRTIO2 - For Ceph disk #1
VIRTIO3 - For Ceph disk #2

Everything seems to run fine with the nested proxmox cluster until I setup ceph (wiki - Ceph_Server). I have ceph installed, monitors setup, osd disks setup and a pool created. Now if I try to check content through the gui I get communication failure. If I try and create a VM, the Hard Disk -> Storage box is grey/unavailable. Once I get this communication failure, everything related to storage gets a communication failure as well. Anything that makes a call to /api2/json/nodes/pc1/storage generates a error. Once the storage timeouts start, I may get timeouts in other parts of the gui and the graphs do not show any data. Making a request through the gui for storage related information seems to start this. The same thing happens on all three nested proxmox nodes. I've started over from bare metal install on this configuration several times, double-checking every step along the way.

I've tried creating a VM from the command line with local storage and that works, but the storage is not created.

Any ideas on what is going on would be appreciated.

From access.log


From syslog


Top level version information


Nested version information


Virtualization working on VM. CPUINFO


VM Config file


VM network (have tried with configuring eth1/eth2 instead of vmbr1/vmbr2)



Storage.cfg



Ceph Status

do you have copied the admin key in /etc/pve/priv/ceph/yourstorage.key ?
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
Hi spirit,
Yes, I have copied my key to '/etc/pve/priv/ceph/<storage>.keyring' (and not '.key', I believe it is a typo?).
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
BTW, I am able to access information for: My Node > Ceph (in the right panel) but I am unable to see any information relative to any of the attached storages: My Node > local (in the left panel) for instance: "Summary" is displaying none or partial graphic usage, and 'communication failure (0)' is written.

From a system point of view, the following commands are running (I can be mistaking, but it seems that one for on each storage I've selected, even not Ceph-related!):
Code:
/usr/bin/rados -p rbd -m 10.2.x.201,10.2.x.202,10.2.x.203 -n client.root --keyring /etc/pve/priv/ceph/<storage>.keyring --auth_supported cephx df
Where 10.2.x.0/24 is my private network for CEPH storage. To terminate cleanly these commands I have to run 'service pvestatd restart'.

I am still investigating,but isn't a 'rados' on a non-CEPH storage a nonsense?
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
And all my disks are back if I comment the CEPH rbd storage in '/etc/pve/storage.cfg'…
 

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
Code:
/usr/bin/rados -p rbd -m 10.2.x.201,10.2.x.202,10.2.x.203 -n client.root --keyring /etc/pve/priv/ceph/<storage>.keyring --auth_supported cephx df
Where 10.2.x.0/24 is my private network for CEPH storage. To terminate cleanly these commands I have to run 'service pvestatd restart'.

I am still investigating,but isn't a 'rados' on a non-CEPH storage a nonsense?

If you launch the command manually, does it hang ?
can you send your ceph.conf file ?

also , do you have proxmox firewall enabled ?
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
If I launch the command manually, it does hang, even if it returns an output:
Code:
# /usr/bin/rados -p rbd -m 10.2.x.201,10.2.x.202,10.2.x.203 -n client.root --keyring /etc/pve/priv/ceph/<storage>.keyring --auth_supported cephx df
2016-12-09 14:39:27.318429 7f9e91f18700  0 -- :/3289528711 >> 10.2.x.202:6789/0 pipe(0x560c3a106160 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x560c3a10b4d0).fault
2016-12-09 14:39:30.318630 7f9e769f1700  0 -- :/3289528711 >> 10.2.x.203:6789/0 pipe(0x7f9e68000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f9e68001f90).fault
2016-12-09 14:39:33.318797 7f9e91f18700  0 -- :/3289528711 >> 10.2.x.202:6789/0 pipe(0x7f9e68005190 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f9e68006450).fault
…
Until I hit CTRL+C.

For the FW:
  • according to the Datacenter > Firewall > Options: Enable Firewall = No.
  • according to all the Nodes > Firewall > Options: Enable Firewall = Yes.
  • according to 'service pve-firewall status', the status is active.
I would answer yes, the FW is enabled.

Here is a copy of my ceph.conf:
Code:
[global]
  auth client required = cephx
  auth cluster required = cephx
  auth service required = cephx
  cluster network = 10.2.x.0/24
  filestore xattr use omap = true
  fsid = 6a28a5eb-49c4-41a6-XXXX-XXXXXXXXXXXX
#  keyring = /etc/pve/priv/$cluster.$name.keyring
  keyring = /etc/ceph/$cluster.$name.keyring
  osd journal size = 5120
  osd pool default min size = 1 # Allow writing 1 copy in a degraded state
  osd pool default size = 2 # Write an object 2 times
  osd pool default pg num = 256
  osd pool default pgp num = 256
  public network = 10.2.0.0/18
  mon initial members = node1,node2,node3
  mon host = 10.2.0.201

[mon]
  debug mon = 9

[mon.2]
  host = node3
  mon addr = 10.2.0.203:6789

[mon.1]
  host = node2
  mon addr = 10.2.0.202:6789

[mon.0]
  host = node1
  mon addr = 10.2.0.201:6789

[osd]
#  debug osd = 9
  keyring = /var/lib/ceph/osd/ceph-$id/keyring

[osd.0]
  public addr = 10.2.0.201
  cluster addr = 10.2.x.201

[osd.1]
  public addr = 10.2.0.201
  cluster addr = 10.2.x.201

[osd.2]
  public addr = 10.2.0.202
  cluster addr = 10.2.x.202

[osd.3]
  public addr = 10.2.0.202
  cluster addr = 10.2.x.202

[osd.4]
  public addr = 10.2.0.203
  cluster addr = 10.2.x.203

[osd.5]
  public addr = 10.2.0.203
  cluster addr = 10.2.x.203

Please note that private network 10.2.x.0/24 is different from the public one 10.2.0.0/18.
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
More information… I can also confirm that my cluster is in the private network:
Code:
# cat /etc/pve/storage.cfg
…
rbd: cephstorage
  monhost 10.2.x.201;10.2.x.202;10.2.x.203
  content images
  krbd
  pool test
  username root
…
When the 'rados' commands are launched, they are harming the Proxmox interface. For instance, I can no longer access correctly: Node 1 > Ceph (right panel) options. Some get 'communication timeout (0)' after that. I have to restart CEPH using the following:
  • From the Proxmox interface, I make sure I am not on a CEPH page failing,
  • Then I stop CEPH on each node (update OSD # accordingly):
Code:
service ceph stop
service ceph-osd@0 stop
service ceph-osd@1 stop
service pvedaemon restart
service pvestatd  restart
  • I make sure there is no more 'ceph' user process running (and hanging): 'ps aux | grep ceph'. If there are, just run 'service pvestatd restart' again.
  • Then I restart CEPH on each node: 'service ceph start'.
At this moment, I can again get all menus from: Node1 > Ceph (right panel). If I go back to: Node1 > <ceph storage> (left panel) again, I can reproduce the problem…

I am now wondering if there were no network issue across subnets, e.g. public network unable to reach private network for any reason. But aren't ceph-osd's made for this? I can't be a routing problem?
 

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
If I launch the command manually, it does hang, even if it returns an output:
Code:
# /usr/bin/rados -p rbd -m 10.2.x.201,10.2.x.202,10.2.x.203 -n client.root --keyring /etc/pve/priv/ceph/<storage>.keyring --auth_supported cephx df
2016-12-09 14:39:27.318429 7f9e91f18700  0 -- :/3289528711 >> 10.2.x.202:6789/0 pipe(0x560c3a106160 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x560c3a10b4d0).fault
2016-12-09 14:39:30.318630 7f9e769f1700  0 -- :/3289528711 >> 10.2.x.203:6789/0 pipe(0x7f9e68000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f9e68001f90).fault
2016-12-09 14:39:33.318797 7f9e91f18700  0 -- :/3289528711 >> 10.2.x.202:6789/0 pipe(0x7f9e68005190 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f9e68006450).fault
…
Until I hit CTRL+C.

For the FW:
  • according to the Datacenter > Firewall > Options: Enable Firewall = No.
  • according to all the Nodes > Firewall > Options: Enable Firewall = Yes.
  • according to 'service pve-firewall status', the status is active.
I would answer yes, the FW is enabled.

Here is a copy of my ceph.conf:
Code:
[global]
  auth client required = cephx
  auth cluster required = cephx
  auth service required = cephx
  cluster network = 10.2.x.0/24
  filestore xattr use omap = true
  fsid = 6a28a5eb-49c4-41a6-XXXX-XXXXXXXXXXXX
#  keyring = /etc/pve/priv/$cluster.$name.keyring
  keyring = /etc/ceph/$cluster.$name.keyring
  osd journal size = 5120
  osd pool default min size = 1 # Allow writing 1 copy in a degraded state
  osd pool default size = 2 # Write an object 2 times
  osd pool default pg num = 256
  osd pool default pgp num = 256
  public network = 10.2.0.0/18
  mon initial members = node1,node2,node3
  mon host = 10.2.0.201

[mon]
  debug mon = 9

[mon.2]
  host = node3
  mon addr = 10.2.0.203:6789

[mon.1]
  host = node2
  mon addr = 10.2.0.202:6789

[mon.0]
  host = node1
  mon addr = 10.2.0.201:6789

[osd]
#  debug osd = 9
  keyring = /var/lib/ceph/osd/ceph-$id/keyring

[osd.0]
  public addr = 10.2.0.201
  cluster addr = 10.2.x.201

[osd.1]
  public addr = 10.2.0.201
  cluster addr = 10.2.x.201

[osd.2]
  public addr = 10.2.0.202
  cluster addr = 10.2.x.202

[osd.3]
  public addr = 10.2.0.202
  cluster addr = 10.2.x.202

[osd.4]
  public addr = 10.2.0.203
  cluster addr = 10.2.x.203

[osd.5]
  public addr = 10.2.0.203
  cluster addr = 10.2.x.203

Please note that private network 10.2.x.0/24 is different from the public one 10.2.0.0/18.


client need to use monitors and osd through the public network. so you need to use monitor public network ip in /etc/pve/storage.cfg.

cluster network is only used for inter osd replication.
 

gkovacs

Well-Known Member
Dec 22, 2008
506
48
48
Budapest, Hungary
@Nicolas Dey @spirit @dietmar

I am experiencing very similar issues with Ceph. I have set up a two bridge network in my five node Proxmox 4.3 cluster, vmbr0 is for regular cluster traffic (10.10.0.x), and vmbr1 for Ceph traffic (192.168.0.x set up with pveceph -init). After installing Ceph, creating monitors on all 5 cluster nodes, and adding 1-2 OSDs from all 5 cluster nodes, I have created a pool and a storage mount (and copied the keyring file as per the wiki).

I wasn't able to access the pool from Storage even once, regardless of which monitor address I put in. Also the Proxmox cluster communication (which is supposed to be on a different subnet and adapter) has started showing problems: web interface starting to time out, communication problems and failures, statistics not updating.

I ended up removing Ceph, thought I had made a mistake (but I did every installation step as per the wiki). Now it seems others are having the same problem... Has anyone managed to install Ceph nowadays on 4.3?
 

udo

Famous Member
Apr 22, 2009
5,909
169
83
Ahrensburg; Germany
@Nicolas Dey @spirit @dietmar

I am experiencing very similar issues with Ceph. I have set up a two bridge network in my five node Proxmox 4.3 cluster, vmbr0 is for regular cluster traffic (10.10.0.x), and vmbr1 for Ceph traffic (192.168.0.x set up with pveceph -init). After installing Ceph, creating monitors on all 5 cluster nodes, and adding 1-2 OSDs from all 5 cluster nodes, I have created a pool and a storage mount (and copied the keyring file as per the wiki).

I wasn't able to access the pool from Storage even once, regardless of which monitor address I put in. Also the Proxmox cluster communication (which is supposed to be on a different subnet and adapter) has started showing problems: web interface starting to time out, communication problems and failures, statistics not updating.

I ended up removing Ceph, thought I had made a mistake (but I did every installation step as per the wiki). Now it seems others are having the same problem... Has anyone managed to install Ceph nowadays on 4.3?
Hi,
yes I have installed ceph (hammer) on an 4.3-cluster tree/four weeks ago.
And I use the wiki too - I remember no trouble.

BTW. Updated the cluster to jewel this afternoon.

Udo
 

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
@Nicolas Dey @spirit @dietmar

I am experiencing very similar issues with Ceph. I have set up a two bridge network in my five node Proxmox 4.3 cluster, vmbr0 is for regular cluster traffic (10.10.0.x), and vmbr1 for Ceph traffic (192.168.0.x set up with pveceph -init)

can you post your ceph.conf ?

so you have configured ceph public network to 192.168.0.x ?
 

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
Yeah how?

Code:
root@proxmox:~# pveceph install -version jewel
400 Parameter verification failed.
version: value 'jewel' does not have a value in the enumeration 'hammer'

pveceph install -version, is only for new install. (and proxmox has not yet updated/supported jewel has a bug existed in old version).

for updating a existing cluster, you need to change your /etc/apt/sources.list.d/ceph.list, and update to jewel.

Read the ceph release note on ceph.com, before updating.
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
Hi @spirit , thank you for your answer.
I've replaced private network addresses by public network ones, and restarted CEPH. Everything looks OK, but one: I am back to a RBD connection problem. When selecting: Node1 > Ceph Storage > Content, a grey message appears: "rbd error: rbd: couldn't connect to the cluster! (500)", and I am still unable to create a VM disk on the Ceph storage (times out).
In the monitor logs, I have:
Code:
…
2016-12-12 11:36:33.018611 7f201e395700  0 mon.0@0(leader).auth v247 caught error when trying to handle auth request, probably malformed request
2016-12-12 11:36:52.591387 7f201e395700  0 mon.0@0(leader).auth v247 caught error when trying to handle auth request, probably malformed request
…

I believe this is a permission issue… Until now, I was unable to solve it, but now that I am sure public network is correct in storage.cfg, I will try again.
 
Last edited:

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
Hi @spirit ,
I've replaced private network addresses by public network ones, and restarted CEPH. .

do you have only change monitor ip in proxmox /etc/pve/storage.cfg ?

or do you have change public/private network in ceph.conf ? (as you said your restarted ceph).
As you can't change monitor ips or network when your ceph cluster is already built.
 

Nicolas Dey

New Member
Dec 7, 2016
22
0
1
@spirit, the monitor addresses are on the public network, the only mentions to the private network were in 'storage.cfg'. In 'ceph.conf' (cf. above, in my older post) I have a ref. on the private network for the OSDs. Did I understand correctly what you meant?

@frantek, yes, I have seen this post already, and it helped me a bit. My misunderstanding problem comes from so many installation attempts at first, with so many different keys. Could you please confirm which is the keyring that has to be copied to '/etc/pve/priv/ceph/' as it is not clear (for me) which keyring file has to be copied? For now, I have:
Code:
…
# ls -la /etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/<ceph_storage>.keyring
-rw-r----- 1 root www-data 137 Dec  8 14:23 /etc/ceph/ceph.client.admin.keyring
-rw------- 1 root www-data 137 Dec  8 16:38 /etc/pve/priv/ceph/<ceph_storage>.keyring
# md5sum /etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/<ceph_storage>.keyring
285f46b706aa3e69ee76135971947bf2  /etc/ceph/ceph.client.admin.keyring
285f46b706aa3e69ee76135971947bf2  /etc/pve/priv/ceph/<ceph_storage>.keyring
 

spirit

Famous Member
Apr 2, 2010
4,583
348
103
www.odiso.com
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@spirit, the monitor addresses are on the public network, the only mentions to the private network were in 'storage.cfg'. In 'ceph.conf' (cf. above, in my older post) I have a ref. on the private network for the OSDs. Did I understand correctly what you meant?

@frantek, yes, I have seen this post already, and it helped me a bit. My misunderstanding problem comes from so many installation attempts at first, with so many different keys. Could you please confirm which is the keyring that has to be copied to '/etc/pve/priv/ceph/' as it is not clear (for me) which keyring file has to be copied? For now, I have:
Code:
…
# ls -la /etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/<ceph_storage>.keyring
-rw-r----- 1 root www-data 137 Dec  8 14:23 /etc/ceph/ceph.client.admin.keyring
-rw------- 1 root www-data 137 Dec  8 16:38 /etc/pve/priv/ceph/<ceph_storage>.keyring
# md5sum /etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/<ceph_storage>.keyring
285f46b706aa3e69ee76135971947bf2  /etc/ceph/ceph.client.admin.keyring
285f46b706aa3e69ee76135971947bf2  /etc/pve/priv/ceph/<ceph_storage>.keyring


seem to be ok.

/etc/ceph/ceph.client.admin.keyring is generated when you install your ceph cluster. (It could be an external ceph cluster without proxmox).

Then you need to copy this file in /etc/pve/priv/ceph/<ceph_storage>.keyring
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!