Proxmox VE Ceph Server released (beta)

lynn_yudi

Member
Nov 27, 2011
80
0
6
yes, refresh install pve and ceph by pveceph

#ceph -v
ceph version 0.80 (b78644e7dee100e48dfeca32c9270a6b210d3003)

#ceph -s
cluster b82584ba-4461-4117-a797-6a41f7f1be14
health HEALTH_WARN 198 pgs degraded; 368 pgs stuck unclean; clock skew detected on mon.1
monmap e3: 3 mons at {0=192.168.11.2:6789/0,1=192.168.11.3:6789/0,2=192.168.11.4:6789/0}, election epoch 4, quorum 0,1,2 0,1,2
osdmap e20: 3 osds: 3 up, 3 in
pgmap v34: 368 pgs, 3 pools, 0 bytes data, 0 objects
105 MB used, 11155 GB / 11156 GB avail
62 active
140 active+degraded
108 active+remapped
58 active+degraded+remapped

but, delete the rbd pool, no error! and create it, it's normal.

in cli:
# ceph osd pool delete metadata metadata --yes-i-really-really-mean-it
Error EBUSY: pool 'metadata' is in use by CephFS
 
Last edited:

impire

Member
Jun 10, 2010
106
0
16
Hello,

I am not sure what the problem is. I had to keep running the command many times before ceph can install. Almost every single time it give me the below error:

root@server1:~# pveceph install -version firefly
download and import ceph reqpository keys
update available package list
Reading package lists...
Building dependency tree...
Reading state information...
gdisk is already the newest version.
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
ceph : Depends: binutils but it is not installable
Depends: xfsprogs but it is not installable
Depends: libgoogle-perftools4 but it is not installable
ceph-common : Depends: libgoogle-perftools4 but it is not installable
E: Unable to correct problems, you have held broken packages.
command 'apt-get -q --assume-yes --no-install-recommends -o 'Dpkg::Options::=--f
orce-confnew' install -- ceph ceph-common gdisk' failed: exit code 100
root@server1:~#
 

RobFantini

Renowned Member
May 24, 2012
1,674
37
68
Boston,Mass
On Saturday I reinstalled ceph to 3 nodes, one of them had that issue.
It left me having to reinstall. i forget the exact error msg.

I reinstalled and had the exact same issue.

I test installed just one of the pgms because if i let ' pveceph ' command run it'd install just some of the pgms and not let me install the rest.

As far as I remember running ' apt-get update ' a few times until this worked.
Code:
apt-get update
apt-get install xfsprogs
I did that 3 or 4 times.

Also fiddled with sources,list and ended up with
Code:
deb http://ftp.us.debian.org/debian wheezy main contrib
deb http://security.debian.org/ wheezy/updates main contrib

# wheezy-updates, previously known as 'volatile'
deb http://ftp.us.debian.org/debian/ wheezy-updates main
and /etc/apt/sources.list.d/pve.list
Code:
deb [arch=amd64] http://download.proxmox.com/debian wheezy pve-no-subscription


So I test installed just one of the pgms because if i let ' pveceph ' command run it'd install just some of the pgms and not let me install the rest.

The weird thing is that I installed all 3 nodes withing the same hour and just one of them had the issue.
 

impire

Member
Jun 10, 2010
106
0
16
I read everywhere that SAS is supposed to be better and faster than SATA.

I have 3 servers filed with SATA drives for OSD and also for boot. On the 4th server, I filled it with SAS drives.

After creation of OSDs for all 4 servers, on the "OSD" section it show all the SAS drives on the 4th server have latency.

I also notice that during the install, boot, and OSD creation, the SATA seems to be faster than the SAS.

Am I not supposed to mix SATA and SAS drives for CEPH?

When I want to add more hard drive, does it have to be identical to the existing?
 

symmcom

Renowned Member
Oct 28, 2012
1,078
26
68
Calgary, Canada
www.symmcom.com
I read everywhere that SAS is supposed to be better and faster than SATA.

I have 3 servers filed with SATA drives for OSD and also for boot. On the 4th server, I filled it with SAS drives.

After creation of OSDs for all 4 servers, on the "OSD" section it show all the SAS drives on the 4th server have latency.

I also notice that during the install, boot, and OSD creation, the SATA seems to be faster than the SAS.

Am I not supposed to mix SATA and SAS drives for CEPH?

When I want to add more hard drive, does it have to be identical to the existing?
It is not good idea to mix and match different speed hard drive. SAS certainly faster than SATA. In your case SATA seems faster because your majority drives are SATA and they work together about same speed. CEPH will try to write in all drives equally. So your SAS might be faster but they have to wait for the slower drives to finish before they get their share. You can still mix, but do it eqaully in all nodes. Instead of having all SAS in one node, spread them over 4 nodes. Take out some SATA from other nodes and fill the 4th one. When you want to replace any SATA with SAS, do it in a set of 4. Hope this makes sense.

How many replicas are you using? Whats your PG count?
 

felipe

Member
Oct 28, 2013
152
1
18
isnt it possible to mixe sas,sata sas disk using weights for the disks?
example weight 1 for all sata disks and weight 2 for all sas disks?
 

symmcom

Renowned Member
Oct 28, 2012
1,078
26
68
Calgary, Canada
www.symmcom.com
If i understand correctly Weight in CEPH defines how much data that HDD going to hold, not the speed. For example, a 2TB with weight 2will keep writing till it is full. If i have mix of 1 TB and 2TB in the cluster and i want to evenly distribute data among all of them, i would set weight 1 for all the 2TB HDD so that do not hold more than 1 TB. Weight allows use of multiple sizes HDD so certain HDDs do not write more than others.
I guess with this in a way we can set all SATA 2TBs weight to 1.90 while keep all SAS 2TB HDD to say 2. So they will be writing more while SATA 2TBs catches up. Cannot set SAS 2TB beyond 2 then it will run out of space and cause backfill. Actually the weight of 2TB is 1.81 by default in CEPH. I just used rounded figures here of weight 2.
 

impire

Member
Jun 10, 2010
106
0
16
It is not good idea to mix and match different speed hard drive. SAS certainly faster than SATA. In your case SATA seems faster because your majority drives are SATA and they work together about same speed. CEPH will try to write in all drives equally. So your SAS might be faster but they have to wait for the slower drives to finish before they get their share. You can still mix, but do it eqaully in all nodes. Instead of having all SAS in one node, spread them over 4 nodes. Take out some SATA from other nodes and fill the 4th one. When you want to replace any SATA with SAS, do it in a set of 4. Hope this makes sense.

How many replicas are you using? Whats your PG count?
Thank you very much. It makes a whole lot of sense.

I have 3 replicas. PG count is 1024 (16 x 2TB HD)

If I want to add more hard drive in the future, can I use 4TB-6TB drives or am I stuck with 2TB drives to match with the current? It would be silly if the drives we add to the cepth nodes has to match the current capacity. Please advise. Thank you for your help.
 

impire

Member
Jun 10, 2010
106
0
16
Hello everyone,

1) Is there any detailed documentation for Promox and Ceph?

2) Per this link: http://pve.proxmox.com/wiki/Storage:_Ceph

I've completed install of ceph nodes, created the necessary steps, then created the VMs successsfully.

Now I want to add ISO and installation images. Where do I go to upload them? I've tried to upload but keep running to an error. I can't find any documentation show me how to do it.

Any help pointing in the right direction is greatly appreciated.
 

symmcom

Renowned Member
Oct 28, 2012
1,078
26
68
Calgary, Canada
www.symmcom.com
Thank you very much. It makes a whole lot of sense.

I have 3 replicas. PG count is 1024 (16 x 2TB HD)

If I want to add more hard drive in the future, can I use 4TB-6TB drives or am I stuck with 2TB drives to match with the current? It would be silly if the drives we add to the cepth nodes has to match the current capacity. Please advise. Thank you for your help.
Sure you can add any sizes HDD later in future. It doesnt have to 2TB always. What i was saying, Since you have 3 replicas on 3 nodes, to balance writing you should replace in a set of 3 HDDs. If you replacing a 2TB with 4TB, try to replace 3 TBs with 3 4TBs. Note that it "DOES NOT" have to be that way. You can mix and match any sizes, CEPH will automatically set weight based on their capacity, but by using same sets it just gives you good balance of writes thus little more performance.


Now I want to add ISO and installation images. Where do I go to upload them? I've tried to upload but keep running to an error. I can't find any documentation show me how to do it.

You cannot upload ISO on CEPH RBD storage, if that is what you are trying to do. RBD only supports RAW image. If you want to use CEPH to store ISO and other disk images such as qcow2, vmdk then you have to setup CephFS. Here are some simplified steps to create CephFS on CEPH cluster. This steps needs to be done all proxmox nodes that you want to use CephFS:
1. Install ceph-fuse Proxmox node: #apt-get install ceph-fuse
2. Create a separate pool: #ceph osd pool create <a_name> 512 512
3. Create a mount folder on Proxmox nodes: #mkdir /mnt/cephfs or <anything>
4. #cd /etc/pve
5. #ceph-fuse /mnt/<a_folder> -o nonempty
6. Goto Proxmox GUI and add the storage as local directory: /mnt/<a_folder> and you will see you can use any image types.

To umount a CephFS simply run: #fusermount -u /mnt/<a_folder>

To mount the CephFS during Proxmox reboot automatically add this to /etc/fstab:
#DEVICE PATH TYPE OPTIONS
id=admin,conf=/etc/pve/ceph.conf /mnt/<a_folder> fuse.ceph defaults 0 0

That is it for creating CephFS.

Keep in mind that, CephFS is not considered as "Production Ready" yet. But i have been using it for last 11 months without issue. I use it primarily to store ISO, templates and other test VMs with qcow2. All my production VMs are on RBD, so even if CephFS crashes it wont be big loss.
Hope this helps.



 

sdutremble

Member
Sep 29, 2011
85
0
6
Sure you can add any sizes HDD later in future. It doesnt have to 2TB always. What i was saying, Since you have 3 replicas on 3 nodes, to balance writing you should replace in a set of 3 HDDs. If you replacing a 2TB with 4TB, try to replace 3 TBs with 3 4TBs. Note that it "DOES NOT" have to be that way. You can mix and match any sizes, CEPH will automatically set weight based on their capacity, but by using same sets it just gives you good balance of writes thus little more performance. You cannot upload ISO on CEPH RBD storage, if that is what you are trying to do. RBD only supports RAW image. If you want to use CEPH to store ISO and other disk images such as qcow2, vmdk then you have to setup CephFS. Here are some simplified steps to create CephFS on CEPH cluster. This steps needs to be done all proxmox nodes that you want to use CephFS: 1. Install ceph-fuse Proxmox node: #apt-get install ceph-fuse 2. Create a separate pool: #ceph osd pool create 512 512 3. Create a mount folder on Proxmox nodes: #mkdir /mnt/cephfs or 4. #cd /etc/pve 5. #ceph-fuse /mnt/ -o nonempty 6. Goto Proxmox GUI and add the storage as local directory: /mnt/ and you will see you can use any image types. To umount a CephFS simply run: #fusermount -u /mnt/ To mount the CephFS during Proxmox reboot automatically add this to /etc/fstab: #DEVICE PATH TYPE OPTIONS id=admin,conf=/etc/pve/ceph.conf /mnt/ fuse.ceph defaults 0 0 That is it for creating CephFS. Keep in mind that, CephFS is not considered as "Production Ready" yet. But i have been using it for last 11 months without issue. I use it primarily to store ISO, templates and other test VMs with qcow2. All my production VMs are on RBD, so even if CephFS crashes it wont be big loss. Hope this helps. [/COLOR]
I thought it was also necessary to have at least one MDS? Could you modify your steps to have this added if I am correct? Thanks, Serge
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!