Proxmox VE Ceph Server released (beta)

impire

Member
Jun 10, 2010
106
0
16
"You cannot upload ISO on CEPH RBD storage, if that is what you are trying to do. RBD only supports RAW image. If you want to use CEPH to store ISO and other disk images such as qcow2, vmdk then you have to setup CephFS. Here are some simplified steps to create CephFS on CEPH cluster. This steps needs to be done all proxmox nodes that you want to use CephFS:
1. Install ceph-fuse Proxmox node: #apt-get install ceph-fuse
2. Create a separate pool: #ceph osd pool create <a_name> 512 512
3. Create a mount folder on Proxmox nodes: #mkdir /mnt/cephfs or <anything>
4. #cd /etc/pve
5. #ceph-fuse /mnt/<a_folder> -o nonempty
6. Goto Proxmox GUI and add the storage as local directory: /mnt/<a_folder> and you will see you can use any image types.

To umount a CephFS simply run: #fusermount -u /mnt/<a_folder>

To mount the CephFS during Proxmox reboot automatically add this to /etc/fstab:
#DEVICE PATH TYPE OPTIONS
id=admin,conf=/etc/pve/ceph.conf /mnt/<a_folder> fuse.ceph defaults 0 0

That is it for creating CephFS.

Keep in mind that, CephFS is not considered as "Production Ready" yet. But i have been using it for last 11 months without issue. I use it primarily to store ISO, templates and other test VMs with qcow2. All my production VMs are on RBD, so even if CephFS crashes it wont be big loss.
Hope this helps."
Thank you. This sounds like a lot just to store the ISOs.

- Can I just upload it to the local drive (boot drive) of each server? I can download the templates to the 'local'. But when I tried to upload an ISO, I get the error "Error: 500: can't activate storage 'local' on node 'server1' "

- In the Datacenter -> Storage -> Add section. I see options to create Directory, LVM and such. Can I just create a directory and put the ISOs on it?

I just want a simple method to store the ISO without having to go through the CephFS. Can it be done? Thank you in advance for your help.
 
Last edited:

impire

Member
Jun 10, 2010
106
0
16
If i understand correctly Weight in CEPH defines how much data that HDD going to hold, not the speed. For example, a 2TB with weight 2will keep writing till it is full. If i have mix of 1 TB and 2TB in the cluster and i want to evenly distribute data among all of them, i would set weight 1 for all the 2TB HDD so that do not hold more than 1 TB. Weight allows use of multiple sizes HDD so certain HDDs do not write more than others.
I guess with this in a way we can set all SATA 2TBs weight to 1.90 while keep all SAS 2TB HDD to say 2. So they will be writing more while SATA 2TBs catches up. Cannot set SAS 2TB beyond 2 then it will run out of space and cause backfill. Actually the weight of 2TB is 1.81 by default in CEPH. I just used rounded figures here of weight 2.
I keep seeing the term "weight" being apply throughout this discussion. Can someone help clarify what exactly is "weight" in the Ceph/ProxMox setup?
 

impire

Member
Jun 10, 2010
106
0
16
Hello,

Can any one help? I am sure I am not the only one with this question. I just want to be able to upload ISO without going through the CephFS? Can I just upload it to the local drive (boot)? Thanks in advance for your help.
 

impire

Member
Jun 10, 2010
106
0
16
I created a pool of 3 replicas on a 30TB. When I go to the datacenter and created the storage. It show me 30TB available space. Shouldn't it display 10TB as the available space?
 

dietmar

Proxmox Staff Member
Staff member
Apr 28, 2005
16,599
339
103
Austria
www.proxmox.com
When I go to the datacenter and created the storage. It show me 30TB available space. Shouldn't it display 10TB as the available space?
Not sure about that, because you can have different pools with different number of replicas. So that would just add confusion.
 

symmcom

Renowned Member
Oct 28, 2012
1,080
28
68
Calgary, Canada
www.symmcom.com
I created a pool of 3 replicas on a 30TB. When I go to the datacenter and created the storage. It show me 30TB available space. Shouldn't it display 10TB as the available space?
Available space will always show the total space of CEPH cluster. Look at the used space. When u save a 1gb file you will see 3gb being used

Sent from my SGH-I747M using Tapatalk
 

symmcom

Renowned Member
Oct 28, 2012
1,080
28
68
Calgary, Canada
www.symmcom.com
Yes, of course !

you can also upload it to each proxmox node local storage, easier than cephfs ;)
Local storage definitely easier than cephfs. But let's say if you wanted to play around with openvz or different image type; unless you have plenty of space on local node, cephfs will be much better option. Once it is setup it runs without tending much.


Sent from my SGH-I747M using Tapatalk
 

felipe

Member
Oct 28, 2013
152
2
18
i want to use different pools for my sata and ssd disks. (one big pool for slow sata data and one small for fast ssd)
as i will use it within the same servers i will need to have different root buckets and then make an ssd and an sata host for each real host. or is there any simpler way?
i think it is impossible to use the proxmox gui afterwards?
 

symmcom

Renowned Member
Oct 28, 2012
1,080
28
68
Calgary, Canada
www.symmcom.com
i want to use different pools for my sata and ssd disks. (one big pool for slow sata data and one small for fast ssd)
as i will use it within the same servers i will need to have different root buckets and then make an ssd and an sata host for each real host. or is there any simpler way?
i think it is impossible to use the proxmox gui afterwards?
You are trying what i have been trying for last several months until recently i decided to leave it behind.

Basically you edit the crushmap to create another bucket and tell the SSD or HDD pool to use that ruleset. Yes proxmox GUI cannot do it. It has be done manually. But the problem is, Proxmox GUI will only show one OSD list from one bucket. So for the other bucket you will fully rely on fully command line management, which is not that hard.
The big reason i left it behind is performance issue. My both SSD and HDD pool was on same node and both pools took performance penalty. Now i decided to get rid of SSDs and filled all 3 nodes with HDDs. I did a benchmark on this thread: http://forum.proxmox.com/threads/18580-CEPH-desktop-class-HDD-benchmark with HDDs for OSDs.

One another big negative point: When you reboot the node ALL OSDs in additional bucket will move to default bucket!!! You MUST remember to manually move the OSDs back to their bucket everytime you reboot. If you dont, then the CEPH cluster will go into rebalancing! I could not avoid this everytime i rebooted.

This is NOT Proxmox issue.
 
Last edited:

felipe

Member
Oct 28, 2013
152
2
18
>> The big reason i left it behind is performance issue. My both SSD and HDD pool was on same node and both pools took performance penalty
ok and how strong is your server? ram cpu?
you have only 1gig ethernet.
we will use 2 x 10gig cards. so we have 20gig for osd traffic and 20 for mons



>> One another big negative point: When you reboot the node ALL OSDs in additional bucket will move to default bucket!!! You MUST remember to manually move the OSDs back to their bucket everytime you reboot. If you dont, then the >> CEPH cluster will go into rebalancing! I could not avoid this everytime i rebooted.

this should fix it:
osd crush update on start = false' in your ceph.conf

but generally it seems just a few people use ssd and sata separate. but we have vm machines with totaly different workload -and it makes no sense to use just one big pool for that...
 

symmcom

Renowned Member
Oct 28, 2012
1,080
28
68
Calgary, Canada
www.symmcom.com
>> The big reason i left it behind is performance issue. My both SSD and HDD pool was on same node and both pools took performance penalty
ok and how strong is your server? ram cpu?
you have only 1gig ethernet.
we will use 2 x 10gig cards. so we have 20gig for osd traffic and 20 for mons


CEPH nodes had 32GB RAM in each with i3 CPUs. Indeed i had 1gig network, but all these 3 CPU,RAM and Bandwidth has been monitored over several to get accurate consumption data. Yes network bandwidth may not be enough during rebalancing due to node or OSD loss, but generally none of these 3 were consumed more 15% any given time on day to day basis. I dont believe the performance bottle neck was due to any of these. In my case both SSD and OSD were on same node.

On same hardware nodes with 36 OSDs and one pool things are lot faster. I didnt add any extra RAM, nor changed the CPU or increased 1gig network.


>> One another big negative point: When you reboot the node ALL OSDs in additional bucket will move to default bucket!!! You MUST remember to manually move the OSDs back to their bucket everytime you reboot. If you dont, then the >> CEPH cluster will go into rebalancing! I could not avoid this everytime i rebooted.

this should fix it:
osd crush update on start = false' in your ceph.conf

but generally it seems just a few people use ssd and sata separate. but we have vm machines with totaly different workload -and it makes no sense to use just one big pool for that...
I did separate SSD cluster from SATA in my setup. Again same configuration of 32GB RAM, i3 CPU and gigabit network. SSD pools are significantly faster now than they were before with coexisting SATA pool. I am also using same SSDs i used before.
 

felipe

Member
Oct 28, 2013
152
2
18
what was your bench with mixed ssd & sata?
the question is if a server with a lot more cpu and ram & nic power could handle that without performance loss...
our machines have 2 x hexacore cpu and 128 ram.
 

impire

Member
Jun 10, 2010
106
0
16
Local storage definitely easier than cephfs. But let's say if you wanted to play around with openvz or different image type; unless you have plenty of space on local node, cephfs will be much better option. Once it is setup it runs without tending much.


Sent from my SGH-I747M using Tapatalk
I can't upload to the local drives on the nodes. I get the below error. I've tried searching Google and everywhere else. Nothing seems to help. Can anyone help to point me in the right direction?

"Error: 500: can't activate storage 'local' on node 'CEPH1' "
 

impire

Member
Jun 10, 2010
106
0
16
Available space will always show the total space of CEPH cluster. Look at the used space. When u save a 1gb file you will see 3gb being used

Sent from my SGH-I747M using Tapatalk
Thank you.

1) So I just need to keep in mind of the maximum my users can store?

For example, if I do 3 replicas on a 30TB, then I just need to make sure used space can be no more than 9.5TB or 10TB?

2) Another question. What is the main advantage to having 3 replicas versus 2 replicas? Logically I know we have 3 copies of data. But why do that when the data is already replicated? What's the point of having 3 sets of data? Sorry for the newbie question.

Thanks in advance for your help.
 

symmcom

Renowned Member
Oct 28, 2012
1,080
28
68
Calgary, Canada
www.symmcom.com
1) So I just need to keep in mind of the maximum my users can store?

For example, if I do 3 replicas on a 30TB, then I just need to make sure used space can be no more than 9.5TB or 10TB?
Were you talking about image #1 or image #2 when you said 'need to make sure used space can be no more than......'?
Image#1
POOL-1.PNG

Image #2
POOL-2.PNG

Image #1 shows the actual usage by a CEPH pool and Image #2 shows cluster wide available and used space. I think watching the image #2 will keep you informed if your cluster running out of space or not. Image #1 will give you better idea which CEPH pool using how much.

2) Another question. What is the main advantage to having 3 replicas versus 2 replicas? Logically I know we have 3 copies of data. But why do that when the data is already replicated? What's the point of having 3 sets of data? Sorry for the newbie question.
This is my opinion based on experience; I think using the following formula to decide on Replica number is good idea:
# of Node - 1 = Replica

So a 3 node CEPH cluster will have replica of 2. A 5 Node CEPH cluster will have replica of 4 and so on. Having 3 replicas 3 nodes will allow 2 simultaneous nodes failure and still keep functioning fine. Whereas with 2 replicas, 2 node failures may cause huge issue. What are the chances that 2 machines will fail at the same time. Dont be confuse when say with 3 replicas everything will work just fine even with 2 node failure, cluster will still face rebalancing thus making everything slower while cluster tries to recover. But i dont believe you will face any data loss due to stuck, unclean or stale PGs. Hope this makes sense.
 

impire

Member
Jun 10, 2010
106
0
16
It won't work. "Error: 500: can't activate storage 'local' on node 'CEPH1' "
I just did a fresh install of the nodes from scratch. Setup CEPH and activated OSDs.

Straight out of the box, I cannot upload the ISO to the local drive (boot) of any node. Can anybody shed a light on this problem?

"Error: 500: can't activate storage 'local' on node 'CEPH1' "

Perhaps this is a bug of ProxMox and/or combination of CEPH.
 

symmcom

Renowned Member
Oct 28, 2012
1,080
28
68
Calgary, Canada
www.symmcom.com
I just did a fresh install of the nodes from scratch. Setup CEPH and activated OSDs.

Straight out of the box, I cannot upload the ISO to the local drive (boot) of any node. Can anybody shed a light on this problem?

"Error: 500: can't activate storage 'local' on node 'CEPH1' "

Perhaps this is a bug of ProxMox and/or combination of CEPH.
It could not possibly be because of Proxmox and CEPH combination. Proxmox does not manipulate CEPH in any way, it just uses API of CEPH.

How big is the ISO file you are trying to upload? Are you trying to do it through the Web interface? Did you try to upload it through FTP program such as FileZilla?

I have noticed the GUI have trouble if the ISO is more than about 400MB. I do not think it is Proxmox limitation but HTTP protocol itself for file uploading.
 

impire

Member
Jun 10, 2010
106
0
16
It could not possibly be because of Proxmox and CEPH combination. Proxmox does not manipulate CEPH in any way, it just uses API of CEPH.

How big is the ISO file you are trying to upload? Are you trying to do it through the Web interface? Did you try to upload it through FTP program such as FileZilla?

I have noticed the GUI have trouble if the ISO is more than about 400MB. I do not think it is Proxmox limitation but HTTP protocol itself for file uploading.
Hi,

The file size is only 120MB to 250MB.

I have not tried to upload it through FTP program like FileZilla.

Right now, I can still download the templates to local directory, but uploading ISO of any size will give that 500 error.

I re-installed everything from scratch. Before adding CEPH, it worked fine uploading to local directory. After adding CEPTH, same exact problem.

Straight out of the box, this is definitely a promox bug! I don't think I am the only one encountering this issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!