Proxmox VE Ceph Server released (beta)

how does it work with an "Production Environment"?

Oh, just saw what they call "Production Environment" => min 100 OSD @ 4TB per OSD

This is a total capacity of > 400TB, and not what we target with this setup.

Instead, we want to provide a setup for SMALL/MEDIUM enterprise, 4-8 OSD per node, < 8 nodes.

If you need more OSDs it is better to split the storage cluster from VMs (as suggested by ceph.com).

The really good thing with ceph is that you can start small, and then grow depending on your needs.
 
Oh, just saw what they call "Production Environment" => min 100 OSD @ 4TB per OSD

This is a total capacity of > 400TB, and not what we target with this setup.

Instead, we want to provide a setup for SMALL/MEDIUM enterprise, 4-8 OSD per node, < 8 nodes.

If you need more OSDs it is better to split the storage cluster from VMs (as suggested by ceph.com).

The really good thing with ceph is that you can start small, and then grow depending on your needs.
Hi Dietmar,
is it possible to use pve only as monitor-nodes with additional osd-nodes also with the ceph-repository (http://ceph.com/debian/ wheezy main) or are the pve (test) repros are nessesary (for stable pve now and also for pve3.2)?

Background is my whish to split the mon-nodes from the osd-nodes without buying new hardware...

Udo
 
is it possible to use pve only as monitor-nodes with additional osd-nodes also with the ceph-repository (http://ceph.com/debian/ wheezy main) or are the pve (test) repros are nessesary (for stable pve now and also for pve3.2)?

The whole ceph GUI is targeted for 3.2. And we always use ceph repositories as soon as you run 'pveceph init'.

Background is my whish to split the mon-nodes from the osd-nodes without buying new hardware...

If you run ceph services outside the pve cluster, you need to manually sync ceph config, and
you cannot manage those services using our GUI.
 
I've just deployed pvetest on our test-cluster and it works flawlessly so far. Finally these 2 techniques are married in one product. We thought about deploying our own ceph-storage system for some time (but did'nt have time to do ;) ) and now it seems that there's no need to do so ;) Thanks for this!

We are looking to improve our uptime with HA (by pve or probably by ourselves using heartbeat and proxies...), so shared storage is the key. As we are using a lot of VZ's, is there any (simple) way to use the ceph storage for them? Is there any file-system layer on debian? PLOOP for VZ would be nice, but i know that you're having issues with it.

We love the VZs as they are ideal for webapps (one per VZ) ... If there's no way, we could probably switch over to KVM's and see how they perform against the VZ's.
 
As far as Im aware: unless you want to implement modifications to vzmigrate as demonstrated here: http://openvz.org/Vzmigrate_filesystem_aware (Note: I don't know whether the linked code is still being maintained), you HAVE to use NFS for openvz shared storage.

So. you could either have a NFS server somewhere, that's simultaneously a ceph client and mount some RBD as its storage and use that for CT storage, or
have one RBD for each proxmox node and store the containers in these. Note that the latter setup would increase the duration of live migrations (unless you were to implement a different vzmigrate modification)
 
Last edited:
wow! this is some nice feature. i will install 3 test servers :)


one question about ssds and disks. it is written that because of the clustering and self healing there is no need for raid.
you have no raid for the ssd of the proxmox partition? when it failes the node goes down!
what happens if an ssd with the journal failes?
what happens when a osd fails? - when it is not accessible anymore should be ok. but what happens when some osd'S have bad disk data? - raid normally can handle this- or at least make 1 time a week a check of the concistency....
 
wow! this is some nice feature. i will install 3 test servers :)


one question about ssds and disks. it is written that because of the clustering and self healing there is no need for raid.
you have no raid for the ssd of the proxmox partition? when it failes the node goes down!
what happens if an ssd with the journal failes?
what happens when a osd fails? - when it is not accessible anymore should be ok. but what happens when some osd'S have bad disk data? - raid normally can handle this- or at least make 1 time a week a check of the concistency....

Don't forget the high-aviability of the cluster proxmox. If a node is down, the remaining two still work.
 
what happens if an ssd with the journal failes?

All OSDs using that SSD as journal will go down.

what happens when a osd fails? - when it is not accessible anymore should be ok. but what happens when some osd'S have bad disk data? - raid normally can handle this- or at least make 1 time a week a check of the concistency....

ceph also runs consistency checks (scrub job) each day/week.
 
Absolutely great things to read about Proxmox - my congrats to the Proxmox devs and contributors.

For my understanding (and others too?) who are not as familiar with Ceph as you guys:
I read that Ceph needs at least 2 copies for data safety but 2+ copies more for HA (says: http://ceph.com/docs/master/architecture/),
however the Proxmox wiki suggests 3 nodes as the minimum for Ceph.

Now I understand that Ceph, for production use wants > 2 copies, that's fair enough, I do see the point.
However: Can it be tested and configured with only 2 nodes?

I'd have 2 servers available for some testing, but not 3 - for production that would be possible though.
 
Last edited:
Although not recommended, a test platform can be setup with no issue with just 2 nodes.

hm, isn't a quorum needed? Imo at least an uneven count of MONs is required. Though you can probably run 2 MONs on one of the two nodes, to get a quorum with 3 MONs...
 
I love it so far!

I suggest adding a CRUSH map editor in the gui. Compiling and decompiling its so 80's :)

I would love to move to production soon. Should i expect the stable 3.2 under two months?
 
the request was "I have only two nodes, how can I try anyway even if not for production?"

Marco

Ok; but for me at least the interesting - if not most important - part with "trying" such is "what happens if something happens"... :)
(i'd believe standard use case for Ceph includes some sort of redundancy - though you also might go without, of course...)
 
I love it so far!

I suggest adding a CRUSH map editor in the gui. Compiling and decompiling its so 80's :)

I would love to move to production soon. Should i expect the stable 3.2 under two months?

With new CEPH Emperor, decompiling/compiling is not necessary anymore. All CRUSH editing can be done from CLI.
 
I have access to this test cluster, just tell in detail what benchmark I should run for you.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!