Proxmox VE Ceph Server released (beta)

impire

Member
Jun 10, 2010
106
0
16
Hello,

What is the linus kernel Proxmox VE Ceph Server released (beta) is on? Is it RHEL 6.4?

I am installing a set of new QLogic inifiniband NICs for the Ceph nodes. However, Proxmox is not recognizing the card.

QLogic support is stating they do not have drivers for Debian. Only for RHEL. I understand Proxmox was using RHEL kernal 6.3 as of last year. Is it still the same with the current version 3.2?

Is there a list of compatible hardware NIC cards and mother boards that we can reference to? This would be helpful in selecting the hardware for the Ceph nodes. Thank you.
 

spirit

Well-Known Member
Apr 2, 2010
3,432
146
63
www.odiso.com
Hello,

What is the linus kernel Proxmox VE Ceph Server released (beta) is on? Is it RHEL 6.4?

I am installing a set of new QLogic inifiniband NICs for the Ceph nodes. However, Proxmox is not recognizing the card.

QLogic support is stating they do not have drivers for Debian. Only for RHEL. I understand Proxmox was using RHEL kernal 6.3 as of last year. Is it still the same with the current version 3.2?

Is there a list of compatible hardware NIC cards and mother boards that we can reference to? This would be helpful in selecting the hardware for the Ceph nodes. Thank you.

pve-kernel-2.6.32 : rhel 6.5
pve-kernel-3.10 : rhel 7
 

symmcom

Well-Known Member
Oct 28, 2012
1,077
26
48
Calgary, Canada
www.symmcom.com
I can confirm that HA setup for VMs still works on Ceph nodes. It is expected to work since Ceph integration in Proxmox does not really change anything to the hypervisor itself. Proxmox mainly uses API for Ceph to do basic tasks and show information on GUI.
 

impire

Member
Jun 10, 2010
106
0
16
pve-kernel-2.6.32 : rhel 6.5
pve-kernel-3.10 : rhel 7
Thank you. Do I need to do anything to switch to the Redhat Kernel?

Tried to install a Qlogic adapter card with the driver provided by manufacturer. Got the error with pve kernel and not finding Redhat rpm.

"Kernel version 2.6.32-29-pve
Binary rpm not found for the above adapters"
 

impire

Member
Jun 10, 2010
106
0
16
Has anyone successfully ran an Infiniband network of 20gbps or 40gbps for the CEPH nodes? If so, I would greatly appreciate feedback on what servers (mother boards/processors) you are using that is capable of pushing such high speed? Thank you.
 

impire

Member
Jun 10, 2010
106
0
16
Proxmox is using RHEL kernel. It was stated in the forum that any device support RHEL would be compatible with Proxmox.

When I tried to install a QLOGIC NIC card driver, it gave me an error:

"Kernel version 2.6.32-29-pve
Binary rpm not found for the above adapters"

This mean it's recognizing Proxmox's own kernel and not RHEL?

Can anyone please help to point me in the right direction?

Thank you very much in advance for your help.
 

udo

Well-Known Member
Apr 22, 2009
5,857
161
63
Ahrensburg; Germany
Has anyone successfully ran an Infiniband network of 20gbps or 40gbps for the CEPH nodes? If so, I would greatly appreciate feedback on what servers (mother boards/processors) you are using that is capable of pushing such high speed? Thank you.
Hi,
I have "only" 10GB Ethernet (with dedicated osd-network).
While the osd-network partially use the speed (after inserting additional disks, the ceph-cluster use up to 1.4GByte/s to reorginized the cluster) the speed for the client is not so good... I guess that 20 or 40GB don't change much.

But perhaps someone can show better values?

Udo
 

spirit

Well-Known Member
Apr 2, 2010
3,432
146
63
www.odiso.com
Has anyone successfully ran an Infiniband network of 20gbps or 40gbps for the CEPH nodes? If so, I would greatly appreciate feedback on what servers (mother boards/processors) you are using that is capable of pushing such high speed? Thank you.
Hi, I don't think you can reach 20gbits with IP over infiband. (I think maybe 4 gigabits max)
Full rdma support is coming soon in ceph, so I think you'll be able to reach native infiband speed.
 

udo

Well-Known Member
Apr 22, 2009
5,857
161
63
Ahrensburg; Germany
Hi, I don't think you can reach 20gbits with IP over infiband. (I think maybe 4 gigabits max)
Full rdma support is coming soon in ceph, so I think you'll be able to reach native infiband speed.
Hi,
a little bit more is possible.

This shows an infiniband-connection (40GB/s) tested with iperf:
Code:
oot@proxtest3:~# iperf -c 172.20.1.82
------------------------------------------------------------
Client connecting to 172.20.1.82, TCP port 5001
TCP window size:  645 KByte (default)
------------------------------------------------------------
[  3] local 172.20.1.83 port 51412 connected with 172.20.1.82 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  15.7 GBytes  13.4 Gbits/sec

lspci
07:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Udo
 

Florent

Member
Apr 3, 2012
91
2
8
I think I have found a bug in pveceph tool : when we have some custom sections in ceph.conf (like mds...), and pveceph needs to rewrite the file, it deletes sections it doesn't know (like ceph mds !).

Hopefully I had a backup copy of my config file :)
 

impire

Member
Jun 10, 2010
106
0
16
Hi,
a little bit more is possible.

This shows an infiniband-connection (40GB/s) tested with iperf:
Code:
oot@proxtest3:~# iperf -c 172.20.1.82
------------------------------------------------------------
Client connecting to 172.20.1.82, TCP port 5001
TCP window size:  645 KByte (default)
------------------------------------------------------------
[  3] local 172.20.1.83 port 51412 connected with 172.20.1.82 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  15.7 GBytes  13.4 Gbits/sec

lspci
07:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Udo
I am drooling while looking at the result of your iperf. Man, that is some serious speed for the Ceph nodes. An IT guy's wet dreams.

May I ask for the brand/model of the:

1) Switches and NIC cards you are using?

2) Cables type (I think Infiniband CX4 cables only support up to 10Gbps)

If you don't mind, please share your server config and hard drive type, etc. I know it takes a lot of CPUs workhorse to push that type of speed. Furthermore, the hard drive transfer speed should also be the bottle neck. I am perplexed as to how you can push this type of speed?

Thanks in advance for your help.
 
Last edited:

impire

Member
Jun 10, 2010
106
0
16
Hi, I don't think you can reach 20gbits with IP over infiband. (I think maybe 4 gigabits max)
Full rdma support is coming soon in ceph, so I think you'll be able to reach native infiband speed.
I tried to find info on rdma with Ceph but it's very vague. What exactly is RDMA and how will it help Ceph to improve speed? Thanks.
 

udo

Well-Known Member
Apr 22, 2009
5,857
161
63
Ahrensburg; Germany
I am drooling while looking at the result of your iperf. Man, that is some serious speed for the Ceph nodes. An IT guy's wet dreams.

May I ask for the brand/model of the:

1) Switches and NIC cards you are using?
Hi,
like mir allready wrote: Mellanox 40GB Card (without switch - it's an direct connection for drbd). For ceph I use 10GB Ethernet.
2) Cables type (I think Infiniband CX4 cables only support up to 10Gbps)
I'm not sure how this cable named... QSFP?!
If you don't mind, please share your server config and hard drive type, etc. I know it takes a lot of CPUs workhorse to push that type of speed. Furthermore, the hard drive transfer speed should also be the bottle neck. I am perplexed as to how you can push this type of speed?

Thanks in advance for your help.
In this case are the server testserver (to find an issue with an drbd-connection). Two amd-boxes - one fx8350 and one 965.

Normaly the cards are in dual-Opterons server.+

Udo
 

felipe

Member
Oct 28, 2013
152
1
18
if you run the same benchmark in parallel - e.g. 100 guest you will see a the difference between ceph RBD and your sata raid1. if your goal is a very fast single VM, then ceph is not the winner. a fast hardware raid with a lot of cache, ssd only or ssd & sas hdd´s is a good choice here.

no i have a ceph cluster with 3 times 15 satas (journal on disks) - 10 gig eth external and 10gig for osds...
on my windows guest i get only 115mb/sec for seq read, 4k reads 7mb/sec (whats ok)
but the seq reads are really low in the guest...

rados -p test bench -b 4194304 60 seq -t 1 --no-cleanup
gives me:
Total time run: 23.298631
Total reads made: 2829
Read size: 4194304
Bandwidth (MB/sec): 485.694


Average Latency: 0.00823321
Max latency: 0.018985
Min latency: 0.004756

so thats 400mb with one thread

write speed is very hight on the virtual machine: seq around 500mb and 4k is 14mb... with 4000MB test file (maybe the rbd cache) but iostat on one of the ceph nodes tell me between 400 and 700mb/sec so its written there also....

have you got any ideas for tuning the read speed for guests? (in this case its important that they are windows)
 

felipe

Member
Oct 28, 2013
152
1
18
ok i gout it: virtio makes a BIG! speed difference now.
ide: 110 mb read
virtio: 500mb read

what is funny is that both give me 500mb/write! virtio and also ide
also the cache modes make no difference. writeback or none is both 500mb
i am using 4GB tests Crystal Diskmark so i dont hit the cache. with 1gb or 500mb it gets a lot faster because of the cache....

what is also very strange is that i get better speeds inside the vm then with rados bench! (even with 32 or 64 threads) and also atop has more disk load. and the ceph log also says that disk load is higher...

but still have the bottleneck of the 500mb/sec write for the whole cluster... (with 45 disks journals on disks - 3 server with 15 disks each) 10gig for external and 10gig for osd traffic...
 

spirit

Well-Known Member
Apr 2, 2010
3,432
146
63
www.odiso.com
ok i gout it: virtio makes a BIG! speed difference now.
ide: 110 mb read
virtio: 500mb read

what is funny is that both give me 500mb/write! virtio and also ide
also the cache modes make no difference. writeback or none is both 500mb
i am using 4GB tests Crystal Diskmark so i dont hit the cache. with 1gb or 500mb it gets a lot faster because of the cache....

what is also very strange is that i get better speeds inside the vm then with rados bench! (even with 32 or 64 threads) and also atop has more disk load. and the ceph log also says that disk load is higher...

but still have the bottleneck of the 500mb/sec write for the whole cluster... (with 45 disks journals on disks - 3 server with 15 disks each) 10gig for external and 10gig for osd traffic...
I you should try to bench from a linux vm with virtio disk, "fio" tools is really good for this.
virtio on windows is known to be slower than linux
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!