Big proxmox installations

spirit · May 16, 2018

Alessandro 123 said:
Could you describe your ceph environments ? How many servers, how many switches and so on. 10GBe ?
.

I'm doing small 3 nodes cluster, 18-24 osd by cluster (1,6TB intel s3610 ssd or 3,2TB hstg nvme).
Fast cpu frequency (10-12 cores, 3ghz intel) by node. replication x3.
debian stretch/luminous bluestore and jessie/jewel filestore
2x10GB by ceph node. (ceph public and private network on same link)
2x10GB on proxmox node. (san + lan on same links, differents vlan)

proxmox node also have fast cpu (3ghz), to reduce latency.

I'm also using cephfs and radosgw for sharing datas in my vms, on a dedicated cluster.

Small clusters because it's more simple for upgrade, and if I don't have enough storage for a specific vm, we simply move disk with proxmox.

The corruption bug (always triggered and well known, just rebalance a sharded volume to loose data) took years to be fixed

I known 2 peoples who"s have triggered this bug...
also I don't known if it's have changed, but resync a vm volume/file, needed to scan all blocks on the source file.

spirit · May 16, 2018

badji said:
The compagny of Alexandre : 4em. Good!!!
Proxmoxve+Ceph : Odiso

http://www.01net.com/services/indicateurs-ip-label/hebergement-en-haute-disponibilite/

Aller Alexandre et Aller l'OM

Thank's.

yes, 100% high avalability. Thanks ceph && proxmox

Alessandro 123 · May 16, 2018

spirit said:
I'm doing small 3 nodes cluster, 18-24 osd by cluster (1,6TB intel s3610 ssd or 3,2TB hstg nvme).
Fast cpu frequency (10-12 cores, 3ghz intel) by node. replication x3.
debian stretch/luminous bluestore and jessie/jewel filestore
2x10GB by ceph node. (ceph public and private network on same link)
2x10GB on proxmox node. (san + lan on same links, differents vlan)

So, you are putting OSD and MONs on the same server. Interesting.
If i understood properly: 3 ceph servers for both, OSDs and MONs, 2x10GB for redundancy with public and private on the same link, then proxmox is connected to these 3 servers via 10GB link (also used for LAN)

How many ram on ceph nodes?

also I don't known if it's have changed, but resync a vm volume/file, needed to scan all blocks on the source file.

With sharding, only changed shards are synced.

spirit · May 16, 2018

Alessandro 123 said:
How many ram on ceph nodes?

64GB for my osd nodes (6-8osd).
128go for my ceph mds for cephfs (I have around 100 000 000 files)

[QUOTE
With sharding, only changed shards are synced.[/QUOTE]
great ! I see that it's since 3.7. That was really wrong before this.

Others ceph features I'm using is rbd snapshot export|import, for disaster recovery.It's working really fine.
Also snapshot/rollback. (qcow2 on top of gluster for snapshot can be sometime dangerous)

rkl · May 17, 2018

We have a 5-node Proxmox cluster and in the early days tried out GlusterFS and found it was a buggy-as-hell and slow disaster :-( It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time (and weren't interested in updating them when I queried it in this forum). GlusterFS actually managed to create 1 *million* open file handles in a few hours and then refuse to let you do any new operations after that, making it utterly useless for us. Its I/O performance - particularly for writing - seemed to be dismal too.

We're now using SANs (dual bonded gigabit for speed) and iSCSI to provide the filestore to the Proxmox hosts, which works pretty well and makes live migration quite easy. What I don't like is that recent attempts to upgrade Proxmox on our clusters have generally been a failure. It seems to me that the Proxmox devs don't test cluster upgrades much - quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

Our most recent upgrade was so bad (first step to just upgrade packages within the current version and reboot before starting the upgrade to the next version actually borked the whole install and dumped me into an initramfs prompt after the reboot!). We ended up ditching Proxmox completely on that node and installing CentOS 7 on the host and using virt-manager to run the VMs we had (that particular setup wasn't using iSCSI).

I'm now very scared to do any Proxmox "warm" updates - my gut feeling is that the only safe update is to wipe the machine with a fresh Proxmox install from an ISO and reconfigure it from a backup of the config!

Alessandro 123 · May 17, 2018

No one tried LizardFS ?

wolfgang · May 17, 2018

rkl said:
It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time

Proxmox never bundled GlusterFS packages.

spirit · May 17, 2018

rkl said:
quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

I never have had problem with live migration, upgraded my clusters since proxmox2->proxmox5, without reinstall.

of course , you can't migrate from a newer qemu version to an old qemu version.
But older->newer qemu has never has been a problem.

fabian · May 17, 2018

rkl said:
We have a 5-node Proxmox cluster and in the early days tried out GlusterFS and found it was a buggy-as-hell and slow disaster :-( It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time (and weren't interested in updating them when I queried it in this forum). GlusterFS actually managed to create 1 *million* open file handles in a few hours and then refuse to let you do any new operations after that, making it utterly useless for us. Its I/O performance - particularly for writing - seemed to be dismal too.

@wolfgang already answered this. that gluster botched a couple releases is not our fault.

We're now using SANs (dual bonded gigabit for speed) and iSCSI to provide the filestore to the Proxmox hosts, which works pretty well and makes live migration quite easy. What I don't like is that recent attempts to upgrade Proxmox on our clusters have generally been a failure. It seems to me that the Proxmox devs don't test cluster upgrades much - quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

we do test cluster upgrades, both for regular upgrades, minor releases, and major releases. all of our infrastructure runs on PVE as well

Our most recent upgrade was so bad (first step to just upgrade packages within the current version and reboot before starting the upgrade to the next version actually borked the whole install and dumped me into an initramfs prompt after the reboot!). We ended up ditching Proxmox completely on that node and installing CentOS 7 on the host and using virt-manager to run the VMs we had (that particular setup wasn't using iSCSI).

I'm now very scared to do any Proxmox "warm" updates - my gut feeling is that the only safe update is to wipe the machine with a fresh Proxmox install from an ISO and reconfigure it from a backup of the config!

while that is very unfortunate, most of our users have a different experience. it might be worth to re-evaluate and try to get to the bottom of your issue, since it is not the expected behaviour.

badji · May 17, 2018

rkl said:
We have a 5-node Proxmox cluster and in the early days tried out GlusterFS and found it was a buggy-as-hell and slow disaster :-( It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time (and weren't interested in updating them when I queried it in this forum). GlusterFS actually managed to create 1 *million* open file handles in a few hours and then refuse to let you do any new operations after that, making it utterly useless for us. Its I/O performance - particularly for writing - seemed to be dismal too.

We're now using SANs (dual bonded gigabit for speed) and iSCSI to provide the filestore to the Proxmox hosts, which works pretty well and makes live migration quite easy. What I don't like is that recent attempts to upgrade Proxmox on our clusters have generally been a failure. It seems to me that the Proxmox devs don't test cluster upgrades much - quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

Our most recent upgrade was so bad (first step to just upgrade packages within the current version and reboot before starting the upgrade to the next version actually borked the whole install and dumped me into an initramfs prompt after the reboot!). We ended up ditching Proxmox completely on that node and installing CentOS 7 on the host and using virt-manager to run the VMs we had (that particular setup wasn't using iSCSI).

I'm now very scared to do any Proxmox "warm" updates - my gut feeling is that the only safe update is to wipe the machine with a fresh Proxmox install from an ISO and reconfigure it from a backup of the config!

What you say is not true. Sorry for your experience, but your experience is not ours. @fabian and @wolfgang have answered well. @spirit Alexandre has answered well, his company and well placed especially 100% HA. I deployed a lot of Proxmoxve clusters, with OpenFiler, FreeNAS, Ceph, Glusterfs ( Storages) and it always worked well. Same for updates. Currently only my personel POC is for 11 servers, I run on, hpc, paas, big-data, iot ... and it works well. I say like Alexandre : Thank's Proxmox and thank's Ceph ( Glusters, too Sorry @spirit

) .
PS. @wolfgang please, migrate glusterfs-server .deb to version 4.0, I want to test it one more. Merci.

Alessandro 123 · May 17, 2018

badji said:
please, migrate glusterfs-server .deb to version 4.0, I want to test it one more. Merci.

Based on Gluster's release schedule, you'll better to use vendor repository and not waiting for proxmox. Gluster is updated almost every month or two.

wolfgang · May 18, 2018

badji said:
PS. @wolfgang please, migrate glusterfs-server .deb to version 4.0, I want to test it one more. Merci.

As Alessandro 123 says you should use the upstream packages.
Beside the GlusterFS internal bugs(new features what do not correctly work), these packages work well.

It is much work to build clean and perfect working packages.
So I don't think we will not in the near future make one.

badji · May 18, 2018

wolfgang said:
As Alessandro 123 says you should use the upstream packages.
Beside the GlusterFS internal bugs(new features what do not correctly work), these packages work well.

It is much work to build clean and perfect working packages.
So I don't think we will not in the near future make one.

I already use it when I create external glusterfs storage clusters. I thought especially when it's done directly with proxmoxve. there is even ovirt-web for gluster (Not Ovirt-web virtualistaion) which allows to autamatise the deployment of clusters on a large scale.

Search

Search

Big proxmox installations

spirit

Distinguished Member

spirit

Distinguished Member

Alessandro 123

Well-Known Member

spirit

Distinguished Member

rkl

Renowned Member

Alessandro 123

Well-Known Member

wolfgang

Proxmox Retired Staff

spirit

Distinguished Member

fabian

Proxmox Staff Member

badji

Renowned Member

Alessandro 123

Well-Known Member

wolfgang

Proxmox Retired Staff

badji

Renowned Member

Attachments

We value your privacy