Big proxmox installations

Could you describe your ceph environments ? How many servers, how many switches and so on. 10GBe ?
.

I'm doing small 3 nodes cluster, 18-24 osd by cluster (1,6TB intel s3610 ssd or 3,2TB hstg nvme).
Fast cpu frequency (10-12 cores, 3ghz intel) by node. replication x3.
debian stretch/luminous bluestore and jessie/jewel filestore
2x10GB by ceph node. (ceph public and private network on same link)
2x10GB on proxmox node. (san + lan on same links, differents vlan)

proxmox node also have fast cpu (3ghz), to reduce latency.


I'm also using cephfs and radosgw for sharing datas in my vms, on a dedicated cluster.

Small clusters because it's more simple for upgrade, and if I don't have enough storage for a specific vm, we simply move disk with proxmox.


The corruption bug (always triggered and well known, just rebalance a sharded volume to loose data) took years to be fixed
I known 2 peoples who"s have triggered this bug...
also I don't known if it's have changed, but resync a vm volume/file, needed to scan all blocks on the source file.
 
  • Like
Reactions: Fathi
  • Like
Reactions: Fathi
I'm doing small 3 nodes cluster, 18-24 osd by cluster (1,6TB intel s3610 ssd or 3,2TB hstg nvme).
Fast cpu frequency (10-12 cores, 3ghz intel) by node. replication x3.
debian stretch/luminous bluestore and jessie/jewel filestore
2x10GB by ceph node. (ceph public and private network on same link)
2x10GB on proxmox node. (san + lan on same links, differents vlan)

So, you are putting OSD and MONs on the same server. Interesting.
If i understood properly: 3 ceph servers for both, OSDs and MONs, 2x10GB for redundancy with public and private on the same link, then proxmox is connected to these 3 servers via 10GB link (also used for LAN)

How many ram on ceph nodes?

also I don't known if it's have changed, but resync a vm volume/file, needed to scan all blocks on the source file.

With sharding, only changed shards are synced.
 
  • Like
Reactions: Fathi
How many ram on ceph nodes?
64GB for my osd nodes (6-8osd).
128go for my ceph mds for cephfs (I have around 100 000 000 files)



[QUOTE
With sharding, only changed shards are synced.[/QUOTE]
great ! I see that it's since 3.7. That was really wrong before this.

Others ceph features I'm using is rbd snapshot export|import, for disaster recovery.It's working really fine.
Also snapshot/rollback. (qcow2 on top of gluster for snapshot can be sometime dangerous)
 
  • Like
Reactions: Fathi
We have a 5-node Proxmox cluster and in the early days tried out GlusterFS and found it was a buggy-as-hell and slow disaster :-( It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time (and weren't interested in updating them when I queried it in this forum). GlusterFS actually managed to create 1 *million* open file handles in a few hours and then refuse to let you do any new operations after that, making it utterly useless for us. Its I/O performance - particularly for writing - seemed to be dismal too.

We're now using SANs (dual bonded gigabit for speed) and iSCSI to provide the filestore to the Proxmox hosts, which works pretty well and makes live migration quite easy. What I don't like is that recent attempts to upgrade Proxmox on our clusters have generally been a failure. It seems to me that the Proxmox devs don't test cluster upgrades much - quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

Our most recent upgrade was so bad (first step to just upgrade packages within the current version and reboot before starting the upgrade to the next version actually borked the whole install and dumped me into an initramfs prompt after the reboot!). We ended up ditching Proxmox completely on that node and installing CentOS 7 on the host and using virt-manager to run the VMs we had (that particular setup wasn't using iSCSI).

I'm now very scared to do any Proxmox "warm" updates - my gut feeling is that the only safe update is to wipe the machine with a fresh Proxmox install from an ISO and reconfigure it from a backup of the config!
 
It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time
Proxmox never bundled GlusterFS packages.
 
quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

I never have had problem with live migration, upgraded my clusters since proxmox2->proxmox5, without reinstall.

of course , you can't migrate from a newer qemu version to an old qemu version.
But older->newer qemu has never has been a problem.
 
We have a 5-node Proxmox cluster and in the early days tried out GlusterFS and found it was a buggy-as-hell and slow disaster :-( It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time (and weren't interested in updating them when I queried it in this forum). GlusterFS actually managed to create 1 *million* open file handles in a few hours and then refuse to let you do any new operations after that, making it utterly useless for us. Its I/O performance - particularly for writing - seemed to be dismal too.

@wolfgang already answered this. that gluster botched a couple releases is not our fault.

We're now using SANs (dual bonded gigabit for speed) and iSCSI to provide the filestore to the Proxmox hosts, which works pretty well and makes live migration quite easy. What I don't like is that recent attempts to upgrade Proxmox on our clusters have generally been a failure. It seems to me that the Proxmox devs don't test cluster upgrades much - quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

we do test cluster upgrades, both for regular upgrades, minor releases, and major releases. all of our infrastructure runs on PVE as well ;)

Our most recent upgrade was so bad (first step to just upgrade packages within the current version and reboot before starting the upgrade to the next version actually borked the whole install and dumped me into an initramfs prompt after the reboot!). We ended up ditching Proxmox completely on that node and installing CentOS 7 on the host and using virt-manager to run the VMs we had (that particular setup wasn't using iSCSI).

I'm now very scared to do any Proxmox "warm" updates - my gut feeling is that the only safe update is to wipe the machine with a fresh Proxmox install from an ISO and reconfigure it from a backup of the config!

while that is very unfortunate, most of our users have a different experience. it might be worth to re-evaluate and try to get to the bottom of your issue, since it is not the expected behaviour.
 
We have a 5-node Proxmox cluster and in the early days tried out GlusterFS and found it was a buggy-as-hell and slow disaster :-( It doesn't help that Proxmox bundled ancient horrendously buggy versions of GlusterFS for a long time (and weren't interested in updating them when I queried it in this forum). GlusterFS actually managed to create 1 *million* open file handles in a few hours and then refuse to let you do any new operations after that, making it utterly useless for us. Its I/O performance - particularly for writing - seemed to be dismal too.

We're now using SANs (dual bonded gigabit for speed) and iSCSI to provide the filestore to the Proxmox hosts, which works pretty well and makes live migration quite easy. What I don't like is that recent attempts to upgrade Proxmox on our clusters have generally been a failure. It seems to me that the Proxmox devs don't test cluster upgrades much - quite often you can't live migrate between different Proxmox versions (often between major releases and sometimes even between minor releases!), which is a disaster for a cluster that's supposed to have high uptimes.

Our most recent upgrade was so bad (first step to just upgrade packages within the current version and reboot before starting the upgrade to the next version actually borked the whole install and dumped me into an initramfs prompt after the reboot!). We ended up ditching Proxmox completely on that node and installing CentOS 7 on the host and using virt-manager to run the VMs we had (that particular setup wasn't using iSCSI).

I'm now very scared to do any Proxmox "warm" updates - my gut feeling is that the only safe update is to wipe the machine with a fresh Proxmox install from an ISO and reconfigure it from a backup of the config!
What you say is not true. Sorry for your experience, but your experience is not ours. @fabian and @wolfgang have answered well. @spirit Alexandre has answered well, his company and well placed especially 100% HA. I deployed a lot of Proxmoxve clusters, with OpenFiler, FreeNAS, Ceph, Glusterfs ( Storages) and it always worked well. Same for updates. Currently only my personel POC is for 11 servers, I run on, hpc, paas, big-data, iot ... and it works well. I say like Alexandre : Thank's Proxmox and thank's Ceph ( Glusters, too Sorry @spirit :) ) .
PS. @wolfgang please, migrate glusterfs-server .deb to version 4.0, I want to test it one more. Merci.
 
please, migrate glusterfs-server .deb to version 4.0, I want to test it one more. Merci.

Based on Gluster's release schedule, you'll better to use vendor repository and not waiting for proxmox. Gluster is updated almost every month or two.
 
  • Like
Reactions: badji
PS. @wolfgang please, migrate glusterfs-server .deb to version 4.0, I want to test it one more. Merci.
As Alessandro 123 says you should use the upstream packages.
Beside the GlusterFS internal bugs(new features what do not correctly work), these packages work well.

It is much work to build clean and perfect working packages.
So I don't think we will not in the near future make one.
 
  • Like
Reactions: badji
As Alessandro 123 says you should use the upstream packages.
Beside the GlusterFS internal bugs(new features what do not correctly work), these packages work well.

It is much work to build clean and perfect working packages.
So I don't think we will not in the near future make one.
I already use it when I create external glusterfs storage clusters. I thought especially when it's done directly with proxmoxve. there is even ovirt-web for gluster (Not Ovirt-web virtualistaion) which allows to autamatise the deployment of clusters on a large scale.
 

Attachments

  • Cluster-gluster.jpg
    Cluster-gluster.jpg
    58.5 KB · Views: 33

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!