Ceph Octopus

I'm on the Ceph testing repos - but it seems to be 14.2.10 (Nautilus) at the moment, I'm guessing they're not ready for us to test yet.
 
I'm on the Ceph testing repos - but it seems to be 14.2.10 (Nautilus) at the moment, I'm guessing they're not ready for us to test yet.

As ceph upgrades need to be planned, and sometimes need some manual steps we always create a separate repository for major ceph versions, so that people not just run a major upgrade "by mistake".

For ceph Octopus testing repo use the following apt sources list entry:
deb http://download.proxmox.com/debian/ceph-octopus buster test
 
Oh this is awesome!

I can try this now on my test cluster.

I already have a 3-node Proxmox cluster setup with Ceph Nautilus. Should I just do the upgrade, or would you recommend re-installing the cluster from scratch, and installing Ceph Octopus right from the beginning?

Also - are there any caveats or gotchas I should be aware of with this Octopus testing repository yet?
 
I already have a 3-node Proxmox cluster setup with Ceph Nautilus. Should I just do the upgrade, or would you recommend re-installing the cluster from scratch, and installing Ceph Octopus right from the beginning?

If it's just a test setup: go for the upgrade, gives you already a feeling for the production upgrade once it's out of the test repo.
Also, if there's something we overlooked we can fix it easily in the test phase.

Also - are there any caveats or gotchas I should be aware of with this Octopus testing repository yet?
The placement-group autoscaler will warn in Octopus for exitisting pools with a pg_num it deems unfit, if you enable it then you may experience quite some rebalancing. For new pools PG autoscaling is on by default.

There's an in-progress update how-to:
https://pve.proxmox.com/wiki/Ceph_Nautilus_to_Octopus
 
I can confirm I was able to successfully upgrade a Proxmox 3-node cluster from Nautilus to Octopus.

Immediately after the upgrade, the OSDs showed up as out-of-date:

Screen Shot 2020-07-08 at 6.11.58 am.png

I simply rebooted each node one by one, and it seems to be all good now.

This cluster didn't have any data stored on the OSDs before-hand though - I have another cluster with around 30TB of data I could try an upgrade on, if that provides any useful data?

By the way - is there an easy way to enable the Ceph dashboard, and access it?

(I remember when I tried it before, I had some issues with Ceph keyrings not being in the right place.)

I also saw this more recent post, which gave the below commands

On each node, I ran
Code:
# apt-get install ceph-mgr-dashboard
On one node:
Code:
# ceph mgr module enable dashboard
# ceph dashboard create-self-signed-cert
Self-signed certificate created
# ceph dashboard ac-user-create victorhooi SANITISED administrator
{"username": "victorhooi", "password": "$2b$12$SANITISED.", "roles": ["administrator"], "name": null, "email": null, "lastUpdate": 1594155899, "enabled": true, "pwdExpirationDate": null, "pwdUpdateRequired": false}

However, I then tried to access the dashboard on port 8443 - and I get a certificate error in Chrome, and Safari says it can't open a secure connection.

I'm assuming at this point it's a SSL cert issue - any advice on how to get it running? (Or possibly how to enable HTTP).
 
You can add the self-signed cert to your Mac via the 'Keychain Access' app, then set it as 'trusted', but Chrome and Safari will still complain. If it's on a closed/private network anyway, you can issue ceph config set mgr mgr/dashboard/ssl false
 
We've updated our test cluster to octopus a few days ago using the howto in the Proxmox Wiki. For us it worked without any issues.
After the upgrade all OSDs, managers and monitors displayed that a newer version is available and had to be restarted.
In our test environment we've also enabled the autoscaler and the CEPH dashboard. That also worked without any issues or service disruptions.
 
We've updated our test cluster to octopus a few days ago using the howto in the Proxmox Wiki. For us it worked without any issues.
After the upgrade all OSDs, managers and monitors displayed that a newer version is available and had to be restarted.
In our test environment we've also enabled the autoscaler and the CEPH dashboard. That also worked without any issues or service disruptions.
you can also enable writeback on your disk vm, with octopus, this give a big performance boost on write, without negative impact on read :)
 
I saw that Ceph Octopus 15.2.5 was released a couple of weeks back.

yes, more exactly just two weeks ago - and yes, we are working on this.
 
Please build out the monitoring functionality so that per-RBD disk and per-pool performance stats can be viewed in the PVE GUI rather than the ceph mgr dashboard or external grafana host.
 
  • Like
Reactions: Zombie
Hi,
when will octopus be in the main (not test) repository?
I assume together with the release of Proxmox VE 6.3 (later this year).
 
Is there going to be an automatic upgrade to Octopus as part of the regular PVE 6.3 releases at some point?
No. Ceph upgrades need to be planned and done with care, we do not want to force admins to deal with that by automatically enabling Ceph Octopus. Note also that people still upgrade from Proxmox VE 5.4, and need to update from Ceph Luminous (12.2) to Ceph Nautilus (13.2) first, direct upgrade to Octopus (15.2) is not possible from Luminous.

Still, all users should plan their upgrade to Octopus to happen ideally still in the first half of 2021, as Ceph Nautilus will become EOL after 2021-06-01 and won't receive any bug and security fixes after that.
While you need to plan and should take care, as long as you start out with a healthy cluster and follow our upgrade guide as close as possible, do not miss any step or change the order (if not 110% sure) then you won't have any trouble:
https://pve.proxmox.com/wiki/Ceph_Nautilus_to_Octopus

Do you guys have dates for PVE 7.x, assuming that either Octopus or Pacific will be present there?
Pacific needs to have a stable release first before we can make any specific planning.
Proxmox VE 7.0 will start out with a ceph release equal or later than Ceph Octopus, as Nautilus may be already EOL once PVE 7.0 releases. Any other details cannot be stated yet, doing so would be guessing at best.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!