Search results

  1. T

    [SOLVED] Ceph Luminous to Nautilus upgrade issues

    All hardware was recommended by SMC engineering after consultation. I have two clusters, each cluster contains: Ceph Storage Node Hardware: SMC SSG-6029P-E1CR12L (x3) 24 x Xeon Gold 6128 CPU@3.40GHz 192GB memory 2x Samsung SM863a 1.9TB SSD (one for system, one for CephDB) 10x Toshiba...
  2. T

    [SOLVED] Ceph Luminous to Nautilus upgrade issues

    I have upgraded my 6 node cluster (3 ceph-only plus 3 compute-only nodes) from 5.4 to 6. The Ceph config was created on the Luminous release and I am following the upgrade instructions provided at https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus. During the upgrade the OSDs were...
  3. T

    BlueFS spillover detected on 30 OSD(s)

    Thanks for confirming that. I was about to post a followup with my observations that the bluestore setting was the only way I could get it to create the partition size I needed. The other thing that got me is that you can't just delete one converted OSD from luminous and re-add it with...
  4. T

    BlueFS spillover detected on 30 OSD(s)

    I thought I would try re-creating the OSDs with Nautilus, but now it's creating the DB LVM size at about 370 GB, which I guess is 10% of the OSD. However, my SSD is only 1.7TB, so after creating 4 of the 10 OSDs in that server, it runs out of space on the SSD. I have tried using the size...
  5. T

    BlueFS spillover detected on 30 OSD(s)

    Hi all, After an upgrade on one cluster from 5.4 to 6.0, I performed the Ceph upgrade procedures listed here: https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus Somewhere along the way, in the midst of all the messages, I got the following WARN: BlueFS spillover detected on 30 OSD(s). In...
  6. T

    Limit node usage through storage availabilty

    I have three ceph nodes in my Proxmox Cluster that I do NOT want users creating VMs on or the system automatically moving VMs to. At first I thought I could do that through the permissions system, but after reading some posts it looks like removing storage should do it. I'm just posting this...
  7. T

    Best Practices for new Ceph cluster

    Just a quick update. In one cluster I was able to simply delete all the OSDs and pools, add new OSDs and create new pools. That worked perfectly. On a second cluster, where all the VM images were already on Ceph, I added all the new OSDs and removed the OSDs from one of the existing nodes...
  8. T

    Best Practices for new Ceph cluster

    Thanks for that advice. I am definitely replacing the older Ceph nodes, not adding to and would like to avoid the re-balancing act at all cost. I may have another way to go...The reason the current Ceph cluster is being replaced is that it never quite worked right and had issues when I started...
  9. T

    Best Practices for new Ceph cluster

    Hi all. I have an existing PVE/Ceph cluster that I am currently upgrading. The PVE portion is rather straight forward, I'm adding the new nodes to the PVE cluster, moving VMs off the old ones, then removing the old ones from the cluster. Easy Peasy. However, what I don't know is the best...
  10. T

    Ceph performance troubleshooting

    Because all documentation says anything less than enterprise SSD will be disappointing. I agree the PMs have a good price point, though.
  11. T

    Ceph performance troubleshooting

    Thanks for the input and big message to follow. It would be nice if there were a concise guide specifically for troubleshooting ceph, especially what to check in what order. Like how to verify ceph network, osd, and physical hardware performance. How to read the dashboard (Reads/Writes/IOPS)...
  12. T

    Help understand relationship between ceph pools

    Now that was worth the price of admission! Thanks!
  13. T

    Ceph performance troubleshooting

    There are two pools, one RBD with 512 PGs, and now a cephFS with 128 PGs. The health reports as HEALTH_OK, but it changes as the problem emerges: 2018-12-11 10:34:02.014813 mon.belle mon.0 192.168.201.241:6789/0 114595 : cluster [WRN] Health check failed: 13 slow requests are blocked > 32 sec...
  14. T

    PVE 4.4-1 rename VM disks

    It's been a while since I've done that with LVM. You will need to look closer at renaming LVMs. I would expect you would need to shutdown the VM, rename the LV, change the config file to match, then restart the VM. If you are in a cluster, then you have to run that on the node that owns the VM.
  15. T

    Help understand relationship between ceph pools

    I'm trying to understand how the Ceph pools are all working together. Before Proxmox 5.3. On this cluster, I have three nodes with four 2TB drives in each node (for roughly 22 TB total disk space after overhead). I had a single pool with 512 PGs in 3/2 configuration that was used to create a...
  16. T

    Ceph performance troubleshooting

    Hi all, I've been running Proxmox for a number of years and now have a 13 node cluster where last year I added Ceph to the mix (after a 5.2 upgrade) using the empty drive bays in some of the Proxmox nodes. Last Friday I upgraded all nodes to version 5.3. The Ceph system has always felt slow...
  17. T

    Ceph OSD Performance Issue

    In the first cluster, there are three nodes with four OSDs each. All HDDs (2 TB SAS) and SSD units are identical. I have looked back over the configuration for the first node in the first cluster (OSDs 0-3) and compared it to the configuration on the other nodes and there doesn't appear to be...
  18. T

    Ceph OSD Performance Issue

    I am now getting back to this issue. I haven't found anything that would explain why the OSDs in the first server (OSDs 0, 1, 2, 3) show the write time to be on average 9 seconds, while the other OSDs (4 through 11) all have write times on average about 1.5 seconds. I have a different cluster...
  19. T

    Ceph OSD Performance Issue

    While investigating OSD performance issues on a new ceph cluster, I did the same analysis on my "good" cluster. I discovered something interesting and fixing it may be the solution to my new cluster issue. For the "good" cluster, I have three nearly identical servers. Each server has four...
  20. T

    Ceph OSD Journal on USB3.0 SSD?

    I built a ceph cluster earlier this year for one of my Proxmox clusters and it has been working just fine. I had enough drive slots in each storage node to include a dedicated SSD for the OSD journals and that cluster is working fine in terms of performance. On a second cluster, I only had...

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!