What kind of OSDs do you have that cluster?
It is possible that deleting objects might eat a lot of performance. You could try to increase the sleep time between delete operations and observe if the impact is less.
ceph config set osd osd_delete_sleep 2
See...
There should not be any issue regarding data corruption. Depending on how much data needs to be moved, you can expect higher load on the cluster.
If possible, set up a test cluster that you can test such operations on before you do it on the production system. You could even do so virtualized...
Doesn't seem to be necessary, the current rule looks okay and you can keep it.
Yes. For those pools, Ceph will move the data so that it is only stored on OSDs of the device class hdd.
Right now, with only one OSD of type ssd, there isn't too much that needs to be done. You might see a slight...
If that output is current, then you seem to only have one SSD OSD. If you did not change the defaults, Ceph wants to create 3 replicas on different nodes. You will need to add more SSDs into your nodes and create OSDs on them to give Ceph the chance to create the replicas.
In order to separate...
What is the output of the following commands?
ceph osd crush rule ls
ceph osd df tree
The overall plan, if you only have the replicated_rule, is to create new rules, one for each device class that you want to use. Then assign all pools to one of the new rules, so that you don't have any pools...
Das wird der Grund sein. Wenn die Node down ist, haben einige PGs nur noch eine Replica -> weniger als die min_size und somit wird IO geblockt. Das wird auch Probleme machen, wenn eine Node mal komplett stirbt. Bei 3/2 wären noch 2 Replicas vorhanden und der Cluster würde weiter laufen.
Wenn...
Ich hab das gerade in einem kleinen 3-Node Testcluster nachgestellt. Eine Node herunterfahren, während eine VM IO produziert. Hat alles geklappt, ohne dass die VM hängen blieb.
Bis Ceph die OSDs als Down anzeigt, kann es ein paar Augenblicke brauchen.
Ohne weitere Informationen zum Cluster kann...
Wie war denn der Status von Ceph vorm Herunterfahren der Node? Wenn die Pools stehen, sobald die Node down ist, kann ich mir vorstellen, dass Ceph auch anderweitig gerade nicht die volle Redundanz hatte. Wenn dann durch das Herunterfahren der Node weniger als "min_size" Kopien von mindestens...
Erasure coded pools can only be added via the CLI for now. See the man page for pveceph.
For example, pveceph pool create myecpool --erasure-coding k=3,m=2.
Should you already have created an EC profile manually, you can specify it as well and omit the k and m parameters.
Make sure to read up...
Within a cluster or standalone Proxmox VE nodes?
In a cluster, you can migrate containers. They will be shutdown and started on the target node.
I haven't been involved, but AFAIK there were attempts to live migrate containers in a cluster with criu, but it wasn't working as (reliably) as we...
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).
The CPUs in your cluster seem to span quite a few generations. As a workaround for...
The new PCI card most likely changes the ordering of other PCI devices. One of them is the NIC.
Run ip link to get a list of the network cards currently detected. The enp10s0f3 will not show up anymore. It will have a different name.
Change that in the /etc/network/interfaces file where the old...
I am a bit confused ;)
The NIC names that you see with ip add match the ones that are configured in /etc/network/interfaces? Same for the IP addresses that you have in /etc/hosts?
The NIC is down? So ssh and ping on that IP does not work as well? If that is the case, then that would be the...
I missed that.
Check that the IP is the same in the /etc/network/interfaces file and in the /etc/hosts file in the line with the hostname. Then restart the node and check again.
On a hunch, try to enter it completely with the https://w.x.y.z:8006 as it is possible that the browser tries to establish a normal http connection which won't work.
So the cluster and ceph are working fine so far?
I hope the last octet in the IP addresses is differen ;)
Are you talking about the 192.168.107.44 and 192.168.107.55 on node 2?
Do they show up if you reboot the node? What happens if you run ifreload -a?
Also, please post CLI output inside...
Da wird cloudinit wohl auch den ssh host rsa key getauscht/neu generiert haben, was dann zu einer neuen Server ID geführt hat.
Ich würde auf der Node sicherheitshalber noch ein pvecm updatecerts machen, damit evtl. geänderte Keys und Fingerprints wieder in die know_hosts/authorized_keys...
Then check why that is. The optional ceph cluster network (the public network is the mandatory "main" ceph network) is used for the traffic between the OSDs and can take away quite some load from the public network.
Is the network config correct on that node? Verify the ip a output with the one...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.