Tomas, could you please check/confirm that with one way mirroring next command:
rbd mirror pool status rbd --verbose
gives normal output running on backup node:
root@pve-backup:~# rbd mirror pool status rbd
health: OK
images: 18 total
18 replaying
and warning running on main cluster...
I had to down Replica/min to 2/1 from 3/2 to get some "extra space". Any ideas why journaling data are not wiped after pulling from backup node?
If I was not mistaken I did one-way mirror. How could I check that?
Thanks
After an upgrade to PVE 6 and CEPH to 14.2.4 I enabled pool mirroring to independent node (following PVE wiki)
From that time my pool usage is growing up constantly even-though no VM disk changes are made
Could anybody help to sort out where my space is flowing out?
Pool usage size is going to...
According to CEPH docs (https://docs.ceph.com/docs/master/rados/configuration/network-config-ref/#id1) several public nets could be defined (useful in case of rdb mirroring when slave CEPH cluster is located in separate location or/and monitors need to be created on different network...
I did it. I even deleted all /var/lib/ceph folder and all ceph* related services in /etc/system.d/.. and rebooted that node
but pveceph purge still says:
root@pve-node4:~# pveceph purge
detected running ceph services- unable to purge data
what pveceph purge checks for "running ceph...
Not sure it's somehow related but I dont have any OSDs in my cluster by the moment
root@pve-node4:~# systemctl | grep ceph-
● ceph-mon@pve-node4.service loaded failed failed Ceph cluster...
Yeap, systemd service was enabled but disabling does change nothing
ceph log on pve-node4 on mon start:
Oct 04 13:41:25 pve-node4 systemd[1]: Started Ceph cluster monitor daemon.
Oct 04 13:41:25 pve-node4 ceph-mon[436732]: 2019-10-04 13:41:25.495 7f5aed4ec440 -1 mon.pve-node4@-1(???) e14 not...
After an update from 5.x to 6.x one CEPH monitors became "ghost"
With status "stopped" and address "unknown"
It can be neither run, created or deleted with errors as below:
create: monitor address '10.10.10.104' already in use (500 )
destroy : no such monitor id 'pve-node4' (500)
I deleted...
In my environment with libknet* 1.12-pve1 (from no-subscription repo) cluster has become much more stable (no "link down" and corosync seg fault so far >48hrs)
Here is an answer...
https://github.com/corosync/corosync/commit/0a323ff2ed0f2aff9cb691072906e69cb96ed662
PVE wiki should be get updated accordingly
Dumn corosync...
Could anyone explain why do corosync (KNET) choose best link with the highest priority instead of the lowest one (as written in PVE wiki)?
Very confused with corosync3 indeed...
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: amarao-cluster
config_version: 20
interface...
Another observation is that in my setups only nodes with no swap (zfs as root and NFS share as datastore) and vm.swappiness=0 in sysctl.conf are affected
I do remember the unresolved issue with PVE 5.x where swap has been used even with vm.swappiness=0 by pve process. Couldn't this be the case...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.