I recently built a dev cluster to test ceph performance, using a Windows Server 2019 guest with CrystalDiskMark I am getting very very slow speeds in read and write testing.
Reads:140MB/s vs 4000MB/s testing on a disk attached to NFS storage.
Writes: 90MB/s vs 1643MB/s
ceph.conf
[global]...
I think what may have happened is after removing the 2 "failed" nodes they were still HA targets. Right now ceph in the cluster is totally broken and I've disabled it everywhere that I can. I won't be able to test again until the weekend in case it takes everything down.
Here is Pve -v...
We had a node failure that took down the ceph manager service, i know there should have been more than one running but ceph -s said their were 2 on standby that never took over.
Ceph was completely pooched and we had to do restorations from backup and luckily managed to recover some stuff from...
I have a 12 node cluster, 6 at each of two location. Location one nodes use .2.0/24, the other .39.0/24
Nodes can all ping one another but when trying to create a ceph monitor on any node at the second location (.39) the error states:
Multiple Ceph public networks detected on putsproxp07...
When I upgraded my test cluster from 6.x to 7.x there were no issues.
Today when upgrading one of my production nodes it appears that systemd used a new naming structure and all my interfaces changes as follows:
ens3f0 - enp175s0f0
ens3f1 - enp175s0f1
ens6f0 - enp24s0f0
ens6f1 - enp24s0f1...
My syslog on all nodes is basically page after page of:
Dec 12 13:03:27 putsproxp10 corosync[3147]: [KNET ] pmtud: Starting PMTUD for host: 7 link: 0
Dec 12 13:03:27 putsproxp10 corosync[3147]: [KNET ] udp: detected kernel MTU: 1500
Dec 12 13:03:27 putsproxp10 corosync[3147]: [KNET ]...
Is there a best practice for restarting the host of the virtual disks? The boot drives are all held in a local volume that is replicated to all the nodes but the data/storage/database disks are housed on network attached storage.
I'd like to avoid manually shutting down 100+ VMs running...
Unchecked the KRBD flag in the RBD config - seems to have fixed the sysfs write failed issue.
If someone can explain what the krbd flag does in the RBD config that would be great
Thank you
I have searched all the logs and I do not see an indicator of what could be at fault here. I just updated the ceph to 14.2.10.
same error messges
2020-08-05 07:44:40 starting migration of VM 107 to node 'putsproxp01' (192.168.2.95)
2020-08-05 07:44:41 starting VM 107 on remote node 'putsproxp01'...
At this point I have this issue with RBD - can't map rbd volume for a disk image when trying these functions:
Migrate a VM from any node (1-6) to any node (1-6)
back up any VM on any node (1-6)
start a new VM on any node (1-6)
pvesm status
Name Type Status Total Used Available %
isoRepo nfs active 2108488704 336919552 1664457728 15.98%
local dir active 25413876 14911972 9187908 58.68%
local-lvm...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.