Oh and, since that might have come across the wrong way: I do not suggest that it has to do with upgrading from an older ceph version per se, but rather that the OSDs affected had been the longest-running and seen the most reorganizing, reshuffling, etc.
Interesting. Thank you for sharing this!
Since I reformatted after downgrading to octopus, I guess I will have to keep the information about the allocator handy for when / if I upgrade to pacific again.
My OSDs all were enterprise / DC class SSDs, albeit SATA ones (Samsung SM863, Samsung PM883...
Hi there,
I noticed that sometimes backups will just fail with either of two messages:
or
The problem is that this occurs randomly with some VM backups while others just run.
If I test the backup storage from a PVE node, by executing this mulitple times in a row:
It mostly works and show...
After having restored everything from backup, I looked through the fsck attempts and saw loads of these:
I cannot recall having skipped any steps on this (or previous) upgrades of Ceph, but my guess is that this means the OSD was not working according to the latest and greatest bluestore...
24 1TB SSDs, 8 per Node, everything was on the osd itself.
No snapshots, usage at roughly 65% / 70%.
One problem seems to have been a backup going to a cephfs share that (for reasons I have yet t ounderstand) was way bigger than previous days and seems to have caused the out of space issue.
Log...
Hi all,
after an upgrade (on Friday night) to Proxmox 7.x and Ceph 16.2, everything seemed to work perfectly.
Sometime early morning today (sunday), the cluster crashed.
17 out of 24 OSDs will no longer start
most of them will do a successful
ceph-bluestore-tool fsck
but some will have an...
During the night, I had all VMs of one node frozen with the qmp socket no longer responding.
The freeze happened during pbe backup.
What is weird is this output from one of the backup jobs:
2021-03-09 00:35:29 INFO: Starting Backup of VM 156068 (qemu)
2021-03-09 00:35:29 INFO: status = running...
Hi there,
I just noticed that the shipped version of openvswitch (2.12.x on Proxmox VE 6.3) does not yet support specifying a primary member for an active-backup bond. Unfortunately, this means that there is no clearly defined state.
Newer versions (since 2.14.x, I believe) do have that...
Indeed. It turned out to be a firewall thing after all.
As for the socket: apparently, if you create a dual-stack socket on ::, it will still listen on all interfaces ipv4 and v6, but not show up separately. I'll have to keep that in mind - it is somewhat misleading when trying to diagnose...
I just added a node to my test cluster to try and work with PBS.
My cluster network is done using openvswitch and I was disappointed in that the PBE server software does not seem to work with this setup.
root@test07:~# proxmox-backup-manager network list...
This results in new / different error. :-(
I tried the following:
- Setup a new storage for ceph rbd using the krbd flag named 'kdisks'
- removed /etc/pve/priv/ceph/kdisks.keyring
- Tried to add a new disk to a VM using the new storage
/etc/pve/ceph.conf auth stuff reads like this...
OK. Resetting ceph.conf to the default auth stuff:
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
solved the problem, but I would still be very glad if someone could point out why that is?
AFAIK ceph auth does incur a (small) performance...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.