Recent content by ikogan

I
Some client(s) are unable to authenticate to Ceph but which ones?

This past weekend I completed migrating my config management to Ansible and updated all PVE packages, rebooting each node as I did so. Afterwards, 4 of my 5 nodes started throwing errors that some Ceph client(s) using the client.admin client are unable to authenticate...
- ikogan
- Thread
- Mar 10, 2025
- Replies: 0
- Forum: Proxmox VE: Installation and configuration
I
Proxmox Datacenter Manager - First Alpha Release

Is there any reason why this couldn't be placed in a docker image? Additionally, how do y'all feel about releasing an official image?
- ikogan
- Post #52
- Dec 19, 2024
- Forum: Datacenter Manager: Installation and configuration
I
Installing "ceph-exporter" Daemon

No, I'm just taking the performance hit for now.
- ikogan
- Post #3
- Aug 4, 2024
- Forum: Proxmox VE: Installation and configuration
I
Installing "ceph-exporter" Daemon

According to the ceph documentation, at least as of Reef, the mgrs no longer export perf counters by default (https://docs.ceph.com/en/reef/mgr/prometheus/#id1) which I thought wouldn't be a big deal for me. However, some of these counters include OSD storage information, in particular...
- ikogan
- Thread
- May 15, 2024
- Replies: 16
- Forum: Proxmox VE: Installation and configuration
I
Corosync KNET Flapping

So the node got fenced again yesterday and now I'm not seeing errors anymore. There doesn't seem to be anything more in `daemon.log`. It includes the usual flapping followed by what looks like startup logs. Monitoring corosync stats might be a good idea. I'll try and get that worked in there to...
- ikogan
- Post #18
- Apr 2, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

I've attached the cmap stats for the other nodes here. Node 1 is the node that fenced Node 5. The metrics up there are from Node 5.
- ikogan
- Post #16
- Mar 31, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

Lucky for me, I have that info. From one of the other nodes: 2023-03-31T12:50:05-04:00 service 'vm:112': state changed from 'fence' to 'recovery' 2023-03-31T12:50:05-04:00 service 'vm:109': state changed from 'fence' to 'recovery' 2023-03-31T12:50:05-04:00 node 'zorya': state changed...
- ikogan
- Post #15
- Mar 31, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

There are 0 errors on all nodes... There are retries across all nodes but no node is an outlier. In the logs, it always seems to be link 1 that's failing, never link 0. Looking at the logs some more, every node claims that "host 5 joined", host 5 claims every other node joined. Since this node...
- ikogan
- Post #13
- Mar 31, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

So this cluster has: 1. Intel Core i7-9700k w/Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 2. Intel Xeon E3-1240L v5 w/Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 3. Intel Xeon D-1521 w/Intel Corporation 82599ES 10-Gigabit SFI/SFP+...
- ikogan
- Post #11
- Mar 31, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

The interfaces I have for my corosync rings are not bonded. This just started happening again today and I'm not sure what changed. I don't see any errors so far on the links but I'm seeing constant flapping. If this is related to mac address aging, why would it have such a weird pattern...
- ikogan
- Post #9
- Mar 30, 2023
- Forum: Proxmox VE: Networking and Firewall
I
ceph-mgr crash when enabling perf stats

I'm having some performance issues on CephFS I'm trying to track down. I tried enabling stats following the information on https://docs.ceph.com/en/quincy/cephfs/cephfs-top/. The second I do ceph mgr module enable stats, all of my MDSes start to puke the following: 2023-02-19T21:19:31.164-0500...
- ikogan
- Thread
- Feb 21, 2023
- Replies: 0
- Forum: Proxmox VE: Installation and configuration
I
Corosync KNET Flapping

Sorry, I've been on a trip for the past week or so. Anyway, thanks for the tip, I've disabled RSTP and LLDP on ports connected to the cluster nodes. We'll see if that helps.
- ikogan
- Post #7
- Feb 8, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

Thanks for all of your help! I do have VMs running on vmbr1, those consume ceph client traffic. Not ideal, but as the graph shows, they're not saturated in this situation and the issue is happening on vmbr2.13. vmbr2 is shared between the Ceph backside network and the secondary ring. That...
- ikogan
- Post #5
- Jan 26, 2023
- Forum: Proxmox VE: Networking and Firewall
I
Corosync KNET Flapping

The 10 GbE interfaces are not bonded, only the 1 GbE interfaces are bonded and those are used for "public" VM traffic, they're not used for Proxmox clustering or Ceph. They're using LACP and the switch reports that it's fine. Here's an example of one of the host's `/etc/network/interfaces`: ❯...
- ikogan
- Post #3
- Jan 26, 2023
- Forum: Proxmox VE: Networking and Firewall
I
pvestatd[15200]: status update time

Not to resurrect an old thread but I'm seeing these errors as well as influxdb read timeouts but `time pvesm status` always returns in less than a second: ❯ time pvesm status Name Type Status Total Used Available % cluster...
- ikogan
- Post #9
- Jan 26, 2023
- Forum: Proxmox VE: Installation and configuration

Top Bottom

Search

Search

Recent content by ikogan

Some client(s) are unable to authenticate to Ceph but which ones?

Proxmox Datacenter Manager - First Alpha Release

Installing "ceph-exporter" Daemon

Installing "ceph-exporter" Daemon

Corosync KNET Flapping

Corosync KNET Flapping

Corosync KNET Flapping

Corosync KNET Flapping

Corosync KNET Flapping

Corosync KNET Flapping

ceph-mgr crash when enabling perf stats

Corosync KNET Flapping

Corosync KNET Flapping

Corosync KNET Flapping

pvestatd[15200]: status update time

We value your privacy