[SOLVED] ceph bonded network one port failed cannot access vms via vnc

inxamc

Active Member
Dec 2, 2020
44
6
28
Hello Froum,

we are running a hyperconverged 3 node proxmox ceph cluster (PVE 7.4.16, ceph 16.3)
we use a meshed 10GBit bonded network for ceph trafic with 3 x dual port intel nics.
It seems that one port on node 3 fails because node 1 cannot ping node 3 and reverse but node 2 can ping node 3
we cannot log into any vm
Current state is: after reboot of node 3 no vm can start
ceph on node 3 has still noout and noscrub flags and ceph is peering
ceph status.JPG
root@amcvh13:~# ceph -s
cluster:
id: ae713943-83f3-48b4-a0c2-124c092c250b
health: HEALTH_WARN
noout,noscrub,nodeep-scrub flag(s) set
Reduced data availability: 291 pgs inactive, 291 pgs peering
1388 slow ops, oldest one blocked for 3578 sec, daemons [osd.1,osd.10,osd.12,osd.17,osd.2,osd.3,osd.45,osd.47,osd.5,osd.6]... have slow ops.

services:
mon: 3 daemons, quorum amcvh11,amcvh12,amcvh13 (age 59m)
mgr: amcvh12(active, since 65m), standbys: amcvh11, amcvh13
mds: 1/1 daemons up, 2 standby
osd: 45 osds: 45 up (since 59m), 45 in (since 12M)
flags noout,noscrub,nodeep-scrub

data:
volumes: 1/1 healthy
pools: 6 pools, 433 pgs
objects: 600.49k objects, 1.1 TiB
usage: 3.4 TiB used, 8.8 TiB / 12 TiB avail
pgs: 67.206% pgs not active
291 peering
142 active+clean

our biggest problem is, that we had to stop our mail server which didn't react anymore
before becoming aware of the underlying ceph network problem - so mail on node 2 is offline now
How can I go on without causing any damage?
Is it approprite to shutdown node 3 an continue with a 2/3 degraded cluster just in
order to start and run the mailserver until we replace the nic in node 3?
P.S. We have a subscription, but our login in the customer portal failed, so I wasn't able to open
a ticket.

Any help is very much appreciated...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!