Search results

D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

RE: pmx2 - Good catch - no that wasn't intentional, fixing it already. From a network standpoint 198.18.50-53.xxx can all ping each other, so the network pieces, yes were all operational. Based on the config however, it looks like pmx2 wasn't on ring2 correctly. That in and of itself shouldn't...
- dlasher
- Post #9
- Oct 5, 2022
- Forum: Proxmox VE: Installation and configuration
D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

Here's what the same event looked like from pmx4 (node 3) Oct 03 23:17:58 pmx4 corosync[6951]: [TOTEM ] Token has not been received in 4687 ms Oct 03 23:17:58 pmx4 corosync[6951]: [KNET ] link: host: 6 link: 0 is down Oct 03 23:17:58 pmx4 corosync[6951]: [KNET ] link: host: 6 link: 1 is...
- dlasher
- Post #7
- Oct 4, 2022
- Forum: Proxmox VE: Installation and configuration
D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

For reference, from a topology standpoint, pmx1/2/3/4/5 (nodes 6,5,4,3,2) sit in the same rack, whereas pmx6/7 (nodes 1,7) sit in another room, connected to different switches with shared infra between. root@pmx1:~# pveversion pve-manager/7.2-11/b76d3178 (running kernel: 5.15.39-3-pve)
- dlasher
- Post #6
- Oct 4, 2022
- Forum: Proxmox VE: Installation and configuration
D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

And the conf file. logging { debug: off to_syslog: yes } nodelist { node { name: pmx1 nodeid: 6 quorum_votes: 1 ring0_addr: 10.4.5.101 ring1_addr: 198.18.50.101 ring2_addr: 198.18.51.101 ring3_addr: 198.18.53.101 } node { name: pmx2 nodeid: 5...
- dlasher
- Post #5
- Oct 4, 2022
- Forum: Proxmox VE: Installation and configuration
D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

Here's logs from a (7) node cluster, this is from node 1 - I'm notice that there's nothing in the logs that explicitly say "hey, we've failed, I'm rebooting" so I hope this makes sense to you @fabian . I read this as "lost 0, lost 1, 2 is fine, we shuffle a bit to make 2 happy, then pull the...
- dlasher
- Post #4
- Oct 4, 2022
- Forum: Proxmox VE: Installation and configuration
D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

Will dig them out tonight, thanks. From an operational standpoint, is there any way to tweak the behavior of fencing? For example, this cluster has CEPH, and as long as CEPH is happy, I'm fine with all the VM's being shut down, but by all means, don't ()*@#$ reboot!!! It's easily 20 minutes...
- dlasher
- Post #3
- Oct 4, 2022
- Forum: Proxmox VE: Installation and configuration
D
Single ring failure causes cluster reboot? (AKA: We hates the fencing my precious.. we hates it..)

Someone please explain to me why the loss of a single ring should force the entire cluster (9 hosts) to reboot? Topology - isn't 4 rings enough?? ring0_addr: 10.4.5.0/24 -- eth0/bond0 - switch1 (1ge) ring1_addr: 198.18.50.0/24 -- eth1/bond1 - switch2 (1ge) ring2_addr...
- dlasher
- Thread
- Oct 4, 2022
- Replies: 20
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Making zfs root mirror bootable (uefi)

This solved my problem as well - thank you. Somewhere there should be a short "Admin HowTo" list, because this would be part of a document labeled: "How to replace a ZFS boot disk in a mirror set for Proxmox" (To be fair - this : https://pve.proxmox.com/pve-docs/chapter-sysadmin.html - is a...
- dlasher
- Post #5
- Aug 15, 2022
- Forum: Proxmox VE: Installation and configuration
D
Zabbix template

How are people monitoring CEPH on their proxmox clusters? The default "Ceph by Zabbix Agent 2" quickly goes "unsupported" when added to a PMX host.
- dlasher
- Post #54
- Jun 28, 2022
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Ceph - Schedule deep scrubs to prevent service degradation

Fantastic work, thanks for sharing... favorited this one. (honestly, this should be a set of options in the PMX Ceph admin page.)
- dlasher
- Post #8
- Apr 10, 2022
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Some LXC CT not starting after 7.0 update

Happy to help! Glad it worked for you too! :)
- dlasher
- Post #6
- Feb 7, 2022
- Forum: Proxmox VE: Installation and configuration
D
PMX7.0 - HA - preventing entire cluster reboot

Having read all the other threads (including : https://forum.proxmox.com/threads/pve-5-4-11-corosync-3-x-major-issues.56124/page-11#post-269235) , wanted to add - I'm running (4) different "rings" for corosync, spread across (4) different physical interfaces, and (2) different switches...
- dlasher
- Post #2
- Dec 5, 2021
- Forum: Proxmox VE: Installation and configuration
D
PMX7.0 - HA - preventing entire cluster reboot

pve-manager/7.0-11/63d82f4e (running kernel: 5.11.22-5-pve) - (5) node cluster, full HA setup, CEPH filesystem How do I prevent HA from rebooting the entire cluster? 20:05:39 up 22 min, 2 users, load average: 6.58, 6.91, 5.18 20:05:39 up 22 min, 1 user, load average: 4.34, 6.79, 6.23...
- dlasher
- Thread
- Dec 5, 2021
- ceph high availability proxmox 7.0
- Replies: 1
- Forum: Proxmox VE: Installation and configuration
D
CEPH multiple MDS on the same node

Fair question, I'd like to see an answer as well - given pretty solid test data showing significant advantages to multiple MDS's (like this : https://croit.io/blog/ceph-performance-test-and-optimization) I'd love to see support for more than one MDS per server.
- dlasher
- Post #2
- Nov 30, 2021
- Forum: Proxmox VE: Installation and configuration
D
Best way to access CephFS from within VM (high perf)

I am seeing an issue however, with CEPHFS performance in VM's, when one of the "mounted" IP's is down, for example: 198.18.53.101,198.18.53.102,198.18.53.103,198.18.53.104,198.18.53.105:/ /mnt/pve/cephfs when .103 was offline for a while today (crashed) VM's using things mounted in that path...
- dlasher
- Post #9
- Nov 29, 2021
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] pveceph osd destroy is not cleaning device properly

Just wanted to be a big +1 for this command - I've been doing it the hard way any time a drive fails, and was pleasantly surprised to find it cleaned up all the pv/vg's correctly, even in the case of DB on NVME. 10/10 will use again. :)
- dlasher
- Post #5
- Oct 26, 2021
- Forum: Proxmox VE: Installation and configuration
D
Ceph 16.2.6 - CEPHFS failed after upgrade from 16.2.5

TL;DR - Upgrade from 16.2.5 to 16.2.6 - CEPHFS fails to start after upgrade, all MDS in "standby" - requires ceph fs compat <fs name> add_incompat 7 "mds uses inline data" to work again. Longer version : pve-manager/7.0-11/63d82f4e (running kernel: 5.11.22-5-pve) apt dist-upgraded, CEPH...
- dlasher
- Thread
- Oct 10, 2021
- ceph cephfs proxmox
- Replies: 0
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Some LXC CT not starting after 7.0 update

Ran into this exact issue this week, upgrading some older ubuntu-14-LTS containers, didn't realize rolling to 16 would kill them :( What I did to fix it: lxc mount $CTID chroot /var/lib/lxc/$CTID/rootfs apt update apt dist-upgrade do-release-upgrade ((( none found - had to do it by hand ))...
- dlasher
- Post #4
- Aug 10, 2021
- Forum: Proxmox VE: Installation and configuration
D
New Mobile App for Proxmox VE!

Sweet - works great on a 7.x cluster.
- dlasher
- Post #105
- Aug 3, 2021
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] proxmox 7 / linux 5.11.22 issue with LSI 2008 controllers?

As a datapoint: I just completed a new 5-node builld, with several sets of 92xx cards * AOC-USAS-L8i (Broadcom 1068E) * LSI 9207-8i (IBM M5110) * AOC-S2308L-L8E (LSI 9207-8i) * two other random LSI 92xx cards I tried the cards as they were, and cross-flashed them to v20 (as appropriate), and...
- dlasher
- Post #10
- Aug 3, 2021
- Forum: Proxmox VE: Installation and configuration

Top Bottom

Search results

We value your privacy