Search results

D
download.proxmox.com unreachable

Hi, We've noticed increasing occurrences where we are unable to retrieve updates from download.proxmox.com from South Africa. We get directed towards af.cdn.proxmox.com and are unable to establish a connection to the resulting IP on tcp:80 (http). [admin@backup1 ~]# host download.proxmox.com...
- David Herselman
- Thread
- Dec 3, 2019
- Replies: 11
- Forum: Proxmox VE: Installation and configuration
D
OVS Bridge Between VMs (QinQ?)

We've been running QinQ VMs for almost 2 years, ie virtual router or firewall with VLANs with a single virtual Ethernet uplink assigned to the VM. PVE 6 simply requires a very simple change to the network initialization script, details here...
- David Herselman
- Post #7
- Nov 1, 2019
- Forum: Proxmox VE: Networking and Firewall
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

Hi, We run corosync on the vlan Ceph replicates on, on a redundant LACP channel, instead of a dedicated NIC. False positive fencing events are extremely disruptive so we continue to run with those settings in place...
- David Herselman
- Post #236
- Nov 1, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

I'm also able to report 7 healthy clusters with zero false positive fencing events over the last week. We always configure Corosync to run on LACP OvS bonds so the changes @spirit recommended are perfect for our usage case (detailed here) The cluster where nodes would get fenced regularly (the...
- David Herselman
- Post #221
- Oct 3, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

It ultimately, in our environments, reduces false positive events. We typically have VMs bridged on a dedicated LAG with Ceph and Corosync on another LAG. Fail over for real failure events is around 2 minutes but unnecessarily fencing nodes is massively disruptive...
- David Herselman
- Post #217
- Sep 29, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

@spriit Good thinking, I scanned through the documentation on corosync 3 and my understanding is that the token timeout is automatically adjusted by the coefficient when there are 3 or more nodes, so I made the following changes: On all three nodes initially: systemctl stop pve-ha-lrm...
- David Herselman
- Post #213
- Sep 26, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] pveproxy fails to load local certificate chain after upgrade to pve 6

Was upgrading a stand alone PVE 5 to 6 today and ran in to this... To fix: rm -f /etc/pve/pve-root-ca.pem /etc/pve/priv/pve-root-ca.* /etc/pve/local/pve-ssl.*; pvecm updatecerts -f;
- David Herselman
- Post #12
- Sep 25, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

We have two clusters in which we host virtual routers and firewalls. Heavy network traffic causes jitter and sometimes even packet loss with the default LACP OvS configuration so we run a sort of hybrid. The root cause is that Intel X520 network cards support receive side steering where they...
- David Herselman
- Post #205
- Sep 25, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

We increase totem timeout to 10000 and yes, we're running libknet 1.11-pve2. We haven't had a false positive fencing scenario since the 13th but looking at logs indicates continuous problems so I assume it's a matter of time... Symptoms appear very similar, in that relatively minor network...
- David Herselman
- Post #197
- Sep 23, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

Those 3 nodes are in a client's cluster using older equipment. They were previously 3 x VMware hosts using a HP SAN device where the IT guy's predecessor had set it up as RAID-0... Irrespective, their Proxmox + Ceph cluster has been super stable for the last 2 years but plagued by frequent...
- David Herselman
- Post #195
- Sep 22, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

I upgraded libknet to 1.11-pve2 and restarted corosync with debugging enabled. Was 'lucky' to have fairly recently started concurrent pings between all three nodes when a corosync host down event occurred. I hope there's something useful in these logs. kvm1 = 1.1.7.9 kvm2 = 1.1.7.10 kvm3 =...
- David Herselman
- Post #192
- Sep 21, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Ceph 14.2.3 / 14.2.4

No problems, looks good...
- David Herselman
- Post #10
- Sep 20, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

The following could perhaps be a separate but related bug. These logs are from a remaining node when one of the nodes dropped off. Cluster memberships updates successfully to 5/6 but notification message then start escalating with processing pause being reported. Eventually settles but I'm sure...
- David Herselman
- Post #186
- Sep 20, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Ceph 14.2.3 / 14.2.4

Apologies, I fat fingered something. For others: pico /etc/apt/sources.list.d/ceph.list #deb http://download.proxmox.com/debian/ceph-nautilus buster main deb http://download.proxmox.com/debian/ceph-nautilus buster test apt-get update; apt-get dist-upgrade; systemctl restart ceph-target...
- David Herselman
- Post #8
- Sep 20, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Ceph 14.2.3 / 14.2.4

Great news, just checked though and it unfortunately doesn't appear to have replicated to the mirrors yet... For those being affected by this: One can generally simply mitigate by restarting the downed OSD Ceph processes. Systemd would have tried this numerous times and failed though, so the...
- David Herselman
- Post #6
- Sep 20, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

Just a heads up that the second filter also matches protocol 1 (icmp) only...
- David Herselman
- Post #171
- Sep 19, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Intel S5520HC Xeon reboots

We re-purposed some old hardware to setup a proper sandbox environment. The cluster has 4 Dell R620 servers and a relatively old Intel S5520HC system with Intel E5620 (Westmere) CPUs. This last node isn't going to get used for virtuals, primarily serving as a dedicated Ceph storage node. System...
- David Herselman
- Thread
- Sep 18, 2019
- Replies: 0
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] Ceph 14.2.3 / 14.2.4

We are being affected by OSDs failing when another node is restarted. The issue is detailed in Ceph bug tracker entry 39693 (https://tracker.ceph.com/issues/39693). The issue has apparently been addressed and included in Ceph 14.2.3, will there be binaries in the testing repository soon...
- David Herselman
- Thread
- Sep 16, 2019
- Replies: 10
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

Nothing at all, only the usual boot time initialisation messages. QLogic firmware might be newer/older than other's: [root@kvm1 ~]# ethtool -i eth0 driver: bnx2 version: 2.2.6 firmware-version: bc 5.2.3 NCSI 2.0.6 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes...
- David Herselman
- Post #125
- Sep 10, 2019
- Forum: Proxmox VE: Installation and configuration
D
[SOLVED] PVE 5.4-11 + Corosync 3.x: major issues

I don't believe so, corosync hasn't crashed on any of our nodes and switching to udpu made no difference so we're back on knet. Our small HP system cluster, which has bnx2 NICs, is the only one still experiencing regular problems...
- David Herselman
- Post #120
- Sep 7, 2019
- Forum: Proxmox VE: Installation and configuration

Top Bottom

Search results

We value your privacy