Search results

  1. M

    Cluster reset when one node can only reached over corosync ring0 -- configuration problem?

    Since we had this reset twice, I analyzed the log from one node in more details: I see 2024-03-06T10:00:12.396742+01:00 pve52 corosync[1172]: [KNET ] link: host: 9 link: 0 is down 2024-03-06T10:00:12.396906+01:00 pve52 corosync[1172]: [KNET ] host: host: 9 (passive) best link: 0 (pri...
  2. M

    Cluster reset when one node can only reached over corosync ring0 -- configuration problem?

    Thank You for your reply. I attached the corosync.conf. This is the same on all nodes. it is located in /etc/pve/corosync.conf has the same content as /etc/corosync/corosnyc.conf as expected. On pve58 only link0 is up was expected. This is because it got a new network card on ,,link1" und...
  3. M

    Cluster reset when one node can only reached over corosync ring0 -- configuration problem?

    Thank You for your time fabian. here are the logs. node name in aon the filename. node pve58 was the one switched off and rebooted without ring0 and ceph network. I changed some namings from storage-names
  4. M

    Cluster reset when one node can only reached over corosync ring0 -- configuration problem?

    Dear Fabian, Thank you for Your reply. All of the hosts have this log entries in syslog: on the lin [QUORUM] member 9 is missing. 2024-03-06T10:15:35.484881+01:00 pve40 corosync[1561]: [QUORUM] Sync members[6]: 1 2 3 6 7 8 2024-03-06T10:15:35.485344+01:00 pve40 corosync[1561]: [TOTEM ]...
  5. M

    Cluster reset when one node can only reached over corosync ring0 -- configuration problem?

    Hey All, I got a complete cluster reset (watchdog based reset of all nodes) in the following scenario. Got a cluster of 7 hosts. corosync has 2 rings: ring0 network 192.168.xx.n/24 using a dedicated cupper switch rint1 network 192.168.yy.n/24 using a vlan in a 10g fiber. Here a part of...
  6. M

    Is it possible to add a Version 7.2 node to version 6 cluster before upgrade

    Hi all. We are in a process of upgrading an extenting our 4 node cluster. When we setup a new node, is it possible to add this node to the existing versin 6.4 cluster. We have ceph version 15.2 running on the existing cluster. So first installng ceph version 15 on the version 7 node should...
  7. M

    reboot of all cluster nodes when corosync is restarted on specific member

    Hi Fabian, Thanks for your reply, Viewed the bug report and likely this could be right. I will test the packages when they ,,arrive" and come back Best regards Lukas
  8. M

    reboot of all cluster nodes when corosync is restarted on specific member

    Hey all, I observed a strange reboot off all my cluster nodes as soon as on one specific host cororsync is restarted or this host rebooted. I have 7 hosts in one cluster Corosync has 2 links configured. ring0 is on a separate network on separate switch. ring1 is shared as VLAN over 10G fiber...
  9. M

    Proxmox_special_agent (check_mk 2.0) ... json response

    Hey All, Das schein dann doch ein Bug in Check_mk. Alle Cluster können korrekt abgefragt werden. Sobald aber in einem Cluster ein Host down ist (auch beabsichtigt), läuft der special Agent auf den JSONDecode Error. Sobald alles hosts wieder Up, kommt ein korrekter Output. Gruss Lukas
  10. M

    Proxmox_special_agent (check_mk 2.0) ... json response

    OK,.... was ich gefunden habe ist, dass ich beide cluster per curl erreichen kann und abfragen: z.B. /nodes Auf den ,,laufenden Cluster", der per cmk special-agent abgefragt werden kann.... curl --insecure --cookie "$(<cookie)" https://10.1.0.11:8006/api2/json/nodes/...
  11. M

    Proxmox_special_agent (check_mk 2.0) ... json response

    Leider nur ein: json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) ein mitgegebenes --debug dann noch diesen trace: Traceback (most recent call last): File "/omd/sites/mc/share/check_mk/agents/special/agent_proxmox_ve", line 10, in <module> main() File...
  12. M

    Proxmox_special_agent (check_mk 2.0) ... json response

    Hallo Stoiko Ivanov, Danke für die Rückmeldung. cmk -d hostname fürhrt sowohl den agent check auf dem Host aus, der tadellos funktioniert, wie auch den oben beschriebenen ,,spezial Agent". Dieser fragt per https vom cmk-host den Proxmox Cluster über die Proxmox api ab. Der Aufruf erwartet...
  13. M

    Proxmox_special_agent (check_mk 2.0) ... json response

    Ich versuche Proxmox VE 6.4 Cluster mit einem upgegradeten check_mk der Version 2.0 zu monitoren. Check_mk 2.0 liefert einen special Agent, der die Proxmox API nutzt. Auf einem Cluster (beide Cluster haben den gleichen Patch Stand) bekomme ich auch brauchbare Antworten aus der API: Auf dem...
  14. M

    ProxMox 6.2-15 Problem with HotPlug Hard Drives

    To reply to my own: The problem disappeared afterupgrade to libpve-common-perl: 6.2-4. HTH
  15. M

    ProxMox 6.2-15 Problem with HotPlug Hard Drives

    Same Problem here with Ceph storage on PVE 6.2.15
  16. M

    watchdog timeout on slow NFS backups

    You are right, it's strange. CPU load is about or lower 1 when doing the backups. We have 2 NFS mounts on the cluster. One to a local QNAP which is running fine without any issues. The other is remote connected via Gateway and IPSec. This one produces the ,,not responding" messages which is...
  17. M

    watchdog timeout on slow NFS backups

    Thanks for ypur answer ,,spirit" ! csync and backup so far: This is corosync: for ,,one" node: node { name: pve56 nodeid: 7 quorum_votes: 1 ring0_addr: 192.168.24.56 ring1_addr: 192.168.25.56 } Where 192.168.240/24 is a separate Network with dedicated switch and 192.168.25.0/24 is...
  18. M

    watchdog timeout on slow NFS backups

    Hi all, Since Version 6.0 up to now Version 6.2 we see the follwoing behavior running backups over WAN to NFS. We have a 8 hosts cluster (all HP DL380 G7 up to W9) runnung fine. When doing backups over a WAN connection to a QNAP we first we see al lot of this May 21 22:39:03 pve56 kernel...
  19. M

    Kernel panic on VM's since Kernel pve-kernel-5.0.21-4-pve: 5.0.21-8

    I can confirm Version pve-kernel-5.0.21-4-pve: 5.0.21-9 is working correct on 8 x Intel(R) Xeon(R) CPU E5320 So this is fixed now. Thanks so much for great work
  20. M

    Kernel panic on VM's since Kernel pve-kernel-5.0.21-4-pve: 5.0.21-8

    I think this pve-kernel-5.0.21-4-pve cause Debian guests to reboot loop on older intel CPUs is what you are talking about. The same good old dinosaurs. We use this old hosts for testing and DMZ hosts. Having a lot of HDD's for ceph, a tape drive..... So I will follow the other discussion.