Search results

  1. M

    How to add a 2nd Corosync Link

    I just did that, it seemed to work. The 2nd Link is beeing display in the gui now, too. Also the new Configuration ID/Version. journalctl -b -u corosync looked good. However, to test it i did a "ifconfig vmbr0 0.0.0.0" where ring0 is on. After a few moments it lost quorum and got fenced...
  2. M

    How to add a 2nd Corosync Link

    Hello, i would like to add a 2nd corosync ring/network for redundancy. How can i add "Link 1" after i took that cluster into production? Thanks, Michael
  3. M

    Change 3 Node Ceph Cluster to 1 Node

    Hello i have a old Proxmox 4.4 Cluster and would like to "reinstall" it. My goal is to change that 3-Node Cluster with ceph storage to a Single-Node to get 2 free for a fresh install. I guess i just change the quorum to 1 and then migrate all VMS to that one node. Then turn off the two...
  4. M

    Radom Node Freeze/Fence and TOTEM Retransmit List problems

    Hello Alwin, that wiki link seems new to me. But yes, i use "Method 2" I think there was a official proxmox unicast howto which described Method 2. But i cant find it anymore. Will Method 1 (multicast) fix my problem? Thanks, Mario
  5. M

    Radom Node Freeze/Fence and TOTEM Retransmit List problems

    Hello, i am still struggling with those random Node Freezes/Crashed. I run a 3 Node cluster with Unicast (NO SWITCH) in a full meshed network. I think i was abele to reproduced the problem with a VMWare 3Node Cluster Setup by setting some packet loss on the corosync network. The fencing...
  6. M

    [SOLVED] periodic Node Crash/freeze

    Thread can be closed, i will carry on in this one: https://forum.proxmox.com/threads/radom-node-freeze-fence-and-totem-retransmit-list-problems.49493/
  7. M

    [SOLVED] Corosync/HA logic

    1.) I testet it by setting packetloss to 100% in my vmware test environment. 2.) hm....i cant reproduce the random fencinng now. Maybe i testet it wrong 3.) No need to have a look at the logs since this thead can be closed nice to know was: no point in fencing nodes without services
  8. M

    [SOLVED] Corosync/HA logic

    Hello, 3 Node Test Cluster. Test Case: Cut off node10 (master) cluster network connection ha-manager statuson node08: quorum OK master node10 (old timestamp - dead?, Tue Jan 8 17:12:52 2019) lrm node08 (active, Tue Jan 8 17:13:31 2019) lrm node09 (idle, Tue Jan 8 17:13:34 2019)...
  9. M

    Radom Node Freeze/Fence and TOTEM Retransmit List problems

    Hello List, i have a 3 Node Cluster with ceph. They are connected to each other with unicast. node08: pve-manager/5.3-5/97ae681d (running kernel: 4.15.18-9-pve) node09: pve-manager/5.2-9/4b30e8f9 (running kernel: 4.15.18-7-pve) node10: pve-manager/5.2-10/6f892b40 (running kernel...
  10. M

    [SOLVED] periodic Node Crash/freeze

    Hmm...it crashed again in the same style :-(
  11. M

    [SOLVED] periodic Node Crash/freeze

    Hello dcsapak, thanks a lot for your help. It turned out to be a "Kernel Intel Nic driver problem". I once saw an error indicating a network driver problem. Since i switched to some old mellanox connectx-3 nics the problems disappeared. The supermicro hardware is quite common with the 10G...
  12. M

    Disable build in Fencing

    Hello, how can i disable the build in fencing in Proxmox 5.2 ? I have some weird problems and i would perfer a "stuck" system (do be able to debug it) than a fenced one. Thanks, Mario
  13. M

    [SOLVED] periodic Node Crash/freeze

    it just crashed again: Sep 7 14:01:03 node09 systemd[1309404]: Stopped target Paths. Sep 7 14:01:03 node09 systemd[1309404]: Stopped target Sockets. Sep 7 14:01:03 node09 systemd[1309404]: Reached target Shutdown. Sep 7 14:01:03 node09 systemd[1309404]: Starting Exit the Session... Sep 7...
  14. M

    [SOLVED] periodic Node Crash/freeze

    Hello dcsapak, my network seems fine. I did a 268 Hour test with a ping flood: root@node09:~# ping -f -i 0.2 node10 PING node10.cluster3.stuttgart.local (10.15.15.10) 56(84) bytes of data. --- node10.cluster3.stuttgart.local ping statistics --- 4826063 packets transmitted, 4826063 received...
  15. M

    [SOLVED] periodic Node Crash/freeze

    Hello dcsapak, thanks for your reply. I use unicast. All three nodes connected to each other. Using the manual here: https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server /etc/network/interfaces (from node08) auto lo iface lo inet loopback allow-hotplug eth2 iface eth2 inet...
  16. M

    [SOLVED] periodic Node Crash/freeze

    i removed puppet from the three nodes and just now one node got stuck again (!!!): Aug 22 16:11:12 node08 corosync[5428]: notice [TOTEM ] Retransmit List: 4b5fc 4b5fd 4b5fe 4b5ff 4b600 4b601 4b602 4b604 Aug 22 16:11:12 node08 corosync[5428]: notice [TOTEM ] Retransmit List: 4b5fc 4b5fd 4b5fe...
  17. M

    [SOLVED] periodic Node Crash/freeze

    I get this a lot: root@node09:~ # grep "Reached target Shutdown." /var/log/syslog ... Aug 21 09:25:17 node09 systemd[1503148]: Reached target Shutdown. Aug 21 09:25:18 node09 systemd[1503202]: Reached target Shutdown. Aug 21 09:27:52 node09 systemd[1504300]: Reached target Shutdown. Aug 21...
  18. M

    [SOLVED] periodic Node Crash/freeze

    Hello, i run a couple of 3 Node Clusters and since half a year i have a problem with one of them. The Cluster has the node "node08", "node09" and "node10" All seem to have the same problem. They crash after about two weeks with the following in syslog. They then just sit there with a freeze...