Hi.
The installation of cluster with 6 nodes. Dell servers with bonded interfaces, private switch, almost no load, no traffic. Everything working perfect a years, but after upgrade to 6.0.4 step by step fails happen.
Sometimes nodes lost knet connection without any reason (no load at all)
Sometimes some MTU problems
All nodes with default settings, I did not changed MTU, etc.
Is it possible to came back to multicast in corosync?
===
After upgrade the server wotn start, because it can not mount /dev/pve/data in fstab. But it works perfect before.
Actually, looks like the release 6 is most worst release ever.
The installation of cluster with 6 nodes. Dell servers with bonded interfaces, private switch, almost no load, no traffic. Everything working perfect a years, but after upgrade to 6.0.4 step by step fails happen.
Sometimes nodes lost knet connection without any reason (no load at all)
Code:
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] link: host: 6 link: 0 is down
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] link: host: 3 link: 0 is down
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] link: host: 1 link: 0 is down
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] host: host: 6 (passive) best link: 0 (pri: 1)
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] host: host: 6 has no active links
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] host: host: 3 has no active links
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Jul 28 11:36:36 pve2 corosync[28832]: [KNET ] host: host: 1 has no active links
Jul 28 11:36:38 pve2 corosync[28832]: [KNET ] rx: host: 6 link: 0 is up
Jul 28 11:36:38 pve2 corosync[28832]: [KNET ] rx: host: 3 link: 0 is up
Jul 28 11:36:38 pve2 corosync[28832]: [KNET ] host: host: 6 (passive) best link: 0 (pri: 1)
Jul 28 11:36:38 pve2 corosync[28832]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Jul 28 11:36:38 pve2 corosync[28832]: [KNET ] rx: host: 1 link: 0 is up
Sometimes some MTU problems
Code:
Jul 28 09:29:16 pve3 corosync[7418]: [KNET ] pmtud: possible MTU misconfiguration detected. kernel is reporting MTU: 1500 bytes for host 6 link 0 but the other node is not acknowled ging packets of this size.
Jul 28 09:29:16 pve3 corosync[7418]: [KNET ] pmtud: This can be caused by this node interface MTU too big or a network device that does not support or has been misconfigured to manage MTU of this size, or packet loss. knet will continue to run but performances might be affected.
Jul 28 09:29:42 pve3 corosync[7418]: [KNET ] host: host: 6 (passive) best link: 0 (pri: 1)
Jul 28 09:29:42 pve3 corosync[7418]: [KNET ] host: host: 6 has no active links
Is it possible to came back to multicast in corosync?
===
After upgrade the server wotn start, because it can not mount /dev/pve/data in fstab. But it works perfect before.
Actually, looks like the release 6 is most worst release ever.