Hello,
I just built 6 new pmx v6 nodes (uptodate) with same hardware.
Had 2 links per node (2 lacp bond on 2 Intel X520) :
bond0 : 2x10 Gb (Management and VMs prod - MTU : 1500)
bond1 : 2x10 Gb (Ceph Storage - MTU : 9000)
bond0 is declared as primary corosync link (link 0), bond1 as link 1
On another node :
Sometimes I have the following lines on syslog :
There is no VM on this cluster yet.
Seems that there is something wrong with the LAG with MTU=9000.
I have 2 links so I don't worry but I don't know if it could be an issue and where to look further.
I stressed the LAG with iperf for more than 10 min but nothing happened in syslog.
My corosync.conf :
Thanks in advanced for your help if you find something to test !
Antoine
I just built 6 new pmx v6 nodes (uptodate) with same hardware.
Had 2 links per node (2 lacp bond on 2 Intel X520) :
bond0 : 2x10 Gb (Management and VMs prod - MTU : 1500)
bond1 : 2x10 Gb (Ceph Storage - MTU : 9000)
root@dc-prox-23:~# pveversion -v
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-9 (running version: 6.0-9/508dcee0)
pve-kernel-5.0: 6.0-9
pve-kernel-helper: 6.0-9
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-5
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-7
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.1-3
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-9
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve1
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-9 (running version: 6.0-9/508dcee0)
pve-kernel-5.0: 6.0-9
pve-kernel-helper: 6.0-9
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-2
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-5
libpve-guest-common-perl: 3.0-1
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.0-9
libqb0: 1.0.5-1
lvm2: 2.03.02-pve3
lxc-pve: 3.1.0-65
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.0-8
pve-cluster: 6.0-7
pve-container: 3.0-7
pve-docs: 6.0-7
pve-edk2-firmware: 2.20190614-1
pve-firewall: 4.0-7
pve-firmware: 3.0-2
pve-ha-manager: 3.0-2
pve-i18n: 2.0-3
pve-qemu-kvm: 4.0.1-3
pve-xtermjs: 3.13.2-1
qemu-server: 6.0-9
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve1
bond0 is declared as primary corosync link (link 0), bond1 as link 1
Code:
root@dc-prox-25:~# corosync-cfgtool -s
Printing link status.
Local node ID 1
LINK ID 0
addr = 10.192.5.59
status:
nodeid 1: link enabled:1 link connected:1
nodeid 2: link enabled:1 link connected:1
nodeid 3: link enabled:1 link connected:1
nodeid 4: link enabled:1 link connected:1
nodeid 5: link enabled:1 link connected:1
nodeid 6: link enabled:1 link connected:1
LINK ID 1
addr = 10.199.0.59
status:
nodeid 1: link enabled:0 link connected:1
nodeid 2: link enabled:1 link connected:1
nodeid 3: link enabled:1 link connected:1
nodeid 4: link enabled:1 link connected:1
nodeid 5: link enabled:1 link connected:1
nodeid 6: link enabled:1 link connected:1
On another node :
Code:
root@dc-prox-23:~# corosync-cfgtool -s
Printing link status.
Local node ID 4
LINK ID 0
addr = 10.192.5.57
status:
nodeid 1: link enabled:1 link connected:1
nodeid 2: link enabled:1 link connected:1
nodeid 3: link enabled:1 link connected:1
nodeid 4: link enabled:1 link connected:1
nodeid 5: link enabled:1 link connected:1
nodeid 6: link enabled:1 link connected:1
LINK ID 1
addr = 10.199.0.57
status:
nodeid 1: link enabled:1 link connected:1
nodeid 2: link enabled:1 link connected:1
nodeid 3: link enabled:1 link connected:1
nodeid 4: link enabled:0 link connected:1
nodeid 5: link enabled:1 link connected:1
nodeid 6: link enabled:1 link connected:1
Sometimes I have the following lines on syslog :
Nov 4 09:40:03 dc-prox-23 corosync[34237]: [KNET ] link: host: 1 link: 1 is down
Nov 4 09:40:03 dc-prox-23 corosync[34237]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Nov 4 09:40:06 dc-prox-23 corosync[34237]: [KNET ] rx: host: 1 link: 1 is up
Nov 4 09:40:06 dc-prox-23 corosync[34237]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
There is no VM on this cluster yet.
Seems that there is something wrong with the LAG with MTU=9000.
I have 2 links so I don't worry but I don't know if it could be an issue and where to look further.
I stressed the LAG with iperf for more than 10 min but nothing happened in syslog.
My corosync.conf :
root@dc-prox-23:~# cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: dc-prox-22
nodeid: 3
quorum_votes: 1
ring0_addr: 10.192.5.56
ring1_addr: 10.199.0.56
}
node {
name: dc-prox-23
nodeid: 4
quorum_votes: 1
ring0_addr: 10.192.5.57
ring1_addr: 10.199.0.57
}
node {
name: dc-prox-24
nodeid: 2
quorum_votes: 1
ring0_addr: 10.192.5.58
ring1_addr: 10.199.0.58
}
node {
name: dc-prox-25
nodeid: 1
quorum_votes: 1
ring0_addr: 10.192.5.59
ring1_addr: 10.199.0.59
}
node {
name: dc-prox-26
nodeid: 5
quorum_votes: 1
ring0_addr: 10.192.5.60
ring1_addr: 10.199.0.60
}
node {
name: dc-prox-27
nodeid: 6
quorum_votes: 1
ring0_addr: 10.192.5.61
ring1_addr: 10.199.0.61
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: CDV6Cluster
config_version: 6
interface {
linknumber: 0
}
interface {
linknumber: 1
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: dc-prox-22
nodeid: 3
quorum_votes: 1
ring0_addr: 10.192.5.56
ring1_addr: 10.199.0.56
}
node {
name: dc-prox-23
nodeid: 4
quorum_votes: 1
ring0_addr: 10.192.5.57
ring1_addr: 10.199.0.57
}
node {
name: dc-prox-24
nodeid: 2
quorum_votes: 1
ring0_addr: 10.192.5.58
ring1_addr: 10.199.0.58
}
node {
name: dc-prox-25
nodeid: 1
quorum_votes: 1
ring0_addr: 10.192.5.59
ring1_addr: 10.199.0.59
}
node {
name: dc-prox-26
nodeid: 5
quorum_votes: 1
ring0_addr: 10.192.5.60
ring1_addr: 10.199.0.60
}
node {
name: dc-prox-27
nodeid: 6
quorum_votes: 1
ring0_addr: 10.192.5.61
ring1_addr: 10.199.0.61
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: CDV6Cluster
config_version: 6
interface {
linknumber: 0
}
interface {
linknumber: 1
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
Thanks in advanced for your help if you find something to test !
Antoine