Hallo,
ich kann leider keinen neuen Node mehr in meinen Cluster hinzufügen.
Alle Proxmox Systeme laufen auf 7.2.7 inkl. dem neuen Node - ich bin langsam etwas ratlos, da alle anderen Nodes konnten bisher problemlos hinzugefügt werden
Durchgeführt wurde der Cluster ADD über die GUI (root user)
Alle Systeme können sich gegenseitig anpingen (alle Routen stehen und laufen).
Anbei einige Logs
Alle Proxmox Nodes wurden wenige Tage vor dem Add Node, neu gestartet.
Ich wäre wirklich sehr dankbar für einige Ideen - dies ist bereits der 4. fehlgeschlagene Versuch :-(
Falls mehr Logs benötigt werden (z.B. von einem System innerhalb des Clusters, füge ich diese gerne bei.
Viele Grüße Thorsten
ich kann leider keinen neuen Node mehr in meinen Cluster hinzufügen.
Alle Proxmox Systeme laufen auf 7.2.7 inkl. dem neuen Node - ich bin langsam etwas ratlos, da alle anderen Nodes konnten bisher problemlos hinzugefügt werden
Durchgeführt wurde der Cluster ADD über die GUI (root user)
Alle Systeme können sich gegenseitig anpingen (alle Routen stehen und laufen).
Anbei einige Logs
Mein Node, welcher in den Cluster soll
cat syslog
Aug 1 06:57:06 VMS04ZW pvedaemon[1428]: <root@pam> starting task UPID:VMS04ZW:002D7780:08C091F8:62E75D22:clusterjoin::root@pam:
Aug 1 06:57:06 VMS04ZW systemd[1]: Stopping The Proxmox VE cluster filesystem...
Aug 1 06:57:06 VMS04ZW pmxcfs[1367]: [main] notice: teardown filesystem
Aug 1 06:57:06 VMS04ZW systemd[1]: etc-pve.mount: Succeeded.
Aug 1 06:57:16 VMS04ZW systemd[1]: pve-cluster.service: State 'stop-sigterm' timed out. Killing.
Aug 1 06:57:16 VMS04ZW systemd[1]: pve-cluster.service: Killing process 1367 (pmxcfs) with signal SIGKILL.
Aug 1 06:57:16 VMS04ZW systemd[1]: pve-cluster.service: Main process exited, code=killed, status=9/KILL
Aug 1 06:57:16 VMS04ZW systemd[1]: pve-cluster.service: Failed with result 'timeout'.
Aug 1 06:57:16 VMS04ZW systemd[1]: Stopped The Proxmox VE cluster filesystem.
Aug 1 06:57:16 VMS04ZW systemd[1]: pve-cluster.service: Consumed 8min 53.388s CPU time.
Aug 1 06:57:16 VMS04ZW systemd[1]: Starting Corosync Cluster Engine...
Aug 1 06:57:16 VMS04ZW systemd[1]: Starting The Proxmox VE cluster filesystem...
Aug 1 06:57:16 VMS04ZW pveproxy[2403006]: ipcc_send_rec[1] failed: Permission denied
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [quorum] crit: quorum_initialize failed: 2
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [quorum] crit: can't initialize service
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [confdb] crit: cmap_initialize failed: 2
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [confdb] crit: can't initialize service
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [dcdb] crit: cpg_initialize failed: 2
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [dcdb] crit: can't initialize service
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [status] crit: cpg_initialize failed: 2
Aug 1 06:57:16 VMS04ZW pmxcfs[2979728]: [status] crit: can't initialize service
Aug 1 06:57:16 VMS04ZW corosync[2979725]: [MAIN ] Corosync Cluster Engine 3.1.5 starting up
Aug 1 06:57:16 VMS04ZW corosync[2979725]: [MAIN ] Corosync built-in features: dbus monitoring watchdog systemd xmlconf vqsim nozzle snmp pie relro bindnow
Aug 1 06:57:16 VMS04ZW corosync[2979725]: [TOTEM ] Initializing transport (Kronosnet).
Aug 1 06:57:16 VMS04ZW kernel: [1468391.902467] sctp: Hash tables configured (bind 1024/1024)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [TOTEM ] totemknet initialized
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] common: crypto_nss.so has been loaded from /usr/lib/x86_64-linux-gnu/kronosnet/crypto_nss.so
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync configuration map access [0]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QB ] server name: cmap
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync configuration service [1]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QB ] server name: cfg
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QB ] server name: cpg
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync profile loading service [4]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync resource monitoring service [6]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [WD ] Watchdog not enabled by configuration
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [WD ] resource load_15min missing a recovery key.
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [WD ] resource memory_used missing a recovery key.
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [WD ] no resources configured.
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync watchdog service [7]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QUORUM] Using quorum provider corosync_votequorum
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QB ] server name: votequorum
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QB ] server name: quorum
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [TOTEM ] Configuring link 0
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [TOTEM ] Configured link number 0: local addr: 192.168.5.225, port=5405
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 0)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 0)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 1 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 1 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 1 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QUORUM] Sync members[1]: 7
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QUORUM] Sync joined[1]: 7
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [TOTEM ] A new membership (7.cca4) was formed. Members joined: 7
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 0)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 4 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 4 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 4 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 4 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 0)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 3 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 3 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [QUORUM] Members[1]: 7
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 3 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 0)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 has no active links
Aug 1 06:57:17 VMS04ZW systemd[1]: Started Corosync Cluster Engine.
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 6 (passive) best link: 0 (pri: 0)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 6 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 6 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 6 has no active links
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 6 (passive) best link: 0 (pri: 1)
Aug 1 06:57:17 VMS04ZW corosync[2979725]: [KNET ] host: host: 6 has no active links
Aug 1 06:57:17 VMS04ZW pve-firewall[1399]: status update error: Connection refused
Aug 1 06:57:17 VMS04ZW pve-firewall[1399]: firewall update time (10.050 seconds)
Aug 1 06:57:17 VMS04ZW pve-firewall[1399]: status update error: Connection refused
Aug 1 06:57:17 VMS04ZW systemd[1]: Started The Proxmox VE cluster filesystem.
Aug 1 06:57:17 VMS04ZW pvestatd[1401]: status update time (8.275 seconds)
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] rx: host: 2 link: 0 is up
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] rx: host: 5 link: 0 is up
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] pmtud: PMTUD link change for host: 2 link: 0 from 469 to 1397
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] pmtud: PMTUD link change for host: 5 link: 0 from 469 to 1397
Aug 1 06:57:20 VMS04ZW corosync[2979725]: [KNET ] pmtud: Global data MTU changed to: 1397
Aug 1 06:57:22 VMS04ZW pmxcfs[2979728]: [status] notice: update cluster info (cluster name proxmox-cluster, version = 13)
Aug 1 06:57:35 VMS04ZW corosync[2979725]: [QUORUM] Sync members[1]: 7
Aug 1 06:57:35 VMS04ZW corosync[2979725]: [TOTEM ] A new membership (7.cd6f) was formed. Members
Aug 1 06:57:35 VMS04ZW corosync[2979725]: [QUORUM] Members[1]: 7
Aug 1 06:57:35 VMS04ZW corosync[2979725]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 1 06:57:49 VMS04ZW corosync[2979725]: [QUORUM] Sync members[1]: 7
Aug 1 06:57:49 VMS04ZW corosync[2979725]: [TOTEM ] A new membership (7.cd73) was formed. Members
Aug 1 06:57:49 VMS04ZW corosync[2979725]: [QUORUM] Members[1]: 7
Aug 1 06:57:49 VMS04ZW corosync[2979725]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 1 06:57:49 VMS04ZW pmxcfs[2979728]: [dcdb] notice: members: 7/2979728
Aug 1 06:57:49 VMS04ZW pmxcfs[2979728]: [dcdb] notice: all data is up to date
Aug 1 06:57:49 VMS04ZW pmxcfs[2979728]: [status] notice: members: 7/2979728
Aug 1 06:57:49 VMS04ZW pmxcfs[2979728]: [status] notice: all data is up to date
Aug 1 06:58:01 VMS04ZW cron[1391]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
Aug 1 06:58:03 VMS04ZW pvescheduler[2979832]: jobs: cfs-lock 'file-jobs_cfg' error: no quorum!
Aug 1 06:58:03 VMS04ZW pvescheduler[2979831]: replication: cfs-lock 'file-replication_cfg' error: no quorum!
Aug 1 06:58:03 VMS04ZW corosync[2979725]: [QUORUM] Sync members[1]: 7
Aug 1 06:58:03 VMS04ZW corosync[2979725]: [TOTEM ] A new membership (7.cd77) was formed. Members
Aug 1 06:58:10 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 10
Aug 1 06:58:11 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 20
Aug 1 06:58:12 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 30
Aug 1 06:58:13 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 40
Aug 1 06:58:14 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 50
Aug 1 06:58:15 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 60
Aug 1 06:58:16 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 70
Aug 1 06:58:17 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retry 80
Aug 1 06:58:17 VMS04ZW corosync[2979725]: [QUORUM] Sync members[1]: 7
Aug 1 06:58:17 VMS04ZW corosync[2979725]: [TOTEM ] A new membership (7.cd7b) was formed. Members
Aug 1 06:58:17 VMS04ZW corosync[2979725]: [QUORUM] Members[1]: 7
Aug 1 06:58:17 VMS04ZW corosync[2979725]: [MAIN ] Completed service synchronization, ready to provide service.
Aug 1 06:58:17 VMS04ZW pmxcfs[2979728]: [status] notice: cpg_send_message retried 89 times
Aug 1 06:58:17 VMS04ZW pvestatd[1401]: status update time (8.930 seconds)
Aug 1 06:58:32 VMS04ZW corosync[2979725]: [QUORUM] Sync members[1]: 7
pvecm status
Cluster information
-------------------
Name: proxmox-cluster
Config Version: 13
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Mon Aug 1 13:53:36 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000007
Ring ID: 7.e863
Quorate: No
Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 1
Quorum: 4 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000007 1 192.168.5.225 (local)
cat /etc/pve/corosync.conf
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: VMS04ZW
nodeid: 7
quorum_votes: 1
ring0_addr: 192.168.5.225
}
node {
name: vms05zwtest
nodeid: 5
quorum_votes: 1
ring0_addr: 192.168.5.226
}
node {
name: vms06zw
nodeid: 1
quorum_votes: 1
ring0_addr: 10.1.5.162
}
node {
name: vms09zw
nodeid: 4
quorum_votes: 1
ring0_addr: 192.168.5.236
}
node {
name: vms11zw
nodeid: 3
quorum_votes: 1
ring0_addr: 10.1.5.166
}
node {
name: vms12zw
nodeid: 2
quorum_votes: 1
ring0_addr: 10.1.5.161
}
node {
name: vms14zw
nodeid: 6
quorum_votes: 1
ring0_addr: 192.168.5.130
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: proxmox-cluster
config_version: 13
interface {
linknumber: 0
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
Alle Proxmox Nodes wurden wenige Tage vor dem Add Node, neu gestartet.
Ich wäre wirklich sehr dankbar für einige Ideen - dies ist bereits der 4. fehlgeschlagene Versuch :-(
Falls mehr Logs benötigt werden (z.B. von einem System innerhalb des Clusters, füge ich diese gerne bei.
Viele Grüße Thorsten