kein Start von Proxmox nach Update mit Subscribtion

ednt

Well-Known Member
Mar 16, 2017
107
7
58
Nach dem letzten Update startet Proxmox nicht mehr im Cluster
Alle LXC und VM starten nicht, Gui nicht erreichbar.

1 Cluster mit 4 Systemen mit Ceph, alle gleich konfiguriert

Systeme 1 & 2 kein start von Proxmox nach Reboot. (Reboot von alleine durchgeführt)

pveversion -v ergibt :
proxmox-ve: 5.2-2 (running kernel: 4.15.18-5-pve)
pve-manager: 5.2-9 (running version: 5.2-9/4b30e8f9)
pve-kernel-4.15: 5.2-8
pve-kernel-4.15.18-5-pve: 4.15.18-24
pve-kernel-4.15.18-4-pve: 4.15.18-23
ceph: 12.2.8-pve1
corosync: 2.4.2-pve5
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-38
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-29
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-2
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-20
pve-cluster: 5.0-30
pve-container: 2.0-27
pve-docs: 5.2-8
pve-firewall: 3.0-14
pve-firmware: 2.0-5
pve-ha-manager: 2.0-5
pve-i18n: 1.0-6
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.11.2-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-35
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.11-pve1~bpo1

pveproxy status hängt

service pve-cluster status gibt :
pmxcfs[2266]: [status] crit: cpg_send_message failed: 6

service Corosync status :
Oct 05 21:29:42 srv-pm-172 corosync[2481]: warning [CPG ] downlist left_list: 0 received
Oct 05 21:29:42 srv-pm-172 corosync[2481]: [CPG ] downlist left_list: 0 received
Oct 05 21:29:42 srv-pm-172 corosync[2481]: warning [CPG ] downlist left_list: 0 received
Oct 05 21:29:42 srv-pm-172 corosync[2481]: [CPG ] downlist left_list: 0 received
Oct 05 21:29:42 srv-pm-172 corosync[2481]: [CPG ] downlist left_list: 0 received
Oct 05 21:29:42 srv-pm-172 corosync[2481]: [CPG ] downlist left_list: 0 received
Oct 05 21:29:42 srv-pm-172 corosync[2481]: notice [QUORUM] Members[5]: 1 2 3 4 5
Oct 05 21:29:42 srv-pm-172 corosync[2481]: notice [MAIN ] Completed service synchronization, ready to provide service.
Oct 05 21:29:42 srv-pm-172 corosync[2481]: [QUORUM] Members[5]: 1 2 3 4 5
Oct 05 21:29:42 srv-pm-172 corosync[2481]: [MAIN ] Completed service synchronization, ready to provide service.

service pve-cluster status
liefert:
pve-cluster.service - The Proxmox VE cluster filesystem
Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2018-10-05 22:43:34 CEST; 6min ago
Main PID: 2288 (pmxcfs)
Tasks: 9 (limit: 4915)
Memory: 72.0M
CPU: 1.920s
CGroup: /system.slice/pve-cluster.service
└─2288 /usr/bin/pmxcfs

Oct 05 22:48:11 srv-pm-171 pmxcfs[2288]: [status] crit: cpg_initialize failed: 2
Oct 05 22:48:11 srv-pm-171 pmxcfs[2288]: [status] crit: can't initialize service
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [status] notice: update cluster info (cluster name ednt-170, version = 7)
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [status] notice: node has quorum
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [dcdb] notice: members: 1/2288, 2/2266, 3/2434, 4/2217, 5/12906
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [dcdb] notice: starting data syncronisation
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [dcdb] notice: received sync request (epoch 1/2288/00000002)
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [status] notice: members: 1/2288, 2/2266, 3/2434, 4/2217, 5/12906
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [status] notice: starting data syncronisation
Oct 05 22:48:17 srv-pm-171 pmxcfs[2288]: [status] notice: received sync request (epoch 1/2288/00000002)


Sync Netz mit Omoing geprüft : keine Verluste

Journal xe liefert:
-- Unit pve-container@112.service has begun starting up.
Oct 05 22:55:21 srv-pm-171 kernel: INFO: task pve-firewall:2901 blocked for more than 120 seconds.
Oct 05 22:55:21 srv-pm-171 kernel: Tainted: P IO 4.15.18-5-pve #1
Oct 05 22:55:21 srv-pm-171 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 05 22:55:21 srv-pm-171 kernel: pve-firewall D 0 2901 1 0x00000000

Ceph health is Ok
 
Last edited:
Hi,

probier bitte mal vom vorherigen Kernel zu booten.
 
Lösung:

Auf allen 4 Systemn wurde ein update durchgeführt
1 system rebootete von Selbst, daraugf hin hing der gesamte Cluster (Vermutlich Corosync)
Der gesamte Cluster musste neu gestartet werden.
Danach liefen die Proxmox Prozesse.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!