[SOLVED] Strange issues with updated 4.4-13/7ea56165

udo

Distinguished Member
Apr 22, 2009
5,975
196
163
Ahrensburg; Germany
Hi,
I have an strange effect - I added the last remaining node (pve06) to an now 7-Node cluster (pve01-07).
All nodes has this verision:
Code:
pveversion
pve-manager/4.4-13/7ea56165 (running kernel: 4.4.44-1-pve)
But the last (pve07 + pve06) has also get some updates, which on node pve04 (and pve01,02) is missing:
Code:
root@pve04:~# apt list --upgradable
Listing... Done
eject/stable 2.1.5+deb1+cvs20081104-13.1+deb8u1 amd64 [upgradable from: 2.1.5+deb1+cvs20081104-13.1]
libjasper1/stable 1.900.1-debian1-2.4+deb8u3 amd64 [upgradable from: 1.900.1-debian1-2.4+deb8u2]
libqb0/stable 1.0.1-1 amd64 [upgradable from: 1.0-1]
libsmbclient/stable 2:4.2.14+dfsg-0+deb8u5 amd64 [upgradable from: 2:4.2.14+dfsg-0+deb8u4]
libwbclient0/stable 2:4.2.14+dfsg-0+deb8u5 amd64 [upgradable from: 2:4.2.14+dfsg-0+deb8u4]
proxmox-ve/stable 4.4-86 all [upgradable from: 4.4-84]
pve-cluster/stable 4.0-49 amd64 [upgradable from: 4.0-48]
pve-container/stable 1.0-97 all [upgradable from: 1.0-96]
pve-docs/stable 4.4-4 all [upgradable from: 4.4-3]
pve-firmware/stable 1.1-11 all [upgradable from: 1.1-10]
qemu-server/stable 4.0-110 amd64 [upgradable from: 4.0-109]
samba-common/stable 2:4.2.14+dfsg-0+deb8u5 all [upgradable from: 2:4.2.14+dfsg-0+deb8u4]
samba-libs/stable 2:4.2.14+dfsg-0+deb8u5 amd64 [upgradable from: 2:4.2.14+dfsg-0+deb8u4]
smbclient/stable 2:4.2.14+dfsg-0+deb8u5 amd64 [upgradable from: 2:4.2.14+dfsg-0+deb8u4]
vncterm/stable 1.3-2 amd64 [upgradable from: 1.3-1]
never the less all hosts show the same version on pveversion?!

Now the strange thing: if I want to migrate an VM from pve04 to pve06 I got following error-messages
Code:
root@pve04:~# qm migrate 400 pve06 --online --with-local-disks
no such cluster node 'pve06'
The webfrontend from pve04 also don't show pve06 - the webfrontend from pve06 show all nodes!

The cluster is healthy:
Code:
root@pve04:~# pvecm status
Quorum information
------------------
Date:             Mon Apr 10 20:04:56 2017
Quorum provider:  corosync_votequorum
Nodes:            7
Node ID:          0x00000004
Ring ID:          1/1192
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   7
Highest expected: 7
Total votes:      7
Quorum:           4
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.1.2.11
0x00000002          1 10.1.2.12
0x00000003          1 10.1.2.13
0x00000004          1 10.1.2.14 (local)
0x00000005          1 10.1.2.15
0x00000006          1 10.1.2.16
0x00000007          1 10.1.2.17
and pve06 is in the hosts-file
Code:
root@pve04:~# grep pve06 /etc/hosts
10.1.2.16 pve06.sub.dom.net pve06

Due to strange effects on an cluster, where I update an node with running VMs on the weekend (load go up to 20-30) I would upgrade pve04 if I migrate all VMs to other nodes...

Any hints?

Udo
 
Hi Dietmar,
not on pve04:
Code:
root@pve04:~# qm migrate 634 pve06 --with-local-disks
no such cluster node 'pve06'
root@pve04:~# tail /var/log/syslog
Apr 10 21:10:06 pve04 snmpd[2931]: message repeated 3 times: [ Cannot statfs /var/lib/ceph/osd/ceph-3#012: Permission denied]
Apr 10 21:10:24 pve04 systemd-timesyncd[1580]: interval/delta/delay/jitter/drift 2048s/+0.000s/0.022s/0.000s/+20ppm
Apr 10 21:10:47 pve04 corosync[2952]:  [TOTEM ] Retransmit List: 3426b 3426c 3426d 3426e 3426f 34270 34271 34272
Apr 10 21:10:53 pve04 pmxcfs[2918]: [status] notice: received log
Apr 10 21:11:39 pve04 rrdcached[2788]: flushing old values
Apr 10 21:11:39 pve04 rrdcached[2788]: rotating journals
Apr 10 21:11:39 pve04 rrdcached[2788]: started new journal /var/lib/rrdcached/journal/rrd.journal.1491851499.082374
Apr 10 21:11:39 pve04 rrdcached[2788]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1491844299.082338
Apr 10 21:12:02 pve04 postfix/smtpd[15689]: connect from localhost.localdomain[127.0.0.1]
Apr 10 21:12:02 pve04 postfix/smtpd[15689]: disconnect from localhost.localdomain[127.0.0.1]
and also nothing on pve06.

Udo
 
What is the output of

# cat /etc/pve/.members

Is it the same on all hosts? if not, does it help if you restart pve-cluster. ..
 
Hi Dietmar,
unfortunality I was too fast last night and moved all VMs to other cluster member, upgrade und restart this node.
And on all other nodes, which are not updated yet, the migration work…

I will marked this thread as solved, even it's not possible to find the issue now.

Thanks

Udo
 
Hi again,
I had this issue again - added two nodes, one work and on the second I got the same issue from one older node.
Code:
root@pve07:~# qm migrate 728 pve10 --online                                                                                                                     
no such cluster node 'pve10'
The members conf is old on pve07
Code:
{
"nodename": "pve07",
"version": 15,
"cluster": { "name": "pve-xyz", "version": 17, "nodes": 11, "quorate": 1 },
"nodelist": {
  "pve01": { "id": 1, "online": 1, "ip": "10.x.y.11"},
  "pve02": { "id": 2, "online": 1, "ip": "10.x.y.12"},
  "pve03": { "id": 3, "online": 1, "ip": "10.x.y.13"},
  "pve05": { "id": 5, "online": 1, "ip": "10.x.y.15"},
  "pve06": { "id": 6, "online": 1, "ip": "10.x.y.16"},
  "pve07": { "id": 7, "online": 1, "ip": "10.x.y.17"},
  "pve08": { "id": 8, "online": 1, "ip": "10.x.y.18"},
  "pve09": { "id": 9, "online": 1, "ip": "10.x.y.19"},
  "pve04": { "id": 4, "online": 1, "ip": "10.x.y.14"},
  "pve18": { "id": 18, "online": 1, "ip": "10.x.y.28"},
  "pve19": { "id": 19, "online": 1, "ip": "10.x.y.29"}
  }
}
after restarting corosync on pve07 this node also see pve10 in /etc/pve/.members, but without IP.
After restarting corosync on pve10 after that, also the IP is there and the migration work.

Udo
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!