Hi,
this evening I did upgrade Proxmox from 4.2 to 4.3-12/6894c9d9. I should kill myself rather than doing that...
After "upgrade" - better say destroy, to 4.3, Proxmox behave like, well...
First of all. IS THERE ANY SAVE WAY HOW TO DOWNGRADE TO 4.2 AND HAVE WORKING CLUSTER AGAIN?
I did run 3 node cluster with one small node as an arbiter for GlusterFS. In Prox. 4.3 now most of VMs DON`T EVEN FINISH STARTING THE OS. VM is booting for a few seconds maybe 10-15 second normally (I can see Linux daemons starting, for example), BUT THAN IT TURNS OFF IN A SECOND. And this is repeating and repating for VMs with HA. VMs without HA do it just once, of course. Some VM start normally, some don`t. I have backup of all the VMs made every day or week, but none of backups I restored was able to make VM run normally again. And whole the cluster did run lets say nicely since damn "upgrade" today.
GlusterFS says that NO SPLIT-BRAIN... see bellow
Some of the VM were MikrotikOS. And none of them is able to start again. Some were stopped during the upgrade, some not. But it doesnt make difference. Restoring Mikrotik VM didn`t help. AND EVEN INSTALLING NEW VM WITH MIKROTIK IS NOT POSSIBLE NOW. VM is able to boot from ISO, but in a few seconds of installing RouterOS is VM turned off, * * * !
I have 2 Windows XP VM, none starts.
In syslog, messages, no suspicious message can be found.
Some command outputs:
[root@Proxmox-1 log]$ pvecm status
Quorum information
------------------
Date: Wed Nov 30 00:24:09 2016
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1/280
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.170.100 (local)
0x00000002 1 192.168.170.102
0x00000003 1 192.168.170.120
[root@Proxmox-1 log]$ pvecm nodes
Membership information
----------------------
Nodeid Votes Name
1 1 Proxmox-1 (local)
2 1 Proxmox-2
3 1 Proxmox-quorum
[root@Proxmox-1 log]$ gluster volume heal gluster_volume_0 info
Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
[root@Proxmox-1 log]$ gluster volume info
Volume Name: gluster_volume_0
Type: Replicate
Volume ID: 014b8ec6-9934-421a-ac33-0a75e884eaec
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Brick2: 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Brick3: 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
cluster.quorum-type: auto
[root@Proxmox-1 log]$ gluster volume heal gluster_volume_0 info
Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
[root@Proxmox-1 log]$ gluster volume heal gluster_volume_0 info split-brain
Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries in split-brain: 0
Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries in split-brain: 0
Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries in split-brain: 0
[root@Proxmox-1 log]$ gluster peer status
Number of Peers: 2
Hostname: 192.168.170.120
Uuid: 00807e8e-c600-4025-bd3e-8b2a5c2ebbfd
State: Peer in Cluster (Connected)
Hostname: 192.168.170.102
Uuid: 8837af84-a446-44e3-bcd2-dc8d037a268e
State: Peer in Cluster (Connected)
[root@Proxmox-1 log]$ gluster volume status all detail
Status of volume: gluster_volume_0
------------------------------------------------------------------------------
Brick : Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 1818
File System : xfs
Device : /dev/mapper/pve-LV--pro--GlusterFS
Mount Options : rw,relatime,attr2,inode64,noquota
Inode Size : 512
Disk Space Free : 575.2GB
Total Disk Space : 817.0GB
Inode Count : 428529664
Free Inodes : 428529287
------------------------------------------------------------------------------
Brick : Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 1689
File System : xfs
Device : /dev/mapper/pve-LV--pro--GlusterFS
Mount Options : rw,relatime,attr2,inode64,noquota
Inode Size : 512
Disk Space Free : 576.4GB
Total Disk Space : 817.0GB
Inode Count : 428529664
Free Inodes : 428529287
------------------------------------------------------------------------------
Brick : Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 1613
File System : xfs
Device : /dev/mapper/pve-LV--pro--GlusterFS
Mount Options : rw,relatime,attr2,inode64,noquota
Inode Size : 512
Disk Space Free : 39.9GB
Total Disk Space : 40.0GB
Inode Count : 20971520
Free Inodes : 20971143
this evening I did upgrade Proxmox from 4.2 to 4.3-12/6894c9d9. I should kill myself rather than doing that...
After "upgrade" - better say destroy, to 4.3, Proxmox behave like, well...
First of all. IS THERE ANY SAVE WAY HOW TO DOWNGRADE TO 4.2 AND HAVE WORKING CLUSTER AGAIN?
I did run 3 node cluster with one small node as an arbiter for GlusterFS. In Prox. 4.3 now most of VMs DON`T EVEN FINISH STARTING THE OS. VM is booting for a few seconds maybe 10-15 second normally (I can see Linux daemons starting, for example), BUT THAN IT TURNS OFF IN A SECOND. And this is repeating and repating for VMs with HA. VMs without HA do it just once, of course. Some VM start normally, some don`t. I have backup of all the VMs made every day or week, but none of backups I restored was able to make VM run normally again. And whole the cluster did run lets say nicely since damn "upgrade" today.
GlusterFS says that NO SPLIT-BRAIN... see bellow
Some of the VM were MikrotikOS. And none of them is able to start again. Some were stopped during the upgrade, some not. But it doesnt make difference. Restoring Mikrotik VM didn`t help. AND EVEN INSTALLING NEW VM WITH MIKROTIK IS NOT POSSIBLE NOW. VM is able to boot from ISO, but in a few seconds of installing RouterOS is VM turned off, * * * !
I have 2 Windows XP VM, none starts.
In syslog, messages, no suspicious message can be found.
Some command outputs:
[root@Proxmox-1 log]$ pvecm status
Quorum information
------------------
Date: Wed Nov 30 00:24:09 2016
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1/280
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.170.100 (local)
0x00000002 1 192.168.170.102
0x00000003 1 192.168.170.120
[root@Proxmox-1 log]$ pvecm nodes
Membership information
----------------------
Nodeid Votes Name
1 1 Proxmox-1 (local)
2 1 Proxmox-2
3 1 Proxmox-quorum
[root@Proxmox-1 log]$ gluster volume heal gluster_volume_0 info
Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
[root@Proxmox-1 log]$ gluster volume info
Volume Name: gluster_volume_0
Type: Replicate
Volume ID: 014b8ec6-9934-421a-ac33-0a75e884eaec
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Brick2: 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Brick3: 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
cluster.quorum-type: auto
[root@Proxmox-1 log]$ gluster volume heal gluster_volume_0 info
Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries: 0
[root@Proxmox-1 log]$ gluster volume heal gluster_volume_0 info split-brain
Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries in split-brain: 0
Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries in split-brain: 0
Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
Status: Connected
Number of entries in split-brain: 0
[root@Proxmox-1 log]$ gluster peer status
Number of Peers: 2
Hostname: 192.168.170.120
Uuid: 00807e8e-c600-4025-bd3e-8b2a5c2ebbfd
State: Peer in Cluster (Connected)
Hostname: 192.168.170.102
Uuid: 8837af84-a446-44e3-bcd2-dc8d037a268e
State: Peer in Cluster (Connected)
[root@Proxmox-1 log]$ gluster volume status all detail
Status of volume: gluster_volume_0
------------------------------------------------------------------------------
Brick : Brick 192.168.170.100:/export/pve-LV--pro--GlusterFS/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 1818
File System : xfs
Device : /dev/mapper/pve-LV--pro--GlusterFS
Mount Options : rw,relatime,attr2,inode64,noquota
Inode Size : 512
Disk Space Free : 575.2GB
Total Disk Space : 817.0GB
Inode Count : 428529664
Free Inodes : 428529287
------------------------------------------------------------------------------
Brick : Brick 192.168.170.102:/export/pve-LV--pro--GlusterFS/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 1689
File System : xfs
Device : /dev/mapper/pve-LV--pro--GlusterFS
Mount Options : rw,relatime,attr2,inode64,noquota
Inode Size : 512
Disk Space Free : 576.4GB
Total Disk Space : 817.0GB
Inode Count : 428529664
Free Inodes : 428529287
------------------------------------------------------------------------------
Brick : Brick 192.168.170.120:/export/pve-LV--pro--GlusterFS/brick
TCP Port : 49152
RDMA Port : 0
Online : Y
Pid : 1613
File System : xfs
Device : /dev/mapper/pve-LV--pro--GlusterFS
Mount Options : rw,relatime,attr2,inode64,noquota
Inode Size : 512
Disk Space Free : 39.9GB
Total Disk Space : 40.0GB
Inode Count : 20971520
Free Inodes : 20971143
Last edited: