Proxmox / GlusterFS: method of operation

behr

New Member
May 11, 2022
5
1
3
Hello together,

I got two proxmox server with an GlusterFS storage. The storage is provided by two GlusterFS servers (mode: replicate) . Currently the mirror route is offline but the VMs on proxmox are still working without any problems. With the help of nload I saw that there got plenty of data written on both servers. So proxmox seems to write on both GlusterFS-hosts.

I am concerned that if you fix the mirror link between the two GlusterFS servers again you will slip into a split-brain as it seems to write to both servers.

Am I correct in assuming that proxmox is writing the data simultaneously to both GlusterFS server?

Has anybody an idea how to resolve the problem between the two GlusterFS hosts and proxmox without getting into a split-brain or how to check if I am in a split-brain situation right now?

Thanks in advance
behr
 
Am I correct in assuming that proxmox is writing the data simultaneously to both GlusterFS server?
GlusterFS server is not managed by Proxmox VE, you need to make sure to deal with all this by yourself.

(Just to note, Ceph is fully managed and supported by Proxmox VE)
 
Has anybody an idea how to resolve the problem between the two GlusterFS hosts and proxmox without getting into a split-brain or how to check if I am in a split-brain situation right now?
I think its pretty safe to say that you are already in split-brain situation...
Google "glusterfs split brain" - first two results seem to be your starting point..


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I think its pretty safe to say that you are already in split-brain situation...
Google "glusterfs split brain" - first two results seem to be your starting point..


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
I also think so, but there is some strange behaviour: if i'm creating a vm on one of the proxmox servers the vm is replicated on both of the glusterfs storages even though they don't have a mirror route. Can you somehow explain this behaviour?
 
Can you somehow explain this behaviour?
No, I am not an expert or even casual user of GlusterFS. My impression of the situation is based solely on the information you provided in the OP.
As @tom said - PVE is just a consumer of the filesystem in this case. PVE has no awareness that its GlusterFS or NTFS. You should seek help on GlusterFS specific resources.

P.S. if you create the file on one side and it appears on the other side, then perhaps the mirror link is not offline...


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
No, I am not an expert or even casual user of GlusterFS. My impression of the situation is based solely on the information you provided in the OP.
As @tom said - PVE is just a consumer of the filesystem in this case. PVE has no awareness that its GlusterFS or NTFS. You should seek help on GlusterFS specific resources.

P.S. if you create the file on one side and it appears on the other side, then perhaps the mirror link is not offline...


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Okay, thanks anyway for your help.

P.S. if you create the file on one side and it appears on the other side, then perhaps the mirror link is not offline...
But I can't ping the servers on the mirror interface. It does not work in both ways
 
Il you have only 2 server for Glusterfs, if one server is down, or communication between the twoserver is down , you are in split Brain.

for the split brain try this : gluster volume status

you see something like this. You should have a "y" on every line

root@p1:~# gluster volume status Status of volume: GlusterEMMC Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.10.5.91:/Data/GlusterEMMC/Brick1 49152 0 Y 1197 Brick 10.10.5.92:/Data/GlusterEMMC/Brick1 49152 0 Y 3757 Brick 10.10.5.93:/Data/GlusterEMMC/Brick1_a 49152 0 Y 895 Brick 10.10.5.91:/Data/GlusterEMMC/Brick2 49153 0 Y 1198 Brick 10.10.5.93:/Data/GlusterEMMC/Brick2 49153 0 Y 909 Brick 10.10.5.92:/Data/GlusterEMMC/Brick2_a 49153 0 Y 3766 Brick 10.10.5.93:/Data/GlusterEMMC/Brick3 49154 0 Y 918 Brick 10.10.5.92:/Data/GlusterEMMC/Brick3 49154 0 Y 3775 Brick 10.10.5.91:/Data/GlusterEMMC/Brick3_a 49154 0 Y 1214 Self-heal Daemon on localhost N/A N/A Y 1248 Self-heal Daemon on 10.10.5.93 N/A N/A Y 1025 Self-heal Daemon on 10.10.5.92 N/A N/A Y 3811 Task Status of Volume GlusterEMMC ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: GlusterSSD Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.10.5.93:/Data/GlusterSSD/Brick1 49155 0 Y 971 Brick 10.10.5.92:/Data/GlusterSSD/Brick1 49155 0 Y 3789 Brick 10.10.5.91:/Data/GlusterSSD/Brick1 49155 0 Y 1226 Self-heal Daemon on localhost N/A N/A Y 1248 Self-heal Daemon on 10.10.5.92 N/A N/A Y 3811 Self-heal Daemon on 10.10.5.93 N/A N/A Y 1025 Task Status of Volume GlusterSSD ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: SSDinterne Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.10.5.92:/Data/SSDinterne/brick1 49156 0 Y 3800 Brick 10.10.5.93:/Data/SSDinterne/brick1 49156 0 Y 982 Brick 10.10.5.91:/Data/SSDinterne/brick1_a 49156 0 Y 1237 Self-heal Daemon on localhost N/A N/A Y 1248 Self-heal Daemon on 10.10.5.92 N/A N/A Y 3811 Self-heal Daemon on 10.10.5.93 N/A N/A Y 1025 Task Status of Volume SSDinterne ------------------------------------------------------------------------------ There are no active volume tasks

With this command : gluster volume name_of_thegluster volume heal info

Code:
root@p1:~# gluster volume heal SSDinterne info
Brick 10.10.5.92:/Data/SSDinterne/brick1
Status: Connected
Number of entries: 0

Brick 10.10.5.93:/Data/SSDinterne/brick1
Status: Connected
Number of entries: 0

Brick 10.10.5.91:/Data/SSDinterne/brick1_a
Status: Connected
Number of entries: 0

You should have "0" everywhere. If not you have problem.

dark26
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!