Glusterfs problems with proxmox 3.4

copymaster

Member
Nov 25, 2009
183
0
16
Hi!

Just built a new cluster with 3 nodes (proxmox 3.4)
i built a glusterfs on 2 of the nodes and configured a replica 2 gluster.
So far so well.

I added the glusterfs in webUI and connected another NFS Share from a NAS.

Now i wanted to restore a backup (from NAS) to the glusterfs storage.

I got plenty lines of errors, saying that all nodes are down. I can not see if vm is restored correctly.

so i decided to take the glusterfs repositories and upgraded gluster packages to 3.6.2.

Glusterfs is running fine, but from the proxmox webui the storage is mounted, but with no files in it. When trying to use it, i got a mount error in webUI.
seems like directory /mnt/pve/<glustervolumename> is there, but it is not mounted, hence no data available.

Then i deleted the store from webui and reconfigured it.
Store is shown under the clusternodes, but no data and still it seems that the glustervolume is not mounted under /mnt/pve/

So my question is:

a) Are there updated "proxmox-gluster" Packages where the error is not thrown?
b) How can i fix the mount error? do i need to downgrade glusterfs packages to the proxmiox versions 3.5.2? How can i do this, when i already have 3.6.2?

Thank you
 
I just started from scratch, reinstalled 3 Servers with proxmox 3.4 and updated all nodes via terminal.
Then i built a cluster and installed glusterfs-server (3.5.2-1) via wget command.

2 of the nodes have a 18 TB raid 5 which is now used for gluster. after configuring gluster ist working fine.

Then i used webinterface to add this gluster storage to proxmox. No error so far. After this i copied an backup lzo file to the server and tried to restore
it via WebUI.
this is the output
















restore vma archive: lzop -d -c /mnt/pve/proxbackup1/dump/vzdump-qemu-123-2015_03_03-20_56_15.vma.lzo|vma extract -v -r /var/tmp/vzdumptmp52590.fifo - /var/tmp/vzdumptmp52590
CFG: size: 814 name: qemu-server.conf
DEV: dev_id=1 size: 53687091200 devname: drive-ide0
DEV: dev_id=2 size: 1099511627776 devname: drive-sata0
DEV: dev_id=3 size: 1099511627776 devname: drive-sata1
CTIME: Tue Mar 3 20:56:15 2015
[2015-03-06 04:07:56.457845] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-1: disconnected from 192.168.0.81:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:07:56.457970] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-03-06 04:07:56.641426] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-03-06 04:07:56.993414] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
Formatting 'gluster://192.168.0.80/proxvolume/images/123/vm-123-disk-1.qcow2', fmt=qcow2 size=53687091200 encryption=off cluster_size=65536 preallocation='metadata' lazy_refcounts=off
[2015-03-06 04:07:57.127885] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
new volume ID is 'proxvolume:123/vm-123-disk-1.qcow2'
map 'drive-ide0' to 'gluster://192.168.0.80/proxvolume/images/123/vm-123-disk-1.qcow2' (write zeros = 0)
[2015-03-06 04:07:57.337287] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-0: disconnected from 192.168.0.80:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:07:57.337456] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-03-06 04:07:57.497248] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-1: disconnected from 192.168.0.81:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:07:57.497357] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-03-06 04:08:02.161368] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-1: disconnected from 192.168.0.81:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:08:02.161456] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
Formatting 'gluster://192.168.0.80/proxvolume/images/123/vm-123-disk-2.qcow2', fmt=qcow2 size=1099511627776 encryption=off cluster_size=65536 preallocation='metadata' lazy_refcounts=off
[2015-03-06 04:08:02.297363] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
new volume ID is 'proxvolume:123/vm-123-disk-2.qcow2'
map 'drive-sata0' to 'gluster://192.168.0.80/proxvolume/images/123/vm-123-disk-2.qcow2' (write zeros = 0)
[2015-03-06 04:08:02.468172] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-1: disconnected from 192.168.0.81:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:08:02.468255] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-03-06 04:08:02.616702] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-03-06 04:08:07.147113] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-1: disconnected from 192.168.0.81:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:08:07.147222] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
Formatting 'gluster://192.168.0.80/proxvolume/images/123/vm-123-disk-3.qcow2', fmt=qcow2 size=1099511627776 encryption=off cluster_size=65536 preallocation='metadata' lazy_refcounts=off
[2015-03-06 04:08:07.279445] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
new volume ID is 'proxvolume:123/vm-123-disk-3.qcow2'
map 'drive-sata1' to 'gluster://192.168.0.80/proxvolume/images/123/vm-123-disk-3.qcow2' (write zeros = 0)
progress 1% (read 22527148032 bytes, duration 317 sec)
 
[2015-03-06 04:07:56.457845] I [client.c:2229:client_rpc_notify] 0-proxvolume-client-1: disconnected from 192.168.0.81:49152. Client process will keep trying to connect to glusterd until brick's port is available
[2015-03-06 04:07:56.457970] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.

This looks like a glusterfs problem, nothing related to proxmox. You should be able to reproduce that problem by writing large junk of data to the mounted glusterfs?
 
Hi Dietmar.

I don't think it is glusterfs realted, because before doing a restore, i copied 4 TB(!) of vm backups to the glusterfs from an NFS mounted store (old fileserver) with no errormessages.

I only get the errors when using the WebUI to restore the backups or from commandline using qmrestore command.

For your information i also thought at once, it could be gluster related, and used the original gluster repo, installed gluster packages from there (3.6.2).
But with this version, the proxmox webUI is not able to mount any glusterfilesystem.
It is shown in webinterface under the nodes, but it is not mounted and though not useable.

So i reinstalled all servers and testet with several raid controller configurations to test if it is LSI Raid related. But the error remains.

I am also able to build a new VM from webinterface with real big harddisks (2x 1TB) with no error.

It is only when using qmrestore.

The virtual machines are restored ok and they are fully working.

When doing a backup of the restored vm's, there's no error either.

I am nearly sure it must be something with qmrestore, it seems like the process is not waiting long enough for glusterfs to be available.

As you can see in the log i posted earlier, for the FIRST vm there are several error-lines. But when i restore a second server the error line is printed only once per virtual hdd like this:

restore vma archive: lzop -d -c /mnt/pve/backup1/dump/vzdump-qemu-121-2015_03_07-01_04_29.vma.lzo|vma extract -v -r /var/tmp/vzdumptmp383547.fifo - /var/tmp/vzdumptmp383547
CFG: size: 385 name: qemu-server.conf
DEV: dev_id=1 size: 55834574848 devname: drive-ide0
DEV: dev_id=2 size: 64424509440 devname: drive-ide1
CTIME: Sat Mar 7 01:04:30 2015
Formatting 'gluster://192.168.0.80/proxvolume/images/121/vm-121-disk-1.raw', fmt=raw size=55834574848
[2015-03-10 03:55:45.551331] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
new volume ID is 'proxvolume:121/vm-121-disk-1.raw'
map 'drive-ide0' to 'gluster://192.168.0.80/proxvolume/images/121/vm-121-disk-1.raw' (write zeros = 0)
Formatting 'gluster://192.168.0.80/proxvolume/images/121/vm-121-disk-2.raw', fmt=raw size=64424509440
[2015-03-10 03:55:45.720353] E [afr-common.c:4168:afr_notify] 0-proxvolume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
new volume ID is 'proxvolume:121/vm-121-disk-2.raw'
map 'drive-ide1' to 'gluster://192.168.0.80/proxvolume/images/121/vm-121-disk-2.raw' (write zeros = 0)
progress 1% (read 1202651136 bytes, duration 32 sec)
progress 2% (read 2405236736 bytes, duration 65 sec)
progress 3% (read 3607822336 bytes, duration 85 sec)


For testing, is it possible that you create actual gluster packages (3.6.2) and make them useable in webUI? I am able to test this version if it is working from webUi.
But with actual packages from gluster.org i can not use the glusterfs from within WebUI.

I use a really big glusterfs replica 2 on two of the servers i have a 18 TB filestore for gluster (4x 6 TB Hdd's in a raid 5)

according to:
http://trac.lliurex.net/pandora/bro...xlators/cluster/afr/src/afr-common.c?rev=6049

it seems that there was a routine was commented out which throws the error?
So i think it is likely that the error will be gone in the actual version of gluster.

Can someone please give me a hint on how to install latest gluster packages so that they work with proxmox? THe WebUI should mount the glustervolume correctly.
 
Last edited:
Hi Dietmar.

I used the binary packages from gluster org for wheezy. THe gluster is working well, but the proxmox webinterface seems not to be able to mount the volumes.
The volumes are shown, but not mounted and i don't know how to debug this.

From commandline, the gluster is working very well with 3.6.2 from gluster.org. only Proxmox webinterface can not use it as it seems.
Do you have any ideas about this?
 
some news for all who are interested:

I installed proxmox 3.4 from scratch.
then edited the repos for using glusterfs from gluster.org.

You can do:
wget -O - http://download.gluster.org/pub/gluster/glusterfs/3.6/3.6.2/Debian/wheezy/pubkey.gpg | apt-key add -
then
echo deb http://download.gluster.org/pub/gluster/glusterfs/3.6/3.6.2/Debian/wheezy/apt wheezy main > /etc/apt/sources.list.d/gluster.list
afterwards do a
apt-get update && apt-get upgrade

now you have gluster 3.6.2 for installed packages.
If you need glusterfs-server, just install them via
apt-get install glusterfs-server

Now you have an actual gluster version for use.
After configuring gluster as you like, build the proxmox cluster as usual.

Now it should be possible to mount the storage and the errors are gone!

Thank you for spending your time to read this !
 
Yes, it does.

Problem was, when shipped gluster is configured, and already used, there seems to be problems when you then upgrade to gluster.org packages.

One should FIRST of all add the gluster repository, upgrade gluster and install gluster-server before doing anything else
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!