[SOLVED] Proxmox and GlusterFS

RRJ

Member
Apr 14, 2010
245
0
16
Estonia, Tallinn
Hello, dear community.

I've been using Proxmox for a while (total of 5 years :rolleyes: atm) and now decided to attach GlusterFS storage to proxmox nodes.

Configuration of GlusterFS: replicated volume of two bricks on different servers. Servers are connected over 10G network. Using LSI hardware raid.
Configuration of GlusterFS storage on Proxmox: mounted using Proxmox GUI.
The problem:
after restarting the server which was used in Promox GUI to add GlusterFS volume, guest qemu VM drops these messages into the logs:
Code:
[ 3462.988707] end_request: I/O error, dev vda, sector 25458896
[ 3462.989474] end_request: I/O error, dev vda, sector 25458896
[ 3763.435225] end_request: I/O error, dev vda, sector 25458896
[ 3987.913744] end_request: I/O error, dev vda, sector 26304696
[ 3987.917413] end_request: I/O error, dev vda, sector 26304720
[ 3987.917716] end_request: I/O error, dev vda, sector 26304752
[ 3987.917728] end_request: I/O error, dev vda, sector 26304792
[ 3987.917728] end_request: I/O error, dev vda, sector 26304848
[ 3987.917728] end_request: I/O error, dev vda, sector 26304880
[ 3987.917728] end_request: I/O error, dev vda, sector 26304896
[ 3987.917728] end_request: I/O error, dev vda, sector 26304912
[ 3987.917728] end_request: I/O error, dev vda, sector 26304960
[ 3987.917728] end_request: I/O error, dev vda, sector 26304992
[ 3987.917728] end_request: I/O error, dev vda, sector 26297448
[ 3987.917728] end_request: I/O error, dev vda, sector 26297408
[ 3987.917728] end_request: I/O error, dev vda, sector 26297384
[ 3987.917728] end_request: I/O error, dev vda, sector 26297312
[ 3987.917728] end_request: I/O error, dev vda, sector 26297272
[ 3987.921830] end_request: I/O error, dev vda, sector 26304696
[ 3997.914129] end_request: I/O error, dev vda, sector 17097768
[ 3997.914982] end_request: I/O error, dev vda, sector 17097768
[ 3997.915640] end_request: I/O error, dev vda, sector 17097768

and acts like this:

Code:
cat: /var/log/syslog: Input/output error
-bash: /sbin/halt: Input/output error

I know that basically GlusterFS client needs only one server to get the configuration file from the GlusterFS server, but it seems like Promox doesn't know about second server?

storage config:
Code:
glusterfs: FAST-HA-150G
        volume HA-Proxmox-TT-fast-150G
        path /mnt/pve/FAST-HA-150G
        content images,rootdir
        server stor1
        nodes pve1
        maxfiles 1

vmconfig:
Code:
#debian7
bootdisk: virtio0
cores: 2
cpu: host
ide2: none,media=cdrom
memory: 512
name: cacti
net0: virtio=42:01:8D:5A:2C:6C,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
virtio0: FAST-HA-150G:116/vm-116-disk-1.raw,size=17G
and pveversion:
Code:
root@pve1:~# pveversion -vproxmox-ve-2.6.32: 3.2-132 (running kernel: 2.6.32-29-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-2.6.32-31-pve: 2.6.32-132
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

GlusterFS servers version is: 3.4.4



Was looking for answers on forums, google and wiki. Was not able to find an answer.

http://www.jamescoyle.net/how-to/533-glusterfs-storage-mount-in-proxmox this says, that one has to add volume manually to use both servers.

Could someone drop some light on this problem?

[Added later]
seems like someone has the same problem:
http://forum.proxmox.com/threads/18221-Gluster-KVM-problem-on-gluster-node-reboot

fix is here
http://forum.proxmox.com/threads/19058-SOLVED-Proxmox-and-GlusterFS?p=98280#post98280
 
Last edited:
Re: Proxmox and GlusterFS

Seems like the problem was that I didn't give enough time to GlusterFS servers to sync and restarted the other server :).

So it seems like everything is working as expected.
 
Re: Proxmox and GlusterFS

Oh, not true.
It seems like after previous restart the VM was mounted from GlusterFS server2 and the server1 restart did not affected it. This time, before restarting the server2, I tested if there was any SYNC jobs before restarting the second one. There was nothing, so I restarted the second gluster server and result is the same.
Code:
cat: /var/log/kern.log: Input/output error
So proxmox VE does see the second gluster peer, but it seems like VM can't failover to it?
Any ideas?

More information:
After VM restart fsck did it job to clean the root FS.
 
Last edited:
Re: Proxmox and GlusterFS

He-he,
and when I manually mount the glusterfs volume on VE host and add new storage as directory in VE GUI, I face the problem with no-cache and the workaround is to manually edit the qemu conf file and again set the path with only one gluster uri :)

Do I understand right, that it is not possible to run KVM guest on redundant glusterfs volume that way, so it would automagically failover when one of servers is down?
 
Re: Proxmox and GlusterFS

yes, seems it is not possible without problems.
http://joejulian.name/blog/keeping-...hen-encountering-a-ping-timeout-in-glusterfs/

The only use of HA NFS or GlusterFS storages for VM-s are:
1. full proxmox HA cluster (which is overkill for the most small enterprises)
2. fast recovery after one of storage-s goes down (one may remount GFS storage from the second storage server and start machines in state they were before first storage went down).

So basically, the main purpose of GFS is distributed storage.

Dont' forget about Backups, Backups, Backups.
 
Re: Proxmox and GlusterFS

What about using the "backupvolfile-server=" mount option?

gluster1:/datastore /mnt/datastore glusterfs defaults,_netdev,backupvolfile-server=gluster2 0 0




And you can change Gluster's ping-timeout... See
http://gluster.org/community/docume...:_Setting_Volume_Options#network.ping-timeout

-Josh

Hi, joshin and thank you for your time.

If I mount GlusterFS via fstab than I can't start any KVM guest with cache=none, as GlusterFS does not support direct IO :). Have to choose some cache method and lose in performance. And it is not the default method also. And there is no need, actually, to do so, as Proxmox does see both servers when I mount GlusterFS via the GUI. It downloads from server brick's configuration file and knows, where is the second one. And by the way, I've tried this too with cache=pass through. Same result. It just won't go. After one HA storage node fails, one should manually start VM-s from the other HA storage node. At least this way I don't need to remount the GlusterFS via GUI using the second HA server name to start the VM :)

I changed the ping-timeout to 3 seconds. No luck.

Other way I've tried to solve this:

I've installed third machine with GlusterFS mounted on it and shared it to Proxmox via NFS. Result is same. If one glusterfs server fails (where the VM is running from of course), VM remounts root FS in RO and keeps crying about IO errors :)

So... At this moment the only point of any HA storage for me is pretty fast restore of failed VM-s (remount the storage or restart the VM, depends on the method one is using ). Pretty same is said in Proxmox WiKi about DRDB: https://pve.proxmox.com/wiki/Two-Node_High_Availability_Cluster

For this testing configuration, two DRBD resources were created, one for VM images an another one for VMs users data. Thanks to DRBD (if properly configured), a mirror raid is created through the network (be aware that, although possible, using WANs would mean high latencies). As VMs and data is replicated synchronously in both nodes, if one of them fails, it will be possible to restart "dead" machines on the other node without data loss.

And I've found this one. Pretty interesting.

http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html

Also interesting is that there are no replies from proxmox team here. To me at this moment, this behavior seems like a bug.

updated proxmox gluster client version to 3.4.4 like on servers from gluster debian repo. Still no luck.
Seems like GlusterFS and proxmox do not provide HA storage model :(

by the way! after VM locked its FS to read-only (after one of GlusterFS nodes failure), it won't come up by just simply reseting it via GUI. Have to stop it and start it! Seems like initialization problem like stated before in gluster forums. In console I see: no boot device! while downed Gluster node is up and synced.


My next step was:
I've created a VM on local storage. Added a virtio disk from the Gluster Storage via VE GUI. Mounted. Then rebooted one Gluster Node, where this disk was. Same situation: end_request: I/O error. While trying to remount, it says:
mount: special device UUID=a60204f8-b9ac-472f-ba10-0930db1f626b does not exist
and it sees it only after full restart and bringing back the Gluster storage (or change the gluster node name in GUI)

next step was:
a. Mount glusterfs on VE locally
b. Set cache=write through
c. boot previously locally created VM with attached via VE gluster storage
d. disconnect one of the glusted servers from network
result: same.
Buffer I/O error on device vdb1,
root@gluster-vm-local-test:~# umount /mnt/stor
root@gluster-vm-local-test:~# mount -a
mount: special device UUID=a60204f8-b9ac-472f-ba10-0930db1f626b does not exist
After using the reset button:
mount: special device UUID=a60204f8-b9ac-472f-ba10-0930db1f626b does not exist
After using the stop/start buttons:
got my VM back and didn't had to remount the gluster storage with the second gluster node server ip/hostname (because of backupvolfile-server option in fstab)
 
Last edited:
Re: Proxmox and GlusterFS

So two things are running out from my tests:

1. glusterfs and kvm = no go for live failover. VMs keep locking FS to readonly after one of GlusterFS nodes is down / the one from VM was booted / (workaround: errors=continue in fstab, but it is dangerous). To get them back have to stop VM, change GlusterFS node name to one that is UP and start it again. Reset is not working neither, after reset it says, that there is no such storage attached. Of course, Proxmox GUI allows to add only one Gluster Server. Setting ping-timeout to 2 seconds does in gluster volumes does not help. Someone should take care of this. Its definitely an initialization BUG! And it seems like it is in libgfapi or kvm.
2. using proxmox GUI to add GlusterFS storage one will have to manually remount it with second Gluster node name in GUI to get machines back if another node fails. Mounting GlusterFS from fstab with backupvolfile-server option works like a charm (same using gluster config file in fstab). Proxmox team should fix this.
 
Last edited:
Re: Proxmox and GlusterFS

Hey,

I've almost broke my brains.
Have anyone ever managed to set up glusterfs replicated storage that way, so it works with automatic failover ?

I mean this situation:
1. installed replicated storage
2. mounted it in proxmox
3. create a VM on this storage
4. shut one of gluster bricks down.
5. created VM is runnining.
6. power on downed brick
7. wait for the sync on both bricks
8. shut other brick down
9. VM runs.

anyone? please, share the setup configuration.
 
Re: Proxmox and GlusterFS

If someone ever will find himself in the same situation:
answer is:
gluster volume set <volname> cluster.self-heal-daemon off on the volume, where your VM disks are.
BTW, don't test VMs with files generated from /dev/zero. Some kind of bug, files remain unsynced and you may lose your VM (qemu won't be able to read qcow2 file headers)

 
Re: Proxmox and GlusterFS

So two things are running out from my tests:

1. glusterfs and kvm = no go for live failover. VMs keep locking FS to readonly after one of GlusterFS nodes is down / the one from VM was booted / (workaround: errors=continue in fstab, but it is dangerous). To get them back have to stop VM, change GlusterFS node name to one that is UP and start it again. Reset is not working neither, after reset it says, that there is no such storage attached. Of course, Proxmox GUI allows to add only one Gluster Server. Setting ping-timeout to 2 seconds does in gluster volumes does not help. Someone should take care of this. Its definitely an initialization BUG! And it seems like it is in libgfapi or kvm.
2. using proxmox GUI to add GlusterFS storage one will have to manually remount it with second Gluster node name in GUI to get machines back if another node fails. Mounting GlusterFS from fstab with backupvolfile-server option works like a charm (same using gluster config file in fstab). Proxmox team should fix this.

- Can somebody tell the status and behaviour is changed, with the setup like this thread, with proxmox4 and glusterfs 3.5.2?
- Is it working stable now?
- What is the preferred configuration for a KVM node with its disk options? Default or write through?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!