[SOLVED] Problems on shutdown/boots with nfs/glusterfs Filesystems and HA containers / systemd order of services!

mount -t glusterfs uses GlusterFS native binary. This native client is a FUSE-based client running in user space.

As far as I can see (1, 2) Qemu can even avoid the FUSE overhead via libgfapi / GlusterFS block driver. If you have a VM running on a PVE storage of type Gluster you can see that this is used with
Code:
ps -aux | grep "gluster://"
I guess this is what you mean by gflib?

It does not matter if you mount the Gluster storage via
  • fstab or
  • PVE (GUI),
if you then place a directory storage of PVE on top of it and store a VM on this directory storage, repeating the last grep will be empty in either case. Instead the kvm command will contain something like
Code:
file=/mnt/glusterfstab/images/102/vm-102-disk-0.qcow2

Update: Btw I randomly encountered
Code:
[Fri Aug  7 16:30:53 2020] blk_update_request: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
this error on a local storage, too. So might eventually be unrelated.
 
Last edited:
I guess this is what you mean by gflib?
Yes ... soyrr for confusion ... ;-)

Thank you, as said I will try things as soon as I come to it but need to see if I want to make such fundamental changes to my system before a 2 weeks vacation ;-)
 
@Dominic I have one question still ... Currently my glusterfs storage is called "glusterfs" and so in my /etc/pve/qemu-server/102.conf as example the image location is set as

scsi0: glusterfs:102/vm-102-disk-0.qcow2,size=30G

Ok, so I would need to mount it myself to "somewhere" and then add a directory storage with the name "glusterfs" and it is fine ... ok understood.

But now you write that as soon as I do that the qemu start options is no longer using gluster:// but the "path". So is the optimizations via libgfapi / GlusterFS block driver still applying somehow? When I understand the links then not ... and so this is kind of bad :-(

So the question is still - which PVE process is processing the storage.cfg and mounts all the stuff there ...

The alternative would be to mount the gluster storages in fstab already with the exact same location as pve would do ... question is how pve reacts in this case ... maybe it detects that mount is there and is "ok" for it? Then this could be another idea.
 
Last edited:
I have set the is_mountpoint option ... this helps for the topi con boot very well...

Code:
Aug 10 22:19:40 pm1 systemd[1]: Started PVE Local HA Resource Manager Daemon.
Aug 10 22:19:40 pm1 systemd[1]: Starting PVE guests...
Aug 10 22:19:41 pm1 pve-ha-crm[1276]: status change wait_for_quorum => slave
Aug 10 22:19:43 pm1 systemd[1]: /lib/systemd/system/rpc-statd.service:13: PIDFile= references path below legacy directory /var/run/, updating /var/run/rpc.statd.pid → /run/rpc.statd.pid; please update the unit file accordingly.
Aug 10 22:19:43 pm1 systemd[1]: Reached target Host and Network Name Lookups.
Aug 10 22:19:43 pm1 systemd[1]: Starting Preprocess NFS configuration...
Aug 10 22:19:43 pm1 systemd[1]: nfs-config.service: Succeeded.
Aug 10 22:19:43 pm1 systemd[1]: Started Preprocess NFS configuration.
Aug 10 22:19:43 pm1 systemd[1]: Starting Notify NFS peers of a restart...
Aug 10 22:19:43 pm1 systemd[1]: Starting NFS status monitor for NFSv2/3 locking....
Aug 10 22:19:43 pm1 sm-notify[1345]: Version 1.3.3 starting
Aug 10 22:19:43 pm1 systemd[1]: rpc-statd-notify.service: Succeeded.
Aug 10 22:19:43 pm1 systemd[1]: Started Notify NFS peers of a restart.
Aug 10 22:19:43 pm1 rpc.statd[1347]: Version 1.3.3 starting
Aug 10 22:19:43 pm1 rpc.statd[1347]: Flags: TI-RPC
Aug 10 22:19:43 pm1 systemd[1]: Started NFS status monitor for NFSv2/3 locking..
Aug 10 22:19:43 pm1 kernel: [   27.242558] FS-Cache: Loaded
Aug 10 22:19:43 pm1 kernel: [   27.283334] FS-Cache: Netfs 'nfs' registered for caching
Aug 10 22:19:43 pm1 pve-guests[1288]: <root@pam> starting task UPID:pm1:0000054A:00000AAD:5F31ABDF:startall::root@pam:
Aug 10 22:19:43 pm1 pvesh[1288]: Starting VM 200
Aug 10 22:19:43 pm1 pve-guests[1356]: start VM 200: UPID:pm1:0000054C:00000AB3:5F31ABDF:qmstart:200:root@pam:
Aug 10 22:19:43 pm1 pve-guests[1354]: <root@pam> starting task UPID:pm1:0000054C:00000AB3:5F31ABDF:qmstart:200:root@pam:


Aug 10 22:19:44 pm1 pvestatd[1245]: unable to activate storage 'glusterfs2-container' - directory is expected to be a mount point but is not mounted: '/mnt/pve/glusterfs2'
Aug 10 22:19:44 pm1 systemd[1]: Created slice qemu.slice.
Aug 10 22:19:44 pm1 systemd[1]: Started 200.scope.
Aug 10 22:19:44 pm1 pvestatd[1245]: unable to activate storage 'glusterfs-container' - directory is expected to be a mount point but is not mounted: '/mnt/pve/glusterfs'


Aug 10 22:19:44 pm1 systemd-udevd[471]: Using default interface naming scheme 'v240'.
Aug 10 22:19:44 pm1 systemd-udevd[471]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Aug 10 22:19:44 pm1 systemd-udevd[471]: Could not generate persistent MAC address for tap200i0: No such file or directory
Aug 10 22:19:45 pm1 pve-ha-lrm[1286]: successfully acquired lock 'ha_agent_pm1_lock'
Aug 10 22:19:45 pm1 pve-ha-lrm[1286]: watchdog active
Aug 10 22:19:45 pm1 pve-ha-lrm[1286]: status change wait_for_agent_lock => active
Aug 10 22:19:45 pm1 pve-ha-lrm[1527]: starting service ct:301
Aug 10 22:19:45 pm1 pve-ha-lrm[1528]: starting CT 301: UPID:pm1:000005F8:00000B48:5F31ABE1:vzstart:301:root@pam:
Aug 10 22:19:45 pm1 pve-ha-lrm[1527]: <root@pam> starting task UPID:pm1:000005F8:00000B48:5F31ABE1:vzstart:301:root@pam:
Aug 10 22:19:45 pm1 systemd[1]: Created slice PVE LXC Container Slice.
Aug 10 22:19:45 pm1 systemd[1]: Started PVE LXC Container: 301.
Aug 10 22:19:45 pm1 pve-ha-lrm[1527]: <root@pam> end task UPID:pm1:000005F8:00000B48:5F31ABE1:vzstart:301:root@pam: OK
Aug 10 22:19:45 pm1 pve-ha-lrm[1527]: service status ct:301 started

As you can see he understands that the mount is not done yet and delays that ... so no fs errorson startup anymore.

But on shutdown we have still the same problem :-(


Code:
Aug 10 22:16:17 pm1 systemd[1]: pve-container@403.service: Succeeded.
Aug 10 22:16:17 pm1 systemd[1]: pve-container@401.service: Succeeded.
Aug 10 22:16:18 pm1 systemd[1]: pve-container@402.service: Succeeded.
Aug 10 22:16:18 pm1 pve-guests[9108]: end task UPID:pm1:00002395:0000A45A:5F31AB09:vzshutdown:403:root@pam:
Aug 10 22:16:18 pm1 pve-guests[9108]: end task UPID:pm1:00002399:0000A45D:5F31AB09:vzshutdown:402:root@pam:
Aug 10 22:16:18 pm1 pve-guests[9108]: end task UPID:pm1:000023A3:0000A462:5F31AB09:vzshutdown:401:root@pam:
Aug 10 22:17:42 pm1 kernel: [  514.334873] vmbr0: port 6(tap200i0) entered disabled state
Aug 10 22:17:43 pm1 pve-guests[9108]: end task UPID:pm1:000023AE:0000A46C:5F31AB09:qmshutdown:200:root@pam:
Aug 10 22:17:44 pm1 pve-guests[9108]: all VMs and CTs stopped
Aug 10 22:17:44 pm1 pve-guests[9041]: <root@pam> end task UPID:pm1:00002394:0000A454:5F31AB09:stopall::root@pam: OK
Aug 10 22:17:44 pm1 systemd[1]: pve-guests.service: Succeeded.
Aug 10 22:17:44 pm1 systemd[1]: Stopped PVE guests.
Aug 10 22:17:44 pm1 systemd[1]: Stopping PVE Local HA Resource Manager Daemon...
Aug 10 22:17:44 pm1 systemd[1]: Stopping Proxmox VE LXC Syscall Daemon...
Aug 10 22:17:44 pm1 systemd[1]: Stopping Proxmox VE firewall...
Aug 10 22:17:44 pm1 systemd[1]: Stopping PVE Status Daemon...
Aug 10 22:17:44 pm1 systemd[1]: Stopping PVE SPICE Proxy Server...
Aug 10 22:17:44 pm1 systemd[1]: pve-lxc-syscalld.service: Main process exited, code=killed, status=15/TERM
Aug 10 22:17:44 pm1 systemd[1]: pve-lxc-syscalld.service: Succeeded.
Aug 10 22:17:44 pm1 systemd[1]: Stopped Proxmox VE LXC Syscall Daemon.
Aug 10 22:17:45 pm1 spiceproxy[1281]: received signal TERM
Aug 10 22:17:45 pm1 spiceproxy[1281]: server closing
Aug 10 22:17:45 pm1 spiceproxy[1282]: worker exit
Aug 10 22:17:45 pm1 spiceproxy[1281]: worker 1282 finished
Aug 10 22:17:45 pm1 spiceproxy[1281]: server stopped
Aug 10 22:17:46 pm1 pvestatd[1241]: received signal TERM
Aug 10 22:17:46 pm1 pvestatd[1241]: server closing
Aug 10 22:17:46 pm1 pvestatd[1241]: server stopped
Aug 10 22:17:46 pm1 pve-firewall[1240]: received signal TERM
Aug 10 22:17:46 pm1 pve-firewall[1240]: server closing
Aug 10 22:17:46 pm1 pve-firewall[1240]: clear firewall rules
Aug 10 22:17:46 pm1 pve-firewall[1240]: server stopped
Aug 10 22:17:46 pm1 pve-ha-lrm[1283]: received signal TERM
Aug 10 22:17:46 pm1 pve-ha-lrm[1283]: got shutdown request with shutdown policy 'conditional'
Aug 10 22:17:46 pm1 pve-ha-lrm[1283]: reboot LRM, stop and freeze all services
Aug 10 22:17:46 pm1 systemd[1]: spiceproxy.service: Succeeded.
Aug 10 22:17:46 pm1 systemd[1]: Stopped PVE SPICE Proxy Server.
Aug 10 22:17:47 pm1 systemd[1]: mnt-pve-glusterfs.mount: Succeeded.
Aug 10 22:17:47 pm1 systemd[1]: Unmounted /mnt/pve/glusterfs.
Aug 10 22:17:47 pm1 systemd[1]: mnt-pve-glusterfs2.mount: Succeeded.
Aug 10 22:17:47 pm1 systemd[1]: Unmounted /mnt/pve/glusterfs2.
Aug 10 22:17:47 pm1 systemd[1]: pvestatd.service: Succeeded.
Aug 10 22:17:47 pm1 systemd[1]: Stopped PVE Status Daemon.
Aug 10 22:17:47 pm1 systemd[1]: pve-firewall.service: Succeeded.
Aug 10 22:17:47 pm1 systemd[1]: Stopped Proxmox VE firewall.
Aug 10 22:17:47 pm1 pvefw-logger[680]: received terminate request (signal)
Aug 10 22:17:47 pm1 pvefw-logger[680]: stopping pvefw logger
Aug 10 22:17:47 pm1 systemd[1]: Stopping Proxmox VE firewall logger...
Aug 10 22:17:47 pm1 kernel: [  519.377505] loop: Write error at byte offset 17395712, length 4096.
Aug 10 22:17:47 pm1 kernel: [  519.377528] blk_update_request: I/O error, dev loop1, sector 33976 op 0x1:(WRITE) flags 0x3800 phys_seg 1 prio class 0
Aug 10 22:17:47 pm1 kernel: [  519.377538] Buffer I/O error on dev loop1, logical block 4247, lost sync page write
Aug 10 22:17:47 pm1 kernel: [  519.377557] EXT4-fs error (device loop1): kmmpd:178: comm kmmpd-loop1: Error writing to MMP block
Aug 10 22:17:47 pm1 kernel: [  519.377607] loop: Write error at byte offset 0, length 4096.
Aug 10 22:17:47 pm1 kernel: [  519.377621] blk_update_request: I/O error, dev loop1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Aug 10 22:17:47 pm1 kernel: [  519.377630] blk_update_request: I/O error, dev loop1, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Aug 10 22:17:47 pm1 kernel: [  519.377637] Buffer I/O error on dev loop1, logical block 0, lost sync page write
Aug 10 22:17:47 pm1 kernel: [  519.377653] EXT4-fs (loop1): I/O error while writing superblock
Aug 10 22:17:47 pm1 systemd[1]: pvefw-logger.service: Succeeded.
Aug 10 22:17:47 pm1 systemd[1]: Stopped Proxmox VE firewall logger.
Aug 10 22:17:48 pm1 kernel: [  520.401436] loop: Write error at byte offset 34762752, length 4096.
Aug 10 22:17:48 pm1 kernel: [  520.401462] blk_update_request: I/O error, dev loop0, sector 67896 op 0x1:(WRITE) flags 0x3800 phys_seg 1 pri
...
...
...
Aug 10 22:17:58 pm1 pve-ha-lrm[10361]: stopping service ct:301
Aug 10 22:17:58 pm1 pve-ha-lrm[10362]: shutdown CT 301: UPID:pm1:0000287A:0000CF2D:5F31AB76:vzshutdown:301:root@pam:
Aug 10 22:17:58 pm1 pve-ha-lrm[10361]: <root@pam> starting task UPID:pm1:0000287A:0000CF2D:5F31AB76:vzshutdown:301:root@pam:
Aug 10 22:17:58 pm1 kernel: [  530.425914] loop: Write error at byte offset 343932928, length 4096.
Aug 10 22:17:58 pm1 kernel: [  530.425940] blk_update_request: I/O error, dev loop0, sector 671744 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Aug 10 22:17:58 pm1 kernel: [  530.425959] EXT4-fs warning (device loop0): ext4_end_bio:315: I/O error 10 writing to inode 479 (offset 0 size 0 starting block 83969)
Aug 10 22:17:58 pm1 kernel: [  530.425962] Buffer I/O error on device loop0, logical block 83968
Aug 10 22:17:58 pm1 kernel: [  530.457160] blk_update_request: I/O error, dev loop0, sector 1729136 op 0x0:(READ) flags 0x80700 phys_seg 4 p

He stops all "Non HA" machines, then the hutdown process continues and unmounts the glusterfs and baammm fs done ... and then he stops the HA container 301 :-)

So yes ... mounting via fstab seems the only senseful option ... so the questions above apply :-)
 
Last edited:
I did a bit more research and found some very old Glusterfs code on Github from Glusterfs plugin.

Interesting here is that it is allowed to enter "localhost" as proxmox server topount to the local machine ... never had that idea :) Would be awesome in the docs.

In any case in this code the activate storage method is checking existing mountpoints for the volumename (ip:volume) and path before mounting it himself ... So if I make sure my fstab mount and the pve one are "the same" and I only setup one host (so I think localhost is the idea) then it should work, or ?!
 
Ok, I just tried it (needed to know it :-) ) and yes If I create the exact same mount in fstab as gluster would do and still leaving the glusterfs "storage definition of type glusterfs" existing is exactly the solution. The stuff from fstab is mounted very early and unmounted very late and proxmox storage engine checks if a "matching mount" is existing and because it is exactly the same (name/path wise) it simply detects it as "already mounted". With this (and also the "is_mountpoint" flag on the directory storage what should be unneccessary now, but a good failsafe) anything works as expected.

Thank you @Dominic for your support.
I will try next how it is with nfs but honestly ... why nfs should behave different when mounted via pve then glusterfs? ;-)
 
  • Like
Reactions: arteck
Great that you could solve the problem! If you have a minute, it would be great if you could edit the opening post and mark the thread as solved by editing the prefix :)
 
As an additional information. Starting with glusterfs 8 they introduced a new service that also handles mounts and destroys that again ... so the glusterfssharedstorage.service need to be deactivated and stopped
 
  • Like
Reactions: Dominic

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!