Would Proxmox please consider providing a cephfs storage mode in /etc/pve/storage.cfg, to define a dir as a shared resource? We are using CephFS to access ISO images, having simply mounted it as /var/lib/vz.
NB: Performance of CephFS would not make this ideal, if at all usable, for image or backup storage.
We have 5 nodes (kvm5a/b/c/d/e) of which the first 3 operate as Ceph monitors (kvm5a/b/c). Herewith the steps to provide replicated file storage on all nodes:
Install Ceph MDS binaries on nodes running as monitors:
Code:
apt-get -y install ceph-mds;
Edit Ceph configuration file (vi /etc/ceph/ceph.conf) and define nodes running as monitors as CephFS gateways (active/failover):
Code:
[mds]
mds data = /var/lib/ceph/mds/$cluster-$id
keyring = /var/lib/ceph/mds/$cluster-$id/keyring
[mds.kvm5a]
host = kvm5a
[mds.kvm5b]
host = kvm5b
[mds.kvm5c]
host = kvm5c
Run the following on nodes running as monitors (change 'id' to match node's name):
Code:
id='kvm5a';
mkdir -p /var/lib/ceph/mds/ceph-$id;
ceph auth get-or-create mds.$id mds 'allow ' osd 'allow *' mon 'allow rwx' > /var/lib/ceph/mds/ceph-$id/keyring;
chown ceph.ceph /var/lib/ceph/mds -R;
systemctl enable ceph-mds@$id;
systemctl start ceph-mds@$id;
systemctl status ceph-mds@$id;
Create CephFS pools (we set number of placement groups as 2 x cluster OSD count):
Code:
ceph osd pool create cephfs_data 40;
ceph osd pool create cephfs_metadata 40;
ceph fs new cephfs cephfs_metadata cephfs_data;
Confirm that everything is healthy and that you don't now have too many placement groups:
PS: Expect to see something like 'fsmap e7: 1/1/1 up {0=kvm5b=up:active}, 2 up:standby'
Ceph nodes not running as monitors would need to have the following binaries installed (they are automatically included as dependencies of ceph-mds on the nodes running as monitors):
Code:
apt-get -y install ceph-fuse
Lastly configure file system table (vi /etc/fstab) to mount CephFS volume:
Code:
id=admin,conf=/etc/ceph/ceph.conf /var/lib/vz fuse.ceph defaults,_netdev,noauto,nonempty,x-systemd.requires=ceph.target,x-systemd.automount 0 0
PS: You will probably want to mount this as something like '/mnt' first to test and to then copy the content of the existing '/var/lib/vz' folder (ie 'rsync -aHvx --delete /var/lib/vz/ /mnt/') before unmounting /var/lib/vz and remounting it using CephFS.
Note:
Ceph MDS is not active/active so failure of the current active MDS master (eg rebooting) results in the current master timing out, prior to a new master being elected. No manual commands need to be run to initiate this process and it therefor behaves like a fully redundant and replicated shared file system.
For reference purposes, herewith our Proxmox storage configuration file (/etc/pve/storage.cfg):
Code:
dir: local
path /var/lib/vz
maxfiles 0
content vztmpl,backup,iso,rootdir
rbd: virtuals
monhost 10.254.1.3;10.254.1.4;10.254.1.5
content images,rootdir
pool rbd
krbd 1
username admin
NB: Having a guest's cdrom attached to an ISO image prevents live migration. We're essentially asking Proxmox 5 to include functionality to tell Proxmox that a given directory is shared storage (should also work when mounting a directory using Samba) to allow live migrations of guests which have mapped ISOs...