[TUTORIAL] PVE 7.x Cluster Setup of shared LVM/LV with MSA2040 SAS [partial howto]

glowfisch

New Member
Aug 9, 2020
6
0
1
25
1. as said this was a research-project trying to bend behaviour to my needs, fencing gave alot of issues, so i turned it off, and never looked back to be honest.
2. i never had a full cluster/network fallout, so i have not reproduced this behaviour.
3. not have had that issue.
4. i am atm running latest pve-kernel-5.4/stable 6.2-6

My setup atm = 4x DL360Gen7 + MSA2040SAS+ 1 shelf added

Tell me, please, what about you network configuration? Have you link aggregation? What's the switch? Cisco? Microtik?
 

glowfisch

New Member
Aug 9, 2020
6
0
1
25
Something i had forgotten to mention in the whole previous is that the directory being offered to Proxmox is not set to shared.
As the GFS2 filesystem takes care of this by itself it is not needed to set the directory to 'shared'
When I try to migrate VM -> proxmox copyied qcow2 disk at the same storage with different number.
Only checkbox "shared" give me result when I can live migrate VM without copy disk?
This option can damage data or make cluster unstable?
 

stibila

New Member
Oct 16, 2020
1
0
1
36
Hi, this is interesting topic and something I am currently researching myself.
I have a question though. If I understand correctly, you have LUN connected to all nodes and you put LVM on top of it. Then you have gfs2 on top of LVM. Is that correct?
If yes, why do you need LVM and whole dance with LVM locking, when you have gfs2 on top? gfs2 with DLM directly on top of LUN should be enough and you should be able to skip whole LVM layer. Or am I missing something?
 

Glowsome

Active Member
Jul 25, 2017
139
20
38
49
The Netherlands
www.comsolve.nl
Yes, it is correct.

i am using lvmlockd in dlm for lockspace on the availabillity of volume( - groups and presented volumes)
then gfs2 as filesystem is used to ensure locking is done at the file-level.

It was just my favor for lvm that i started out with this .. never really looked into the possibility of skipping the dance with lvm
 

glowfisch

New Member
Aug 9, 2020
6
0
1
25
Hi, this is interesting topic and something I am currently researching myself.
I have a question though. If I understand correctly, you have LUN connected to all nodes and you put LVM on top of it. Then you have gfs2 on top of LVM. Is that correct?
If yes, why do you need LVM and whole dance with LVM locking, when you have gfs2 on top? gfs2 with DLM directly on top of LUN should be enough and you should be able to skip whole LVM layer. Or am I missing something?
Hm... Well. May be. But now I use CEPH and its so better, because work well without hard.
 

Glowsome

Active Member
Jul 25, 2017
139
20
38
49
The Netherlands
www.comsolve.nl
Since latest kernel / version of PVE i got rid of some blocking issues i had.
Basically i was unable to upgrade PVE beyond a specific kernel version where when i did upgrade i was facing kernel-panic's as soon as i launched a VM/CT on it.

This issue seems to be solved again with latest 6.3 release, as i have not encountered issues like this since upgrading to 6.3 ... even with having experienced a powercut yesterday/last night the whole cluster came up just fine, and all defined hosts came up accordingly.

As i am currently also exploring ansible - as an alternative to puppet (as i find it more easy) i am testing on one node of the cluster a deployment/maintainment playbook.
 

glowfisch

New Member
Aug 9, 2020
6
0
1
25
Since latest kernel / version of PVE i got rid of some blocking issues i had.
Basically i was unable to upgrade PVE beyond a specific kernel version where when i did upgrade i was facing kernel-panic's as soon as i launched a VM/CT on it.

This issue seems to be solved again with latest 6.3 release, as i have not encountered issues like this since upgrading to 6.3 ... even with having experienced a powercut yesterday/last night the whole cluster came up just fine, and all defined hosts came up accordingly.

As i am currently also exploring ansible - as an alternative to puppet (as i find it more easy) i am testing on one node of the cluster a deployment/maintainment playbook.
Good! By the word, CEPH needs many GB RAM
 

Glowsome

Active Member
Jul 25, 2017
139
20
38
49
The Netherlands
www.comsolve.nl
It has been a long time sice i posted in this thread,

Alot has changed over the time which has passed.
  • we have a PVE 7 release ( yes i upgraded to it)
  • i am still working on ansible as alternative of puppet, and i think i am quite close to getting it finalised.
Just a peek on the playbook i am working on :

YAML:
---
# ./roles/proxmox/tasks/main.yml

# As this is HP hardware we are dealing with, and the repo is not yet available on bullseye i am currently stuck with this.
- name: Add HP repository into sources list using specified filename
  ansible.builtin.apt_repository:
    repo: deb http://downloads.linux.hpe.com/SDR/repo/mcp buster/current non-free
    state: present
    filename: mcp

- name: Add ProxMox free repository into sources list using specified filename
  ansible.builtin.apt_repository:
    repo: deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
    state: present
    filename: pve-install-repo

# I dont have a sub, so this needs to be removed
- name: Remove ProxMox Enterprise repository from sources list using specified filename
  ansible.builtin.apt_repository:
    repo: deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
    state: absent
    filename: pve-enterprise

- name: Install additional packages needed for ProxMox Cluster environment
  ansible.builtin.apt:
    name:
      - lvm2-lockd
      - dlm-controld
      - gfs2-utils
    state: present

- name: Update apt-get repo and cache
  ansible.builtin.apt:
    update_cache: yes
    force_apt_get: yes
    cache_valid_time: 3600

- name: Upgrade all apt packages
  ansible.builtin.apt:
    upgrade: dist
    force_apt_get: yes

# register the need for a reboot, but action following the result has yet to be made- DANGER ! as we are dealing with a cluster here, so dont want it to go down/reboot fully - need to spread reboots - for now its manual interaction.
- name: Check if a reboot is needed for ProxMox boxes
  register: reboot_required_file
  stat: path=/var/run/reboot-required get_md5=no

- name: Ensure customised dlm.conf is present
  ansible.builtin.template:
    src: 'dlm.conf.j2'
    dest: '/etc/dlm/dlm.conf'
    mode: 0600

- name: Ensure lvm.conf contains lvmlockd = 1
  ansible.builtin.template:
    src: 'lvm.conf.j2'
    dest: '/etc/lvm/lvm.conf'
    mode: 0600

- name: Ensure shared volumes and mountpoint definition file is present
  ansible.builtin.template:
    src: 'lvmshared.conf.j2'
    dest: '/etc/lvm/lvmshared.conf'
    mode: 0600

- name: Ensure the mountscript for shared volume is available
  ansible.builtin.template:
    src: lvmmount.sh.j2
    dest: '/usr/local/share/lvmmount.sh'
    mode: 0700

- name: Ensure Systemd service for shared volumes is present
  ansible.builtin.template:
    src: 'lvshared.service.j2'
    dest: '/usr/lib/systemd/system/lvshared.service'
    mode: 0644
 
- name: Ensure SystemD service pve-guests has a After=lvshared.service entry
  ansible.builtin.lineinfile:
    path: /usr/lib/systemd/system/pve-guests.service
    regexp: '^After=lvshared.service'
    insertafter: '^After=pve-ha-crm.service$'
    line: After=lvshared.service
    mode: 0644

- name: Force systemd to reread configs (2.4 and above)
  ansible.builtin.systemd:
    daemon_reload: yes

# i had issues where known_hosts was incorrectly sym-linked, or not at all - so i check and recreate
- name: check if /etc/ssh/ssh_known_hosts is present
  stat: path=/etc/ssh/ssh_known_hosts
  register: ssh_known_hosts_stat

- name: Delete /etc/ssh/ssh_known_hosts
  ansible.builtin.file:
    path: /etc/ssh/ssh_known_hosts
    state: absent
  when: ssh_known_hosts_stat.stat.exists

- name: Symlink /etc/ssh/ssh_known_hosts to /etc/pve/priv/known_hosts
  ansible.builtin.file:
    src: /etc/pve/priv/known_hosts
    dest: /etc/ssh/ssh_known_hosts
    owner: root
    state: link

- name: Add nodes to known_hosts
  ansible.builtin.known_hosts:
    path: /etc/pve/priv/known_hosts
    name: '{{ item.name }}'
    key: '{{ item.name }} {{ item.key }}'
  loop: '{{ my_node_keys }}'
  no_log: true

- name: check if /root/.ssh/ssh_known_hosts is present
  stat: path=/root/.ssh/known_hosts
  register: root_known_hosts_stat

- name: Delete /root/.ssh/known_hosts
  ansible.builtin.file:
    path: /root/.ssh/known_hosts
    state: absent
  when: root_known_hosts_stat.stat.exists

- name: Symlink /root/.ssh/known_hosts to /etc/pve/priv/known_hosts
  ansible.builtin.file:
    src: /etc/pve/priv/known_hosts
    dest: /root/.ssh/known_hosts
    owner: root
    state: link

- name: Set up Node authorized keys
  ansible.posix.authorized_key:
    manage_dir: no
    path: /etc/pve/priv/authorized_keys
    user: root
    state: present
    key: '{{ item.key }}'
  loop: '{{ my_node_keys }}'
  no_log: true

Hope this info helps in automating/keeping a cluster-env in check the 'ansible'-way

Edit : 15/07/2021 - straightened out most mixings of style
 
Last edited:

Gregorian84

New Member
Oct 5, 2020
4
0
1
37
Hello smart people! It's very good thread Glowsome, thank You! all

Last days i'm trying to set up fresh (PVE 7.0.11), 2-node cluster with one iSCSI ssd netapp storage. I'd like to have shared storage for VMs, with thin-provissioning and snapshots on it (eventually qcow2 in local directory).

So i've decided to get it done by multi-path iSCSI with non-direct LUN which can be used for both nodes as gfs2 directory mouned storage with dlm manager (working with corosync). Then i could simply add in GUI shared direcotry (pointed to gfs2 mounted storage) and voila, make some tests.

For now, I've already set up and tested multipath iSCSI (over 120k iops), it's visible in nodes as /dev/mapper/device which i can partition and mkfs.gfs2 without any problems.
Code:
root@mycluster:~# tunegfs2 -l /dev/mapper/my-storage-part1
tunegfs2
File system volume name: KRONOS:fullgfs2
File system UUID: 51e554ba-8643-4db4-8819-8c0abf5181bb
File system magic number: 0x1161970
Block size: 4096
Block shift: 12
Root inode: 263772
Master inode: 131357
Lock protocol: lock_dlm
Lock table: KRONOS:fullgfs2

On both nodes /etc/dlm/dlm.conf is like:
Code:
log_debug=1
#debug_logfile=/var/log/dlm
protocol=tcp
post_join_delay=10
enable_fencing=0
#enable_plock=1
#fence_all /bin/true
lockspace KRONOS nodir=1
#master KRONOS node=1
#master KRONOS node=2

On both nodes the dlm_tool status -n shows:
Code:
cluster nodeid 1 quorate 1 ring seq 487 487
daemon now 4547 fence_pid 0
node 1 M add 2740 rem 0 fail 0 fence 0 at 0 0
node 2 M add 2740 rem 0 fail 0 fence 0 at 0 0

And finally when i'd like to mount system to direcotory i get only:
Code:
mount -t gfs2 -o noatime /dev/mapper/my-storage-part1 /STORAGE_shared_GFS2_on_iqn_lun-indirect-MPiSCSI
mount: /STORAGE_shared_GFS2_on_iqn_lun-indirect-MPiSCSI: wrong fs type, bad option, bad superblock on /dev/mapper/my-storage-part1, missing codepage or helper program, or other error.

There are no errors in messages and only manual running dlm_controld --daemon_debug gives 2 errors on startup:
740 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2
2740 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2

I've found there coould be no modules such GFS2_FS_LOCKING_DLM but the kernel has it build-in
CONFIG_GFS2_FS_LOCKING_DLM=y

So i have no clue why i can't just mount the filesystem in my directory. Please help, even small hint can save the world :)

ps.
i forgot to add that when i prepare the filesystem as local and not cluster by
mkfs.gfs2 -p lock_nolock
then i can mount it with no complaints. So i was looking for some dlm misconfiguration but every docs on web doesn't give me a clue
 
Last edited:

Gregorian84

New Member
Oct 5, 2020
4
0
1
37
So i've found in syslog proper message saying:
dlm: TCP protocol can't handle multi-homed hosts, try SCTP

The point is that i've set up cluster (by gui) in dual ring for corosync and didn't care about corosync.conf because it's working fine. But DLM is somehow strangely detect tcp protocol, not sctp. When i put "protocol=sctp" in dlm.conf it started to working. In my opinion it shoudl work with tcp also but maybe someone smarter point's out how&why it really should work.
 

Gregorian84

New Member
Oct 5, 2020
4
0
1
37
btw actual repo gfs-utils package is version 3.3.0-2 and by default mkfs.gfs2 makes journal in 512mb size which is different than in manuals (default should be 128). For now i'm testing this filesystem after making and mounting as below:

Code:
mkfs.gfs2 -p lock_dlm -t KRONOS:fullgfs2 -j 2 -J 128 /dev/mapper/my-storage-part1
mount -t gfs2 -o noatime /dev/mapper/netapp-ef280-storage-part1 /dysk/
 

Glowsome

Active Member
Jul 25, 2017
139
20
38
49
The Netherlands
www.comsolve.nl
Just an update from my end, it seems the location (since a bit ( now running 7.0.14 PVE) ) of the unitfile has changed, and introduced some issues so i had to adapt my Ansible playbook.
Code:
---
# ./roles/proxmox/tasks/main.yml

- name: remove mcp repository to reset content
  ansible.builtin.file:
    path: /etc/apt/sources.list.d/mcp.list
    state: absent

- name: Add HP repository into sources list using specified filename (Debian 10)
  ansible.builtin.apt_repository:
    repo: deb http://downloads.linux.hpe.com/SDR/repo/mcp buster/current non-free
    state: present
    filename: mcp
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "10"

- name: Add HP repository into sources list using specified filename (Debian 11)
  ansible.builtin.apt_repository:
    repo: deb http://downloads.linux.hpe.com/SDR/repo/mcp bullseye/current non-free
    state: present
    filename: mcp
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "11"

- name: Add ProxMox free repository into sources list using specified filename (Debian 10)
  ansible.builtin.apt_repository:
    repo: deb http://download.proxmox.com/debian buster pve-no-subscription
    state: present
    filename: pve-install-repo
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "10"

- name: Remove ProxMox Enterprise repository from sources list using specified filename (Debian 10)
  ansible.builtin.apt_repository:
    repo: deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise
    state: absent
    filename: pve-enterprise
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "10"

- name: Remove ProxMox Enterprise repository from sources list using specified filename (Debian 11)
  ansible.builtin.apt_repository:
    repo: deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
    state: absent
    filename: pve-enterprise
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "11"

- name: Add ProxMox free repository into sources list using specified filename (Debian 11)
  ansible.builtin.apt_repository:
    repo: deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
    state: present
    filename: pve-install-repo
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "10"

- name: Remove ProxMox Enterprise repository from sources list using specified filename (Debian 10)
  ansible.builtin.apt_repository:
    repo: deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise
    state: absent
    filename: pve-enterprise
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "10"

- name: Remove ProxMox Enterprise repository from sources list using specified filename (Debian 11)
  ansible.builtin.apt_repository:
    repo: deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
    state: absent
    filename: pve-enterprise
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "11"

- name: Add ProxMox free repository into sources list using specified filename (Debian 11)
  ansible.builtin.apt_repository:
    repo: deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
    state: present
    filename: pve-install-repo
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "11"

- name: Remove ProxMox Enterprise repository from sources list using specified filename (Debian 11)
  ansible.builtin.apt_repository:
    repo: deb https://enterprise.proxmox.com/debian/pve bullseye pve-enterprise
    state: absent
    filename: pve-enterprise
  when:
    - ansible_facts['distribution'] == "Debian"
    - ansible_facts['distribution_major_version'] == "11"

- name: Register hostname to determine if its part of a cluster
  ansible.builtin.command: 'hostname --fqdn'
  register: nodename

#- name: Print information about nodename
#  ansible.builtin.debug:
#    var: nodename.stdout

- name: Install additional packages needed for ProxMox Cluster environment
  ansible.builtin.apt:
    name:
      - lvm2-lockd
      - dlm-controld
      - gfs2-utils
    state: present
  when: nodename.stdout is regex("^node0?\.*.")

- name: Update apt-get repo and cache
  ansible.builtin.apt:
    update_cache: yes
    force_apt_get: yes
    cache_valid_time: 3600

- name: Upgrade all apt packages
  ansible.builtin.apt:
    upgrade: dist
    force_apt_get: yes

- name: Check if a reboot is needed for ProxMox boxes
  ansible.builtin.stat:
    path: /var/run/reboot-required
  register: check_reboot

- name: Print information about reboot
  ansible.builtin.debug:
    var: check_reboot

- name: Ensure customised dlm.conf is present
  ansible.builtin.template:
    src: 'dlm.conf.j2'
    dest: '/etc/dlm/dlm.conf'
    mode: 0600
  when: nodename.stdout is regex("^node0?\.*.")

- name: Ensure lvm.conf contains lvmlockd = 1
  ansible.builtin.template:
    src: 'lvm.conf.j2'
    dest: '/etc/lvm/lvm.conf'
    mode: 0600
  when: nodename.stdout is regex("^node0?\.*.")

- name: Ensure shared volumes and mountpoint definition file is present
  ansible.builtin.template:
    src: 'lvmshared.conf.j2'
    dest: '/etc/lvm/lvmshared.conf'
    mode: 0600
  when: nodename.stdout is regex("^node0?\.*.")

- name: Ensure the mountscript for shared volume is available
  ansible.builtin.template:
    src: lvmmount.sh.j2
    dest: '/usr/local/share/lvmmount.sh'
    mode: 0700
  when: nodename.stdout is regex("^node0?\.*.")

- name: Ensure Systemd service for shared volumes is present
  ansible.builtin.template:
    src: 'lvshared.service.j2'
    dest: '/usr/lib/systemd/system/lvshared.service'
    mode: 0644
  when: nodename.stdout is regex("^node0?\.*.")

- name: Remove possible wrong location of After=lvshared.service
  ansible.builtin.lineinfile:
    path: /lib/systemd/system/pve-guests.service
    regexp: '^After=lvshared.service'
    state: absent
  when: nodename.stdout is regex("^node0?\.*.")

- name: Ensure Systemd service pve-guests has an After=lvshared.service entry
  ansible.builtin.lineinfile:
    path: /lib/systemd/system/pve-guests.service
    regexp: '^After=lvshared.service'
    insertafter: '^After=pve-ha-crm.service.*'
    line: After=lvshared.service
    mode: 0644
  when: nodename.stdout is regex("^node0?\.*.")

- name: Force systemd to reread configs (2.4 and above)
  ansible.builtin.systemd:
    daemon_reload: yes

- name: Check /root/.ssh/authorised_keys
  ansible.builtin.stat:
    path: /root/.ssh/authorized_keys
    get_checksum: no
  register: ssh_authorized_keys_stat

- name: Delete /root/.ssh/authorised_keys when not a symlink or not linked correctly
  ansible.builtin.file:
    path: /root/.ssh/authorised_keys
    state: absent
  when:
    - ssh_authorized_keys_stat.stat.islnk is not defined or ssh_authorized_keys_stat.stat.lnk_target != "/etc/pve/priv/authorized_keys"

- name: Symlink /root/.ssh/authorized_keys to /etc/pve/priv/authorized_keys
  ansible.builtin.file:
    src: /etc/pve/priv/authorized_keys
    dest: /root/.ssh/authorized_keys
    owner: root
    state: link
  when:
    - ssh_authorized_keys_stat.stat.islnk is not defined or ssh_authorized_keys_stat.stat.lnk_target != "/etc/pve/priv/authorized_keys"

- name: Check /etc/ssh/ssh_known_hosts
  ansible.builtin.stat:
    path: /etc/ssh/ssh_known_hosts
    get_checksum: no
  register: ssh_known_hosts_stat

- name: Delete /etc/ssh/ssh_known_hosts when not a symlink or not linked correctly
  ansible.builtin.file:
    path: /etc/ssh/ssh_known_hosts
    state: absent
  when:
    - ssh_known_hosts_stat.stat.islnk is not defined or ssh_known_hosts_stat.stat.lnk_target != "/etc/pve/priv/known_hosts"

- name: Symlink /etc/ssh/ssh_known_hosts to /etc/pve/priv/known_hosts
  ansible.builtin.file:
    src: /etc/pve/priv/known_hosts
    dest: /etc/ssh/ssh_known_hosts
    owner: root
    state: link
  when:
    - ssh_known_hosts_stat.stat.islnk is not defined or ssh_known_hosts_stat.stat.lnk_target != "/etc/pve/priv/known_hosts"

- name: Add nodes to known_hosts
  ansible.builtin.known_hosts:
    path: /etc/pve/priv/known_hosts
    name: '{{ item.name }}'
    key: '{{ item.name }} {{ item.key }}'
  loop: '{{ my_node_keys }}'
  no_log: true
  when: nodename.stdout is regex("^node0?\.*.")

- name: Check if /root/.ssh/ssh_known_hosts
  ansible.builtin.stat:
    path: /root/.ssh/known_hosts
    get_checksum: no
  register: root_known_hosts_stat

- name: Delete /root/.ssh/known_hosts when not a symlink or not linked correctly
  ansible.builtin.file:
    path: /root/.ssh/known_hosts
    state: absent
  when:
    - root_known_hosts_stat.stat.islnk is not defined or root_known_hosts_stat.stat.lnk_target != "/etc/pve/priv/known_hosts"

- name: Symlink /root/.ssh/known_hosts to /etc/pve/priv/known_hosts
  ansible.builtin.file:
    src: /etc/pve/priv/known_hosts
    dest: /root/.ssh/known_hosts
    owner: root
    state: link
  when:
    - root_known_hosts_stat.stat.islnk is not defined or root_known_hosts_stat.stat.lnk_target != "/etc/pve/priv/known_hosts"

- name: Set up Node authorized keys
  ansible.posix.authorized_key:
    manage_dir: no
    path: /etc/pve/priv/authorized_keys
    user: root
    state: present
    key: '{{ item.key }}'
  loop: '{{ my_node_keys }}'
  no_log: true
  when: nodename.stdout is regex("^node0?\.*.")

- name: Add keys to ssh_known_hosts
  ansible.builtin.known_hosts:
    path: /etc/pve/priv/known_hosts
    name: '{{ item.name }}'
    key: '{{ item.name }} {{ item.key }}'
  loop: '{{ my_host_keys }}'
  no_log: true
  when: nodename.stdout is regex("^node0?\.*.")

The playbook will also correct incorrect placement of the systemd unitfile dependancy in the pve-guests unitfile.
 
Last edited:

Glowsome

Active Member
Jul 25, 2017
139
20
38
49
The Netherlands
www.comsolve.nl
Still suffering from a Timing-issue that when i reboot a node it hangs after shutting down all VM's/Containers.
From the cluster-perspective this is quite unwanted.
At the moment i just hard-reboot the node from iLO ( HP's solution on servers to manage them from remote via a management board)

Al in all me dealing with an exotic configuration seems to work out fine for the most, but as said still searching for some solutions to what i am dealing with.
 

FrancisS

Member
Apr 26, 2019
16
0
6
56
Hello,

About the "sleep 10" added int the unitfile "lvmlockd.service" you do not have to modify the file.

You can create a directory "/etc/systemd/lvmlockd.service.d/" and create a file "sleep.conf" (exemple) with the content

[Service]
ExecStartPre=/usr/bin/sleep 10

or you can create a directory "/etc/systemd/dlm.service.d/" and create a file "sleep.conf"

[Service]
ExecStartPost=/usr/bin/sleep 10

An update can overwrite the file "lvmlockd.service" or "dlm.service" no problem.

Best regards.

Francis
 

FrancisS

Member
Apr 26, 2019
16
0
6
56
Hello,

About "- activate the shared LV" and "- mount the LV on the filesystem" we can to that with only "/etc/fstab" entries and an "unitfile".

For your purpose add the line to "/etc/fstab"

/dev/cluster02/backups /data/backups gfs2 noatime,nodiratime,noauto 1 2

"noatime,nodiratime" is for gfs2 performance,
"noauto" for no automatic mount we mount manually after all LVM shared up.

And add the "unitfile" /etc/systemd/system/lvmshare.service

[Unit]
Description=LVM locking LVs and mount LVs start and stop
Documentation=man:lvmlockd(8)
After=lvmlocks.service lvmlockd.service sanlock.service dlm.service

[Service]
Type=oneshot
RemainAfterExit=yes

# start lockspaces LVs and mount LVs
ExecStart=/usr/bin/bash -c "/usr/sbin/vgs --noheadings -o name -S vg_shared=yes | xargs /usr/sbin/lvchange -asy; /usr/sbin/lvs --noheadings -o lv_path -S vg_shared=yes | xargs mount"

# stop lockspaces LVS after umount LVs
ExecStop=/usr/bin/bash -c "/usr/sbin/lvs --noheadings -o lv_path -S vg_shared=yes | xargs umount; /usr/sbin/vgs --noheadings -o name -S vg_shared=yes | xargs /usr/sbin/lvchange -an"

[Install]
WantedBy=multi-user.target

This unitfile work like the "lvmlocks.service" witch start all shared VG so all shared LV are started and mounted.

The command "/usr/sbin/vgs --noheadings -o name -S vg_shared=yes | xargs /usr/sbin/lvchange -asy" activate ALL shared LV in shared mode "-asy".

The command "usr/sbin/lvs --noheadings -o lv_path -S vg_shared=yes | xargs mount" mount ALL the LVs using the fstab.

Error if a LV is not in the fstab...

PS: I am locking for a nomencalture for the names of the VGs, LVs, mountpoint, potentialy we can have multiple VGs, LVs and mountpoint.
For the gfs2 locktablename I used clustername:vgname-lvname

Best regards.

Francis
 

FrancisS

Member
Apr 26, 2019
16
0
6
56
Hello,

LVM shares need to be up before the guests starts we can add this to the unit lvmshare.service "Before=pve-guests.service"

[Unit]
Description=LVM locking LVs and mount LVs start and stop
Documentation=man:lvmlockd(8)
After=lvmlocks.service lvmlockd.service sanlock.service dlm.service
Before=pve-guests.service

[Service]
Type=oneshot
RemainAfterExit=yes

# start lockspaces LVs and mount LVs
ExecStart=/usr/bin/bash -c "/usr/sbin/vgs --noheadings -o name -S vg_shared=yes | xargs /usr/sbin/lvchange -asy; /usr/sbin/lvs --noheadings -o lv_path -S vg_shared=yes | xargs mount"

# stop lockspaces LVS after umount LVs
ExecStop=/usr/bin/bash -c "/usr/sbin/lvs --noheadings -o lv_path -S vg_shared=yes | xargs umount; /usr/sbin/vgs --noheadings -o name -S vg_shared=yes | xargs /usr/sbin/lvchange -an"

[Install]
WantedBy=multi-user.target

Another information for the "multipath.conf" I used these config.

defaults {
user_friendly_names yes
}

blacklist {
device {
product .*
}
}

blacklist_exceptions {
device {
product "MSA [12]0[45]0 SA[NS]"
}
}

With this config no need to do a modification if you add a lun.
Change de "product" with your hardware name.

Best regards.

Francis
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!