iSCSI LUNs or VM image LVMs slows grub during updating PVE

stefws

Member
Jan 29, 2015
302
4
18
Denmark
siimnet.dk
Got a 4.4 production cluster attached to a multipathed iSCSI SAN from a HP MSA1040, divided the MSA into two disk groups A&B, then created 5+1 xiSCSI LUNs per MSA disk group and mapped those to PVs in four volume groups on each hypervisor node like this:

root@n2:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 136.57g 16.00g
vgA 5 53 0 wz--n- 3.64t 1.07t
vgAbck 1 1 0 wz--n- 1.82t 0
vgB 5 50 0 wz--n- 3.64t 1.15t
vgBbck 1 1 0 wz--n- 1.82t 0

root@n2:~# pvs
PV VG Fmt Attr PSize PFree
/dev/mapper/3600c0ff000258a36c4cb245601000000 vgAbck lvm2 a-- 1.82t 0
/dev/mapper/3600c0ff000258a36decb245601000000 vgA lvm2 a-- 744.49g 4.49g
/dev/mapper/3600c0ff000258a36decb245602000000 vgA lvm2 a-- 744.49g 4.49g
/dev/mapper/3600c0ff000258a36decb245603000000 vgA lvm2 a-- 744.49g 44.49g
/dev/mapper/3600c0ff000258a36dfcb245601000000 vgA lvm2 a-- 744.49g 294.49g
/dev/mapper/3600c0ff000258a36dfcb245602000000 vgA lvm2 a-- 744.49g 744.49g
/dev/mapper/3600c0ff000258cfd1403225601000000 vgBbck lvm2 a-- 1.82t 0
/dev/mapper/3600c0ff000258cfd2b03225601000000 vgB lvm2 a-- 744.49g 44.49g
/dev/mapper/3600c0ff000258cfd2b03225602000000 vgB lvm2 a-- 744.49g 744.49g
/dev/mapper/3600c0ff000258cfd2c03225601000000 vgB lvm2 a-- 744.49g 344.49g
/dev/mapper/3600c0ff000258cfd2c03225602000000 vgB lvm2 a-- 744.49g 34.49g
/dev/mapper/3600c0ff000258cfd2c03225603000000 vgB lvm2 a-- 744.49g 14.49g
/dev/sda3 pve lvm2 a-- 136.57g 16.00g

vgXbck LUNs are mapped to nfs server backup VM (vm id: 200) and this is used to store VM backup dump of any other VM than the nfs backup VM.

root@n2:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 10M 0 10M 0% /dev
tmpfs 51G 283M 51G 1% /run
/dev/dm-0 34G 3.7G 28G 12% /
tmpfs 126G 54M 126G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 126G 0 126G 0% /sys/fs/cgroup
/dev/sda2 126M 142K 125M 1% /boot/efi
/dev/mapper/pve-data 69G 7.9G 61G 12% /var/lib/vz
cgmfs 100K 0 100K 0% /run/cgmanager/fs
/dev/fuse 30M 104K 30M 1% /etc/pve
nfsbackupB:/exports/nfsShareB 1.9T 77G 1.8T 5% /mnt/pve/backupB
nfsbackupA:/exports/nfsShareA 1.9T 432G 1.4T 24% /mnt/pve/backupA

root@n2:~# lvs vgBbck
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
vm-200-disk-1 vgBbck -wi------- 1.82t
root@n2:~# lvs vgAbck
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
vm-200-disk-1 vgAbck -wi------- 1.82t

vgX are shared VM image volume groups in our storage like this:

root@n2:~# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content iso,backup,vztmpl,images,rootdir
maxfiles 3

nfs: backupA
export /exports/nfsShareA
server nfsbackupA
path /mnt/pve/backupA
maxfiles 3
options vers=4,soft,intr,timeo=300,rsize=262144,wsize=262144
content iso,backup

nfs: backupB
export /exports/nfsShareB
server nfsbackupB
path /mnt/pve/backupB
maxfiles 3
options vers=4,soft,intr,timeo=300,rsize=262144,wsize=262144
content backup,iso

lvm: vgB
vgname vgB
shared
content images

lvm: vgBbck
vgname vgBbck
shared
content images

lvm: vgA
vgname vgA
shared
content images

lvm: vgAbck
vgname vgAbck
shared
content images

Normally when we patch PVE we do this:

# let's get all non-essentiel disk device out of the way...
vgexport -a
umount /mnt/pve/backupA
umount /mnt/pve/backupB
sleep 2
dmsetup remove_all
iscsiadm -m session -u

# run update/upgrade(s)
apt-get update
apt-get -y dist-upgrade

This way grub is fast when updating, search for new boot images etc. properly as it doesn't probe VM LVMs in vgX nor the iSCSI LUNs directly. Not sure why grub is slow though, earlier been recommend removing the debian package os-prober, but we don't seem to have the package installed:

root@n2:~# dpkg -l | egrep -i 'os-pro|grub'
ii grub-common 2.02-pve5 amd64 GRand Unified Bootloader (common files)
ii grub-efi-amd64-bin 2.02-pve5 amd64 GRand Unified Bootloader, version 2 (EFI-AMD64 binaries)
ii grub-efi-ia32-bin 2.02-pve5 amd64 GRand Unified Bootloader, version 2 (EFI-IA32 binaries)
ii grub-pc 2.02-pve5 amd64 GRand Unified Bootloader, version 2 (PC/BIOS version)
ii grub-pc-bin 2.02-pve5 amd64 GRand Unified Bootloader, version 2 (PC/BIOS binaries)
ii grub2-common 2.02-pve5 amd64 GRand Unified Bootloader (common files for version 2)

But the issue with this update approach, is with the 'vgexport -a' as it leaves vgX marked as exported on any other PVE host and thus until the patching node has been rebooted, live migration fails on other PVE hosts.

Any hints are appreciated to avoid this?
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,357
1,657
174
how long does "vgs --options vg_uuid,pv_name" take (when the inactive VGs are not exported)?

e.g., when I do "grub-probe --target=fs /", the above vgs command is called twice (because my root is on LVM). the same seems to be true for fs_label and fs_uuid (and probably other probe targets). this can add up, which is probably not a problem for local LVM, but if the scan happens with the added network latency and big volume groups, this might be problematic..
 

stefws

Member
Jan 29, 2015
302
4
18
Denmark
siimnet.dk
[QOUTE]
root@n2:~# time vgs --options vg_uuid,pv_name
VG UUID PV

iujPYu-N6fM-O6Yg-t8sy-OLPS-LP5V-LktlaA /dev/sda3

AnrmOh-1J6Z-H08G-n1Mw-ktvr-hYKh-Kvffp1 /dev/mapper/3600c0ff000258a36decb245601
000000
AnrmOh-1J6Z-H08G-n1Mw-ktvr-hYKh-Kvffp1 /dev/mapper/3600c0ff000258a36decb245602
000000
AnrmOh-1J6Z-H08G-n1Mw-ktvr-hYKh-Kvffp1 /dev/mapper/3600c0ff000258a36decb245603
000000
AnrmOh-1J6Z-H08G-n1Mw-ktvr-hYKh-Kvffp1 /dev/mapper/3600c0ff000258a36dfcb245601
000000
AnrmOh-1J6Z-H08G-n1Mw-ktvr-hYKh-Kvffp1 /dev/mapper/3600c0ff000258a36dfcb245602
000000
I651kF-l6XQ-0XLA-xpg8-BAKA-wril-4aIFvt /dev/mapper/3600c0ff000258a36c4cb245601
000000
Gs9vCI-WfNP-o4pO-JpqF-romv-LcJO-2SUFlZ /dev/mapper/3600c0ff000258cfd2b03225601
000000
Gs9vCI-WfNP-o4pO-JpqF-romv-LcJO-2SUFlZ /dev/mapper/3600c0ff000258cfd2c03225603
000000
Gs9vCI-WfNP-o4pO-JpqF-romv-LcJO-2SUFlZ /dev/mapper/3600c0ff000258cfd2c03225602
000000
Gs9vCI-WfNP-o4pO-JpqF-romv-LcJO-2SUFlZ /dev/mapper/3600c0ff000258cfd2c03225601
000000
Gs9vCI-WfNP-o4pO-JpqF-romv-LcJO-2SUFlZ /dev/mapper/3600c0ff000258cfd2b03225602
000000
119opU-nZGw-yFwR-Z9Pq-tH5a-xZGS-4oQWtU /dev/mapper/3600c0ff000258cfd1403225601
000000

real 0m2.650s
user 0m0.020s
sys 0m0.020s
[/QUOTE]

It might not have to do with the number of VM lvms (all non-active when VMs are migrated off host)?

root@n2:~# lvs | wc -l
109
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,357
1,657
174
real 0m2.650s

so 5 seconds for every grub-probe call, of which there are probably tens in each update-grub run - I guess that is a possible culprit. I'll experiment a bit tomorrow to see how it might be possible to narrow down the number of grub-probe calls.

how many kernel images do you have in your /boot folder?

edit: nevermind about the kernel count - the overhead should be neglible, like I expected in your older thread:

1 vs 18 kernel images is 15.2 vs 15.6 seconds for the total update-grub run. the bad news is, this probably means that all those grub-probe / vgs calls are just for figuring out where / and /boot are....
 
Last edited:

stefws

Member
Jan 29, 2015
302
4
18
Denmark
siimnet.dk
usually 2-3 kernel images, but below 5 normally.

Note that the iSCSI LUNs are connected through a bonded pair of NICs plugged into an openvswitch (vmbr1), dunno if this matters, maybe only when patching the openvswitch package.
 

stefws

Member
Jan 29, 2015
302
4
18
Denmark
siimnet.dk
I'll experiment a bit tomorrow to see how it might be possible to narrow down the number of grub-probe calls.
Ever got around narrowing down grub-probe calls?

the bad news is, this probably means that all those grub-probe / vgs calls are just for figuring out where / and /boot are....
Could this be the culprit if vgs is called multiple times:

root@n7:~# time vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 136.57g 16.00g
vgA 5 53 0 wz--n- 3.64t 1.07t
vgAbck 1 1 0 wz--n- 1.82t 0
vgB 5 50 0 wz--n- 3.64t 1.15t
vgBbck 1 1 0 wz--n- 1.82t 0

real 0m2.906s
user 0m0.012s
sys 0m0.024s
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
8,357
1,657
174
seems likely, but unfortunately have not found time to debug further.. given the recent corosync fix, this should now be a lot less problematic though - upgrades take a bit long on your systems, but there should no longer be any fencing or other side-effects.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!