Hangs on boot after finding LVM Storage over QLogic FC

mgabriel

Renowned Member
Jul 28, 2011
111
14
83
Saarbrücken, Germany
www.inett.de
Hi,

I tried to do a test migration from an existing Xen Infrastructure to a Proxmox 2.0rc1. The current setup consists of 2 Servers with 2 QLogic FC Cards, connected to an Infortrend SAN. The servers are currently running XenServer with XenCenter Management.

I could install Proxmox successfully and after boot, the physical volumes on the Infortrend SAN are displayed, the logical volumes (Xen...) are found and displayed, but immediately after, the boot process hangs. I can press enter and see the cursor going down, but even after some minutes, it doesn't go further.

Any hints what to look after?

Best regards,
Marco
 
Hi Marco,


I have a couple of servers booting from an EMC fibre channel SAN via Qlogic QLA2342 2Gbps Fibre HBA's. I ran into similar problems in the beginning and this is the solution I found:

Boot after install fails, probably due to LUN not being available yet. The kernel has to wait a couple of seconds for the LUN to get ready. Enter the GRUB menu at boot by hitting the "e" key, and edit the boot commandline to include the highlighed parameter:



linux /vmlinuz-2.6 [...] ro rootdelay=10



To make this change permanent use one of the following methods when the system is done booting:


# sed -i -e "s/DEFAULT=\"quiet\"/DEFAULT=\"rootdelay=10\"/" /etc/default/grub
# update-grub


OR edit the following files so that resp. files contain the line shown below resp. command:


# nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="rootdelay=10"


# nano /etc/grub.d/10_linux
linux ${rel_dirname}/${basename} root=${linux_root_device_thisversion} ro rootdelay=10 ${args}


# update-grub


OR change the start parameters for GRUB directly:


# nano /boot/grub/grub.cfg



I prefer the first variant as it should have the highest probability to survive a system update :) Either one will enable Proxmox to find the boot LUN IF you have set up the LUN:s and masking / zoning properly and set the HBA:s BIOS to enable boot support.



Now if you want to have support for multipathing you need to do some more tweaking (the following example is for EMC's Clariion FC SAN:s with support for ALUA).

Create a config file for multipath:


# nano /etc/multipath.conf


defaults {
user_friendly_names yes
}


blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|nbd)[0-9]*"
devnode "^(xvd|vd)[a-z]*"
devnode "^hd[a-z][[0-9]*]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
devnode "^dcssblk[0-9]*"
devnode "^etherd"


device {
vendor "DGC"
product "LUNZ"
}
}


blacklist_exceptions {
# wwid "*"
}


devices {
device {
vendor "DGC"
product ".*"
product_blacklist "LUNZ"


hardware_handler "1 emc"
features "1 queue_if_no_path"
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"


prio emc


path_grouping_policy group_by_prio
path_checker emc_clariion
path_selector "round-robin 0"
rr_weight uniform


polling_interval 2
no_path_retry 60
dev_loss_tmo 120
failback immediate
}
}


multipaths {
multipath {
wwid "S_BOOT"
alias "boot"
}
multipath {
wwid "S_REPO"
alias "repository"
}
}




Replace the dummy placeholder for the alias ("S_BOOT") in the file with the WWID for the boot LUN:


# sed -i -e "s/S_BOOT/`/lib/udev/scsi_id --whitelisted --device=/dev/sda | awk '{print $1}'`/" /etc/multipath.conf


Write correct UUID for the boot LUN to /etc/fstab:


# sed -i -e "s/\/dev\/sda1/`blkid | grep -m 1 sd.1 | awk '{print $2}'`/" /etc/fstab




Update initramfs to enable multipath support at re-boot:

# update-initramfs -c -t -k `uname -r`



Update the system and reboot, it should come up without problems::


# aptitude update
# aptitude full-upgrade
# reboot




Install multipath:


# aptitude install multipath-tools-boot




Reboot again!


# reboot




Tidy up the mess that multipath install created:


# dpkg --configure -a




Edit the configuration file for LVM so that it will only "see" the multipath connections:


# sed -r -i -e "s/^([ ]*filter = )(.*)/\1[ \"a|\/dev\/disk\/by-id\/dm-uuid-.*-mpath-.*|\", \"r|.*|\" ]/" /etc/lvm/lvm.conf
# sed -r -i -e "s/^([ ]*)# (types = )(.*)/\1\2[ \"device-mapper\", 1 ]/" /etc/lvm/lvm.conf


OR


# nano /etc/lvm/lvm.conf


filter = [ "a|/dev/disk/by-id/dm-uuid-.*-mpath-.*|", "r|.*|" ]
types = [ "device-mapper", 1 ]




Update initramfs once again:r:

# update-initramfs -c -t -k `uname -r`
 
Thank you for your very detailed hints.

The problem ist solved, it took around 4 hours after the LVM discovery and after that, the machine came up. After that initial session, a boot still takes around 20 minutes, but that's something I can live with.

I just have some problems with multiport Intel Network cards which don't work. They show up, the kernel module e1000e loads and finds the cards, but the connection just does not work. This might be a problem with the specific card and the e1000e version in the kernel. I'm going to compile the module myself with the sources from the Intel homepage. Don't know if that's going to work, but this is my best guess as XenServer finds and uses the cards correctly.

Best regards,
Marco
 
I'm still working on the migration from Xen.

Current status is now:
- Proxmox boots from local hardddisk
- finds FC Storage (QLogic Card x 2 -> Infortrend SAN)
- enables multipathing and shows existing xen volume groups
- after showing them, the boot process hangs for around 20 minutes
- it starts over, boots all necessary VMs

So, it works, but it is slooow.

Even if Xen isn't really fast on this setup, proxmox is way slower than the Xen installation. The virtual machines respond slow.

What I discovered:

- If I copy a file from FC SAN to local disk, the copy hangs every few seconds for one second, then continues. Transfer speed differs from 170 MB/s down to 2MB/s and up again.
- If I copy a file from local disk to FC SAN, it works as expected and the file gets copied in the same speed until it is finished.

The LUN is setup as a RAID5 Volume, which has been done before we came into the project. This will be reformatted as a RAID10 as soon as we make the final migration. But that final migration is postponed until we find the reason for the slow (or unrealiable) transfer speeds.

Any hints would be greatly appreciated.

Thanks in advanced,
Marco
 
I'm still working on the migration from Xen.

Current status is now:
- Proxmox boots from local hardddisk
- finds FC Storage (QLogic Card x 2 -> Infortrend SAN)
- enables multipathing and shows existing xen volume groups
- after showing them, the boot process hangs for around 20 minutes
- it starts over, boots all necessary VMs

So, it works, but it is slooow.

Even if Xen isn't really fast on this setup, proxmox is way slower than the Xen installation. The virtual machines respond slow.

What I discovered:

- If I copy a file from FC SAN to local disk, the copy hangs every few seconds for one second, then continues. Transfer speed differs from 170 MB/s down to 2MB/s and up again.
- If I copy a file from local disk to FC SAN, it works as expected and the file gets copied in the same speed until it is finished.

The LUN is setup as a RAID5 Volume, which has been done before we came into the project. This will be reformatted as a RAID10 as soon as we make the final migration. But that final migration is postponed until we find the reason for the slow (or unrealiable) transfer speeds.

Any hints would be greatly appreciated.

Thanks in advanced,
Marco


I would say that you seem to have some kind of H/W problem. Check the FC cabling, perhaps replace them for testing? Check syslog / messages for any errors during boot. Also, the firmware on the Qlogic cards may be old? Check the settings in the Qlogic during BIOS boot, if you need I can supply my QLA 2340/2342 and QLE2642 settings.
 
It would be great if you could provide your settings. I'd be happy to check them against yours.

I thought about a H/W Problem, but cabling was checked several times, and there are two separate FC cables from the host to the storage. I'm checking again and maybe replace them just to be sure.

Thanks in advance,
Marco
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!