Proxmox VE 1.8 cluster with iSCSI (LUNs on a SAN) : multiple problems

  • Thread starter Thread starter CalamityDjenn
  • Start date Start date
C

CalamityDjenn

Guest
Hi there,

I setted up Proxmox VE 1.8 on a HP Proliant DL360 G5 server (master) and a little C2D desktop computer (slave). The slave server is there only if we got a problem with the master, so all the VM are linked to the master server.
The goal is to use one or more LUNs with the SAN we have got on the network because we only have 79Gb (RAID) physical storage space on the master server. There, I thought it would be easy, if the master if out of order, to have the VM running through the slave server...
I want to use this LUNs for virtual disks and backups, only iso (only one ATM) or templates (we don't use OpenVZ ATM) would be on the physical hard drive.

I am quite a newbie about iSCSI so I think what I have done is wrong...

I created my new volume test + LUN on the SAN side, then did everything about the initiator + volume group to allow access from both servers.
Then, on the master server, I initialized the connection to the SAN with iscsiadm commands.

Here are the steps I have done after that :

Code:
pvcreate /dev/sda1

vgcreate stockVM /dev/sda1

lvcreate -L55000 -n lv_vz stockVM

lvcreate -L25000 -n lv_dump stockVM

mkfs.ext3 /dev/stockVM/lv_vz
mkfs.ext3 /dev/stockVM/lv_dump

mkdir /var/lib/vz2
mkdir /var/lib/vz2/dump

mount /dev/stockVM/lv_vz /var/lib/vz2
mount /dev/stockVM/lv_dump /var/lib/vz2/dump

echo "dumpdir: /var/lib/vz2/dump/" >> /etc/vzdump.conf

echo "#  LVM from SAN
/dev/mapper/stockVM-lv_vz/var/lib/vz2 ext3 defaults 0 0
/dev/mapper/stockVM-lv_dump/var/lib/vz2/dump ext3 defaults 0 0 
" >>/etc/bak_fstab

From that, I can manage all the files from command lines in case we got a problem with the web interface and the backups run fine, BUT...
If we reboot the server, the VM cannot start at boot because the /etc/fstab file is called BEFORE the iSCSI connection with the SAN, so our disks are not mounted on the system. We got to do mount -a when the system is ready, and then start the VM manually! Really annoying :(

Also, I think this is NOT the solution because I found we could add an iSCSI target + LV Group on the server(s) through the web interface. The problem is I absolutely don't know how to manage the files if we use this method : I tried to do this with a new LUN on the slave server but I did not see any folder where the LUN was mounted on the system. Is that normal ?
How can I do then to copy a VM from a server to another ?

Also, I automatized snapshots using the web interface so all my VM are saved each day, but I only got one file for each VM. So, I wanted to create a little script (bash) to duplicate those files so I could have two files for each VM : one from today + one from yesterday for example.
This is in case a problem occurs on the virtual system just before backups (it happened), for safety reasons.

I don't know if everything is clear... The mean thing is our slave server does not run any VM, it is there ONLY if the master is down. It has to be able to start the VM from the state they were before the master server stopped. I don't know how to do my iSCSI thing for that to work properly...

Thank you for your attention.
 
Hi,

In order to migrate a VM from host A to host B you MUST use shared storage, NFS server (qcow2 or vmdk image file) or iSCSI + LVM storage (raw access) for example.
iSCSI + LVM will give you the best performance.

Note : If your master brutally crash, you can't migrate or copy your VM on slave with the web interface, this is a limitation with current version of Proxmox and will be fixed in 2.0. The reason, the configuration file of a VM is only stored on the host who run the VM and there no auto migration or restart mechanism.

I hope I can make myself understood :)
 
Hi Jack, and thank you for your answer.

So, how can I resolve my problem then ? Reinstalling both servers and use DRDB ?? I don't know if it is possible has my hardware are not the same...
Another idea would have been to install Proxmox VE from a LUNs and to boot the server from network, but my slave does not have the same hardware...

I thought if I was using iSCSI LUNs I would not have to do any of that : if my master crash, I just need to start the VM from my slave server... So my VM systems would be in the state they were before the master crashed! The .conf files of each VM are not a problem if I store that on the LUNs aswell.
But I don't know if I did the right thing mounting my LUN from the system and not from web interface... I can manage my LUNs and files as if it was a physical hard drive. I would like to have the same on the slave.

What I thought was to prepare a slave server with Proxmox VE installed on it + LUNs mounted up on the system exactly like the master server.
If the master crash, I just execute a bash script (that I would create) to auto start up (qm start VMID lines) al the VM from the LUNs.

But I don't know how to mount properly my LUNs when the system boots up : iSCSI connection done AFTER mounting partition from fstab, so nothing is automatized. Also, I noticed some problem with iSCSI connection when booting the system up : I have to Control+C to help the system continue to boot up, and then manually mount my partitions...

Thank you.
 
Hey Jack,

I wish I could send you a PM in French (as I see you are from France), we may understand each other better :p
Local backing is what I have done obviously, as you can see from my initial post. All my Virtual Disks use .raw format already.

But as you can see, I have created logical partition on a logical volume from my SAN. If I understood how it works (not sure), those partitions are only accessible from my master server system. If it crashes, no mount points anymore, no VM anymore.
I just want a solution for my secondary server to be able to relay my primary server in case of a problem.

What I want to do is to copy whatever I want from this logical partitions (LV from the SAN) to another LUN whenever I want, for those files to be accessible from any computer (in this case, my secondary Proxmox VE server).

I may not have understood averything about LVM + iSCSI but it seems I cannot do what I wanna do in the state I am.
Only DRDB seems to be a solution but I would have to recreate all my actual partitions, wouldn't I ?
 
Last edited by a moderator:
Hey Jack,

I wish I could send you a PM in French (as I see you are from France), we may understand each other better :p
Local backing is what I have done obviously, as you can see from my initial post. All my Virtual Disks use .raw format already.

But as you can see, I have created logical partition on a logical volume from my SAN. If I understood how it works (not sure), those partitions are only accessible from my master server system. If it crashes, no mount points anymore, no VM anymore.
I just want a solution for my secondary server to be able to relay my primary server in case of a problem.
no, lvm-storage work different. You have on your SAN-Slice an device which see all (both) nodes. You create (like you done before) an physical volume (pvcreate) and a volume-group (vgcreate). After that, you define this volume-group as lvm-storage in the pve-gui.
Now you can create an harddisk for an kvm-vm on this storage (the harddiskfile is an logical volume on the vg). If one node die - you can access the lv on the other node (but like jack write before: save the config)
For this solution you don't need mountpoints!

Udo
 
Thank you Udo, I know all of that, but what am I doing with my existing VMs ?
Once I added the LV group on the pve-gui, I know I would have access to my LUN to create kvm-VM.
But I could only manage the VM I have created trough the pve-gui.

I wish I could do this : cp /var/lib/vz2/images/101/my_vm101.raw /dev/mapper/stockVM-my_new_LUN-vm101 for example, but I am pretty sure that cannot work ^^

I cannot recreate my VM (production machines), I need to copy/move them to the new place.

(I think I would sucess in getting understood one day xD )
 
Thank you Udo, I know all of that, but what am I doing with my existing VMs ?
Once I added the LV group on the pve-gui, I know I would have access to my LUN to create kvm-VM.
But I could only manage the VM I have created trough the pve-gui.

I wish I could do this : cp /var/lib/vz2/images/101/my_vm101.raw /dev/mapper/stockVM-my_new_LUN-vm101 for example, but I am pretty sure that cannot work ^^
the simple way:
create an new harddisk for the VM on the lvm-storage with the gui.
look with lvdisplay.
Code:
dd if=[I]/var/lib/vz2/images/101/my_vm101.raw[/I] of=/dev/[I]stockVM/[/I]vm-101-disk-1
remove after that the old harddisk (not yet delete) and select the lvm-harddisk as boot-disk.
If all work you can delete the old raw-file.
You can do also all things by hand without gui if you know what you do (don't forget lvm-tags).

Udo
 
If we reboot the server, the VM cannot start at boot because the /etc/fstab file is called BEFORE the iSCSI connection with the SAN, so our disks are not mounted on the system. We got to do mount -a when the system is ready, and then start the VM manually! Really annoying :(

For this problem, try to use the _netdev mount option in fstab (see man mount).
It waits for network to start before mounting the filesystem.
 
Thank you very much GillesMo! Everything was in man mount I did not think about it!
@Udo, thank you aswell that should do it!

ATM I got a Windows XP virtual machine which just stopped to work 20 minutes ago... Cannot boot anymore on the virtual hard drive... I couln't stop it from pve-gui, so I did qm unlock VMID then qm stop VMID. It was OK. But when I did qm start VMID I had a permission denied error. I was on a ssh session on my master server, in root. Never had any problem like this one before!
I thought I may have had a blue screen because of some errors writing on the virtual hard drive but I am not sure this is the reason...
This is bad... >.<

And when I try to dd the .raw file to the lvm-harddisk I got Input/Output error (but they are the same size). I forced it with conv=noerror,sync and wait for it to end to have a deeper look, it is quite slow (1.5 MB/s to 2.1Mb/s).

So, I'm still waiting to test my clustering now. :)
 
Last edited by a moderator:
Ok, so I have got a problem when mounting. It is in read-only... I was unable to umount -f so I rebooted the master server. Now, I don't have /dev/mapper/stockVM/ or /dev/stockVM/ anymore but my LV Group stockVM seems fine from lvdisplay and vgdisplay...

What is missing ? Going crazy right now...

srv-vm1:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/pve-root 17G 12G 4.8G 71% /
tmpfs 4.9G 0 4.9G 0% /lib/init/rw
udev 10M 648K 9.4M 7% /dev
tmpfs 4.9G 0 4.9G 0% /dev/shm
/dev/mapper/pve-data 43G 19G 24G 44% /var/lib/vz
/dev/cciss/c0d0p1 504M 54M 425M 12% /boot
srv-vm1:~# pvscan
Found duplicate PV t9w2exMl1TaOWeSLVbfT4VypnqJCgoWk: using /dev/sdb1 not /dev/ sda1
PV /dev/sdb1 VG stockVM lvm2 [79.84 GB / 728.00 MB free]
PV /dev/block/104:2 VG pve lvm2 [67.83 GB / 4.00 GB free]
Total: 2 [147.67 GB] / in use: 2 [147.67 GB] / in no VG: 0 [0 ]
srv-vm1:~# lvscan
Found duplicate PV t9w2exMl1TaOWeSLVbfT4VypnqJCgoWk: using /dev/sdb1 not /dev/ sda1
inactive Original '/dev/stockVM/lv_vz' [53.71 GB] inherit
inactive '/dev/stockVM/lv_dump' [24.41 GB] inherit
inactive Snapshot '/dev/stockVM/vzsnap-srv-vm1-0' [1.00 GB] inherit
ACTIVE '/dev/pve/swap' [4.00 GB] inherit
ACTIVE '/dev/pve/root' [17.00 GB] inherit
ACTIVE '/dev/pve/data' [42.84 GB] inherit
srv-vm1:~# vgscan
Reading all physical volumes. This may take a while...
Found duplicate PV t9w2exMl1TaOWeSLVbfT4VypnqJCgoWk: using /dev/sdb1 not /dev/sda1
Found volume group "stockVM" using metadata type lvm2
Found volume group "pve" using metadata type lvm2
srv-vm1:~#

/dev/sda1 (where I did pvcreate /dev/sda1 + vgcreate stockVM /dev/sda1 few days ago) became /dev/sdb1. Should I modify something then ?
 
Last edited by a moderator:
MrJack, I love you. I'll try to repair my WXP VM now. I'll let you know for the clustering thing ;)
 
Hi,

Note : If your master brutally crash, you can't migrate or copy your VM on slave with the web interface, this is a limitation with current version of Proxmox and will be fixed in 2.0. The reason, the configuration file of a VM is only stored on the host who run the VM and there no auto migration or restart mechanism.

I hope I can make myself understood :)

I wrote a small script that stores all of the config files onto a FTP server. If a server "brutally" crashes then i just click a button and all the configs from that server will be uploaded to a back up server of my choosing. Its not seemless like Vmware but its better then losing the configs.