live migration of KVM VM's from master to slave unreliable

jdcynical · Mar 5, 2011

Hello to all, I hope someone can provide some insight on this issue I'm seeing. I have searched the forum for an answer, but have been unable to find an answer.

Cluster info:

Cluster members: Two Dell PE1950's
Shared storage: Dell PE2950, PERC6/i RAID 10. Storage is done with an NFS export as well as LVM over iSCSI (disk image is used for the exported block device). Storage is set up this way for testing.
KVM guests are fresh 64 bit installs of Debian 6 (Squeeze)

Cluster master (pair of these quad core xeons, 8 gigs RAM):

Code:

cat /proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
stepping        : 10
cpu MHz         : 2327.833
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority
bogomips        : 4655.66
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

Cluster slave (single xeon, 4 gigs RAM):

Code:

cat /proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5110  @ 1.60GHz
stepping        : 6
cpu MHz         : 1596.559
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm tpr_shadow
bogomips        : 3193.11
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Version info:

Code:

pveversion -v

pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-30
pve-kernel-2.6.32-4-pve: 2.6.32-30
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4

What I am seeing:

Migrating from the master node to the secondary works fine with perhaps a second of non-response.

Code:

/usr/sbin/qmigrate --online 10.88.25.72 101
Mar 04 16:35:09 starting migration of VM 101 to host '10.88.25.72'
Mar 04 16:35:09 copying disk images
Mar 04 16:35:09 starting VM on remote host '10.88.25.72'
Mar 04 16:35:10 starting migration tunnel
Mar 04 16:35:10 starting online/live migration
Mar 04 16:35:21 migration status: completed
Mar 04 16:35:21 migration speed: 11.64 MB/s
Mar 04 16:35:22 migration finished successfuly (duration 00:00:14)
VM 101 migration done

Migration from the slave node to the master, however, takes much longer and ends with the VM in a stopped state:

Code:

/usr/bin/ssh -t -t -n -o BatchMode=yes 10.88.25.72 /usr/sbin/qmigrate --online 10.88.24.175 101
Mar 04 16:36:00 starting migration of VM 101 to host '10.88.24.175'
Mar 04 16:36:00 copying disk images
Mar 04 16:36:00 starting VM on remote host '10.88.24.175'
Mar 04 16:36:00 starting migration tunnel
Mar 04 16:36:01 starting online/live migration
Mar 04 16:43:41 migration status: completed
Mar 04 16:43:41 migration speed: 0.28 MB/s
Mar 04 16:43:43 migration finished successfuly (duration 00:07:44)
Connection to 10.88.25.72 closed.
VM 101 migration done

I know that matched CPU's between cluster members is strongly recommended, however, looking at the CPU flags seen by the cluster members, there are only a few flags on the master that the slave doesn't have (est, sse4_1, xsave, vnmi, flexpriority). All the VM's running do see the same virtual CPU:

Code:

/proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 0.13.0
stepping        : 3
cpu MHz         : 1596.663
cache size      : 4096 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm up pni cx16 lahf_lm
bogomips        : 3201.03

I have seen this behavior when the shared storage is either iSCSI or NFS, and I am at a loss as to the cause. The only thing I have been able to find is that KVM might use the xsave/xrstor instructions for live migration, but I'm unable to determine if this is the cause and if it's possible to 'hide' it from KVM.

Any comments or pointers? Thank you for reading!

udo · Mar 5, 2011

Hi,
if you migrate the vm from node to master, had the vm run before on the master?

I had the effect with an old kvm-version and amd-cpus. It's possible to migrate one times, but on migrating back, the VM hang. But this issue is solved...

Have you tried to use following in the VM-config (/etc/qemu-server/101.conf):
args: -cpu qemu64,-nx

But normaly this should not be nessesary since kvm 0.12.3.

Udo

jdcynical · Mar 6, 2011

udo said:
Hi,

Hello!

udo said:
if you migrate the vm from node to master, had the vm run before on the master?

Yes. My testing was to start the VM on one node, migrate to the other and migrate back. During the migration, I had a ping running against the guest and an open SSH session running top for visual feedback.

udo said:
I had the effect with an old kvm-version and amd-cpus. It's possible to migrate one times, but on migrating back, the VM hang. But this issue is solved...

The guest isn't hanging, it's simply shutting down. The UI shows the guest as stopped and I have to restart it after it's done.

udo said:
Have you tried to use following in the VM-config (/etc/qemu-server/101.conf):
args: -cpu qemu64,-nx

No, I've not tried that as the threads that I can find using that string refer to migrating guests between Intel and AMD chips, which I'm dealing with all Intel here, as well as the guests already see the CPU as a 64-bit QEMU chip (hence the lm flag). But I'll give it a shot when I'm in the office monday.

EDIT: Well, I happened to be logged in remotely for something else tonight and tried this. No joy, same deal as before. The migration back to the master takes a lot longer to do and ends up with the guest in a stopped state.

Thank you for the reply, I'll follow up with the results. Anyone else have any suggestions? I'm more than willing to experiment right now as this setup isn't full production yet, so I can break a few things if needed.

meto · Mar 6, 2011

I'd say that comparing Xeon 51XX and 54XX are so different that you should hide some of the flags using -cpu, as you'd do on AMD and Intel.

jdcynical · Mar 6, 2011

meto said:
I'd say that comparing Xeon 51XX and 54XX are so different that you should hide some of the flags using -cpu, as you'd do on AMD and Intel.

Looking at the flags on the chips, the master has the following flags that are not present on the slave:

est
sse4_1
xsave
vnmi
flexpriority

Looking into these a bit deeper, my guess would be that the last three might be throwing things off as they do deal with virtual machine handling (and I do see xsave referenced in dmesg). I have no problems hiding these flags, the question is how? I thought that the /etc/qemu-server files were used specifically for exposing and hiding flags from the guests, and as the guests see the same 'CPU and flags' regardless of the host they are running on, I didn't think that it would apply.

meto · Mar 6, 2011

udo said:
Have you tried to use following in the VM-config (/etc/qemu-server/101.conf):
args: -cpu qemu64,-nx

Read about -cpu arg in KVM, and change it as you wish.

jdcynical · Mar 8, 2011

meto said:
Read about -cpu arg in KVM, and change it as you wish.

Which doesn't apply here as far as I can tell.

From the KVM FAQ:

Does KVM support live migration from an AMD host to an Intel host and back?

Yes. There may be issues on 32-bit Intel hosts which don't support NX (or XD), but for 64-bit hosts back and forth migration should work well. Migration of 32-bit guests should work between 32-bit hosts and 64-bit hosts. If one of your hosts does not support NX, you may consider disabling NX when starting the guest on a NX-capable system. You can do it by passing "-cpu qemu64,-nx" parameter to the guest.

The guests on my cluster are 64-bit, and the Intel CPU's in both cluster members (host machines, not the guests) support nx. Besides the fact that I tried specifying "-cpu qemu64,-nx" and it didn't make any difference. If anything, I would think that moving from the master, which has the more capable CPU, to the slave which has a 'lesser' CPU would be a problem, but that works fine, it's moving from the slave to the master that is the issue.

So please explain to me, what am I missing when you are referring to the cpu argument?

meto · Mar 8, 2011

jdcynical said:
Which doesn't apply here as far as I can tell.

From the KVM FAQ:

The guests on my cluster are 64-bit, and the Intel CPU's in both cluster members (host machines, not the guests) support nx. Besides the fact that I tried specifying "-cpu qemu64,-nx" and it didn't make any difference. If anything, I would think that moving from the master, which has the more capable CPU, to the slave which has a 'lesser' CPU would be a problem, but that works fine, it's moving from the slave to the master that is the issue.

So please explain to me, what am I missing when you are referring to the cpu argument?

http://www.linux-kvm.org/page/Tuning_KVM

Maybe try -cpu qemu64,+ssse3,+sse4.1,+sse4.2,+x2apic

jdcynical · Mar 9, 2011

meto said:
http://www.linux-kvm.org/page/Tuning_KVM

Maybe try -cpu qemu64,+ssse3,+sse4.1,+sse4.2,+x2apic

No luck.

Gest results after trying this args line:

Code:

cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 0.13.0
stepping        : 3
cpu MHz         : 2327.472
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm up rep_good pni ssse3 cx16 sse4_1 x2apic hypervisor lahf_lm
bogomips        : 4654.94
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

I'm not convinced that this is a guest CPU issue, as the migration from the slave back to the master takes several minutes before it returns 'successful', yet moving from the master to the slave is almost instantaneous (10-15 seconds max from master to slave, but several minutes from slave to master).

dietmar · Mar 9, 2011

maybe you can test with 0.14.0?

jdcynical · Mar 9, 2011

dietmar said:
maybe you can test with 0.14.0?

OK, dumb question, how would I go about doing that?

Add "deb http://download.proxmox.com/debian lenny pvetest" to sources.list?
Do I need to have both nodes using 0.14?
Do I need to use the testing kernel as well?

I want to confirm before I try anything as I don't want to make anything worse.

dietmar · Mar 9, 2011

jdcynical said:
Add "deb http://download.proxmox.com/debian lenny pvetest" to sources.list?

yes. In fact you should replace 'pve' with 'pvetest' in sources.list

jdcynical said:
Do I need to have both nodes using 0.14?
Do I need to use the testing kernel as well?

Yes. But please do not test in a production environment.

meto · Mar 9, 2011

jdcynical said:
OK, dumb question, how would I go about doing that?

Add "deb http://download.proxmox.com/debian lenny pvetest" to sources.list?
Do I need to have both nodes using 0.14?
Do I need to use the testing kernel as well?

I want to confirm before I try anything as I don't want to make anything worse.

Ok dumb question here also:
Are you sure that you have the same versions of PVE on both nodes? Show us pveversion -v from both.

jdcynical · Mar 10, 2011

meto said:
Ok dumb question here also:
Are you sure that you have the same versions of PVE on both nodes? Show us pveversion -v from both.

Funny you should ask, I was just thinking about that this morning, so I checked.

The master is as listed in the first post, but the secondary looked like this:

Code:

pveversion -v

pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.35: 1.7-9
pve-kernel-2.6.32-4-pve: 2.6.32-30
pve-kernel-2.6.35-1-pve: 2.6.35-9
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4

Somewhere, somehow (probably a typo on my part), that 2.6.35-9 kernel was added, which seems to have pulled in and replaced the proxmox-ve-2.6.32: 1.7-30 with the 2.6.35: 1.7-9 version.

I've since corrected this on the secondary (love apt-get) and pveversion -v reports the same as the master, but migration still fails right now. I plan on rebooting the machine later tonight and trying again to verify.

So yes, I'm going to hold off on adding the testing repo until this is done.

jdcynical · Mar 11, 2011

Ok. rebooted the secondary and verified that the versions between the two machines are the same. Tried a live migration from the slave to the master, same deal... Took several minutes and ended up with the guest in a stopped state.

Somewhere along the line during the "starting online/live migration" phase, something goes strange. Is there a config file somewhere that I can turn up the logging to a debug setting and tail it during the process?

dietmar · Mar 11, 2011

It does not really make sense to debug an old version - we will release kvm 0.14.0 next week, so I suggest you test with that version first.

jdcynical · Mar 11, 2011

dietmar said:
It does not really make sense to debug an old version - we will release kvm 0.14.0 next week, so I suggest you test with that version first.

OK, I'll wait for it to hit the repository and test again at that point.

*crosses fingers*

jdcynical · Mar 22, 2011

Greetings. Can you tell me what the status is of the kvm 0.14 release? I've been checking the repo and I'm not seeing a newer version of kvm yet.

Or was I mistaken when I read the previous post and thought that kvm 0.14 was going to be released to the stable repo?

jdcynical · Mar 31, 2011

Live migration of KVM VM's from slave to master still not working...

I have verified that both cluster members have the same results:

Code:

pveversion -v
pve-manager: 1.8-15 (pve-manager/1.8/5754)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.8-32
pve-kernel-2.6.32-4-pve: 2.6.32-32
qemu-server: 1.1-30
pve-firmware: 1.0-11
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-11
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.14.0-2
ksm-control-daemon: 1.0-5

Just updated to 1.8 as per the wiki (apt-get update, aptitude safe-upgrade, reboot), and live migration from host #2 (the lesser of the two machines) to host #1 (the master) still doesn't work right. I am getting somewhat different results this time.

All guests have their drives available via NFS.

Migration from 1 to 2:

Code:

/usr/sbin/qmigrate --online 10.10.10.2 105
Mar 31 01:11:33 starting migration of VM 105 to host '10.10.10.1'
Mar 31 01:11:33 copying disk images
Mar 31 01:11:33 starting VM on remote host '10.10.10.2'
Mar 31 01:11:33 starting migration tunnel
Mar 31 01:11:34 starting online/live migration
Mar 31 01:11:36 migration status: active (transferred 20749KB, remaining 90196KB), total 278976KB)
Mar 31 01:11:38 migration status: active (transferred 39280KB, remaining 59828KB), total 278976KB)
Mar 31 01:11:40 migration status: active (transferred 58404KB, remaining 38676KB), total 278976KB)
Mar 31 01:11:42 migration status: active (transferred 77053KB, remaining 18460KB), total 278976KB)
Mar 31 01:11:44 migration status: completed
Mar 31 01:11:44 migration speed: 25.60 MB/s
Mar 31 01:11:45 migration finished successfuly (duration 00:00:13)
VM 105 migration done

Relatively quick, but what is it transferring that is around 278 MB is size? I'm guessing a memory dump?

Now to move it back from #2 to #1:

Code:

/usr/bin/ssh -t -t -n -o BatchMode=yes 10.10.10.2 /usr/sbin/qmigrate --online 10.10.10.1 105
Mar 31 01:15:01 starting migration of VM 105 to host '10.10.10.1' 
Mar 31 01:15:01 copying disk images 
Mar 31 01:15:02 starting VM on remote host '10.10.10.1' 
Mar 31 01:15:02 starting migration tunnel 
Mar 31 01:15:02 starting online/live migration 
Mar 31 01:15:04 migration status: active (transferred 5403KB, remaining 259900KB), total 278976KB) 
Mar 31 01:15:06 migration status: active (transferred 5883KB, remaining 259412KB), total 278976KB) 
Mar 31 01:15:10 migration status: active (transferred 6651KB, remaining 258644KB), total 278976KB) 
Mar 31 01:15:12 migration status: active (transferred 6871KB, remaining 258424KB), total 278976KB) 
Mar 31 01:15:14 migration status: active (transferred 7255KB, remaining 258036KB), total 278976KB) 
Mar 31 01:15:16 migration status: active (transferred 8023KB, remaining 257196KB), total 278976KB) 
Mar 31 01:15:18 migration status: active (transferred 8275KB, remaining 256424KB), total 278976KB) 
Mar 31 01:15:20 migration status: active (transferred 8787KB, remaining 255724KB), total 278976KB)
...
Mar 31 01:28:37 migration status: active (transferred 204501KB, remaining 1836KB), total 278976KB)
Mar 31 01:28:39 migration status: active (transferred 204725KB, remaining 2240KB), total 278976KB)
Mar 31 01:28:41 migration status: active (transferred 205489KB, remaining 1736KB), total 278976KB)
Mar 31 01:28:43 migration status: active (transferred 206033KB, remaining 1564KB), total 278976KB)
Mar 31 01:28:45 migration status: active (transferred 206545KB, remaining 1500KB), total 278976KB)
Mar 31 01:28:47 migration status: active (transferred 207501KB, remaining 1420KB), total 278976KB)
Mar 31 01:28:49 migration status: active (transferred 207853KB, remaining 1492KB), total 278976KB)
Mar 31 01:28:51 migration status: active (transferred 208557KB, remaining 1644KB), total 278976KB)
Mar 31 01:28:53 migration status: active (transferred 209097KB, remaining 1888KB), total 278976KB)
Mar 31 01:28:55 migration status: active (transferred 209321KB, remaining 1816KB), total 278976KB)
Mar 31 01:28:57 migration status: active (transferred 209865KB, remaining 1644KB), total 278976KB)
Mar 31 01:28:59 migration status: active (transferred 210345KB, remaining 1580KB), total 278976KB)
Mar 31 01:29:01 migration status: active (transferred 210629KB, remaining 1724KB), total 278976KB)
Mar 31 01:29:03 migration status: active (transferred 211397KB, remaining 1608KB), total 278976KB)
Mar 31 01:29:05 migration status: active (transferred 211909KB, remaining 1448KB), total 278976KB)
Mar 31 01:29:07 migration status: active (transferred 212929KB, remaining 1460KB), total 278976KB)
Mar 31 01:29:09 migration status: active (transferred 213697KB, remaining 1456KB), total 278976KB)
Mar 31 01:29:11 migration status: active (transferred 214209KB, remaining 1516KB), total 278976KB)
Mar 31 01:29:13 migration status: active (transferred 214465KB, remaining 1616KB), total 278976KB)
Mar 31 01:29:15 migration status: active (transferred 214941KB, remaining 1520KB), total 278976KB)
Mar 31 01:29:17 migration status: active (transferred 215357KB, remaining 1568KB), total 278976KB)
Mar 31 01:29:19 migration status: active (transferred 216093KB, remaining 1280KB), total 278976KB)
Mar 31 01:29:21 migration status: active (transferred 216381KB, remaining 1624KB), total 278976KB)
Mar 31 01:29:23 migration status: active (transferred 216889KB, remaining 1884KB), total 278976KB)
Mar 31 01:29:25 migration status: active (transferred 217369KB, remaining 1796KB), total 278976KB) 
...

And it keeps going, but never finishes. As you can see, it runs a heck of a lot slower and never seems to finish as the amount remaining keeps fluctuating up and down.

I really don't know where to look to turn up logging levels to try and troubleshoot this any further. Any ideas?

tom · Mar 31, 2011

Re: Live migration of KVM VM's from slave to master still not working...

I will try to reproduce your issue here, pls tell me in detail what I need to setup that I got exactly the same.

live migration of KVM VM's from master to slave unreliable

jdcynical

Guest

Distinguished Member

jdcynical

Guest

Member

jdcynical

Guest

Member

jdcynical

Guest

Member

jdcynical

Guest

Proxmox Staff Member

jdcynical

Guest

Proxmox Staff Member

Member

jdcynical

Guest

jdcynical

Guest

Proxmox Staff Member

jdcynical

Guest

jdcynical

Guest

jdcynical

Guest

Proxmox Staff Member

We value your privacy