J
jdcynical
Guest
Hello to all, I hope someone can provide some insight on this issue I'm seeing. I have searched the forum for an answer, but have been unable to find an answer.
Cluster info:
Cluster master (pair of these quad core xeons, 8 gigs RAM):
Cluster slave (single xeon, 4 gigs RAM):
Version info:
What I am seeing:
Migrating from the master node to the secondary works fine with perhaps a second of non-response.
Migration from the slave node to the master, however, takes much longer and ends with the VM in a stopped state:
I know that matched CPU's between cluster members is strongly recommended, however, looking at the CPU flags seen by the cluster members, there are only a few flags on the master that the slave doesn't have (est, sse4_1, xsave, vnmi, flexpriority). All the VM's running do see the same virtual CPU:
I have seen this behavior when the shared storage is either iSCSI or NFS, and I am at a loss as to the cause. The only thing I have been able to find is that KVM might use the xsave/xrstor instructions for live migration, but I'm unable to determine if this is the cause and if it's possible to 'hide' it from KVM.
Any comments or pointers? Thank you for reading!
Cluster info:
- Cluster members: Two Dell PE1950's
- Shared storage: Dell PE2950, PERC6/i RAID 10. Storage is done with an NFS export as well as LVM over iSCSI (disk image is used for the exported block device). Storage is set up this way for testing.
- KVM guests are fresh 64 bit installs of Debian 6 (Squeeze)
Cluster master (pair of these quad core xeons, 8 gigs RAM):
Code:
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
stepping : 10
cpu MHz : 2327.833
cache size : 6144 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority
bogomips : 4655.66
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:
Code:
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1596.559
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm tpr_shadow
bogomips : 3193.11
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
Code:
pveversion -v
pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-30
pve-kernel-2.6.32-4-pve: 2.6.32-30
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4
Migrating from the master node to the secondary works fine with perhaps a second of non-response.
Code:
/usr/sbin/qmigrate --online 10.88.25.72 101
Mar 04 16:35:09 starting migration of VM 101 to host '10.88.25.72'
Mar 04 16:35:09 copying disk images
Mar 04 16:35:09 starting VM on remote host '10.88.25.72'
Mar 04 16:35:10 starting migration tunnel
Mar 04 16:35:10 starting online/live migration
Mar 04 16:35:21 migration status: completed
Mar 04 16:35:21 migration speed: 11.64 MB/s
Mar 04 16:35:22 migration finished successfuly (duration 00:00:14)
VM 101 migration done
Code:
/usr/bin/ssh -t -t -n -o BatchMode=yes 10.88.25.72 /usr/sbin/qmigrate --online 10.88.24.175 101
Mar 04 16:36:00 starting migration of VM 101 to host '10.88.24.175'
Mar 04 16:36:00 copying disk images
Mar 04 16:36:00 starting VM on remote host '10.88.24.175'
Mar 04 16:36:00 starting migration tunnel
Mar 04 16:36:01 starting online/live migration
Mar 04 16:43:41 migration status: completed
Mar 04 16:43:41 migration speed: 0.28 MB/s
Mar 04 16:43:43 migration finished successfuly (duration 00:07:44)
Connection to 10.88.25.72 closed.
VM 101 migration done
Code:
/proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 2
model name : QEMU Virtual CPU version 0.13.0
stepping : 3
cpu MHz : 1596.663
cache size : 4096 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 4
wp : yes
flags : fpu de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm up pni cx16 lahf_lm
bogomips : 3201.03
Any comments or pointers? Thank you for reading!