Live Migration from a node always failed

thevtek

New Member
Sep 18, 2012
4
0
1
I have a real strange behavior, I am using several nodes on a cluster with a NAS with NFS all my virtual machine are KVM with their disk on the NAS each node is talking to each one without problem:

here I have :
athena :
root@athena:~# pveversion -v
pve-manager: 2.1-14 (pve-manager/2.1/f32f3f46)
running kernel: 2.6.32-14-pve
proxmox-ve-2.6.32: 2.1-74
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.92-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.8-1
pve-cluster: 1.0-27
qemu-server: 2.0-49
pve-firmware: 1.0-18
libpve-common-perl: 1.0-30
libpve-access-control: 1.0-24
libpve-storage-perl: 2.0-31
vncterm: 1.0-3
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.1-8
ksm-control-daemon: 1.1-1

megara:root@megara:~# pveversion -vpve-manager: 2.1-14 (pve-manager/2.1/f32f3f46)
running kernel: 2.6.32-14-pve
proxmox-ve-2.6.32: 2.1-74
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.92-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.8-1
pve-cluster: 1.0-27
qemu-server: 2.0-49
pve-firmware: 1.0-18
libpve-common-perl: 1.0-30
libpve-access-control: 1.0-24
libpve-storage-perl: 2.0-31
vncterm: 1.0-3
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.1-8
ksm-control-daemon: 1.1-1

inachos:root@inachos:~# pveversion -vpve-manager: 2.1-14 (pve-manager/2.1/f32f3f46)
running kernel: 2.6.32-14-pve
proxmox-ve-2.6.32: 2.1-74
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-14-pve: 2.6.32-74
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.92-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.8-1
pve-cluster: 1.0-27
qemu-server: 2.0-49
pve-firmware: 1.0-18
libpve-common-perl: 1.0-30
libpve-access-control: 1.0-24
libpve-storage-perl: 2.0-31
vncterm: 1.0-3
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.1-8
ksm-control-daemon: 1.1-1

Here its my problem:

If I do live migrate

inachos => megara of a KVM machine its working ike a charm
megara => inachos of a KVM machine its working ike a charm
inachos => athena of a KVM machine its working ike a charm
megara => athena of a KVM machine its working ike a charm

if I am trying to move out with a live migration from athena to any others nodes inachos or megara : its seem to work like a charm and the message is saying task OK like others BUT the guest its completely freeze if I conect a console its giving me a console with the screen and character but FREEZE I have to reset the guest machine always

SO I assume there s a problem for Athena I know that there is a feture That I have enable since I have this machine its talking in the bios about SECURE Virtuialization ??? others machine in the bios its only a question of ENABLE or disable virtualization but no word like SECURE

athena :
root@athena:~# lspci
00:00.0 Host bridge: Advanced Micro Devices [AMD] RS880 Host Bridge
00:01.0 PCI bridge: ASUSTeK Computer Inc. RS880 PCI to PCI bridge (int gfx)
00:07.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 3)
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] RS780/RS880 PCI to PCI bridge (PCIE port 5)
00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:13.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller (rev 42)
00:14.1 IDE interface: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 IDE Controller (rev 40)
00:14.3 ISA bridge: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 LPC host controller (rev 40)
00:14.4 PCI bridge: Advanced Micro Devices [AMD] nee ATI SBx00 PCI to PCI Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI2 Controller
00:16.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
00:16.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 15h Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 15h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 15h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 15h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 15h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices [AMD] Family 15h Processor Function 5
01:05.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RS880 [Radeon HD 4250]
01:05.1 Audio device: Advanced Micro Devices [AMD] nee ATI RS880 HDMI Audio [Radeon HD 4200 Series]
02:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
04:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)



its a 8 cores AMD FX 8120 here :

processor : 7
vendor_id : AuthenticAMD
cpu family : 21
model : 1
model name : AMD FX(tm)-8120 Eight-Core Processor
stepping : 2
cpu MHz : 3099.691
cache size : 2048 KB
physical id : 0
siblings : 8
core id : 7
cpu cores : 4
apicid : 23
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core cpb npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
bogomips : 6199.38
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb

If you have any idea it will be very apreciated :)
 
here ist a sample of a migration from node athena to inachos, if I do a live migration with this guess from inachos to megara no problem but Again when trying to move out a guess with live migration from athena the guess ate the other node need to be reset because its always 100% frozen seem that athena is not sending properly a signal at the end to the new node where the guess was move over to.... also all my nics are properly configurated each node as 2 nics gigabits with vmbr0 vmbr1 as define on all node vmbr0 its lan and vmbr1 is wan the lan is the same subnet and everything is working smoothly for live migration from inachos to megara or vice versa but moving out from athena its always a freeze guess at the end and I have no vlue in the log to know whats the problem I think I really have a bug here !

tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:04 starting migration of VM 1004 to node 'inachos' (10.10.192.13)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:06 starting migration tunnel
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:06 starting online/live migration on port 60000
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:12 migration status: active (transferred 343735876, remaining 2804273152), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:15 migration status: active (transferred 507011009, remaining 2639437824), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:17 migration status: active (transferred 679350322, remaining 2466635776), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:20 migration status: active (transferred 845766878, remaining 2299514880), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:23 migration status: active (transferred 1016599307, remaining 2126401536), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:26 migration status: active (transferred 1185436572, remaining 1956970496), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:29 migration status: active (transferred 1350939813, remaining 1790382080), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:31 migration status: active (transferred 1522197859, remaining 1618345984), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:34 migration status: active (transferred 1673897640, remaining 1465315328), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:36 migration status: active (transferred 1820567339, remaining 1318109184), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:39 migration status: active (transferred 1988880489, remaining 1148493824), total 3162963968)
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:57 migration speed: 58.82 MB/s
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:42:57 migration status: completed
tasks/C/UPID:athena:000C1D38:024B73C5:5036DBEC:qmigrate:1004:root@pam::Aug 23 21:43:00 migration finished successfuly (duration 00:00:56)
tasks/C/UPID:athena:00096C57:01CE28B9:50359B2C:qmstart:1007:root@pam::migration listens on port 60000
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!