PVE 1.4 beta2: migration KVM on storages crashed

we use an amd processor.

Code:
processor    : 0
vendor_id    : AuthenticAMD
cpu family    : 16
model        : 4
model name    : Quad-Core AMD Opteron(tm) Processor 2376
stepping    : 2
cpu MHz        : 2300.093
cache size    : 512 KB
physical id    : 0
siblings    : 4
core id        : 0
cpu cores    : 4
fpu        : yes
fpu_exception    : yes
cpuid level    : 5
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good pni cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt
bogomips    : 4603.66
TLB size    : 1024 4K pages
clflush size    : 64
cache_alignment    : 64
address sizes    : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
 
For the best debug I describe my installation:

I have installed 2 node with the 1.4 b2 iso. So I have reduced the default pve/data lvm and then created a new lvm that I'm using for drbd.

So the new pve/lvmdrbd lvm is the backend disk for drbd itself (/dev/drbd1).
On /dev/drbd1 I have created a new volume group and I inserted it in the master node through web interface (Add LVM Group - check shared).

Is it correct?
 
We do not use lvm locking, and we do not activate volumes read only (we have a simple lock manager on the master - not lvm specific).

Could you explain me how it works?
What is the process that manages it?
There are some configuration files?

Thanks.
 
Hi,
I have the same problem.

My configuration is: 2 node 1.4 beta2 with primary/primary drbd.

I migrate the WinXP machine and it crashes (blue screen).
The vm reboots but the S.O. vm crashes again. I need to stop the vm and then restart it.

Why?

Another information that I'm verifying now: before migration I opened a command prompt in the vm and I inserted the command "ping -t 192.168.1.1" (192.168.1.1 is my router), and in my notebook the command "ping -r 192.168.1.20" (192.168.1.20 is the vm). During the migration I see in my notebook that I lose 5 ping (it is normal) and when the vm is completely migrated the ping return. In the destination host I see that the cpu load is low: I believe that the vm is working well. If open the vnc console of the vm I see that the ping command that I have inserted is working correctly but if I move the cursor the vm freezes, the host's cpu load increases until 100% and after few seconds blue screen appears.

It seems that there is a problem with the virtual video card or with the virtual mouse/keyboard.

What could it be?
 
So the new pve/lvmdrbd lvm is the backend disk for drbd itself (/dev/drbd1).
On /dev/drbd1 I have created a new volume group and I inserted it in the master node through web interface (Add LVM Group - check shared).

Is it correct?

So you use LVM on top of DRBD on top of a lvm volume??

DISK-partition/LVM/DRBD/LVM

Instead I would use a partition directly

DISK-partition/DRBD/LVM

- Dietmar
 
Could you explain me how it works?
What is the process that manages it?

When we modify shared storage metadata we do:

Code:
ssh master pvesm lock <STORAGE_ID>

pvesm implements a simple locking protocol.

There are some configuration files?

no
 
No, because other users report the same error am AMD CPUs.

So the problem is with the version of KVM, right?
I see that in 1.4 beta2 it is implemented the 0.11.0 KVM version.

Do you believe that with older version of proxmox I resolve?

What version of KVM is in proxmox 1.4 b1?
And in 1.3?

Thansk a lot for the support.
 
older version do not have real live migration.

But in the Roadmap description of 1.4 b1 I see 'Zero downtime live migration (KVM) over ssh channel - all traffic is encrypted'; what version of KVM is here?
 
Can I remove the pve-qemu-kvm and install the qemu-kvm that I find in linux-kvm.org?

There is any dependency with other packages?