After Migration from 1.2 to 1.3 Win XP VMs crashing to BSOD while booting

nasenmann72

Renowned Member
Dec 9, 2008
71
2
73
Germany, Saarland
Hi,

in my two node cluster configuration, I've upgraded the master server from Proxmox VE 1.2 to 1.3. Now all XP/2003 machines are crashing to BSOD (even in safe mode) while booting on the 1.3 (master) node. I can boot the VMs, if I disable KVM in the settings, but then the VMs are very slow.
If I migrate the VM to the 1.2 (slave) node it is booting. There is an System Error with event id 1003, which I think is caused by the faulty boot on the 1.3 node. But besides that the VM is running normally.

What could be the reason for this problem?

Additional information:
- I use the E1000 driver for network, not the paravirtualized one from Qumranet.
- The cluster nodes are AMD Opteron based servers.

Thanks in advance for any help,
Der Nasenmann
 
Master node:

Code:
asterix:~# pveversion -v
pve-manager: 1.3-1 (pve-manager/1.3/4023)
qemu-server: 1.0-14
pve-kernel: 2.6.24-8
pve-kvm: 86-3
pve-firmware: 1
vncterm: 0.9-2
vzctl: 3.0.23-1pve3
vzdump: 1.1-2
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
Slave node:
Code:
obelix:~# pveversion -v
pve-manager: 1.2-1 (pve-manager/1.2/3982)
qemu-server: 1.0-12
pve-kernel: 2.6.24-6
pve-kvm: 85-1
pve-firmware: 1
vncterm: 0.9-2
vzctl: 3.0.23-1pve3
vzdump: 1.1-1
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
 
Windows 2003 Server VM:

Code:
asterix:/etc/qemu-server# cat 105.conf
name: sql1
smp: 4
vlan0: e1000=3A:A8:49:7A:55:4F
bootdisk: ide0
ostype: w2k3
memory: 4096
ide1: SQL2005_CD1.ISO,media=cdrom
onboot: 0
boot: dca
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
ide0: vm-105-disk.qcow2
Windows XP Professional VM:
Code:
asterix:/etc/qemu-server# cat 103.conf
name: vreneli
ide2: none,media=cdrom
smp: 1
bootdisk: ide0
ide0: vm-103-disk-1.qcow2
ostype: wxp
memory: 256
onboot: 0
vlan0: e1000=BA:87:94:6A:F1:EC
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 0
Windows XP Professional VM:
Code:
obelix:/etc/qemu-server# cat 109.conf
name: finoffjboss
ide2: none,media=cdrom
smp: 1
vlan0: e1000=7E:5E:06:57:73:5D
bootdisk: ide0
ostype: wxp
memory: 768
onboot: 0
ide0: vm-109-disk.qcow2
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
 
Hi,

in my two node cluster configuration, I've upgraded the master server from Proxmox VE 1.2 to 1.3. Now all XP/2003 machines are crashing to BSOD (even in safe mode) while booting on the 1.3 (master) node. I can boot the VMs, if I disable KVM in the settings, but then the VMs are very slow.
If I migrate the VM to the 1.2 (slave) node it is booting. There is an System Error with event id 1003, which I think is caused by the faulty boot on the 1.3 node. But besides that the VM is running normally.

What could be the reason for this problem?

What does the BSOD say? Could you provide a screenshot?
 
Hi,

taking a screen shot from the Win XP VMs was too difficult, because they are rebooting to fast. But here is one from the 2003 Server VM, which says the same:

Bildschirmfoto-sql1 - Proxmox Console - Mozilla Firefox.png

Free hard disk space is available, the graphics adapter driver should be the Cirrus Logic 5446, which is automaticly installed by Windows, if virtualized with KVM.

I googled the STOP-Error 0x0000008E and found the following:

http://patch-info.de/artikel/2008/10/31/572 (in german)

Perhaps the BSOD is caused by the video driver.

I will try now the following: migrate one XP VM to the PVE 1.2 node, boot it, deactivate the display driver, shutdown the VM, migrate it back to PVE 1.3 node and boot it again...
 
can you try safe mode? (if yes remove graphic drivers, network)

can install a new windows on your 1.3 installation?

if nothing helps and a fresh installation is working, run the windows installer cd in repair mode.
 
Hi,

I tried to install a fresh XP VM on the 1.3 node, but it hangs while booting from CD-image after the first hardware detection (black screen). Also I followed the advice of BitRausch to force the system HAL. Tried different HALs but no installation was possible. Booting the fresh XP VM on the 1.2 node, the XP installation is no problem.

Somehting seems to be really wrong on my 1.3 node...

Der Nasenmann
 
Hi,

I don't think that the hardware is faulty but I will do some checkups.

Then I want to do a clean install of PVE 1.3. Is it ok, if I migrate all VMs to the 1.2 slave node, reinstall the 1.3 master node and put it as the master node back in the cluster?

Der Nasenmann
 
You are right! Why should this help? This is no Windows machine which could do strange things in cause of a f*cked up registry or something!

I did some hardware checks but did not find any problems and I do not think that there are some with it.

I installed a third cluster node on a PC with a Intel Core 2 Duo CPU. There the XP VM does boot normally.
So it seems that only the AMD machines are affected. In the bug tracker of KVM I found the following (fixed) bug: http://sourceforge.net/tracker/?func=detail&aid=2786468&group_id=180599&atid=893831
The bug on my PVE 1.3 AMD node seems to be similar: the Windows XP/2003 VM crashes when changing resolution from Windows boot logo to the login screen.

Der Nasenmann
 
Last edited:
Hi guys,

the weekend is knockin on the door, the weather is fine and my Windows VMs on the PVE 1.3 node are running again. So no reason to bother you any more! ;-)

The Opteron server which had the problem works on a supermicro H8DME-2 mainboard which was running firmware version 2.0a. An update to firmware v3.0 solved my problems. So it was actually a hardware problem!
MEA CULPA!

Have a nice weekend,
der Nasenmann
 
Hi guys,

the weekend is knockin on the door, the weather is fine and my Windows VMs on the PVE 1.3 node are running again. So no reason to bother you any more! ;-)

The Opteron server which had the problem works on a supermicro H8DME-2 mainboard which was running firmware version 2.0a. An update to firmware v3.0 solved my problems. So it was actually a hardware problem!
MEA CULPA!

Have a nice weekend,
der Nasenmann

perfect, thanks for reporting. checking the latest bios always helps getting a stable VT/AMD-V and I suggest this to all users having issues. also on Raid controller cards - use always the latest version of the firmware.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!