Hello,
We moved 2 HP servers into 1 server with virtualisation (Proxmox). We used SelfImage and everything boots well. All Virtio drivers are installed and we removed all HP software.
The 2 servers are:
SBS 2008 (with Exchange 2007, AD, file, print, DNS, DHCP)
Server 2008 Standard (With SQL Express / SAP application)
At random times the servers crash with a blue screen. The SBS 2008 crashes more often, this server is used a bit more due to the Exchange software.
Some debug info of the minidumps:
NTFS_FILE_SYSTEM CI.dll
SYSTEM_SERVICE_EXCEPTION ntoskrnl.exe
IRQL_NOT_LESS_OR_EQUAL ntoskrnl.exe
SYSTEM_SERVICE_EXCEPTION ntoskrnl.exe
KMODE_EXCEPTION_NOT_HANDLED ntoskrnl.exe
DRIVER_VERIFIER_DETECTED_VIOLATION crcdisk.sys
PAGE_FAULT_IN_FREED_SPECIAL_POOL ntoskrnl.exe
SYSTEM_SERVICE_EXCEPTION mup.sys
DRIVER_CORRUPTED_EXPOOL ntoskrnl.exe
What i've tried so far:
- Memtest 24 hours
- Replaced hardware with known good test server
- Deleted a old McAfee and Synology driver
- Changed drivers (virtio/ide/e1000/vga)
- Chckdsk /F
- scf /scannow
- Disabled HP leftover drivers from starting (device management, hidden devices)
- Removed any software (except Windows/Microsoft/SAP related), so no anti-virus/back-up/monitoring etc.
- Checked eventlog for problems/errors, nothing special and nothing just before the crash
The timing of the BSOD has nothing to do with any stress, the servers also gets BSOD when doing nothing. It's not like a back-up running or a scheduled task. Also, I'm unable to reproduce the error by creating heavy load on the VM.
The VM's don't get a BSOD together, it's mostly just the SBS server and the APP server will run fine. The SBS server gets a BSOD 3 times a week and the APP server 1 or 2 times a month.
Some info about the currunt host:
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-21 (running version: 3.1-21/93bf03d4)
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-8
libpve-access-control: 3.0-7
libpve-storage-perl: 3.0-17
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-4
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1
Host has 2x 5504 Xeon processor, 16 GB ECC RAM, 4x500GB RAID10 (HP HW RAID). It's a ML350 G6 model.
I don't think this has much to do with Proxmox but maybe someone has experienced this before.
We moved 2 HP servers into 1 server with virtualisation (Proxmox). We used SelfImage and everything boots well. All Virtio drivers are installed and we removed all HP software.
The 2 servers are:
SBS 2008 (with Exchange 2007, AD, file, print, DNS, DHCP)
Server 2008 Standard (With SQL Express / SAP application)
At random times the servers crash with a blue screen. The SBS 2008 crashes more often, this server is used a bit more due to the Exchange software.
Some debug info of the minidumps:
NTFS_FILE_SYSTEM CI.dll
SYSTEM_SERVICE_EXCEPTION ntoskrnl.exe
IRQL_NOT_LESS_OR_EQUAL ntoskrnl.exe
SYSTEM_SERVICE_EXCEPTION ntoskrnl.exe
KMODE_EXCEPTION_NOT_HANDLED ntoskrnl.exe
DRIVER_VERIFIER_DETECTED_VIOLATION crcdisk.sys
PAGE_FAULT_IN_FREED_SPECIAL_POOL ntoskrnl.exe
SYSTEM_SERVICE_EXCEPTION mup.sys
DRIVER_CORRUPTED_EXPOOL ntoskrnl.exe
What i've tried so far:
- Memtest 24 hours
- Replaced hardware with known good test server
- Deleted a old McAfee and Synology driver
- Changed drivers (virtio/ide/e1000/vga)
- Chckdsk /F
- scf /scannow
- Disabled HP leftover drivers from starting (device management, hidden devices)
- Removed any software (except Windows/Microsoft/SAP related), so no anti-virus/back-up/monitoring etc.
- Checked eventlog for problems/errors, nothing special and nothing just before the crash
The timing of the BSOD has nothing to do with any stress, the servers also gets BSOD when doing nothing. It's not like a back-up running or a scheduled task. Also, I'm unable to reproduce the error by creating heavy load on the VM.
The VM's don't get a BSOD together, it's mostly just the SBS server and the APP server will run fine. The SBS server gets a BSOD 3 times a week and the APP server 1 or 2 times a month.
Some info about the currunt host:
proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-21 (running version: 3.1-21/93bf03d4)
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-8
libpve-access-control: 3.0-7
libpve-storage-perl: 3.0-17
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-4
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1
Host has 2x 5504 Xeon processor, 16 GB ECC RAM, 4x500GB RAID10 (HP HW RAID). It's a ML350 G6 model.
I don't think this has much to do with Proxmox but maybe someone has experienced this before.