One of our Windows 2008 SP1 (64bit) servers is crashing occasionally under heavy Load. It just happened again while compressing an old database-backup (~5GB) on the guest.
The (useless...) Windows-"Log" records 2 failures AFTER the Crash-Time logged by EventLog ("system has been rebooted at <date> <time>"), written after the reboot.
This entry is logged ~5 and ~15sec after the recorded crash time:
viostor / Reset on device "\Device\RaidPort1" (don't know the exact english equivalent, the entry in german is: "Ein Zuruecksetzen auf Geraet "\Device\RaidPort1" wurde ausgegeben")
~1-2mins after these events, i have the following log entries on the host system in /var/log/syslog:
(The usb device is connected to the crashed VM)
The guest uses virtio drivers for HDD and NIC. Although within the hardware options for the hard drives ("RedHat VirtIO SCSI Disk Device"), Windows claims it uses some MS-Driver from 2006. Is this a normal behavior for the RedHat VirtIO driver?
The other Guests (1x W2K8 SP1 32bit, 3x debian-container, 1x debian VM, 1x Suse VM) on this host are running fine - no errors in their logs or unusual behaviour when the W2k8 64 Windows crashes. So i don't think it is host-related but a problem by the W2k8-64bit guest.
I'm not actively using clustering with this host - it was planned and preconfigured, but due to the low bandwith connection of the second node, it was never added to the configuration. So the only node for the cluster is the 10.18.89.100 machine itself.
Is it save to just stop the clustering-service to ensure it isn't responsible for these odd crashes?
The (useless...) Windows-"Log" records 2 failures AFTER the Crash-Time logged by EventLog ("system has been rebooted at <date> <time>"), written after the reboot.
This entry is logged ~5 and ~15sec after the recorded crash time:
viostor / Reset on device "\Device\RaidPort1" (don't know the exact english equivalent, the entry in german is: "Ein Zuruecksetzen auf Geraet "\Device\RaidPort1" wurde ausgegeben")
~1-2mins after these events, i have the following log entries on the host system in /var/log/syslog:
Code:
Nov 19 14:45:36 proxmox corosync[1652]: [TOTEM ] A processor failed, forming new configuration.
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] CLM CONFIGURATION CHANGE
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] New Configuration:
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] #011r(0) ip(10.18.89.100)
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] Members Left:
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] Members Joined:
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] CLM CONFIGURATION CHANGE
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] New Configuration:
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] #011r(0) ip(10.18.89.100)
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] Members Left:
Nov 19 14:45:36 proxmox corosync[1652]: [CLM ] Members Joined:
Nov 19 14:45:36 proxmox corosync[1652]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
Nov 19 14:45:36 proxmox corosync[1652]: [CPG ] chosen downlist: sender r(0) ip(10.18.89.100) ; members(old:1 left:0)
Nov 19 14:45:36 proxmox corosync[1652]: [MAIN ] Completed service synchronization, ready to provide service.
Nov 19 14:45:40 proxmox kernel: usb 3-2: reset low speed USB device number 3 using uhci_hcd
(The usb device is connected to the crashed VM)
The guest uses virtio drivers for HDD and NIC. Although within the hardware options for the hard drives ("RedHat VirtIO SCSI Disk Device"), Windows claims it uses some MS-Driver from 2006. Is this a normal behavior for the RedHat VirtIO driver?
The other Guests (1x W2K8 SP1 32bit, 3x debian-container, 1x debian VM, 1x Suse VM) on this host are running fine - no errors in their logs or unusual behaviour when the W2k8 64 Windows crashes. So i don't think it is host-related but a problem by the W2k8-64bit guest.
I'm not actively using clustering with this host - it was planned and preconfigured, but due to the low bandwith connection of the second node, it was never added to the configuration. So the only node for the cluster is the 10.18.89.100 machine itself.
Is it save to just stop the clustering-service to ensure it isn't responsible for these odd crashes?
Last edited: