Hi all,
Since we migrate our host from 2.3 to 3.1, one of our Windows 2003 32bits VM freezes randomly without BSOD.
We noticed that there was some cpu usage peaks due to interrupts hardware there was not before under 2.3.
A second W2K3 32bits VM is running on this host without any problems.
What we do to incriminate VM or not is sharing the /var/lib/vz with nfs on the 3.1 host, add this share to a 2.3 host and run the VM on the 2.3 host with exactly the same config file.
On the 2.3 host, the VM is running without any problems.
More investigations using kernrate show me that ntkrnlpa generates lot of hitings :
Zooming on ntkrnlpa says :
VM configuration file :
PVE version on 3.1 host :
PVE version on 2.3 host :
HAL in hardware manager in the VM shows "ACPI Multiprocessor computer"
I tried to change CPU type to host, downgrade the NIC from virtio to e1000, delete all non-present devices in hardware manager ... unsuccessflully ...
Any ideas ?
Regards,
Thomas
Since we migrate our host from 2.3 to 3.1, one of our Windows 2003 32bits VM freezes randomly without BSOD.
We noticed that there was some cpu usage peaks due to interrupts hardware there was not before under 2.3.
A second W2K3 32bits VM is running on this host without any problems.
What we do to incriminate VM or not is sharing the /var/lib/vz with nfs on the 3.1 host, add this share to a 2.3 host and run the VM on the 2.3 host with exactly the same config file.
On the 2.3 host, the VM is running without any problems.
More investigations using kernrate show me that ntkrnlpa generates lot of hitings :
Code:
/==============================\
< KERNRATE LOG >
\==============================/
Date: 2013/08/28 Time: 18:19:31
Machine Name: SV113
Number of Processors: 4
PROCESSOR_ARCHITECTURE: x86
PROCESSOR_LEVEL: 6
PROCESSOR_REVISION: 0f0b
Physical Memory: 4096 MB
Pagefile Total: 5976 MB
Virtual Total: 2047 MB
PageFile1: \??\C:\pagefile.sys, 2046MB
OS Version: 5.2 Build 3790 Service-Pack: 2.0
WinDir: C:\WINDOWS
Kernrate User-Specified Command Line:
Kernrate_i386_XP.exe
Kernel Profile (PID = 0): Source= Time,
Using Kernrate Default Rate of 25000 events/hit
------------Overall Summary:--------------
P0 K 0:00:23.968 (10.6%) U 0:00:04.890 ( 2.2%) I 0:03:17.890 (87.3%) DPC 0:00:00.296 ( 0.1%) Interrupt 0:00:13.218 ( 5.8%)
Interrupts= 325251, Interrupt Rate= 1434/sec.
P1 K 0:00:22.234 ( 9.8%) U 0:00:07.500 ( 3.3%) I 0:03:17.015 (86.9%) DPC 0:00:00.156 ( 0.1%) Interrupt 0:00:15.156 ( 6.7%)
Interrupts= 107537, Interrupt Rate= 474/sec.
P2 K 0:00:26.671 (11.8%) U 0:00:02.937 ( 1.3%) I 0:03:17.140 (86.9%) DPC 0:00:00.265 ( 0.1%) Interrupt 0:00:20.500 ( 9.0%)
Interrupts= 107571, Interrupt Rate= 474/sec.
P3 K 0:00:19.484 ( 8.6%) U 0:00:07.468 ( 3.3%) I 0:03:19.796 (88.1%) DPC 0:00:00.203 ( 0.1%) Interrupt 0:00:08.734 ( 3.9%)
Interrupts= 107564, Interrupt Rate= 474/sec.
TOTAL K 0:01:32.359 (10.2%) U 0:00:22.796 ( 2.5%) I 0:13:11.843 (87.3%) DPC 0:00:00.921 ( 0.1%) Interrupt 0:00:57.609 ( 6.4%)
Total Interrupts= 647923, Total Interrupt Rate= 2857/sec.
Total Profile Time = 226750 msec
BytesStart BytesStop BytesDiff.
Available Physical Memory , 2970775552, 2899222528, -71553024
Available Pagefile(s) , 4994494464, 4908650496, -85843968
Available Virtual , 2132312064, 2131263488, -1048576
Available Extended Virtual , 0, 0, 0
Total Avg. Rate
Context Switches , 495835, 2187/sec.
System Calls , 1870490, 8249/sec.
Page Faults , 91366, 403/sec.
I/O Read Operations , 11135, 49/sec.
I/O Write Operations , 12094, 53/sec.
I/O Other Operations , 42708, 188/sec.
I/O Read Bytes , 36180810, 3249/ I/O
I/O Write Bytes , 3572093, 295/ I/O
I/O Other Bytes , 4722415272, 110574/ I/O
-----------------------------
Results for Kernel Mode:
-----------------------------
OutputResults: KernelModuleCount = 114
Percentage in the following table is based on the Total Hits for the Kernel
Time 359459 hits, 25000 events per hit --------
Module Hits msec %Total Events/Sec
intelppm 277297 226734 77 % 30575145
ntkrnlpa 68881 226734 19 % 7594912
hal 12166 226734 3 % 1341439
win32k 354 226734 0 % 39032
klif 154 226734 0 % 16980
Ntfs 135 226734 0 % 14885
fltmgr 95 226734 0 % 10474
e1000325 91 226734 0 % 10033
klflt 66 226734 0 % 7277
tcpip 58 226734 0 % 6395
BALLOON 29 226734 0 % 3197
RDPDD 22 226734 0 % 2425
wdf01000 21 226734 0 % 2315
SCSIPORT 16 226734 0 % 1764
viostor 10 226734 0 % 1102
Dfs 9 226734 0 % 992
kltdi 7 226734 0 % 771
NDIS 7 226734 0 % 771
kneps 5 226734 0 % 551
RDPWD 4 226734 0 % 441
USBPORT 4 226734 0 % 441
usbuhci 4 226734 0 % 441
CLASSPNP 4 226734 0 % 441
atapi 3 226734 0 % 330
ftdisk 3 226734 0 % 330
Npfs 2 226734 0 % 220
termdd 2 226734 0 % 220
watchdog 2 226734 0 % 220
srv 1 226734 0 % 110
afd 1 226734 0 % 110
ipnat 1 226734 0 % 110
TDI 1 226734 0 % 110
cdrom 1 226734 0 % 110
KSecDD 1 226734 0 % 110
PartMgr 1 226734 0 % 110
volsnap 1 226734 0 % 110
================================= END OF RUN ==================================
Zooming on ntkrnlpa says :
Code:
...
Time 52506 hits, 25000 events per hit --------
Module Hits msec %Total Events/Sec
ExAllocatePoolWithTag 41945 166540 79 % 6296535
KeTerminateThread 4365 166540 8 % 655247
ZwYieldExecution 2363 166540 4 % 354719
RtlCaptureContext 720 166540 1 % 108082
KiDispatchInterrupt 422 166540 0 % 63348
KeFlushEntireTb 419 166540 0 % 62897
NtBuildNumber 242 166540 0 % 36327
NtFreeVirtualMemory 236 166540 0 % 35426
CmRegisterCallback 178 166540 0 % 26720
ObQueryNameString 112 166540 0 % 16812
KeAreAllApcsDisabled 94 166540 0 % 14110
PoShutdownBugCheck 83 166540 0 % 12459
wctomb 77 166540 0 % 11558
ProbeForRead 42 166540 0 % 6304
RtlCompressBuffer 39 166540 0 % 5854
ExRaiseHardError 38 166540 0 % 5704
ObFindHandleForObject 38 166540 0 % 5704
PoQueueShutdownWorkItem 35 166540 0 % 5253
RtlInitializeGenericTable 33 166540 0 % 4953
NtAllocateUuids 29 166540 0 % 4353
ExFreePoolWithTag 28 166540 0 % 4203
...
VM configuration file :
Code:
acpi: 1
balloon: 2048
boot: cad
bootdisk: virtio0
cores: 2
cpu: core2duo
cpuunits: 1000
freeze: 0
ide2: none,media=cdrom
kvm: 1
memory: 4096
name: SV113
net0: e1000=AA:2B:48:6B:E8:E6,bridge=vmbr0
ostype: w2k3
sockets: 2
startup: order=1
virtio0: local:113/vm-113-disk-1.raw,format=raw
PVE version on 3.1 host :
Code:
proxmox-ve-2.6.32: 3.1-109 (running kernel: 2.6.32-23-pve)
pve-manager: 3.1-4 (running version: 3.1-4/f6816604)
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-23-pve: 2.6.32-109
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-7
qemu-server: 3.1-1
pve-firmware: 1.0-23
libpve-common-perl: 3.0-6
libpve-access-control: 3.0-6
libpve-storage-perl: 3.0-10
pve-libspice-server1: 0.12.4-1
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.0-2
PVE version on 2.3 host :
Code:
pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-19-pve
proxmox-ve-2.6.32: 2.3-96
pve-kernel-2.6.32-19-pve: 2.6.32-96
pve-kernel-2.6.32-18-pve: 2.6.32-88
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-20
pve-firmware: 1.0-21
libpve-common-perl: 1.0-49
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-7
vncterm: 1.0-4
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-10
ksm-control-daemon: 1.1-1
HAL in hardware manager in the VM shows "ACPI Multiprocessor computer"
I tried to change CPU type to host, downgrade the NIC from virtio to e1000, delete all non-present devices in hardware manager ... unsuccessflully ...
Any ideas ?
Regards,
Thomas