I have a Proxmox Server with the following specs:
Version:
pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-28
pve-kernel-2.6.32-4-pve: 2.6.32-28
qemu-server: 1.1-25
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-9
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-2
ksm-control-daemon: 1.0-4
VM Configuration:
name: TS64
ide2: none,media=cdrom
bootdisk: ide0
ostype: w2k3
ide0: data:vm-104-disk-1
memory: 10240
sockets: 1
vlan0: virtio=C6:4C:B3:BB:AD:67
onboot: 1
cores: 4
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
CPU 2x Xeon Quad Core E5620 2.4GHZ Processors:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
cpu MHz : 2400.323
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 9
cpu cores : 4
apicid : 19
initial apicid : 19
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida arat tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.19
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
Performance:
CPU BOGOMIPS: 76803.21
REGEX/SECOND: 850066
HD SIZE: 33.96 GB (/dev/mapper/pve-root)
BUFFERED READS: 333.03 MB/sec
AVERAGE SEEK TIME: 6.10 ms
FSYNCS/SECOND: 2948.85
DNS EXT: 131.42 ms
DNS INT: 1.28 ms
I've been successfully running 2 Windows 2003 32-Bit Standard Edition Servers on this server for over a month now. Both were migrations from actual physical servers. However, I've continued to receive random crashes on a Windows 2003 64-bit standard edition running terminal services, which was a fresh install. The server runs fine for hours under a decent load (20 users) and then will crash with a 3B bug check code (System_Service_Exception). I opened a ticket with Microsoft and submitted multiple memory dumps and their engineers suggested the following:
Dump Analyses Result:
===================
What happened is that the OS initiated an APIC /software interrupt. This is handled by the APIC in real hardware. In our Virtual Environment case where we are using Linux based VM – Proxmox, the VM implementation somehow has to make it happen on a virtual machine with the same latency in the virtual APIC. The problem is that there is a latency between when it was initiated and when it happened.
Below are the details for understanding the process or concept of APIC interrupts:
What the Local APIC Is
The Local APIC (LAPIC) is a circuit that is part of the CPU chip. It contains these basic elements:
A mechanism for generating
1. interrupts
2. A mechanism for accepting interrupts
3. A timer
If you have a multiprocessor system, the APIC's are wired together so they can communicate. So the LAPIC on CPU 0 can communicate with the LAPIC on CPU 1, etc.
What the IO APIC Is
This is a separate chip that is wired to the Local APIC's so it can forward interrupts on to the CPU chips. It is programmed similar to the 8259's but has more flexibility.
It is wired to the same bus as the Local APIC's so it can communicate with them.
Note:- In our scenario, it’s all Virtualized interrupts or calls because of hypervisor in picture and thus we need to contact the VM application vendor to get a check of this latency issue in APIC interrupt.
------------------------------------------------End of Message----------------------------------
Their engineers are saying that there is a latency issue with APIC calls. I'm not exactly sure how this can be corrected. Is this a known issue and is their a solution to this problem. I love Proxmox, but my main reason for using it was to upgrade my terminal server to better hardware, while leveraging it for other virtual machines as well.
Version:
pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-28
pve-kernel-2.6.32-4-pve: 2.6.32-28
qemu-server: 1.1-25
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-9
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-2
ksm-control-daemon: 1.0-4
VM Configuration:
name: TS64
ide2: none,media=cdrom
bootdisk: ide0
ostype: w2k3
ide0: data:vm-104-disk-1
memory: 10240
sockets: 1
vlan0: virtio=C6:4C:B3:BB:AD:67
onboot: 1
cores: 4
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
CPU 2x Xeon Quad Core E5620 2.4GHZ Processors:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
cpu MHz : 2400.323
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 9
cpu cores : 4
apicid : 19
initial apicid : 19
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida arat tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.19
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
Performance:
CPU BOGOMIPS: 76803.21
REGEX/SECOND: 850066
HD SIZE: 33.96 GB (/dev/mapper/pve-root)
BUFFERED READS: 333.03 MB/sec
AVERAGE SEEK TIME: 6.10 ms
FSYNCS/SECOND: 2948.85
DNS EXT: 131.42 ms
DNS INT: 1.28 ms
I've been successfully running 2 Windows 2003 32-Bit Standard Edition Servers on this server for over a month now. Both were migrations from actual physical servers. However, I've continued to receive random crashes on a Windows 2003 64-bit standard edition running terminal services, which was a fresh install. The server runs fine for hours under a decent load (20 users) and then will crash with a 3B bug check code (System_Service_Exception). I opened a ticket with Microsoft and submitted multiple memory dumps and their engineers suggested the following:
Dump Analyses Result:
===================
What happened is that the OS initiated an APIC /software interrupt. This is handled by the APIC in real hardware. In our Virtual Environment case where we are using Linux based VM – Proxmox, the VM implementation somehow has to make it happen on a virtual machine with the same latency in the virtual APIC. The problem is that there is a latency between when it was initiated and when it happened.
Below are the details for understanding the process or concept of APIC interrupts:
What the Local APIC Is
The Local APIC (LAPIC) is a circuit that is part of the CPU chip. It contains these basic elements:
A mechanism for generating
1. interrupts
2. A mechanism for accepting interrupts
3. A timer
If you have a multiprocessor system, the APIC's are wired together so they can communicate. So the LAPIC on CPU 0 can communicate with the LAPIC on CPU 1, etc.
What the IO APIC Is
This is a separate chip that is wired to the Local APIC's so it can forward interrupts on to the CPU chips. It is programmed similar to the 8259's but has more flexibility.
It is wired to the same bus as the Local APIC's so it can communicate with them.
Note:- In our scenario, it’s all Virtualized interrupts or calls because of hypervisor in picture and thus we need to contact the VM application vendor to get a check of this latency issue in APIC interrupt.
------------------------------------------------End of Message----------------------------------
Their engineers are saying that there is a latency issue with APIC calls. I'm not exactly sure how this can be corrected. Is this a known issue and is their a solution to this problem. I love Proxmox, but my main reason for using it was to upgrade my terminal server to better hardware, while leveraging it for other virtual machines as well.