VirtualGuest: Windows XP: system time stalls, time drifts (too slow)

plewka

Member
Sep 28, 2009
49
1
8
Probably this is a bug caused by the OS, but do you know a fix?
We used a tool to sync every second...but update could be faster.
As far as I know KVM86 (PVE1.3) has got an option for a timefix.
Maybe this thing would do the job?! How to apply the option parameter?

It's not a nice to have, btw. Some application have time encoded
buttons which pop up after f.e. 1sec. The app gets useless if you
wait 5secs or more.

Most direct way would be to modify the systime code of the OS
and make it synchronize with the hardware more often or allways :-(.
I noticed discussions in this forum regarding timer rate 100Hz/250Hz/1kHz
as a kernel setting...not available in XP, AFAIK.

How about a new main thread regarding made for trouble caused by the
different OSs?
 
Last edited:
Yes, virtual network adapter.
I'll try to move to rt8139 regarding that problem.

No effect :-((

Is my understanding correct, btw!? Should I have noticed
a change after changing the network card to RT8139 using
no other paravirtualized drivers?!
 
No effect :-((

Is my understanding correct, btw!? Should I have noticed
a change after changing the network card to RT8139 using
no other paravirtualized drivers?!

can you test the behavior running the 1.4beta2? (includes a new KVM release)

best tested and most reliable way to run XP is using IDE for disks and e1000 for network (you need to download the Intel network drivers from intel.com).

and send your /etc/qemu-server/VMID.conf file of the machine.
 
No effect :-((

Is my understanding correct, btw!? Should I have noticed
a change after changing the network card to RT8139 using
no other paravirtualized drivers?!

We'll check the beta2 on different hardware next.

Vista2007Enterprise shows same behaviour (up to now
checked RT8139 only), btw... _Not_ using virtual io.

Since RHEL5.1 doesn't install when >1 CPUs are selected
because of clock jitter and Solaris is slightly in trouble
because of IRQ-difficulties, hmmm. One could ask hinself
if there is a general problem in "old" KVM.
 
We'll check the beta2 on different hardware next.

Vista2007Enterprise shows same behaviour (up to now
checked RT8139 only), btw... _Not_ using virtual io.

Since RHEL5.1 doesn't install when >1 CPUs are selected
because of clock jitter and Solaris is slightly in trouble
because of IRQ-difficulties, hmmm. One could ask hinself
if there is a general problem in "old" KVM.

The problem does not show up with PVE1.4beta2 on Core2Duo (a DELL).
We'll have a closer look if it's caused by platform or PVE release.


Update:

In detail it doesn't realy drift, but it simply freezes since it simply misses tons of timer interrupts.
Seems to be a bigger if then else condition...
We just started the machine 1.4b2 on Dell Intel Core Duo (strange buggy BIOS on that one by the way) and removed the -td-rtc-hack.
It "works" now even the Intel strongly drifts.

One thing we don't understand up to now is, why the patch did not fix
our PhenomII-X4-Platform, too (we ran 1.4b2 there, too). Maybe it depends on a different BIOS setting or something is different on AMD(CPU/Chipset/..) which prevents the patch to be successfull.

JP
 
Last edited:
The problem does not show up with PVE1.4beta2 on Core2Duo (a DELL).
We'll have a closer look if it's caused by platform or PVE release.


It's a PhenomX4/AM3/MSI-Issue :-((. We're only loosing 0...1000% of
time on that platform, while we don't on Intel Core2Duo.
 
Hi,

I've had similar issues during my test.

- Please, could you check your power management settings in your bios?
- What does your VM syslog show?

I hope that helps.

Regards
 
Re: VirtualGuest: Windows XP: system time stalls, time drifts , Phenom HPET issue?!

The problem does not show up with PVE1.4beta2 on Core2Duo (a DELL).
We'll have a closer look if it's caused by platform or PVE release.

We took a journey to BOIS settings and changed everything which we expected to cause a minimum reason for trouble - no success.

Next update after using Google again:

http://www.mail-archive.com/kvm@vger.kernel.org/msg11454.html

Up to that Phenom+AM780g+hpet causes unreliable timekeeping and sometime later running full load you're going to have a fat crash...
The problems are even noticed on Xeons and Opterons since a few months.

Hmm, we disabled HPET in BIOS, which is not enough.
In my understanding of the writings the kernel doesn't care for the
BIOS setting, only a kernel parameter (?) could do the job.
We'll try further...
 
Hi,

I've had similar issues during my test.

- Please, could you check your power management settings in your bios?
- What does your VM syslog show?

I hope that helps.

Regards

Where too look in syslog exactly? These ones could be bad news:

Clockevents: could not switch to high resolution mode on CPUx
Could not switch to one-hot mode: lapic is not functional
Please enable iommu in BIOS ... can't find/do that in BIOS :-(

Hmm...
 
The problem does not show up with PVE1.4beta2 on Core2Duo (a DELL).
We'll have a closer look if it's caused by platform or PVE release.

I have had a closer look at that issue.
Probably many or even most users do have that issue on their KVM-Servers, but don't know about that since they don't run a windows-vm in high resolution time. It's known at all virtualisation products, including KVM, QEMU, VirtualBox, VMWare and ESX,
some tried to fix some still trying or demaged their fixes because of
other enhancements.
Most time you won't see the difficulties, because you don't switch WindowsXP to High-Resolution timming. As far as I now until now,
Labview and Quicktime make use of the higher resolution timing.
Labview is fully unusable because of that, since it has got time
triggered context menues, which simply take factor 10 and more
to open. If you try do do some automation with it, it will even fails even for slow stuff. Dependent on load time freezes for more than 10 secs and
the machine looses 50% and more of time.
Since syncronisation at <1 sec resolution is required, you simply can't do that via time servers. It takes much to long and has got too much overhead.

If you have a look at linux, you'll notice time was a bad thing there, too,
but the kernel developers spend much affort to the timing thing and still do and look for accurate results through the various system timers. Linuxes are able to use different sources for systemtime, while older Windows don't. RHEL has got a basic check integrated to prevent install on a
machine which uses >1 cores which do not have time integrity and could
finaly permanently damage system resources (filesystems).

Unfortunately nobody did a mistake but nobody thought about a usefull
concect, too. If the Hardware offers timers based on CPU-clock only, while
clock strongly changes due to dynamic clock/dynamic under/overclocking
inside the chipset and between different cores und on CPU, it's realy hard
for the software to get a usefull timebase.
Inside (hardware based) virtualisation, it's simply the maximum of jitter
because of concept, but it's not really the reason, only a part of that.

The thing which is ok, is the RTC, but at least regarding WindowsXP...it's not used. Hardware people integrated a high resolution event timer to get down to the sub-ms-area with precission....but again did not strictly define a non-changing base-clock for that thing. Maybe it would work for our hardware when disabling the enhanced clocking and energie saving options, but I don't see the HPET as a hardware resource inside the WindowsXPSP3 VM (up to my understanding there should be one...or is it limited to Vista?).
Up to the docs I have had a look at XPSP3 offers the timer, but I'm not sure if win32time makes use of it, of if use is limited to newer Media API
functions.
Probably the bochs BIOS doesn't offer the HPET feature inside the VM, too.

The best solution in my understanding would be a modified win32time.dll to simply use the kvm-time functions. The <1ms time interrupts Windows requires in high resolution (probably 100us like for Linux in past, too) are simply too fast for virtualisation if the host runs a lower rate on his kernel because of efficiency. The linux time on the host at least has got full integrity on our hardware - even with energie saving and dynamic clocking enabled. The whole concept causing timer-interrupts without having any process waiting for it is a stupit idea - caused by the
pre-historic hardware concept. Other platforms don't do it the same, do they?

It seems there are things to do regarding that issue. We'll have a llok at Win2008R2 and see if it's fixed there on OS-Level.

JP
 
The problem does not show up with PVE1.4beta2 on Core2Duo (a DELL).
We'll have a closer look if it's caused by platform or PVE release.


Update:

In detail it doesn't realy drift, but it simply freezes since it simply misses tons of timer interrupts.
Seems to be a bigger if then else condition...
We just started the machine 1.4b2 on Dell Intel Core Duo (strange buggy BIOS on that one by the way) and removed the -td-rtc-hack.
It "works" now even the Intel strongly drifts.

One thing we don't understand up to now is, why the patch did not fix
our PhenomII-X4-Platform, too (we ran 1.4b2 there, too). Maybe it depends on a different BIOS setting or something is different on AMD(CPU/Chipset/..) which prevents the patch to be successfull.

JP

I have got closer details now:
Default tickrate of WindowsXP (f.e.) is 15ms but any application is able to ask for better resolution down to 1ms. 1ms unfortunately is equivalent to the tickrate of the 2.6 linux-kernel (if it is not >=2.6.27 and able to run tickless using the new event counters of newer hardware).
Hopefully I understood correctly. Don't know how they were able to fix it on some machines and why this fix fails on others, though.

JP
 
Last edited: