Hi,
we are running several virtual machines (windows and linux based hosts) and are too experiencing the time drift problems on all windows guests (x86 and x64), but none of the linux guests (also measured against ntppool time)
After reading through this forum and searching and reading a lot about kvm/qemu regarding this problem, fiddling around with several options like the infamous -td-rtc-hack (which first seemed to solve the problem) and trying different -clock settings, we still have severe problems.
Today two of our windows servers each lagged around 1 hour back.
Of course we could set up a frequent time synchronisation to the host ntp clock and probably will do so, but at least we would like to talk about some observations.
Currently both windows servers are running with the -rtc-td-hack option, but today I noticed a high load on the windows 2003 server (32 Bit, VMID=103), clogging up around 40% of the cpu spent inside kernel (according taskmanager).
Perfmon also indicated high values for “Interrupt Time” on this machine (20%-25%). The other machine seemed not be affected by this problem.
As far as I understand the td-rtc-hack, it tries to reinject lost timer interrupts into the windows hosts again and again, so windows is busy processing all lost interrupts of the last hour, which leads to the high load.
I have written a small cron job to log the time differences of all four machines and we are experiencing the time lag repatedly, sometime “only” 300-400 seconds back, but sometimes even several hours. All with the hack option active or inactive.
Today it is worse, it seems, because the host is busy copying large amounts of data since several hours.
The interesting observation now was, that when looking at the windows 2003 server clock trough a RDP connection, the hand displaying the seconds was not or very seldom moving, but when I accessed the hard disk, it was hasting to catch up – moving about 3-4 seconds in 1 real second. When I stopped accessing the hard disk (“dir /s”) it stopped again.
After doing the dir /s again, first nothing happened (I guess because the directory structure was already in the file cache), but then started running again after it was beyond the point in the directory file list where I stopped initially.
I did some more tests and it seems, accessing the network does also lead to this effect. Searching a file with windows explorer on a network drive on a different computer or a ftp get of a 2 mb file did also speed up the host clock, but the latter may be of the resulting file being written to the hard disk. Reading a file from a share on this server also results in faster clock.
Anybody any idea?
Any help would be appreciated.
Here the configuration of the most affected VM:
--103.conf:
name: VAMP
ide2: none,media=cdrom
smp: 2
ostype: w2k3
memory: 4096
onboot: 1
description: Der alte
boot: dc
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
bootdisk: ide0
vlan0: ne2k_pci=A6:39:4E:42:27:77
hostusb: 057c:1900
ide0: vm-103-disk-0.img
ide1: vm-103-disk-1.qcow2
ide3: vm-103-disk-2.qcow2
args: -rtc-td-hack
And here infos about the host:
vmhost: ~# pveversion -v
pve-manager: 1.3-1 (pve-manager/1.3/4023)
qemu-server: 1.0-14
pve-kernel: 2.6.24-8
pve-kvm: 86-1
pve-firmware: 1
vncterm: 0.9-2
vzctl: 3.0.23-1pve3
vzdump: 1.1-2
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
vmhost: ~# uname-a
Linux vmhost 2.6.24-7-pve #1 SMP PREEMPT Tue Jun 2 08:00:29 CEST 2009 x86_64 GNU/Linux
vmhost:~# qm showcmd 103
/usr/bin/kvm -monitor unix:/var/run/qemu-server/103.mon,server,nowait -vnc unix:/var/run/qemu-server/103.vnc,password -pidfile /var/run/qemu-server/103.pid -daemonize -usbdevice tablet -usbdevice host:057c:1900 -name VAMP -smp 2 -id 103 -cpuunits 1000 -boot dc -vga cirrus -tdf -localtime -k de -drive file=/var/lib/vz/images/103/vm-103-disk-1.qcow2,if=ide,index=1 -drive file=/var/lib/vz/images/103/vm-103-disk-0.img,if=ide,index=0,boot=on -drive file=,if=ide,index=2,media=cdrom -drive file=/var/lib/vz/images/103/vm-103-disk-2.qcow2,if=ide,index=3 -m 4096 -net tap,vlan=0,ifname=vmtab103i0,script=/var/lib/qemu-server/bridge-vlan0 -net nic,vlan=0,model=ne2k_pci,macaddr=A6:39:4E:42:27:77 -rtc-td-hack
The host is running on brand new Hardware:
Dual CPU Board with two Intel Xeon E5530 at 2.40GHz
24 GB RAM
Two 3ware 9650 RAID Controllers with eight 2,5” SATA drives attached.
It’s clock is running very stable (measured through ntppery against ntppool.org
we are running several virtual machines (windows and linux based hosts) and are too experiencing the time drift problems on all windows guests (x86 and x64), but none of the linux guests (also measured against ntppool time)
After reading through this forum and searching and reading a lot about kvm/qemu regarding this problem, fiddling around with several options like the infamous -td-rtc-hack (which first seemed to solve the problem) and trying different -clock settings, we still have severe problems.
Today two of our windows servers each lagged around 1 hour back.
Of course we could set up a frequent time synchronisation to the host ntp clock and probably will do so, but at least we would like to talk about some observations.
Currently both windows servers are running with the -rtc-td-hack option, but today I noticed a high load on the windows 2003 server (32 Bit, VMID=103), clogging up around 40% of the cpu spent inside kernel (according taskmanager).
Perfmon also indicated high values for “Interrupt Time” on this machine (20%-25%). The other machine seemed not be affected by this problem.
As far as I understand the td-rtc-hack, it tries to reinject lost timer interrupts into the windows hosts again and again, so windows is busy processing all lost interrupts of the last hour, which leads to the high load.
I have written a small cron job to log the time differences of all four machines and we are experiencing the time lag repatedly, sometime “only” 300-400 seconds back, but sometimes even several hours. All with the hack option active or inactive.
Today it is worse, it seems, because the host is busy copying large amounts of data since several hours.
The interesting observation now was, that when looking at the windows 2003 server clock trough a RDP connection, the hand displaying the seconds was not or very seldom moving, but when I accessed the hard disk, it was hasting to catch up – moving about 3-4 seconds in 1 real second. When I stopped accessing the hard disk (“dir /s”) it stopped again.
After doing the dir /s again, first nothing happened (I guess because the directory structure was already in the file cache), but then started running again after it was beyond the point in the directory file list where I stopped initially.
I did some more tests and it seems, accessing the network does also lead to this effect. Searching a file with windows explorer on a network drive on a different computer or a ftp get of a 2 mb file did also speed up the host clock, but the latter may be of the resulting file being written to the hard disk. Reading a file from a share on this server also results in faster clock.
Anybody any idea?
Any help would be appreciated.
Here the configuration of the most affected VM:
--103.conf:
name: VAMP
ide2: none,media=cdrom
smp: 2
ostype: w2k3
memory: 4096
onboot: 1
description: Der alte
boot: dc
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1
bootdisk: ide0
vlan0: ne2k_pci=A6:39:4E:42:27:77
hostusb: 057c:1900
ide0: vm-103-disk-0.img
ide1: vm-103-disk-1.qcow2
ide3: vm-103-disk-2.qcow2
args: -rtc-td-hack
And here infos about the host:
vmhost: ~# pveversion -v
pve-manager: 1.3-1 (pve-manager/1.3/4023)
qemu-server: 1.0-14
pve-kernel: 2.6.24-8
pve-kvm: 86-1
pve-firmware: 1
vncterm: 0.9-2
vzctl: 3.0.23-1pve3
vzdump: 1.1-2
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
vmhost: ~# uname-a
Linux vmhost 2.6.24-7-pve #1 SMP PREEMPT Tue Jun 2 08:00:29 CEST 2009 x86_64 GNU/Linux
vmhost:~# qm showcmd 103
/usr/bin/kvm -monitor unix:/var/run/qemu-server/103.mon,server,nowait -vnc unix:/var/run/qemu-server/103.vnc,password -pidfile /var/run/qemu-server/103.pid -daemonize -usbdevice tablet -usbdevice host:057c:1900 -name VAMP -smp 2 -id 103 -cpuunits 1000 -boot dc -vga cirrus -tdf -localtime -k de -drive file=/var/lib/vz/images/103/vm-103-disk-1.qcow2,if=ide,index=1 -drive file=/var/lib/vz/images/103/vm-103-disk-0.img,if=ide,index=0,boot=on -drive file=,if=ide,index=2,media=cdrom -drive file=/var/lib/vz/images/103/vm-103-disk-2.qcow2,if=ide,index=3 -m 4096 -net tap,vlan=0,ifname=vmtab103i0,script=/var/lib/qemu-server/bridge-vlan0 -net nic,vlan=0,model=ne2k_pci,macaddr=A6:39:4E:42:27:77 -rtc-td-hack
The host is running on brand new Hardware:
Dual CPU Board with two Intel Xeon E5530 at 2.40GHz
24 GB RAM
Two 3ware 9650 RAID Controllers with eight 2,5” SATA drives attached.
It’s clock is running very stable (measured through ntppery against ntppool.org