Hi,
I'm running a pve cluster from two bare-metal installs[1]. I'm testing online migration of windows and linux vms , and found that a karmic 64bit machine is often freezing after a live migration.
The migration itself (machines are lvm-based, on a SAN provided disk, presented to both cluster nodes) succeds, but after some time the machine freezes.
Windows (server 2003, 32 bit) machines migrates without problems, and so appears for karmic 32 bit.
In particular, freeze happens only migrating back from node to master, not from master to node.
Furthermore, i found that there is a significant drift of the clock, which appears only after a migration from node to master. This is evidenced by running ping[2] during migration;
the "Warning: time of day goes back (-<microseconds>us), taking countermeasures." appears just after migration, and only when migrating from node to master.
Both cluster nodes are running ntp configured with our reference ntp server on internal lan.
Any hint?
bye,
rob
[1] # pveversion -v
pve-manager: 1.5-8 (pve-manager/1.5/4674)
running kernel: 2.6.18-2-pve
proxmox-ve-2.6.18: 1.5-5
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-11
pve-firmware: 1.0-3
libpve-storage-perl: 1.0-10
vncterm: 0.9-2
vzctl: 3.0.23-1pve8
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm-2.6.18: 0.9.1-5
[2] $ ping 192.168.10.48
PING 192.168.10.48 (192.168.10.48) 56(84) bytes of data.
64 bytes from 192.168.10.48: icmp_seq=1 ttl=64 time=359 ms
64 bytes from 192.168.10.48: icmp_seq=2 ttl=64 time=0.311 ms
Warning: time of day goes back (-2158us), taking countermeasures.
Warning: time of day goes back (-2046us), taking countermeasures.
64 bytes from 192.168.10.48: icmp_seq=3 ttl=64 time=0.000 ms
64 bytes from 192.168.10.48: icmp_seq=4 ttl=64 time=0.854 ms
64 bytes from 192.168.10.48: icmp_seq=5 ttl=64 time=2.66 ms
Warning: time of day goes back (-2144us), taking countermeasures.
64 bytes from 192.168.10.48: icmp_seq=6 ttl=64 time=0.000 ms
I'm running a pve cluster from two bare-metal installs[1]. I'm testing online migration of windows and linux vms , and found that a karmic 64bit machine is often freezing after a live migration.
The migration itself (machines are lvm-based, on a SAN provided disk, presented to both cluster nodes) succeds, but after some time the machine freezes.
Windows (server 2003, 32 bit) machines migrates without problems, and so appears for karmic 32 bit.
In particular, freeze happens only migrating back from node to master, not from master to node.
Furthermore, i found that there is a significant drift of the clock, which appears only after a migration from node to master. This is evidenced by running ping[2] during migration;
the "Warning: time of day goes back (-<microseconds>us), taking countermeasures." appears just after migration, and only when migrating from node to master.
Both cluster nodes are running ntp configured with our reference ntp server on internal lan.
Any hint?
bye,
rob
[1] # pveversion -v
pve-manager: 1.5-8 (pve-manager/1.5/4674)
running kernel: 2.6.18-2-pve
proxmox-ve-2.6.18: 1.5-5
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-11
pve-firmware: 1.0-3
libpve-storage-perl: 1.0-10
vncterm: 0.9-2
vzctl: 3.0.23-1pve8
vzdump: 1.2-5
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm-2.6.18: 0.9.1-5
[2] $ ping 192.168.10.48
PING 192.168.10.48 (192.168.10.48) 56(84) bytes of data.
64 bytes from 192.168.10.48: icmp_seq=1 ttl=64 time=359 ms
64 bytes from 192.168.10.48: icmp_seq=2 ttl=64 time=0.311 ms
Warning: time of day goes back (-2158us), taking countermeasures.
Warning: time of day goes back (-2046us), taking countermeasures.
64 bytes from 192.168.10.48: icmp_seq=3 ttl=64 time=0.000 ms
64 bytes from 192.168.10.48: icmp_seq=4 ttl=64 time=0.854 ms
64 bytes from 192.168.10.48: icmp_seq=5 ttl=64 time=2.66 ms
Warning: time of day goes back (-2144us), taking countermeasures.
64 bytes from 192.168.10.48: icmp_seq=6 ttl=64 time=0.000 ms