I'm facing a difficult networking problem for the second time.
Here's the situation:
- proxmox with mostly openvz machines, but also a kvm machine (playing the role of router/firewall)
- I have 'slow reaction' on the network to my kvm machine. But bandwidth is not a problem. pings vary from 2ms to 1000ms (see below)
- Slow reaction is not present when connecting to openvz machines (ping stable from 1 to 4ms)
- problem present on both IPv4 adn IPv6
Sat 06 Jun 12:03 $ ping vm01 => from laptop to hypervisor = OK
PING vm01.home.xxxxx (10.107.2.240): 56 data bytes
64 bytes from 10.107.2.240: icmp_seq=0 ttl=64 time=4.267 ms
64 bytes from 10.107.2.240: icmp_seq=1 ttl=64 time=0.947 ms
64 bytes from 10.107.2.240: icmp_seq=2 ttl=64 time=0.933 ms
Sat 06 Jun 12:05 $ ping nas01 => from laptop to openvz virtual machine = OK
PING nas01.home.xxxxx (10.107.2.8): 56 data bytes
64 bytes from 10.107.2.8: icmp_seq=0 ttl=64 time=3.919 ms
64 bytes from 10.107.2.8: icmp_seq=1 ttl=64 time=0.952 ms
64 bytes from 10.107.2.8: icmp_seq=2 ttl=64 time=0.844 ms
Sat 06 Jun 12:05 $ ping gw01 => from laptop to KVM firewall = NOK
PING gw01.home.xxxxx (10.107.2.254): 56 data bytes
64 bytes from 10.107.2.254: icmp_seq=0 ttl=64 time=1000.727 ms
64 bytes from 10.107.2.254: icmp_seq=1 ttl=64 time=1000.851 ms
64 bytes from 10.107.2.254: icmp_seq=2 ttl=64 time=1.910 ms
64 bytes from 10.107.2.254: icmp_seq=3 ttl=64 time=1001.805 ms
64 bytes from 10.107.2.254: icmp_seq=4 ttl=64 time=999.146 ms
64 bytes from 10.107.2.254: icmp_seq=5 ttl=64 time=82.865 ms
64 bytes from 10.107.2.254: icmp_seq=6 ttl=64 time=15.473 ms
From cabling perspective everything passes on the same switches and cables.
I did some network sniffing on my laptop, on the bridge of the hypervisor/vm01 and on the gw01 (kvm machine).
The results are:
- my laptop sends a packet on the network at T
- my laptop sees reply at T + ~1000ms (long delay)
- vm01 sees packet arrive on vmbr2 at T
- vm01 sees reply at T + ~1000ms (long delay)
- gw01 sees packet arrive on machine at T + ~900ms
- gw01 sends reply immediately (no delay)
=> Conclusion: Delay is between the hypervisor and the KVM virtual machine.
A month ago I had exactly the same behavior. I fixed it by reinstalling my VM from scratch (I was performing a network migration at that time).
For 1 or 2 weeks there were no problems (not that I noticed), but now it's starting again...
Something about the gw01 configuration:
- 5 virtio network interfaces
- virtio disk
- ubuntu linux 9.04 kernel 2.6.28-11-server, up2date
- slow behavior is on every interface of the machine (from and to every dmz)
The hypervisor/vm01:
- standard proxmox install, no extra repo, up2date
I'm reaching the limit of my knowledge and don't know where to continue debugging...
I could re-create a new vm, but that's again a temporary solution and still doesn't explain the cause.
Thanks for your help.
Any ideas for possible debugging are welcome
Here's the situation:
- proxmox with mostly openvz machines, but also a kvm machine (playing the role of router/firewall)
- I have 'slow reaction' on the network to my kvm machine. But bandwidth is not a problem. pings vary from 2ms to 1000ms (see below)
- Slow reaction is not present when connecting to openvz machines (ping stable from 1 to 4ms)
- problem present on both IPv4 adn IPv6
Sat 06 Jun 12:03 $ ping vm01 => from laptop to hypervisor = OK
PING vm01.home.xxxxx (10.107.2.240): 56 data bytes
64 bytes from 10.107.2.240: icmp_seq=0 ttl=64 time=4.267 ms
64 bytes from 10.107.2.240: icmp_seq=1 ttl=64 time=0.947 ms
64 bytes from 10.107.2.240: icmp_seq=2 ttl=64 time=0.933 ms
Sat 06 Jun 12:05 $ ping nas01 => from laptop to openvz virtual machine = OK
PING nas01.home.xxxxx (10.107.2.8): 56 data bytes
64 bytes from 10.107.2.8: icmp_seq=0 ttl=64 time=3.919 ms
64 bytes from 10.107.2.8: icmp_seq=1 ttl=64 time=0.952 ms
64 bytes from 10.107.2.8: icmp_seq=2 ttl=64 time=0.844 ms
Sat 06 Jun 12:05 $ ping gw01 => from laptop to KVM firewall = NOK
PING gw01.home.xxxxx (10.107.2.254): 56 data bytes
64 bytes from 10.107.2.254: icmp_seq=0 ttl=64 time=1000.727 ms
64 bytes from 10.107.2.254: icmp_seq=1 ttl=64 time=1000.851 ms
64 bytes from 10.107.2.254: icmp_seq=2 ttl=64 time=1.910 ms
64 bytes from 10.107.2.254: icmp_seq=3 ttl=64 time=1001.805 ms
64 bytes from 10.107.2.254: icmp_seq=4 ttl=64 time=999.146 ms
64 bytes from 10.107.2.254: icmp_seq=5 ttl=64 time=82.865 ms
64 bytes from 10.107.2.254: icmp_seq=6 ttl=64 time=15.473 ms
From cabling perspective everything passes on the same switches and cables.
I did some network sniffing on my laptop, on the bridge of the hypervisor/vm01 and on the gw01 (kvm machine).
The results are:
- my laptop sends a packet on the network at T
- my laptop sees reply at T + ~1000ms (long delay)
- vm01 sees packet arrive on vmbr2 at T
- vm01 sees reply at T + ~1000ms (long delay)
- gw01 sees packet arrive on machine at T + ~900ms
- gw01 sends reply immediately (no delay)
=> Conclusion: Delay is between the hypervisor and the KVM virtual machine.
A month ago I had exactly the same behavior. I fixed it by reinstalling my VM from scratch (I was performing a network migration at that time).
For 1 or 2 weeks there were no problems (not that I noticed), but now it's starting again...
Something about the gw01 configuration:
- 5 virtio network interfaces
- virtio disk
- ubuntu linux 9.04 kernel 2.6.28-11-server, up2date
- slow behavior is on every interface of the machine (from and to every dmz)
The hypervisor/vm01:
- standard proxmox install, no extra repo, up2date
I'm reaching the limit of my knowledge and don't know where to continue debugging...
I could re-create a new vm, but that's again a temporary solution and still doesn't explain the cause.
Thanks for your help.
Any ideas for possible debugging are welcome