Network issue

tesquenure

Member
Aug 18, 2009
109
1
16
Network issue [Solved]

Hi,

My network go down on a network with 1 PVE, 2 VM KVM, 7 PC, 1 swicht 10/100/1000 and 1 router ADSL

When users open 7 ssh connection on eatch VM server, after 1/2 days all network goes down.

I already change the router and the switch.

================
CONF
================
Code:
pveversion -v
pve-manager: 1.6-2 (pve-manager/1.6/5087)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.6-19
pve-kernel-2.6.32-4-pve: 2.6.32-19
qemu-server: 1.1-18
pve-firmware: 1.0-8
libpve-storage-perl: 1.0-14
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-7
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.12.5-1
ksm-control-daemon: 1.0-4
i change network card too.
Code:
cat /etc/network/interfaces 
# network interface settings
auto lo
iface lo inet loopback

iface eth1 inet manual

iface eth2 inet manual

iface eth3 inet manual

iface eth4 inet manual

auto vmbr0
iface vmbr0 inet static
    address  192.168.1.99
    netmask  255.255.255.0
    gateway  192.168.1.20
    bridge_ports eth4
    bridge_stp off
    bridge_fd 0

auto vmbr1
iface vmbr1 inet manual
    bridge_ports eth3
    bridge_stp off
    bridge_fd 0
Code:
show
bridge name    bridge id        STP enabled    interfaces
vmbr0        8000.00005a000df0    no        eth4
vmbr1        8000.00005a000def    no        eth3
                            vmtab102i1
                            vmtab103i1
Code:
netstat -nr
Table de routage IP du noyau
Destination     Passerelle      Genmask         Indic   MSS Fenêtre irtt Iface
192.168.1.0     0.0.0.0         255.255.255.0   U         0 0          0 vmbr0
0.0.0.0         192.168.1.20    0.0.0.0         UG        0 0          0 vmbr0
Code:
ifconfig
eth3      Link encap:Ethernet  HWaddr 00:00:5a:00:0d:ef  
          adr inet6: fe80::200:5aff:fe00:def/64 Scope:Lien
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:35974 errors:0 dropped:0 overruns:0 frame:0
          TX packets:35110 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:1000 
          RX bytes:3643218 (3.4 MiB)  TX bytes:5890890 (5.6 MiB)
          Interruption:16 

eth4      Link encap:Ethernet  HWaddr 00:00:5a:00:0d:f0  
          adr inet6: fe80::200:5aff:fe00:df0/64 Scope:Lien
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:13328 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7992 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:1000 
          RX bytes:1077770 (1.0 MiB)  TX bytes:2733265 (2.6 MiB)
          Interruption:35 

lo        Link encap:Boucle locale  
          inet adr:127.0.0.1  Masque:255.0.0.0
          adr inet6: ::1/128 Scope:Hôte
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:917 errors:0 dropped:0 overruns:0 frame:0
          TX packets:917 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:0 
          RX bytes:273343 (266.9 KiB)  TX bytes:273343 (266.9 KiB)

venet0    Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP BROADCAST POINTOPOINT RUNNING NOARP  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

vmbr0     Link encap:Ethernet  HWaddr 00:00:5a:00:0d:f0  
          inet adr:192.168.1.99  Bcast:192.168.1.255  Masque:255.255.255.0
          adr inet6: fe80::200:5aff:fe00:df0/64 Scope:Lien
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:13328 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7990 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:0 
          RX bytes:890050 (869.1 KiB)  TX bytes:2733125 (2.6 MiB)

vmbr1     Link encap:Ethernet  HWaddr 00:00:5a:00:0d:ef  
          adr inet6: fe80::200:5aff:fe00:def/64 Scope:Lien
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:11780 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:0 
          RX bytes:1045641 (1021.1 KiB)  TX bytes:468 (468.0 B)

vmtab102i1 Link encap:Ethernet  HWaddr 0e:e9:f7:ad:b0:59  
          adr inet6: fe80::ce9:f7ff:fead:b059/64 Scope:Lien
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:12243 errors:0 dropped:0 overruns:0 frame:0
          TX packets:16563 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:500 
          RX bytes:2078910 (1.9 MiB)  TX bytes:1562102 (1.4 MiB)

vmtab103i1 Link encap:Ethernet  HWaddr 4a:58:23:7c:80:93  
          adr inet6: fe80::4858:23ff:fe7c:8093/64 Scope:Lien
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:22883 errors:0 dropped:0 overruns:0 frame:0
          TX packets:25220 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 lg file transmission:500 
          RX bytes:3812890 (3.6 MiB)  TX bytes:2451242 (2.3 MiB)
 
Last edited:
then

Code:
less /var/log/syslog
Oct  8 12:35:32 ruche202 pvedaemon[2688]: VM 102 started
Oct  8 12:35:33 ruche202 pvedaemon[2719]: starting VM 103 on node 0 (localhost)
Oct  8 12:35:33 ruche202 qm[2720]: VM 103 start
Oct  8 12:35:33 ruche202 kernel: device vmtab103i1 entered promiscuous mode
Oct  8 12:35:33 ruche202 kernel: vmbr1: port 3(vmtab103i1) entering forwarding s
tate
Oct  8 12:35:33 ruche202 pvedaemon[2719]: VM 103 started
Oct  8 12:35:43 ruche202 kernel: vmtab102i1: no IPv6 routers present
Oct  8 12:35:44 ruche202 kernel: vmtab103i1: no IPv6 routers present
Oct  8 12:39:44 ruche202 ntpd[2543]: Listening on interface #7 vmtab102i1, fe80:
:ce9:f7ff:fead:b059#123 Enabled
Oct  8 12:39:44 ruche202 ntpd[2543]: Listening on interface #8 vmtab103i1, fe80:
:4858:23ff:fe7c:8093#123 Enabled
Oct  8 12:40:01 ruche202 /USR/SBIN/CRON[2799]: (root) CMD (/usr/share/vzctl/scri
pts/vpsreboot)
Oct  8 12:40:01 ruche202 /USR/SBIN/CRON[2800]: (root) CMD (test -x /usr/lib/atsa
r/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 12:40:01 ruche202 /USR/SBIN/CRON[2801]: (root) CMD (/usr/share/vzctl/scri
pts/vpsnetclean)
Oct  8 12:50:01 ruche202 /USR/SBIN/CRON[2892]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 12:50:01 ruche202 /USR/SBIN/CRON[2893]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 12:50:01 ruche202 /USR/SBIN/CRON[2894]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 12:55:01 ruche202 /USR/SBIN/CRON[2948]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 12:55:01 ruche202 /USR/SBIN/CRON[2949]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:00:01 ruche202 /USR/SBIN/CRON[2999]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:00:01 ruche202 /USR/SBIN/CRON[3000]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 13:00:01 ruche202 /USR/SBIN/CRON[3001]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:05:02 ruche202 /USR/SBIN/CRON[3055]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:05:02 ruche202 /USR/SBIN/CRON[3056]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:10:01 ruche202 /USR/SBIN/CRON[3100]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:10:01 ruche202 /USR/SBIN/CRON[3099]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:10:01 ruche202 /USR/SBIN/CRON[3098]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 13:15:01 ruche202 /USR/SBIN/CRON[3144]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:15:01 ruche202 /USR/SBIN/CRON[3145]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:17:01 ruche202 /USR/SBIN/CRON[3174]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Oct  8 13:17:26 ruche202 kernel: device vmbr1 entered promiscuous mode
Oct  8 13:20:01 ruche202 /USR/SBIN/CRON[3194]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:20:01 ruche202 /USR/SBIN/CRON[3195]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:20:01 ruche202 /USR/SBIN/CRON[3196]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 13:21:09 ruche202 kernel: device vmbr1 left promiscuous mode
Oct  8 13:21:31 ruche202 kernel: device vmbr1 entered promiscuous mode
Oct  8 13:21:38 ruche202 kernel: device vmbr1 left promiscuous mode
Oct  8 13:21:41 ruche202 kernel: device vmbr0 entered promiscuous mode
Oct  8 13:22:01 ruche202 kernel: device vmbr0 left promiscuous mode
Oct  8 13:22:18 ruche202 kernel: device vmbr1 entered promiscuous mode
Oct  8 13:23:11 ruche202 kernel: device vmbr1 left promiscuous mode
Oct  8 13:25:01 ruche202 /USR/SBIN/CRON[3246]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:25:01 ruche202 /USR/SBIN/CRON[3247]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:30:01 ruche202 /USR/SBIN/CRON[3290]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:30:01 ruche202 /USR/SBIN/CRON[3291]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 13:30:01 ruche202 /USR/SBIN/CRON[3292]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:33:01 ruche202 kernel: device vmbr1 entered promiscuous mode
Oct  8 13:35:01 ruche202 /USR/SBIN/CRON[3329]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:35:01 ruche202 /USR/SBIN/CRON[3330]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:36:40 ruche202 kernel: device vmbr1 left promiscuous mode
Oct  8 13:37:04 ruche202 kernel: device vmbr1 entered promiscuous mode
Oct  8 13:37:12 ruche202 kernel: device vmbr1 left promiscuous mode
Oct  8 13:40:01 ruche202 /USR/SBIN/CRON[3374]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:40:01 ruche202 /USR/SBIN/CRON[3375]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 13:40:01 ruche202 /USR/SBIN/CRON[3376]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:42:36 ruche202 kernel: device vmbr1 entered promiscuous mode
Oct  8 13:44:49 ruche202 kernel: device vmbr1 left promiscuous mode
Oct  8 13:45:01 ruche202 /USR/SBIN/CRON[3423]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:45:01 ruche202 /USR/SBIN/CRON[3424]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:50:01 ruche202 /USR/SBIN/CRON[3467]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Oct  8 13:50:01 ruche202 /USR/SBIN/CRON[3468]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:50:01 ruche202 /USR/SBIN/CRON[3469]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1)
Oct  8 13:55:01 ruche202 /USR/SBIN/CRON[3508]: (root) CMD (/usr/share/vzctl/scripts/vpsnetclean)
Oct  8 13:55:01 ruche202 /USR/SBIN/CRON[3509]: (root) CMD (/usr/share/vzctl/scripts/vpsreboot)
Code:
less /etc/qemu-server/102.conf 
ostype: l26
memory: 1024
sockets: 1
name: server102
bootdisk: scsi0
scsi0: local:102/vm-102-disk-1.raw
onboot: 0
cores: 1
description: IP : 192.168.1.102 
vlan1: rtl8139=F6:71:FD:F0:CA:CC
Code:
less /etc/qemu-server/103.conf
ostype: l26
memory: 1024
sockets: 1
name: server103
bootdisk: scsi0
scsi0: local:103/vm-103-disk-1.raw
onboot: 0
cores: 1
description: IP : 192.168.1.103
serial: /dev/ttyS0
vlan1: rtl8139=E2:B1:0C:E4:D4:09

Does i miss some thing ?

Do i have to downgrade to 2.6.32-2 ?

Regards,

Tes.
 
Last edited:
Hi,
what kind of NICs do you use?

I know the issue that for one VM the network stops under heavy load with the virtio-driver (windows). But not for the host.
Perhaps you can try the e1000-driver for the VMs...

You don't have anything like bonding configured on the switch?!

Udo
 
Hi Udo,

The switch is not manageable i use DLS router as gatway.
I can try with e1000

I try with RealTek RTL8139, RTL8168d/8111d and sky2 dual 1000.

The problem is not just blocking PVE but the way to internet too (already change the DSL router)

Other one when network freezing i cant log on PVE to see what happens.

Regards,

Tes.
 
I'm curious about which part of your network freeze.

Does any of the 7 PC can connect to the Internet after the crash?
What do you do to get you network up again after the crash? (Reset the switch/router? Reboot the VM or PVE?)

thanks,
py
 
Hi,

Does any of the 7 PC can connect to the Internet after the crash?
No they randomly cant.
What do you do to get you network up again after the crash? (Reset the switch/router? Reboot the VM or PVE?)
After unplug and plug the network cable from the PVE, the netkork up.
No need router and switch reset.

I try with an other server this morning and with older PVE version than i already use on the same hardware with same hosts.
 
Hi,

Its solved.

1st, hand downgrade to kernet 2.6.32-1. works perfectly.
Then, with upgrade to pvetest 2.6.35-1.
all VM now with e1000 network

Regards,

Tes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!