High average load (idle) PVE 4.1 With Brocade 1020

sander93

Well-Known Member
Sep 30, 2014
56
2
48
Hello,

First, sorry for my bad English.

I have 3 servers (HP DL160 G6) with all the same specs, revision and firmware versions (latest).

2 of 3 servers having no problems after the upgrade tot Proxmox 4.1 and are running just fine.

The 3th server is having problems that te average load is not coming below 2.00 (on idle state) and only running higher when i am doing something (like start a vm).

I notice that when i remove my Brocade 1020 10gb network adapter the load is normal and almost touching zero. The card works great in al other systems, and i even tried a cad form one of the other servers, which give the same problem in this server. i also test it in another PCI slot but its doing exactly the same.

I also cannot find any errors in any of the logs.

See also attachment, a screenshot with the problem in version 4.1
No VM's are running on this moment.

When i clear the disk and install the older version of Proxmox 3.4 it is just running fine.

Who can help me?

Thank you!

Kind regards,

Sander
 

Attachments

  • Schermafbeelding 2016-02-13 om 14.12.01.png
    Schermafbeelding 2016-02-13 om 14.12.01.png
    52.8 KB · Views: 15
From a console what does the command top show?

To list top 10 cpu using processes run:
ps -eo pcpu,pid,user,args | sort -k1 -r | head -10
 
It says almost nothing or am i wrong?

# ps -eo pcpu,pid,user,args | sort -k1 -r | head -10

%CPU PID USER COMMAND

3.0 1386 root -bash

1.3 1 root /sbin/init

0.6 93 root [migration/12]

0.6 86 root [migration/11]

0.6 79 root [migration/10]

0.6 72 root [migration/9]

0.6 65 root [migration/8]

0.6 58 root [migration/7]

0.6 51 root [migration/6]
 
Any body who can help me with this problem? it would be great!

i'am realy out of options..
 
Last edited:
Any body who can help me with this problem? it would be great!

i'am realy out of options..
Hi,
show
Code:
ethtool -k ethX
different output on your nodes?

Do you use for the "load-2.server" allways the same switchport/cable?
Do you have collisions/dropped pakages on the interface?
MTU-Settings right?
Do you use network connections for disk-IO? Like nfs or iscsi?
If yes, how looks the output of mount (and how on an other server).
Is the load gone, if you umount/disconnect the network storage?

Enough options? ;)

Udo
 
Hello,

Thank you for response!

Yes it do but thats because i have the server with problems at the office and the other servers in the DC, they are connected to a 10GB optic switch, the server with the problem is connected to nothing..

No, no different in having the server at the office ore in the datacenter, so different cable or even none cable is the same.
No interfaces is on the moment even attached, but also in DC where it was connected to a switch it had the same problem.
MTU is default, but also here, the even its not connected.
Normally i do use NFS but sins its here at the office not, there is no different.
Output of mount below, but there are no NFS or ISCSI connected.
None.

Thank you for options i hope you can help me with this information.

Thank you

Kind regards,

Sander



Output of mount:
root@node001:~# mount

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)

proc on /proc type proc (rw,relatime)

udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=12371281,mode=755)

devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)

tmpfs on /run type tmpfs (rw,nosuid,relatime,size=19799276k,mode=755)

/dev/mapper/pve-root on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)

securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)

tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)

tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)

tmpfs on /sys/fs/cgroup type tmpfs (rw,mode=755)

cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)

pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)

cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset,clone_children)

cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)

cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)

cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)

cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)

cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)

cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)

cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event)

cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb)

systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=23,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)

hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)

debugfs on /sys/kernel/debug type debugfs (rw,relatime)

mqueue on /dev/mqueue type mqueue (rw,relatime)

fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)

/dev/sda2 on /boot type ext4 (rw,relatime,data=ordered)

/dev/mapper/pve-data on /var/lib/vz type ext4 (rw,relatime,data=ordered)

rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime)

tmpfs on /run/lxcfs/controllers type tmpfs (rw,relatime,size=100k,mode=700)

name=systemd on /run/lxcfs/controllers/name=systemd type cgroup (rw,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)

cpuset on /run/lxcfs/controllers/cpuset type cgroup (rw,relatime,cpuset,clone_children)

cpu,cpuacct on /run/lxcfs/controllers/cpu,cpuacct type cgroup (rw,relatime,cpu,cpuacct)

blkio on /run/lxcfs/controllers/blkio type cgroup (rw,relatime,blkio)

memory on /run/lxcfs/controllers/memory type cgroup (rw,relatime,memory)

devices on /run/lxcfs/controllers/devices type cgroup (rw,relatime,devices)

freezer on /run/lxcfs/controllers/freezer type cgroup (rw,relatime,freezer)

net_cls,net_prio on /run/lxcfs/controllers/net_cls,net_prio type cgroup (rw,relatime,net_cls,net_prio)

perf_event on /run/lxcfs/controllers/perf_event type cgroup (rw,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event)

hugetlb on /run/lxcfs/controllers/hugetlb type cgroup (rw,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb)

lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)

cgmfs on /run/cgmanager/fs type tmpfs (rw,relatime,size=100k,mode=755)

tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=9899640k,mode=700)

/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
 
Hello,

Thank you for response!

Yes it do but thats because i have the server with problems at the office and the other servers in the DC, they are connected to a 10GB optic switch, the server with the problem is connected to nothing..

No, no different in having the server at the office ore in the datacenter, so different cable or even none cable is the same.
No interfaces is on the moment even attached, but also in DC where it was connected to a switch it had the same problem.
MTU is default, but also here, the even its not connected.
....
Hi,
you mean you have an high load due an unused NIC???

And without this unused NIC the load is also low? So, this should be the same, if you unload the driver (rmmod)?


Udo
 
Hello,

Yes correct, if the NIC card is in the server, and the load is high in both situations, unused en used.

If i get the NIC card out of the server, the load is low, if i unload the driver the load is low.
Looks like a driver bug ore something?

Is there a different in the driver for the Brocade 1020 in Proxmox 3.4 and Proxmox 4.1?

Sander
 
Has any body any idea's i can test?

I orderd a new Brocade NIC this is not the problem, i have another server (Supermicro) it has the same problem.
I have no problem with the (same) card in a HP Workstation with the same clean Proxmox install
 
Any BIOS updates to your Supermicro boards?

I installed already the latest version.. unfortunately no differents
After a few minutes of running on load of 2.00 i get also this errors.

[720.901810] INFO task bfaf_worker:424 blocked for more than 120 seconds.
[720.901972] Tainted: P O 4.2.6-1-pve #1
[720.901931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disable this message.
 
When i start a virtual machine the load becomes higher and higher and i also get a high I/O load.
and VM is very slow
 
Latest information i get/found.

As soon i unload the BNA module the server load becomes normal, of course my 2 network adapters of this NIC a disappeared.
Has anybode this working in Proxmox 4.x?

Thank you.
 
Does anybody now if i can 'easy' downgrade a 4.1 cluster to proxmox 3.4?
Can VM's from Proxmox 4.1 start in Proxmox 3.4?
 
It is not possible to downgrade since pve 3.4 and pve 4.x are based on different Debian distributions.
oke, clear. And if i reinstall one of the nodes whith PVE 3.4 (i use shared storage), can i start a PVE 4.x KVM on it?
 
Is there any body who know's a possible solution for mij problem with the Brocade 1020 and Proxmox 4.1?
2 of my 4 servers keeps having problem with these NIC's 2 other servers are working fine.

Or does any body now if payed support can help me?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!