We have a dual core Intel Wildcat Pass server with Intel E5-2640 processors. We have Hyper Threading enabled so the system reports a total of 40 cores.
Reviewing individual core utilisation showed CPU 0 running at 100% which was causing packet loss. We've subsequently used the taskset utility to restrict KVM and Ceph processes from using core 0 and its Hyper Threaded sibling:
Packet loss is now gone but we've effectively dedicated a tenth of our resources to handling interrupts from the network card. Resulting 'top' output now:
Has anyone got experience with distributing IRQs over multiple cores? The system appear to automatically create Tx/Rx queues equal to the number of cores and associated each pair with a given core. Most interrupts (by a magnitude of 10:1 occur on core 0 though:
Reviewing individual core utilisation showed CPU 0 running at 100% which was causing packet loss. We've subsequently used the taskset utility to restrict KVM and Ceph processes from using core 0 and its Hyper Threaded sibling:
Code:
/etc/rc.local:
# Limit tasks to not run on core 0 and it's Hyper-Threaded sibling:
cpus='1-19,21-39';
for pid in `pidof kvm`; do
taskset -a -cp $cpus $pid &> /dev/null;
for vhostpid in `pidof vhost-$pid`; do
taskset -a -cp $cpus $vhostpid &> /dev/null;
done
done
for pid in `pidof ceph-fuse ceph-mon ceph-osd`; do
taskset -a -cp $cpus $pid &> /dev/null;
done
Packet loss is now gone but we've effectively dedicated a tenth of our resources to handling interrupts from the network card. Resulting 'top' output now:
Code:
top - 10:45:42 up 49 days, 13:38, 1 user, load average: 25.56, 24.59, 24.25
Tasks: 744 total, 9 running, 364 sleeping, 0 stopped, 0 zombie
%Cpu0 : 0.6 us, 6.7 sy, 0.0 ni, 54.3 id, 0.0 wa, 0.0 hi, 38.4 si, 0.0 st
%Cpu1 : 39.4 us, 24.9 sy, 0.0 ni, 31.3 id, 2.7 wa, 0.0 hi, 1.7 si, 0.0 st
%Cpu2 : 43.1 us, 23.7 sy, 0.0 ni, 32.5 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu3 : 51.0 us, 19.8 sy, 0.0 ni, 28.9 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu4 : 53.6 us, 19.0 sy, 0.0 ni, 26.1 id, 0.0 wa, 0.0 hi, 1.4 si, 0.0 st
%Cpu5 : 41.2 us, 24.9 sy, 0.0 ni, 33.2 id, 0.3 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu6 : 40.9 us, 25.2 sy, 0.0 ni, 31.5 id, 1.7 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu7 : 47.1 us, 23.2 sy, 0.0 ni, 28.3 id, 0.3 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu8 : 45.6 us, 22.4 sy, 0.0 ni, 27.6 id, 4.1 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu9 : 46.1 us, 25.4 sy, 0.0 ni, 27.8 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu10 : 23.9 us, 18.4 sy, 0.0 ni, 45.1 id, 11.6 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu11 : 21.2 us, 18.4 sy, 0.0 ni, 57.6 id, 2.4 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu12 : 20.1 us, 18.0 sy, 0.0 ni, 60.2 id, 1.0 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu13 : 20.4 us, 18.0 sy, 0.0 ni, 52.9 id, 7.3 wa, 0.0 hi, 1.4 si, 0.0 st
%Cpu14 : 20.4 us, 16.9 sy, 0.0 ni, 57.7 id, 3.9 wa, 0.0 hi, 1.1 si, 0.0 st
%Cpu15 : 22.1 us, 18.0 sy, 0.0 ni, 54.0 id, 5.5 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu16 : 20.7 us, 16.7 sy, 0.0 ni, 61.6 id, 0.4 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu17 : 22.8 us, 18.6 sy, 0.0 ni, 57.2 id, 1.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu18 : 23.3 us, 17.8 sy, 0.0 ni, 57.8 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu19 : 20.3 us, 19.9 sy, 0.0 ni, 54.2 id, 4.9 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu20 : 0.3 us, 9.2 sy, 0.0 ni, 87.8 id, 2.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu21 : 34.0 us, 27.2 sy, 0.0 ni, 37.4 id, 0.3 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu22 : 60.3 us, 16.8 sy, 0.0 ni, 22.6 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu23 : 49.5 us, 20.9 sy, 0.0 ni, 29.3 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu24 : 44.1 us, 23.7 sy, 0.0 ni, 30.9 id, 0.0 wa, 0.0 hi, 1.3 si, 0.0 st
%Cpu25 : 38.6 us, 23.9 sy, 0.0 ni, 36.8 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu26 : 48.0 us, 19.1 sy, 0.0 ni, 31.2 id, 0.3 wa, 0.0 hi, 1.3 si, 0.0 st
%Cpu27 : 41.9 us, 23.7 sy, 0.0 ni, 33.7 id, 0.0 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu28 : 47.5 us, 23.1 sy, 0.0 ni, 28.8 id, 0.3 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu29 : 47.1 us, 22.9 sy, 0.0 ni, 29.0 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu30 : 21.8 us, 20.0 sy, 0.0 ni, 42.8 id, 14.7 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu31 : 17.7 us, 19.8 sy, 0.0 ni, 61.8 id, 0.4 wa, 0.0 hi, 0.4 si, 0.0 st
%Cpu32 : 22.8 us, 17.6 sy, 0.0 ni, 44.8 id, 14.1 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu33 : 23.2 us, 16.5 sy, 0.0 ni, 58.2 id, 1.8 wa, 0.0 hi, 0.4 si, 0.0 st
%Cpu34 : 21.1 us, 17.5 sy, 0.0 ni, 58.2 id, 1.8 wa, 0.0 hi, 1.4 si, 0.0 st
%Cpu35 : 19.1 us, 18.7 sy, 0.0 ni, 60.4 id, 1.4 wa, 0.0 hi, 0.4 si, 0.0 st
%Cpu36 : 22.5 us, 17.6 sy, 0.0 ni, 58.8 id, 0.0 wa, 0.0 hi, 1.1 si, 0.0 st
%Cpu37 : 19.8 us, 19.1 sy, 0.0 ni, 55.5 id, 4.9 wa, 0.0 hi, 0.7 si, 0.0 st
%Cpu38 : 18.7 us, 20.8 sy, 0.0 ni, 56.7 id, 3.5 wa, 0.0 hi, 0.4 si, 0.0 st
%Cpu39 : 18.6 us, 17.9 sy, 0.0 ni, 28.7 id, 33.3 wa, 0.0 hi, 1.4 si, 0.0 st
KiB Mem : 52828739+total, 40350712 free, 21839540+used, 26954128+buff/cache
KiB Swap: 26830438+total, 26819404+free, 110336 used. 31489411+avail Mem
Has anyone got experience with distributing IRQs over multiple cores? The system appear to automatically create Tx/Rx queues equal to the number of cores and associated each pair with a given core. Most interrupts (by a magnitude of 10:1 occur on core 0 though:
Code:
[admin@kvm5c ~]# grep -e CPU -e eth0 /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 CPU20 CPU21 CPU22 CPU23 CPU24 CPU25 CPU26 CPU27 CPU28 CPU29 CPU30 CPU31
152: 1620602535 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670016-edge eth0-TxRx-0
153: 0 168146477 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670017-edge eth0-TxRx-1
154: 0 0 154649369 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670018-edge eth0-TxRx-2
155: 0 0 0 140435314 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670019-edge eth0-TxRx-3
156: 0 0 0 0 133984352 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670020-edge eth0-TxRx-4
157: 0 0 0 0 0 129604155 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670021-edge eth0-TxRx-5
158: 0 0 0 0 0 0 126145092 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670022-edge eth0-TxRx-6
159: 0 0 0 0 0 0 0 123962115 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670023-edge eth0-TxRx-7
160: 0 0 0 0 0 0 0 0 40980503 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 PCI-MSI 3670024-edge eth0-TxRx-8
161: 0 0 0 0 0 0 0 0 0 42662918 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 PCI-MSI 3670025-edge eth0-TxRx-9
162: 1 0 0 0 0 0 0 0 0 0 41846450 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670026-edge eth0-TxRx-10
163: 0 1 0 0 0 0 0 0 0 0 0 45015288 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670027-edge eth0-TxRx-11
164: 0 0 1 0 0 0 0 0 0 0 0 0 45015780 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670028-edge eth0-TxRx-12
165: 0 0 0 1 0 0 0 0 0 0 0 0 0 41972828 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670029-edge eth0-TxRx-13
166: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 43592026 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670030-edge eth0-TxRx-14
167: 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 41824424 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670031-edge eth0-TxRx-15
168: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 85380315 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670032-edge eth0-TxRx-16
169: 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 116711472 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670033-edge eth0-TxRx-17
170: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 115485583 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI 3670034-edge eth0-TxRx-18
<snip>