don't understand ifconfig stats

mouk

Renowned Member
May 3, 2016
48
0
71
53
Hi,

Another question. Three node proxmox, with a meshed ceph-network (10.10.89.1/2/3) on eth2 and eth3, on 10Gbit intel nics. eth0 is vmbr0 with the external ip / dns.

ifconfig looks like this, nota bene the TX and TX values:

eth0 Link encap:Ethernet HWaddr 68:05:ca:43:3a:8c
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4942557097 errors:0 dropped:32942 overruns:0 frame:0
TX packets:3722437289 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3953319763652 (3.5 TiB) TX bytes:4635593470910 (4.2 TiB)
Interrupt:29 Memory:c73a0000-c73c0000

eth2 Link encap:Ethernet HWaddr 0c:c4:7a:6e:81:e2
inet addr:10.10.89.2 Bcast:10.10.89.255 Mask:255.255.255.0
inet6 addr: fe80::ec4:7aff:fe6e:81e2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:74782497 errors:0 dropped:0 overruns:0 frame:0
TX packets:59444042 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:81229954895 (75.6 GiB) TX bytes:56338716325 (52.4 GiB)

eth3 Link encap:Ethernet HWaddr 0c:c4:7a:6e:81:e3
inet addr:10.10.89.2 Bcast:10.10.89.255 Mask:255.255.255.0
inet6 addr: fe80::ec4:7aff:fe6e:81e3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:54370350 errors:0 dropped:0 overruns:0 frame:0
TX packets:57351493 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:53756934479 (50.0 GiB) TX bytes:58702860158 (54.6 GiB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:141397335 errors:0 dropped:0 overruns:0 frame:0
TX packets:141397335 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:1820422977978 (1.6 TiB) TX bytes:1820422977978 (1.6 TiB)

tap103i0 Link encap:Ethernet HWaddr 02:18:0a:79:7c:91
inet6 addr: fe80::18:aff:fe79:7c91/64 Scope:Link
UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
RX packets:1750478 errors:0 dropped:0 overruns:0 frame:0
TX packets:5408562 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:11046378551 (10.2 GiB) TX bytes:1426699038 (1.3 GiB)

tap109i0 Link encap:Ethernet HWaddr 0e:21:e8:7b:ba:25
inet6 addr: fe80::c21:e8ff:fe7b:ba25/64 Scope:Link
UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:648 errors:0 dropped:270043 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:55445 (54.1 KiB)

vmbr0 Link encap:Ethernet HWaddr 68:05:ca:43:3a:8c
inet addr:a.b.c.d Bcast:a.b.c.255 Mask:255.255.255.0
inet6 addr: fe80::6a05:caff:fe43:3a8c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2990815528 errors:0 dropped:0 overruns:0 frame:0
TX packets:817429329 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3761733493779 (3.4 TiB) TX bytes:4418077825360 (4.0 TiB)​

Looking at this, it seems that only 50 - 75G of traffic has gone through the 10.10.89.0 (10G ceph) network, and the rest (3.4- 4 TiB!) has gone though vmbr0, my 1G external ip nic.

Here is my ceph.conf:
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.89.0/24
filestore xattr use omap = true
fsid = 1397f1dc-7d94-43ea-ab12-8f8792eee9c1
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
public network = a.b.c.0/24

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.2]
host = pm3
mon addr = 10.10.89.3:6789

[mon.0]
host = pm1
mon addr = 10.10.89.1:6789

[mon.1]
host = pm2
mon addr = 10.10.89.2:6789​

As far as I know, the above tells ceph to use 10.10.89.0 for ceph traffic, and not the public a.b.c.0/24 nic.

Why so much traffic on my vmbr0, and so little on the eth2/eth3?

Setup of the meshed ceph network is taken from the wiki, and iperf tells me that my ceph network is running at 10G

Can someone explain my observations? Do I need to post other config files?
 
how does your /etc/network/interfaces look like?
also what does
Code:
ip route
say ?
 
Hi Dominik,

cat /etc/network/interfaces:
root@pm2:~# cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage part of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!

auto lo
iface lo inet loopback

iface eth0 inet manual

auto vmbr0
iface vmbr0 inet static
address a.b.c.27
netmask 255.255.255.0
gateway a.b.c.1
bridge_ports eth0
bridge_stp off
bridge_fd 0

auto eth2
iface eth2 inet static
address 10.10.89.2
netmask 255.255.255.0
up route add -net 10.10.89.1 netmask 255.255.255.255 dev eth2
down route del -net 10.10.89.1 netmask 255.255.255.255 dev eth2

auto eth3
iface eth3 inet static
address 10.10.89.2
netmask 255.255.255.0
up route add -net 10.10.89.3 netmask 255.255.255.255 dev eth3
down route del -net 10.10.89.3 netmask 255.255.255.255 dev eth3
And route -n:
root@pm2:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 a.b.c.1 0.0.0.0 UG 0 0 0 vmbr0
10.10.89.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2
10.10.89.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
10.10.89.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth2
10.10.89.3 0.0.0.0 255.255.255.255 UH 0 0 0 eth3
a.b.c.0 0.0.0.0 255.255.255.0 U 0 0 0 vmbr0​

So I guess that's all correct..?

iperf on IP ADDRESS between two machines is like:
Client connecting to 10.10.89.2, TCP port 5001
TCP window size: 1.21 MByte (default)
------------------------------------------------------------
[ 3] local 10.10.89.1 port 34438 connected with 10.10.89.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.0 GBytes 9.41 Gbits/sec
[ 5] 0.0-10.0 sec 11.0 GBytes 9.41 Gbits/sec​
So that's 10G speed on IP address

And iperf on HOSTNAME 'pm2' is:
Client connecting to pm2, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local a.b.c.26 port 35010 connected with a.b.c.27 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.00 GBytes 860 Mbits/sec​
So that's only 1G speed.

Ceph.conf:
root@pm1:~# cat /etc/ceph/ceph.conf
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.89.0/24
filestore xattr use omap = true
fsid = 1397f1dc-7d94-43ea-ab12-8f8792eee9c1
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
public network = a.b.c.0/24

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.2]
host = pm3
mon addr = 10.10.89.3:6789

[mon.0]
host = pm1
mon addr = 10.10.89.1:6789

[mon.1]
host = pm2
mon addr = 10.10.89.2:6789​

Could it perhaps be that ceph replication is also talking to pm1/pm2/pm3 (=1G speed) and NOT the 10G ip addresses?