Hi all,
I have a 2 server nodes and a NAS (QNAP TS-879). Both server(s) and NAS each have a 1 X Intel Ethernet Server Adapter X520-T2 10GB Ethernet Card. (3 Total)
RAID Array on NAS is 5X500GB 7200 RPM Drives in a RAID5 Configuration
I have updated proxmox to the latest version of the 1.9 currently available. When I issue a lspci command I see the following output from the server(s):
04:00.0 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network Connection (rev 01)
So it is definitely seeing the internal PCI-e 10GB network cards since it sees the onboard controller.
PVE Version is currently at:
pveversion
pve-manager/1.9/6567
uname -a shows me the following:Linux node1 2.6.32-6-pve #1 SMP Mon Jan 23 08:27:52 CET 2012 x86_64 GNU/Linux
I should also note, that I have moved my OpenVZ /var/lib/vz over to my iSCSI volume for storage and have moved the built-in storage to /var/lib/vzInt, so all VM's are now running against the ISCSI raids. This is based on this how to located here: http://pve.proxmox.com/wiki/OpenVZ_on_ISCSI_howto
So as far as I can tell its all at the latest. After installing ethtool, I get the following on both network ports on both cards:
The configuration is as follows currently:
1.) 2 On-board nics are connected at 1.0GB's, but soon I will be changing this to a 802.ad to coincide with my existing switch since it has the capability to do so, thanks to Dimitri Alexandris suggestion in my post to the list.
2.) The cross connection between node1 and node2 are on the eth3 port on the 10GB's ethernet port direct connected (I have no 10GB Switch at the moment)
When I issue the following I get: ethtool eth3 | grep Speed Speed: 10000Mb/s
So it seems to me that the server recognizes the card and it is negotiating at 10000 GB's which is 10X faster than the built-in Nics -- GOOD
3.) Each Port on the servers 10GB Card, in this case eth2 are direct connected to the 1'st port for node1 and the 2nd port of node2 with the following IP addresses:
node1: 10.0.0.10
NAS Server port 1: 10.0.0.1
node 1: 10.0.1.11
NAS Server port 2: 10.0.1.2
There is no VLAN between the 2 of them, just directly connected.
If I issue the following on both, I get the following:
ethtool eth2 | grep Speed
Speed: 10000Mb/s
OK great, so this is also talking from the iSCSI NAS Host at 10000MB's on both. Good. So theoretically, I have a system that has a back end private network that is communicating over 10GB's
4.) The cluster is setup as follows on the server(s)
Node1:
auto vmbr0
iface vmbr0 inet static
address 10.0.2.10
netmask 255.255.255.0
network 10.0.2.0
broadcast 10.0.2.255
bridge_ports eth3
bridge_stp off
bridge_fd 0
Node 2:
auto vmbr0
iface vmbr0 inet static
address 10.0.2.11
netmask 255.255.255.0
network 10.0.2.0
broadcast 10.0.2.255
bridge_ports eth3
bridge_stp off
bridge_fd 0
The cluster is built on the private direct connection between the both of them (vmbr0).
So the Issue:
The problem however, is that the current version of the ixgbe driver, which is currently at version 3.7.17 has some fairly bad performance in it compared to when these servers were running Ubuntu 10.04LTS Server (3.6.7), which I compiled by hand since the OS didn't have native support for it.
Copying data from 1 node to another, say, a large windows KVM Image (27GB) is only copying at around 74MB's when I run an scp on it to test:
Private network copy (10GB's):
scp -r 112/ root@10.0.2.11:/var/lib/vz/dump/
vm-112-disk-1.raw 100% 27GB 53.2MB/s 08:39
Public network copy (1GB's)
scp -r 112/ root@172.16.10.11:/var/lib/vz/dump
vm-112-disk-1.raw 100% 27GB 53.1MB/s 08:40
So as you can see above, not much difference in speed between the copy speeds over 10GB ethernet or over 1GB Ethernet. I don't think thats write, but perhaps its a limitation of the utils like rsync, ssh, scp, etc
When I run a dd test on it, I get the following:
/var/lib/vz/dump# dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 3.43304 s, 626 MB/s
So thats pretty good, but when we compare this to my Ubuntu Server that was previously running on this node, here are the results:
/node2Data# dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 1.84195 s, 1.2 GB/s
However, running this same command on the local internal device, which is a single 500GB SATA 2.5 7200 RPM drive, yields the same results as that of the 10GB dd above:
/var/lib/vzInt# dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 3.51932 s, 610 MB/s
The Ubuntu Server seems to be twice as fast when writing to the previous drive' iSCSI export.
Anyone have any suggestion on how to make my server's communication between the 2 nodes run faster when copying data between them? I would like to be able to keep my downtime to a minimum when migrating, backing up, and maintaining my VM's (openVZ and KVM) between the nodes, and as this grows and I introduce a 10GB Ethernet switch, between more nodes.
I don't have jumbo frames enabled on either of the ports on the servers, or on the NAS at this time since I don't think thats the issue since it wasn't needed before when it was running under Ubuntu 10.04 Server. Is this a driver bug or a parameter issue?
Thanks for your help if any that you can give me. If you need more information, please let me know and I would be more than happy to provide it for you.
Sorry for the novel
I have a 2 server nodes and a NAS (QNAP TS-879). Both server(s) and NAS each have a 1 X Intel Ethernet Server Adapter X520-T2 10GB Ethernet Card. (3 Total)
RAID Array on NAS is 5X500GB 7200 RPM Drives in a RAID5 Configuration
I have updated proxmox to the latest version of the 1.9 currently available. When I issue a lspci command I see the following output from the server(s):
04:00.0 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82599EB 10 Gigabit TN Network Connection (rev 01)
So it is definitely seeing the internal PCI-e 10GB network cards since it sees the onboard controller.
PVE Version is currently at:
pveversion
pve-manager/1.9/6567
uname -a shows me the following:Linux node1 2.6.32-6-pve #1 SMP Mon Jan 23 08:27:52 CET 2012 x86_64 GNU/Linux
I should also note, that I have moved my OpenVZ /var/lib/vz over to my iSCSI volume for storage and have moved the built-in storage to /var/lib/vzInt, so all VM's are now running against the ISCSI raids. This is based on this how to located here: http://pve.proxmox.com/wiki/OpenVZ_on_ISCSI_howto
So as far as I can tell its all at the latest. After installing ethtool, I get the following on both network ports on both cards:
The configuration is as follows currently:
1.) 2 On-board nics are connected at 1.0GB's, but soon I will be changing this to a 802.ad to coincide with my existing switch since it has the capability to do so, thanks to Dimitri Alexandris suggestion in my post to the list.
2.) The cross connection between node1 and node2 are on the eth3 port on the 10GB's ethernet port direct connected (I have no 10GB Switch at the moment)
When I issue the following I get: ethtool eth3 | grep Speed Speed: 10000Mb/s
So it seems to me that the server recognizes the card and it is negotiating at 10000 GB's which is 10X faster than the built-in Nics -- GOOD
3.) Each Port on the servers 10GB Card, in this case eth2 are direct connected to the 1'st port for node1 and the 2nd port of node2 with the following IP addresses:
node1: 10.0.0.10
NAS Server port 1: 10.0.0.1
node 1: 10.0.1.11
NAS Server port 2: 10.0.1.2
There is no VLAN between the 2 of them, just directly connected.
If I issue the following on both, I get the following:
ethtool eth2 | grep Speed
Speed: 10000Mb/s
OK great, so this is also talking from the iSCSI NAS Host at 10000MB's on both. Good. So theoretically, I have a system that has a back end private network that is communicating over 10GB's
4.) The cluster is setup as follows on the server(s)
Node1:
auto vmbr0
iface vmbr0 inet static
address 10.0.2.10
netmask 255.255.255.0
network 10.0.2.0
broadcast 10.0.2.255
bridge_ports eth3
bridge_stp off
bridge_fd 0
Node 2:
auto vmbr0
iface vmbr0 inet static
address 10.0.2.11
netmask 255.255.255.0
network 10.0.2.0
broadcast 10.0.2.255
bridge_ports eth3
bridge_stp off
bridge_fd 0
The cluster is built on the private direct connection between the both of them (vmbr0).
So the Issue:
The problem however, is that the current version of the ixgbe driver, which is currently at version 3.7.17 has some fairly bad performance in it compared to when these servers were running Ubuntu 10.04LTS Server (3.6.7), which I compiled by hand since the OS didn't have native support for it.
Copying data from 1 node to another, say, a large windows KVM Image (27GB) is only copying at around 74MB's when I run an scp on it to test:
Private network copy (10GB's):
scp -r 112/ root@10.0.2.11:/var/lib/vz/dump/
vm-112-disk-1.raw 100% 27GB 53.2MB/s 08:39
Public network copy (1GB's)
scp -r 112/ root@172.16.10.11:/var/lib/vz/dump
vm-112-disk-1.raw 100% 27GB 53.1MB/s 08:40
So as you can see above, not much difference in speed between the copy speeds over 10GB ethernet or over 1GB Ethernet. I don't think thats write, but perhaps its a limitation of the utils like rsync, ssh, scp, etc
When I run a dd test on it, I get the following:
/var/lib/vz/dump# dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 3.43304 s, 626 MB/s
So thats pretty good, but when we compare this to my Ubuntu Server that was previously running on this node, here are the results:
/node2Data# dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 1.84195 s, 1.2 GB/s
However, running this same command on the local internal device, which is a single 500GB SATA 2.5 7200 RPM drive, yields the same results as that of the 10GB dd above:
/var/lib/vzInt# dd if=/dev/zero of=output.img bs=8k count=256k
262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB) copied, 3.51932 s, 610 MB/s
The Ubuntu Server seems to be twice as fast when writing to the previous drive' iSCSI export.
Anyone have any suggestion on how to make my server's communication between the 2 nodes run faster when copying data between them? I would like to be able to keep my downtime to a minimum when migrating, backing up, and maintaining my VM's (openVZ and KVM) between the nodes, and as this grows and I introduce a 10GB Ethernet switch, between more nodes.
I don't have jumbo frames enabled on either of the ports on the servers, or on the NAS at this time since I don't think thats the issue since it wasn't needed before when it was running under Ubuntu 10.04 Server. Is this a driver bug or a parameter issue?
Thanks for your help if any that you can give me. If you need more information, please let me know and I would be more than happy to provide it for you.
Sorry for the novel