Hi. I am trying to fix some IO Delay issues I seem to be having when running massive mongoDB updates on one of my VM's. It throws the entire system out of whack, and I'm trying to determine if it is proxmox or Openfiler (My SAN software) causing these issues.
My IT helped me run some tests, and we found out that the RAID card in my SAN wasn't the best and didn't allow for write-back (or write-thru, can't remember which). Anyways we replaced the RAID card and then did a few optimizations with the OS itself as far as deadline and such. We then ended up with this report:
http://serverbear.com/benchmark/2013/07/13/oQzdyEqjNWSU0gUL
You can see the I/O tab has a dd and FIO test. The dd test is good in speed performance (but still could be a bit faster). The FIO is garbage though. And this is the actual speed that my VM's seem to be getting when it comes to writing files to the SAN. When I am trying to update mongoDB with gigabytes of information this pratically brings the entire SAN down with all the IO delay and lags all the VM's (even the ones on other hypervisors). The IO Delay spikes up past the CPU % and the I/O on the VM itself shoots up to about 30MB but it doesn't actually seem to be getting those speeds.
I'm not really sure where to look at this point, so I am asking Proxmox and Openfiler people alike for their opinions and suggestions.
Thanks for any help.
EDIT: Adding a bit more of useful information:
SAN HW configuration:
1230v2 HT Processor (8 cores w/ HT)
8 gb of Ram
60 gb SSD Local OS drive
2 x 300 15k SAS HDD's in RAID 1 (iSCI targets)
2 x 1TB in RAID 1 (nfs, used for file storage only)
Proxmox Nodes:
1230v2 HT Processor (8 cores w/ HT)
8gb of RAM on one, and 16gb on another node
60 gb SSD local OS drives
1 Gigabit connection private network between proxmox nodes and SAN
Output of pveversion -v on proxmox1:
Output of pveversion -v on proxmox2:
Output of pveperf on proxmox1 (VM's still running):
My IT helped me run some tests, and we found out that the RAID card in my SAN wasn't the best and didn't allow for write-back (or write-thru, can't remember which). Anyways we replaced the RAID card and then did a few optimizations with the OS itself as far as deadline and such. We then ended up with this report:
http://serverbear.com/benchmark/2013/07/13/oQzdyEqjNWSU0gUL
You can see the I/O tab has a dd and FIO test. The dd test is good in speed performance (but still could be a bit faster). The FIO is garbage though. And this is the actual speed that my VM's seem to be getting when it comes to writing files to the SAN. When I am trying to update mongoDB with gigabytes of information this pratically brings the entire SAN down with all the IO delay and lags all the VM's (even the ones on other hypervisors). The IO Delay spikes up past the CPU % and the I/O on the VM itself shoots up to about 30MB but it doesn't actually seem to be getting those speeds.
I'm not really sure where to look at this point, so I am asking Proxmox and Openfiler people alike for their opinions and suggestions.
Thanks for any help.
EDIT: Adding a bit more of useful information:
SAN HW configuration:
1230v2 HT Processor (8 cores w/ HT)
8 gb of Ram
60 gb SSD Local OS drive
2 x 300 15k SAS HDD's in RAID 1 (iSCI targets)
2 x 1TB in RAID 1 (nfs, used for file storage only)
Proxmox Nodes:
1230v2 HT Processor (8 cores w/ HT)
8gb of RAM on one, and 16gb on another node
60 gb SSD local OS drives
1 Gigabit connection private network between proxmox nodes and SAN
Output of pveversion -v on proxmox1:
root@proxmox1:~# pveversion -v
pve-manager: 3.0-23 (pve-manager/3.0/957f0862)
running kernel: 2.6.32-20-pve
proxmox-ve-2.6.32: 3.0-100
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-18-pve: 2.6.32-88
lvm2: 2.02.95-pve3
clvm: 2.02.95-pve3
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-4
qemu-server: 3.0-20
pve-firmware: 1.0-22
libpve-common-perl: 3.0-4
libpve-access-control: 3.0-4
libpve-storage-perl: 3.0-8
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-13
ksm-control-daemon: 1.1-1
Output of pveversion -v on proxmox2:
root@proxmox2:~# pveversion -v
pve-manager: 3.0-23 (pve-manager/3.0/957f0862)
running kernel: 2.6.32-20-pve
proxmox-ve-2.6.32: 3.0-100
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-16-pve: 2.6.32-82
pve-kernel-2.6.32-18-pve: 2.6.32-88
lvm2: 2.02.95-pve3
clvm: 2.02.95-pve3
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-4
qemu-server: 3.0-20
pve-firmware: 1.0-22
libpve-common-perl: 3.0-4
libpve-access-control: 3.0-4
libpve-storage-perl: 3.0-8
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-13
ksm-control-daemon: 1.1-1
Output of pveperf on proxmox1 (VM's still running):
Output of pveperf on proxmox2 (VM's still running):root@proxmox1:~# pveperf
CPU BOGOMIPS: 51198.08
REGEX/SECOND: 1436979
HD SIZE: 13.53 GB (/dev/mapper/pve-root)
BUFFERED READS: 394.34 MB/sec
AVERAGE SEEK TIME: 0.15 ms
FSYNCS/SECOND: 4399.14
DNS EXT: 59.80 ms
DNS INT: 317.00 ms (local)
root@proxmox2:~# pveperf
CPU BOGOMIPS: 52802.56
REGEX/SECOND: 1477414
HD SIZE: 13.53 GB (/dev/mapper/pve-root)
BUFFERED READS: 352.91 MB/sec
AVERAGE SEEK TIME: 0.13 ms
FSYNCS/SECOND: 4134.46
DNS EXT: 63.33 ms
DNS INT: 255.02 ms (local)
Last edited: