Performance problem

lcfabre

New Member
Hello,

I have a proxmox 2.0 inside an OVZ VE here.
Host is centos 5.
I used to work well for month.
This days it get shot down by an internal machine posting by error thousands of mail through port 26.

We stopped the faulty machine.
We flushed the queue via the web interface.
We shutdowned and restart the VE.

What we have now is a responsive SMTP for first minutes after boot and then a slooooow SMTP.
You have to wait exactly 30sec for the prompt (looks like a timeout)
Further smtp dialog is fast.
During this time, CPU is 99% idle, web interface is responsive.

DNS & RDNS is okay and fast

Any ideas ?


Uptime
18:01:36 up 00:19, load average: 0.07, 0.20, 0.17CPU(s)4 x Intel(R) Xeon(TM) CPU 3.06GHzMémoire physique (3496MB/495MB)

L'espace de HD (7812MB/1557MB)

Version (package/version/build)proxmox-mailgateway/2.0/2123Kernel VersionLinux 2.6.18-ovz028stab045.1-enterprise #1 SMP Sun Sep 30 19:35:36 MSD 2007
 
Hello,

I have a proxmox 2.0 inside an OVZ VE here.
Host is centos 5.
I used to work well for month.
This days it get shot down by an internal machine posting by error thousands of mail through port 26.

We stopped the faulty machine.
We flushed the queue via the web interface.
We shutdowned and restart the VE.

What we have now is a responsive SMTP for first minutes after boot and then a slooooow SMTP.
You have to wait exactly 30sec for the prompt (looks like a timeout)
Further smtp dialog is fast.
During this time, CPU is 99% idle, web interface is responsive.

DNS & RDNS is okay and fast

Any ideas ?


Uptime
18:01:36 up 00:19, load average: 0.07, 0.20, 0.17CPU(s)4 x Intel(R) Xeon(TM) CPU 3.06GHzMémoire physique (3496MB/495MB)

L'espace de HD (7812MB/1557MB)

Version (package/version/build)proxmox-mailgateway/2.0/2123Kernel VersionLinux 2.6.18-ovz028stab045.1-enterprise #1 SMP Sun Sep 30 19:35:36 MSD 2007

hi,

please always use the latest version of proxmox. you can upload the servicepack 2.1 via web interface. we always fix a lot of small bugs and also add features.

see servicepack-2.1

openvz on your centos: there is also a newer kernel available (see http://openvz.org)

for deeper analysis, we need the syslog or an ssh login to the machine, best would be the centos host.
 
Hello Tom

I'm aware of the 2.1 & the new kernel.
It's only that I would like to test before putting on production.
I will move to 2.1 in full HA configuration.

What I've done at the moment is replace the VE with a copy I made some time ago, restore proxmox backup and it will last like this for the few next hours.

I really want to know why the production VE get killed.
After further invertigations the problem look like this :
- restart of the smtpd
- no delay for connection
- then more & more delay
- then stmp client begins to time out
The whole cycle last around 1H to 30mn (should depend of the load)

I may grant acces to the physical host on Wednesday afternoon.
Please provide your IP via email and I will mail back a login or provide an rsa_key.
If of interest for debuging I may provide the VE.

TIA
 
Hello Tom

I'm aware of the 2.1 & the new kernel.
It's only that I would like to test before putting on production.
I will move to 2.1 in full HA configuration.

What I've done at the moment is replace the VE with a copy I made some time ago, restore proxmox backup and it will last like this for the few next hours.

I really want to know why the production VE get killed.
After further invertigations the problem look like this :
- restart of the smtpd
- no delay for connection
- then more & more delay
- then stmp client begins to time out
The whole cycle last around 1H to 30mn (should depend of the load)

I may grant acces to the physical host on Wednesday afternoon.
Please provide your IP via email and I will mail back a login or provide an rsa_key.
If of interest for debuging I may provide the VE.

TIA

Hi,

yes, just send the login credentials to support@proxmox.com and open SSH for our IP 213.129.239.114
 
incident close

Hello,

this post to let the forum know that support had treated this case with amazing efficiency.

The point is that we have a rather heavy load here and that the proxmox is running inside an OpenVZ VE (see http://www.openvz.org)

The postfix had to be fine tuned to cope with the load.
The VE had to be fine tuned to accept the more demanding proxmox.
(the post may be of interst in OVZ forum)

All this tuning are standard but hidden around a mass of possible one.
Support pointed it in minutes after a bit of try and set, the whole thing is now tuned to accept the load.

The rule of thumbs numbers are :
- Standard Proxmox VE setting
- physical host is a Dual Xeon 3Ghz with 10Ktm raid1 scsi
(proxperf give
CPU BOGOMIPS: 24428.30
REGEX/SECOND: 441521
FSYNCS/SECOND: 778.60
this is a bit slow in disk)
- You can get around 18Kmail/day
(with 6sec/mail in 2.0 & 14s/mail in 2.1)
Add 2K mail a days and the VE start to throttle and refuse mail.

After tuning we tested at double load with no problem.
Postfix is now having 250 smtp worker
you may count active worker with :
ps auxww | grep smtpd | wc -l

VE have :
- 1,5G used mem
- kmemsize 36994538 (<-max held, set to unlimited)
- 20K numfile
- tcpsndbuf 5121092
- tcprcvbuf 5121092
- othersockbuf 5121092 (can be unlimited)

Regards
 
Hi,

Is the fix for this a configuration change that can be done by users? I am having a similar problem, with fast initial connection and command response, but SMTP banner delays between 1 and 3 minutes. The load average on the server is hovering around 0.1, and our mail volume after all the smtp checks is under 1000/day.

I think the delays started happening after I changed the Verify Receivers option to "Yes (450)", but I'm not sure if that's the cause.

Thanks,

Ahmad
 
Some config had to be done in vi

Hello,

the config made in my "close" mail were done at the shell.
The web interface does not provide them.

One point is the network througput.
A part of the problem was that all that emails filled the network link causing the tcp session hanging postfix.
As we are using un QoS device here, giving more througput to the proxmox helped to down number of concurent session in postfix.

Anf finally, having the reverse dns in is definitely a way to slow down things if your DNS is slow.
Fine tune it and have it not wait too long for spammers smtpd wich rarely have correct revers.

Good crawl
 
Thanks for the reply, lcfabre. I have decided to turn off the "Reject Unknown Senders" option for now, and that made everything responsive again. Unfortunately, it will mean a little more spam, but we were feeling a little lonely without any spam since we installed Proxmox.

Ahmad
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!