Hi,
I'm looking for some ideas / light or places to look.
i'm new to proxmox, i've used ESXi / hyperv before but never proxmox. I have taken over a role that supports a few promox servers.
The problem i'm having is we run postgres with 4 databases running on it, when we run a pg_dump one one of the databases it runs for about 2 / 3 minutes then gets killed by oom killer, with the following error in /etc/log/kern.log
Oct 29 15:54:29 www1 kernel: OOM killed process 0 (pg_dump) vm:4104920kB, rss:4045352kB, swap:0kB
when we run pg_dump on any other database we have no issues, this database is around 2gb big. the vm is running debian 6.0.1 with kernel 2.6.32-11-pve. running with 1gb swap 30gb of ram of which 15gb is in use and 2mb is in use of the swap. when I run pg_dump the swap doesn't move but the phsycial ram increases by 2 / 3 gb before the oom killer kills the process.
below are some outputs
root@www1:/tmp# cat /proc/user_beancounters
Version: 2.5
uid resource held maxheld barrier limit failcnt
200: kmemsize 300369627 397443072 14641266688 16106127360 0
lockedpages 0 0 3932160 3932160 0
privvmpages 5165012 6389310 9223372036854775807 9223372036854775807 0
shmpages 1079352 1082376 1310720 1310720 0
dummy 0 0 0 0 0
numproc 1270 1629 9223372036854775807 9223372036854775807 0
physpages 3885705 4645568 0 7864320 0
vmguarpages 0 0 0 9223372036854775807 0
oomguarpages 3373288 4584858 0 9223372036854775807 10
numtcpsock 590 984 9223372036854775807 9223372036854775807 0
numflock 21 139 9223372036854775807 9223372036854775807 0
numpty 2 4 9223372036854775807 9223372036854775807 0
numsiginfo 0 54 9223372036854775807 9223372036854775807 0
tcpsndbuf 22161632 32647584 9223372036854775807 9223372036854775807 0
tcprcvbuf 25075288 37417680 9223372036854775807 9223372036854775807 0
othersockbuf 240448 1149600 9223372036854775807 9223372036854775807 0
dgramrcvbuf 0 354736 9223372036854775807 9223372036854775807 0
numothersock 309 832 9223372036854775807 9223372036854775807 0
dcachesize 20640208 89834607 7320109056 8053063680 0
numfile 9516 13075 9223372036854775807 9223372036854775807 0
dummy 0 0 0 0 0
dummy 0 0 0 0 0
dummy 0 0 0 0 0
numiptent 55 55 9223372036854775807 9223372036854775807 0
I have even set the oom_adj for postgres pid to -17 the lowest it can be and it still kills the process.
any one have any ideas or any pointers in what to look at.
Thanks
Paul
I'm looking for some ideas / light or places to look.
i'm new to proxmox, i've used ESXi / hyperv before but never proxmox. I have taken over a role that supports a few promox servers.
The problem i'm having is we run postgres with 4 databases running on it, when we run a pg_dump one one of the databases it runs for about 2 / 3 minutes then gets killed by oom killer, with the following error in /etc/log/kern.log
Oct 29 15:54:29 www1 kernel: OOM killed process 0 (pg_dump) vm:4104920kB, rss:4045352kB, swap:0kB
when we run pg_dump on any other database we have no issues, this database is around 2gb big. the vm is running debian 6.0.1 with kernel 2.6.32-11-pve. running with 1gb swap 30gb of ram of which 15gb is in use and 2mb is in use of the swap. when I run pg_dump the swap doesn't move but the phsycial ram increases by 2 / 3 gb before the oom killer kills the process.
below are some outputs
root@www1:/tmp# cat /proc/user_beancounters
Version: 2.5
uid resource held maxheld barrier limit failcnt
200: kmemsize 300369627 397443072 14641266688 16106127360 0
lockedpages 0 0 3932160 3932160 0
privvmpages 5165012 6389310 9223372036854775807 9223372036854775807 0
shmpages 1079352 1082376 1310720 1310720 0
dummy 0 0 0 0 0
numproc 1270 1629 9223372036854775807 9223372036854775807 0
physpages 3885705 4645568 0 7864320 0
vmguarpages 0 0 0 9223372036854775807 0
oomguarpages 3373288 4584858 0 9223372036854775807 10
numtcpsock 590 984 9223372036854775807 9223372036854775807 0
numflock 21 139 9223372036854775807 9223372036854775807 0
numpty 2 4 9223372036854775807 9223372036854775807 0
numsiginfo 0 54 9223372036854775807 9223372036854775807 0
tcpsndbuf 22161632 32647584 9223372036854775807 9223372036854775807 0
tcprcvbuf 25075288 37417680 9223372036854775807 9223372036854775807 0
othersockbuf 240448 1149600 9223372036854775807 9223372036854775807 0
dgramrcvbuf 0 354736 9223372036854775807 9223372036854775807 0
numothersock 309 832 9223372036854775807 9223372036854775807 0
dcachesize 20640208 89834607 7320109056 8053063680 0
numfile 9516 13075 9223372036854775807 9223372036854775807 0
dummy 0 0 0 0 0
dummy 0 0 0 0 0
dummy 0 0 0 0 0
numiptent 55 55 9223372036854775807 9223372036854775807 0
I have even set the oom_adj for postgres pid to -17 the lowest it can be and it still kills the process.
any one have any ideas or any pointers in what to look at.
Thanks
Paul