Proxmox crashing when using ressources

Baptiste.mrch

New Member
Oct 26, 2023
11
0
1
Hello,

I create this thread because I'm really lost. It's now some times that my Proxmox crashes, I noticed when using a lot of resources.
I'm using a docker container with Photoprism installed on it (using the awesome proxmox helper scripts), and I noticed it used some heavy resources (I put 2 CPU thread and 2go RAM). When I execute a library indexation, I have my 2 CPU thread and my RAM that go likes 80% of usage ONLY for this container. The host tells me that it was approximately 50/60% CPU and 60/70% of RAM.

I wonder if I'm not having some bad hardware because I try many things, including updating Proxmox, Processor Microcode, CPU Scaling Governor all from the proxmox helper script. Still have issues.

Below you will find 2 logs I extract from Proxmox using Rsyslog
https://pastebin.com/dHair40M
https://pastebin.com/m2KatVHG

Sorry if my English is bad :)
Thanks to you !

Here is my config:
root@proxmox:~# pveversion --verbose
proxmox-ve: 8.0.2 (running kernel: 6.2.16-18-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-18-pve: 6.2.16-18
proxmox-kernel-6.2: 6.2.16-18
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx5
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.4
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-7
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1
 
I'm using a docker container with Photoprism installed on it (using the awesome proxmox helper scripts), and I noticed it used some heavy resources (I put 2 CPU thread and 2go RAM). When I execute a library indexation, I have my 2 CPU thread and my RAM that go likes 80% of usage ONLY for this container. The host tells me that it was approximately 50/60% CPU and 60/70% of RAM.
The Proxmox manual and just about every thread about Docker on this forum will warn about running it in a container. Can you run Docker in a VM and see if the problem persists?
 
If this is the case, the script didn't come from the repository you mentioned. The script there installs PhotoPrism in a Linux container, not a Docker container.
Yeah, I'm using the Docker LXC so that I can keep the hand on the installation. Not sure if it necessary, but I'm using a NFS mount on /etc/fstab and I didn't succeed with your Photoprism script.
Many thanks for your scripts btw :)
 
It crashes again. Here's the logs :
Oct 26 22:31:45 proxmox kernel: device veth107i0 entered promiscuous mode
Oct 26 22:31:45 proxmox kernel: eth0: renamed from vethRwSAsW
Oct 26 22:31:45 proxmox pct[15797]: <root@pam> end task UPID:proxmox:00003DB6:0003A8BA:653ACCB0:vzstart:107:root@pam: OK
Oct 26 22:31:46 proxmox kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered blocking state
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered forwarding state
Oct 26 22:31:53 proxmox postfix/qmgr[909]: 5E84A81012: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:31:53 proxmox postfix/local[16582]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:31:53 proxmox postfix/local[16582]: 5E84A81012: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=2339, delays=2339/0.01/0/0.03, dsn=4.3.0, status=deferred (alias database unavailable)
Oct 26 22:36:53 proxmox postfix/qmgr[909]: 3198980FC3: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:36:53 proxmox postfix/local[19515]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:36:53 proxmox postfix/local[19515]: 3198980FC3: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=3331, delays=3331/0.01/0/0.01, dsn=4.3.0, status=deferred (alias database unavailable)
Oct 26 22:40:24 proxmox pveproxy[963]: worker 966 finished
Oct 26 22:40:24 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:40:24 proxmox pveproxy[963]: worker 20604 started
Oct 26 22:40:25 proxmox pveproxy[20600]: worker exit
Oct 26 22:45:42 proxmox pvedaemon[959]: <root@pam> successful auth for user 'root@pam'
Oct 26 22:51:08 proxmox pveproxy[963]: worker 965 finished
Oct 26 22:51:08 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:51:08 proxmox pveproxy[963]: worker 24334 started
Oct 26 22:51:11 proxmox pveproxy[24330]: got inotify poll request in wrong process - disabling inotify
Oct 26 22:51:51 proxmox smartd[601]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 75 to 74
-- Reboot --
Oct 26 23:01:05 proxmox kernel: microcode: microcode updated early to revision 0xf4, date = 2023-02-23
Oct 26 23:01:05 proxmox kernel: Linux version 6.2.16-18-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-18 (2023-10-11T15:05Z) ()
Oct 26 23:01:05 proxmox kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.2.16-18-pve root=/dev/mapper/pve-root ro quiet
Oct 26 23:01:05 proxmox kernel: KERNEL supported cpus:
Oct 26 23:01:05 proxmox kernel: Intel GenuineIntel
Oct 26 23:01:05 proxmox kernel: AMD AuthenticAMD
Oct 26 23:01:05 proxmox kernel: Hygon HygonGenuine
Oct 26 23:01:05 proxmox kernel: Centaur CentaurHauls
Oct 26 23:01:05 proxmox kernel: zhaoxin Shanghai
 
It crashes again. Here's the logs :
Code:
Oct 26 22:31:45 proxmox kernel: device veth107i0 entered promiscuous mode
Oct 26 22:31:45 proxmox kernel: eth0: renamed from vethRwSAsW
Oct 26 22:31:45 proxmox pct[15797]: <root@pam> end task UPID:proxmox:00003DB6:0003A8BA:653ACCB0:vzstart:107:root@pam: OK
Oct 26 22:31:46 proxmox kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered blocking state
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered forwarding state
Oct 26 22:31:53 proxmox postfix/qmgr[909]: 5E84A81012: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:31:53 proxmox postfix/local[16582]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:31:53 proxmox postfix/local[16582]: 5E84A81012: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=2339, delays=2339/0.01/0/0.03, dsn=4.3.0, status=deferred (alias database unavailable)
Oct 26 22:36:53 proxmox postfix/qmgr[909]: 3198980FC3: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:36:53 proxmox postfix/local[19515]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:36:53 proxmox postfix/local[19515]: 3198980FC3: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=3331, delays=3331/0.01/0/0.01, dsn=4.3.0, status=deferred (alias database unavailable)
Proxmox is trying to tell you something, which might be important, by e-mailing you but fails because mail is not configured properly.
Code:
Oct 26 22:40:24 proxmox pveproxy[963]: worker 966 finished
Oct 26 22:40:24 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:40:24 proxmox pveproxy[963]: worker 20604 started
Oct 26 22:40:25 proxmox pveproxy[20600]: worker exit
Oct 26 22:45:42 proxmox pvedaemon[959]: <root@pam> successful auth for user 'root@pam'
Oct 26 22:51:08 proxmox pveproxy[963]: worker 965 finished
Oct 26 22:51:08 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:51:08 proxmox pveproxy[963]: worker 24334 started
Oct 26 22:51:11 proxmox pveproxy[24330]: got inotify poll request in wrong process - disabling inotify
Oct 26 22:51:51 proxmox smartd[601]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 75 to 74
-- Reboot --
There is no information in the logs about the crash, just that it happened and the system detected a restart. This is usually a hardware issue. Run memtest, start replacing components, update the BIOS, make sure temperatures are not too high. What kind of hardware are you using?
 
Okay, I think I will try to setup the emails conf so that I can check.

I will do memtest. I have buy new RAM sticks few days ago, I will soon install it and see if it's doing better.

Thanks for your help !
 
Okay some news, I configured email service, works fine (I tested via command line and via backups notification). I receive absolutely 0 emails regarding the crashes.
Memtest tells me everything is fine. I really don't know what to do.
My config is the following :
- Intel Core i3-8109U @ 3.00GHz
- 8go DDR4 2400MT/s
- 1to M.2 SSD from Samsung
 
Okay some news, I configured email service, works fine (I tested via command line and via backups notification). I receive absolutely 0 emails regarding the crashes.
Memtest tells me everything is fine. I really don't know what to do.
My config is the following :
- Intel Core i3-8109U @ 3.00GHz
- 8go DDR4 2400MT/s
- 1to M.2 SSD from Samsung
And you are having these problems when running Docker inside a VM? Or only when running it in a container?
 
And you are having these problems when running Docker inside a VM? Or only when running it in a container?
I have this issue randomly, I think when using a lot of resources. (My photoprism container is only an example).
I've just reinstalled Proxmox, and for now, all seems to keep online even by turn on all my container/vm. I will let few days go by and see if it fixed my issue. I let you know :)
 
Okay some news. Everything appear to working fine, until saturday, when my proxmox crashes again. The mains things that repeat at each time I check the logs is some sort of cron execution (the hourly one). I checked and I have nothing at all inside the /etc/cron.hourly directory
https://pastebin.com/k082cTGp

If you have any ideas, and/or diagnostics, I take.
Thanks !
 
Okay some news. Everything appear to working fine, until saturday, when my proxmox crashes again. The mains things that repeat at each time I check the logs is some sort of cron execution (the hourly one). I checked and I have nothing at all inside the /etc/cron.hourly directory
https://pastebin.com/k082cTGp

If you have any ideas, and/or diagnostics, I take.
Thanks !
Cron messages appear regularly anyway and are probably unrelated. It's probably a hardware (or cooling) issue that is triggers by stressing the system.
 
Hi again,

After a while, reinstalling Proxmox seems to work fine. Unfortunately, I still have crashes. I changed ram sticks (upgrade to 16go) and still crashes. I think I will try a completely different hardware to be sure.

Here are the logs in case someone found something interesting:
https://pastebin.com/NLnEMbbx

Thanks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!