Proxmox crashing when using ressources

Baptiste.mrch · Oct 26, 2023

Hello,

I create this thread because I'm really lost. It's now some times that my Proxmox crashes, I noticed when using a lot of resources.
I'm using a docker container with Photoprism installed on it (using the awesome proxmox helper scripts), and I noticed it used some heavy resources (I put 2 CPU thread and 2go RAM). When I execute a library indexation, I have my 2 CPU thread and my RAM that go likes 80% of usage ONLY for this container. The host tells me that it was approximately 50/60% CPU and 60/70% of RAM.

I wonder if I'm not having some bad hardware because I try many things, including updating Proxmox, Processor Microcode, CPU Scaling Governor all from the proxmox helper script. Still have issues.

Below you will find 2 logs I extract from Proxmox using Rsyslog
https://pastebin.com/dHair40M
https://pastebin.com/m2KatVHG

Sorry if my English is bad

Thanks to you !

Here is my config:

root@proxmox:~# pveversion --verbose
proxmox-ve: 8.0.2 (running kernel: 6.2.16-18-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-18-pve: 6.2.16-18
proxmox-kernel-6.2: 6.2.16-18
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx5
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.4
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-7
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1

leesteken · Oct 26, 2023

Baptiste.mrch said:
I'm using a docker container with Photoprism installed on it (using the awesome proxmox helper scripts), and I noticed it used some heavy resources (I put 2 CPU thread and 2go RAM). When I execute a library indexation, I have my 2 CPU thread and my RAM that go likes 80% of usage ONLY for this container. The host tells me that it was approximately 50/60% CPU and 60/70% of RAM.

The Proxmox manual and just about every thread about Docker on this forum will warn about running it in a container. Can you run Docker in a VM and see if the problem persists?

tteckster · Oct 26, 2023

Baptiste.mrch said:
I'm using a docker container with Photoprism installed on it

If this is the case, the script didn't come from the repository you mentioned. The script there installs PhotoPrism in a Linux container, not a Docker container.

Baptiste.mrch · Oct 26, 2023

leesteken said:
The Proxmox manual and just about every thread about Docker on this forum will warn about running it in a container. Can you run Docker in a VM and see if the problem persists?

I will try that

Baptiste.mrch · Oct 26, 2023

tteckster said:
If this is the case, the script didn't come from the repository you mentioned. The script there installs PhotoPrism in a Linux container, not a Docker container.

Yeah, I'm using the Docker LXC so that I can keep the hand on the installation. Not sure if it necessary, but I'm using a NFS mount on /etc/fstab and I didn't succeed with your Photoprism script.
Many thanks for your scripts btw

Baptiste.mrch · Oct 26, 2023

It crashes again. Here's the logs :

Oct 26 22:31:45 proxmox kernel: device veth107i0 entered promiscuous mode
Oct 26 22:31:45 proxmox kernel: eth0: renamed from vethRwSAsW
Oct 26 22:31:45 proxmox pct[15797]: <root@pam> end task UPID

roxmox:00003DB6:0003A8BA:653ACCB0:vzstart:107:root@pam: OK
Oct 26 22:31:46 proxmox kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered blocking state
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered forwarding state
Oct 26 22:31:53 proxmox postfix/qmgr[909]: 5E84A81012: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:31:53 proxmox postfix/local[16582]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:31:53 proxmox postfix/local[16582]: 5E84A81012: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=2339, delays=2339/0.01/0/0.03, dsn=4.3.0, status=deferred (alias database unavailable)
Oct 26 22:36:53 proxmox postfix/qmgr[909]: 3198980FC3: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:36:53 proxmox postfix/local[19515]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:36:53 proxmox postfix/local[19515]: 3198980FC3: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=3331, delays=3331/0.01/0/0.01, dsn=4.3.0, status=deferred (alias database unavailable)
Oct 26 22:40:24 proxmox pveproxy[963]: worker 966 finished
Oct 26 22:40:24 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:40:24 proxmox pveproxy[963]: worker 20604 started
Oct 26 22:40:25 proxmox pveproxy[20600]: worker exit
Oct 26 22:45:42 proxmox pvedaemon[959]: <root@pam> successful auth for user 'root@pam'
Oct 26 22:51:08 proxmox pveproxy[963]: worker 965 finished
Oct 26 22:51:08 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:51:08 proxmox pveproxy[963]: worker 24334 started
Oct 26 22:51:11 proxmox pveproxy[24330]: got inotify poll request in wrong process - disabling inotify
Oct 26 22:51:51 proxmox smartd[601]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 75 to 74
-- Reboot --
Oct 26 23:01:05 proxmox kernel: microcode: microcode updated early to revision 0xf4, date = 2023-02-23
Oct 26 23:01:05 proxmox kernel: Linux version 6.2.16-18-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.2.16-18 (2023-10-11T15:05Z) ()
Oct 26 23:01:05 proxmox kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.2.16-18-pve root=/dev/mapper/pve-root ro quiet
Oct 26 23:01:05 proxmox kernel: KERNEL supported cpus:
Oct 26 23:01:05 proxmox kernel: Intel GenuineIntel
Oct 26 23:01:05 proxmox kernel: AMD AuthenticAMD
Oct 26 23:01:05 proxmox kernel: Hygon HygonGenuine
Oct 26 23:01:05 proxmox kernel: Centaur CentaurHauls
Oct 26 23:01:05 proxmox kernel: zhaoxin Shanghai

leesteken · Oct 26, 2023

Baptiste.mrch said:

It crashes again. Here's the logs :

Code:

Oct 26 22:31:45 proxmox kernel: device veth107i0 entered promiscuous mode
Oct 26 22:31:45 proxmox kernel: eth0: renamed from vethRwSAsW
Oct 26 22:31:45 proxmox pct[15797]: <root@pam> end task UPID:proxmox:00003DB6:0003A8BA:653ACCB0:vzstart:107:root@pam: OK
Oct 26 22:31:46 proxmox kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered blocking state
Oct 26 22:31:46 proxmox kernel: vmbr0: port 8(veth107i0) entered forwarding state
Oct 26 22:31:53 proxmox postfix/qmgr[909]: 5E84A81012: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:31:53 proxmox postfix/local[16582]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:31:53 proxmox postfix/local[16582]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:31:53 proxmox postfix/local[16582]: 5E84A81012: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=2339, delays=2339/0.01/0/0.03, dsn=4.3.0, status=deferred (alias database unavailable)
Oct 26 22:36:53 proxmox postfix/qmgr[909]: 3198980FC3: from=<root@proxmox.home>, size=614, nrcpt=1 (queue active)
Oct 26 22:36:53 proxmox postfix/local[19515]: error: open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases is unavailable. open database /etc/aliases.db: No such file or directory
Oct 26 22:36:53 proxmox postfix/local[19515]: warning: hash:/etc/aliases: lookup of 'root' failed
Oct 26 22:36:53 proxmox postfix/local[19515]: 3198980FC3: to=<root@proxmox.home>, orig_to=<root>, relay=local, delay=3331, delays=3331/0.01/0/0.01, dsn=4.3.0, status=deferred (alias database unavailable)

Proxmox is trying to tell you something, which might be important, by e-mailing you but fails because mail is not configured properly.

Baptiste.mrch said:

Code:

Oct 26 22:40:24 proxmox pveproxy[963]: worker 966 finished
Oct 26 22:40:24 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:40:24 proxmox pveproxy[963]: worker 20604 started
Oct 26 22:40:25 proxmox pveproxy[20600]: worker exit
Oct 26 22:45:42 proxmox pvedaemon[959]: <root@pam> successful auth for user 'root@pam'
Oct 26 22:51:08 proxmox pveproxy[963]: worker 965 finished
Oct 26 22:51:08 proxmox pveproxy[963]: starting 1 worker(s)
Oct 26 22:51:08 proxmox pveproxy[963]: worker 24334 started
Oct 26 22:51:11 proxmox pveproxy[24330]: got inotify poll request in wrong process - disabling inotify
Oct 26 22:51:51 proxmox smartd[601]: Device: /dev/sda [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 75 to 74
-- Reboot --

There is no information in the logs about the crash, just that it happened and the system detected a restart. This is usually a hardware issue. Run memtest, start replacing components, update the BIOS, make sure temperatures are not too high. What kind of hardware are you using?

Baptiste.mrch · Oct 26, 2023

Okay, I think I will try to setup the emails conf so that I can check.

I will do memtest. I have buy new RAM sticks few days ago, I will soon install it and see if it's doing better.

Thanks for your help !

Baptiste.mrch · Oct 27, 2023

Okay some news, I configured email service, works fine (I tested via command line and via backups notification). I receive absolutely 0 emails regarding the crashes.
Memtest tells me everything is fine. I really don't know what to do.
My config is the following :
- Intel Core i3-8109U @ 3.00GHz
- 8go DDR4 2400MT/s
- 1to M.2 SSD from Samsung

leesteken · Oct 27, 2023

Baptiste.mrch said:
Okay some news, I configured email service, works fine (I tested via command line and via backups notification). I receive absolutely 0 emails regarding the crashes.
Memtest tells me everything is fine. I really don't know what to do.
My config is the following :
- Intel Core i3-8109U @ 3.00GHz
- 8go DDR4 2400MT/s
- 1to M.2 SSD from Samsung

And you are having these problems when running Docker inside a VM? Or only when running it in a container?

Baptiste.mrch · Oct 27, 2023

leesteken said:
And you are having these problems when running Docker inside a VM? Or only when running it in a container?

I have this issue randomly, I think when using a lot of resources. (My photoprism container is only an example).
I've just reinstalled Proxmox, and for now, all seems to keep online even by turn on all my container/vm. I will let few days go by and see if it fixed my issue. I let you know

Baptiste.mrch · Oct 30, 2023

Okay some news. Everything appear to working fine, until saturday, when my proxmox crashes again. The mains things that repeat at each time I check the logs is some sort of cron execution (the hourly one). I checked and I have nothing at all inside the /etc/cron.hourly directory
https://pastebin.com/k082cTGp

If you have any ideas, and/or diagnostics, I take.
Thanks !

leesteken · Oct 30, 2023

Baptiste.mrch said:
Okay some news. Everything appear to working fine, until saturday, when my proxmox crashes again. The mains things that repeat at each time I check the logs is some sort of cron execution (the hourly one). I checked and I have nothing at all inside the /etc/cron.hourly directory
https://pastebin.com/k082cTGp

If you have any ideas, and/or diagnostics, I take.
Thanks !

Cron messages appear regularly anyway and are probably unrelated. It's probably a hardware (or cooling) issue that is triggers by stressing the system.

Baptiste.mrch · Nov 11, 2023

Hi again,

After a while, reinstalling Proxmox seems to work fine. Unfortunately, I still have crashes. I changed ram sticks (upgrade to 16go) and still crashes. I think I will try a completely different hardware to be sure.

Here are the logs in case someone found something interesting:
https://pastebin.com/NLnEMbbx

Thanks

Search

Search

Proxmox crashing when using ressources

Baptiste.mrch

New Member

leesteken

Distinguished Member

tteckster

Member

Baptiste.mrch

New Member

Baptiste.mrch

New Member

Baptiste.mrch

New Member

leesteken

Distinguished Member

Baptiste.mrch

New Member

Baptiste.mrch

New Member

leesteken

Distinguished Member

Baptiste.mrch

New Member

Baptiste.mrch

New Member

leesteken

Distinguished Member

Baptiste.mrch

New Member

We value your privacy