Hello,
I need your help because I have an issue with my proxmox server.
Here is my config :
CPU : AMD Ryzen 9 3900X
Motherboard : Gigabyte B450M DS3H
RAM : 4x 32Go
GPu : GeForce GT 710 (only to get access to console screen)
Disk : SSD disk (no raid or anything)
Proxmox : 8.3.3
Kernel : Linux 6.8.12-8-pve
I have an issue for a long time, maybe last summer. Sometimes, maybe once a month, my proxmox setup completly freeze. The web ui is not available. SSH is impossible. The only thing working is a ping. Most of the VM are not available (while 1 or 2 continues to function correctly).
Here what I tried already :
I am a bit out of idea. When the "freeze" or crash arise, it seams like there is not particular logs prior my manual reboot that could help me find what is happening (here the freeze or crash if at 05:00:00 and my manuel reboot near 09;00;00) :
Do you know how I can troubleshoot this ?
This is really strange that one or 2 VM are still working while the proxmox host is not responding to even the web ui or ssh.
I "seams" (not sure) that this issue happen since I updated from proxmox 7 to 8 last summer. This setup was running fine for years before.
Thank you in advance
I need your help because I have an issue with my proxmox server.
Here is my config :
CPU : AMD Ryzen 9 3900X
Motherboard : Gigabyte B450M DS3H
RAM : 4x 32Go
GPu : GeForce GT 710 (only to get access to console screen)
Disk : SSD disk (no raid or anything)
Proxmox : 8.3.3
Kernel : Linux 6.8.12-8-pve
I have an issue for a long time, maybe last summer. Sometimes, maybe once a month, my proxmox setup completly freeze. The web ui is not available. SSH is impossible. The only thing working is a ping. Most of the VM are not available (while 1 or 2 continues to function correctly).
Here what I tried already :
- Change kernel version
- Update bios
- Replace the SSD
- change Bios settings :
- Disable "Global C-State Control" in BIOS
- Disable power optimization (Don't remember the exact name, but disble things that reduce power when low charge and others)
- Update proxmox version
I am a bit out of idea. When the "freeze" or crash arise, it seams like there is not particular logs prior my manual reboot that could help me find what is happening (here the freeze or crash if at 05:00:00 and my manuel reboot near 09;00;00) :
Code:
Apr 10 02:17:01 sgc CRON[1450196]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 10 02:17:01 sgc CRON[1450197]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 10 02:17:01 sgc CRON[1450196]: pam_unix(cron:session): session closed for user root
Apr 10 02:30:54 sgc smartd[1119]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 55 to 56
Apr 10 03:00:54 sgc smartd[1119]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 56 to 55
Apr 10 03:10:01 sgc CRON[1460175]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 10 03:10:01 sgc CRON[1460176]: (root) CMD (test -e /run/systemd/system || SERVICE_MODE=1 /sbin/e2scrub_all -A -r)
Apr 10 03:10:01 sgc CRON[1460175]: pam_unix(cron:session): session closed for user root
Apr 10 03:17:01 sgc CRON[1461484]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 10 03:17:01 sgc CRON[1461485]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 10 03:17:01 sgc CRON[1461484]: pam_unix(cron:session): session closed for user root
Apr 10 03:30:54 sgc smartd[1119]: Device: /dev/sdb [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 53 to 54
Apr 10 04:00:54 sgc smartd[1119]: Device: /dev/sdb [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 54 to 53
Apr 10 04:17:01 sgc CRON[1472691]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Apr 10 04:17:01 sgc CRON[1472692]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Apr 10 04:17:01 sgc CRON[1472691]: pam_unix(cron:session): session closed for user root
Apr 10 04:30:54 sgc smartd[1119]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 55 to 56
Apr 10 05:00:55 sgc smartd[1119]: Device: /dev/sdb [SAT], SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 53 to 54
-- Boot 66f0503b5c4843f98ef7cdba37695d8f --
Apr 10 09:05:20 sgc kernel: Linux version 6.8.12-8-pve (build@proxmox) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-8 (2025-01-24T12:32Z) ()
Do you know how I can troubleshoot this ?
This is really strange that one or 2 VM are still working while the proxmox host is not responding to even the web ui or ssh.
I "seams" (not sure) that this issue happen since I updated from proxmox 7 to 8 last summer. This setup was running fine for years before.
Thank you in advance
Last edited: