Proxmox Host Locks Up, N00b needs a little help with diagnostics.

arretx

New Member
Jul 15, 2020
16
4
3
51
Experience Level: Beginner - Intermediate

Host System: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz, 8GB Ram, 3 500GB Sata Drives, 1 External 8TB USB 3.0 Drive

Last week, this box was running Ubuntu 18.04 independent of any VE. I migrated to Proxmox, and the only VM configured is a fresh Ubuntu 20.04 install.

The external USB drive is assigned as a USB device on the VM.

The three internal drives are split up. 1 drive is the host system primary drive, and the other two are ZFS mirrored. The Ubuntu VM is using the entire mirror as a single LV.

Problem:

The Host locks up on an irregular basis multiple times per day. The cursor on the terminal attached to the box stops blinking and everything comes to a complete stop and I have to force power off and restart the system.

There weren't any problems running Ubuntu 20.04 on the hardware without Proxmox, so I don't know if there's some sort of finicky detail that Proxmox finds that Ubuntu ignored.

Question:

Where do I start looking beyond syslog to diagnose this problem and what might I be looking for?

As you can see, the dead spots on this usage graph are where the system has gone down (yet I don't know that it has gone down.)

Screen Shot 2021-09-21 at 8.39.10 AM.png

So, I checked the syslog at those exact times:

Sep 20 13:17:02 pve1 CRON[1251095]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Sep 20 13:17:02 pve1 CRON[1251096]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Sep 20 13:17:02 pve1 CRON[1251095]: pam_unix(cron:session): session closed for user root
Sep 20 13:18:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
Sep 20 13:18:01 pve1 systemd[1]: pvesr.service: Succeeded.
Sep 20 13:18:01 pve1 systemd[1]: Finished Proxmox VE replication runner.

----
Crash Here
----

Sep 20 15:26:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
Sep 20 15:26:01 pve1 systemd[1]: pvesr.service: Succeeded.
Sep 20 15:26:01 pve1 systemd[1]: Finished Proxmox VE replication runner.
Sep 20 15:26:18 pve1 pvedaemon[1991]: <root@pam> successful auth for user 'root@pam'
Sep 20 15:27:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
Sep 20 15:27:01 pve1 systemd[1]: pvesr.service: Succeeded.

----
Crash Here
----

Sep 21 00:52:18 pve1 pvedaemon[74904]: <root@pam> successful auth for user 'root@pam'
Sep 21 00:53:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
Sep 21 00:53:00 pve1 systemd[1]: pvesr.service: Succeeded.
Sep 21 00:53:00 pve1 systemd[1]: Finished Proxmox VE replication runner.
Sep 21 00:54:00 pve1 systemd[1]: Starting Proxmox VE replication runner...
Sep 21 00:54:00 pve1 systemd[1]: pvesr.service: Succeeded.
Sep 21 00:54:00 pve1 systemd[1]: Finished Proxmox VE replication runner.

----
Crash Here
----

So, at this point, I have no idea what I'm seeing and I don't see a pattern, and I don't know when the system goes down, so I'm sorta stuck.
 
  • Like
Reactions: majorgear
Can you ping your PVE while the cursor stops blinking/freezes?

From the attached screenshot it looks like that the IO delay is very high.
 
Ahh...that's something I haven't tried...probably because it's a public name server handling MX records for my e-mail and a few clients of mine...

Although, if it's going down anyway, it wouldn't be any worse to just shut down the VM and watch for a crash.

If it was determined that the VM is causing it, what would do that?
 
  • Like
Reactions: majorgear
I would just should down any VMs/containers running on PVE and leave it running and verify if it still crash.
Probably you might need to add some more RAM.

What filesystem are you using on PVE? ZFS?
 
ZFS also need RAM - sometime a lot.
From my point of view I would say that your current server configuration with "just" 8 GB RAM is not enought.

Just leave PVE running without any VMs/containers as mentioned earlier and see if it keeps crashing or not.
 
RAM prices are currently dropping.
What you might consider if you do not want to spend too much money on RAM -> instead of running a whole VM you could use a container for running your Ubuntu?!
 
I have a container of milk here. But beyond that, I will now need to learn more.;)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!