No Access every few days

LazeX7

New Member
Jul 13, 2024
22
1
3
Hello,

I'm running Proxmox on the following machine:

HP Elite Mini 400 G4
RAM: 32GB
storage: 1TB
CPU: Intel i5 12000 something

It is running for a while no, without any issues but all of a sudden I lose connection to the whole machine every few days.
Now, rebooting the system will fix it for a few days, but after that there is no connection anymore.

It runs a few VM's so it's really annoying that this happens, I can't really figure out what the problem is.
Is there any logs I can access to see what is wrong? I can't really connect it to a monitor to see if anything changes.

Here is what I foud while troubleshooting:
1. Ehternet cable is not broken or something. The LED indicators on the machine and on the switch the machine is connected to are on.
2. I can ping the machine, but can't access the webinterface. see screenshot:
1721838013498.png
3. after rebooting, it alway works for a few days.
1721838402184.png
4. It happens randomly.
5. The only thing that changed is the switch i'm using. I'm now using a Unifi Pro Max 16 switch. But nothing is there in the logs.

Does anyone has this issue too?
 
When it goes offline, have you tried if you can still reach it via SSH?
Do the VMs/containers that run on proxmox also go offline / become unreachable after the interface becomes unreachable?
If you go to the proxmox-node/-server (2nd item on the left) and then to the logs, are there any warnings in the last 5-10 minutes before the reboot?
 
When it goes offline, have you tried if you can still reach it via SSH?
Do the VMs/containers that run on proxmox also go offline / become unreachable after the interface becomes unreachable?
If you go to the proxmox-node/-server (2nd item on the left) and then to the logs, are there any warnings in the last 5-10 minutes before the reboot?
Thanks for your reply, my anwser below:

1. All VM's become unreachable.
2. Did not try it via SSH yet.
3. I don't know if this is helpfull, but when i go to system logs on the node is see the following arround the time the machine became unreachable I'm not a expert so i don't really see a issue here.

But the strange thing i DID noticed was the following:
Jul 24 16:39:02 proxmox pvedaemon[798088]: <root@pam> successful auth for user 'root@pam' and then a reboot.

The time isn't possible. I wasn't home, and nobody else than me has access to proxmox!
logs are in the attachment.
 

Attachments

  • Jul 24 162604 proxmox postfixqmgr[9.txt
    296 KB · Views: 5
Those auths are most likely from automatic systems, who also "log in" with root with the local private key.
One things I do notice from the logs, is that it seems to be trying to send you an email for... something... but because you're sending it from a .local address, that is getting blocked by gmail.

Maybe try fixing your notification-setup to see both what it is trying to send and if they might help shine some lights on things.
If you want to send with gmail, maybe take a look at this:
https://gist.github.com/baudneo/04a6e8c42ec4e8f9705824165e6ee0f5
 
Those auths are most likely from automatic systems, who also "log in" with root with the local private key.
One things I do notice from the logs, is that it seems to be trying to send you an email for... something... but because you're sending it from a .local address, that is getting blocked by gmail.

Maybe try fixing your notification-setup to see both what it is trying to send and if they might help shine some lights on things.
If you want to send with gmail, maybe take a look at this:
https://gist.github.com/baudneo/04a6e8c42ec4e8f9705824165e6ee0f5
Thank you very much!! I will do this right away.
 
Those auths are most likely from automatic systems, who also "log in" with root with the local private key.
One things I do notice from the logs, is that it seems to be trying to send you an email for... something... but because you're sending it from a .local address, that is getting blocked by gmail.

Maybe try fixing your notification-setup to see both what it is trying to send and if they might help shine some lights on things.
If you want to send with gmail, maybe take a look at this:
https://gist.github.com/baudneo/04a6e8c42ec4e8f9705824165e6ee0f5
It appears to be that it tried to send me messages to notify me that back-ups has been created. No errors or something like that.
 
Ah, shame, although on the other hand, it could be that something is triggered during/by the backup-creation process.
Could you see if you can link the time of the stop-responding with the time of a backup or other backup-related action (cleanup of old backups for example)
 
Ah, shame, although on the other hand, it could be that something is triggered during/by the backup-creation process.
Could you see if you can link the time of the stop-responding with the time of a backup or other backup-related action (cleanup of old backups for example)
No. I have set the backup to every night at 00:00. One time I had an disconnection at 03:00 AM and the last one was around 16:45 PM.

1722002985576.png
I really don't get why it does this. A few days has passed no and it runs oke. But it always disconnect at unexpected times.
 
Well, then let's just hope it's magically gone (for now)

If it does happen again though, a few things to check:
- Can you access it through SSH?
- If you can reach SSH or use the shell with an attached screen/keyboard, what are the responses to the following commands:
Code:
ip a
qm list
pct list
pvecm status
systemctl status pveproxy

Also one more thought before it happens again: Could you check/show the server-statistics from the period, is something high, low, or is there a break around the time you noticed it.
 
Well, then let's just hope it's magically gone (for now)

If it does happen again though, a few things to check:
- Can you access it through SSH?
- If you can reach SSH or use the shell with an attached screen/keyboard, what are the responses to the following commands:
Code:
ip a
qm list
pct list
pvecm status
systemctl status pveproxy

Also one more thought before it happens again: Could you check/show the server-statistics from the period, is something high, low, or is there a break around the time you noticed it.
Hi Thanks for your reply.

It happend again! Normally it happens when i'm not at my desk, but this time I was.
I couldn't access it with ssh, after the reboot. I could access it via SSH.

It does this for no reason every few days.....

1722587059023.png
 
It seems to me that some device may be accessing the same IP address that Proxmox has. If this happens, start examining the ARP table to see if the MAC address from this table matches the Proxmox host address. If possible, show your network configuration from the /etc/network/interfaces file
 
It seems to me that some device may be accessing the same IP address that Proxmox has. If this happens, start examining the ARP table to see if the MAC address from this table matches the Proxmox host address. If possible, show your network configuration from the /etc/network/interfaces file
Maybe! I have this issue since I upgraded my network.

auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.1.254/24
gateway 192.168.1.1
bridge-ports eno1
bridge-stp off
bridge-fd 0

iface wlp0s20f3 inet manual


source /etc/network/interfaces.d/*
auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.1.254/24
gateway 192.168.1.1
bridge-ports eno1
bridge-stp off
bridge-fd 0

iface wlp0s20f3 inet manual


source /etc/network/interfaces.d/*
auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.1.254/24
gateway 192.168.1.1
bridge-ports eno1
bridge-stp off
bridge-fd 0

iface wlp0s20f3 inet manual


source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

iface eno1 inet manual

auto vmbr0
iface vmbr0 inet static
address 192.168.1.254/24
gateway 192.168.1.1
bridge-ports eno1
bridge-stp off
bridge-fd 0

iface wlp0s20f3 inet manual


source /etc/network/interfaces.d/*


IP-adres has always been reserved.
1722603408019.png
 
I'm fighting this problem for the last few days as well.
It does seems like a common problem lately.

Randomly got my external services not reachable, configured a monitoring and getting a lot of notifications, also experiencing it when at home and connecting locally.
 
It happend again! This is the first time it happens 2 times in a day!
I don't really know what the problem is. When I do a ping to every machine I have in ProxMox I get a reply back.
But I can't access them with RDP or SSH....In my Unifi Controller the devices doesn't seem to be offline either.

What I did noticed is that the device im using is very, very hot! But that's pretty much it.
 
If it is hot and has been that hot for a while, it could also be some kind of thermal damage that's now starting to rear its head.
Do you have any way to at least slightly cool it down by either changing position (to a cooler/airier location) or adding airflow to it (pc or normal fan) to see if it happens less?
Do not use something freezing btw, the condensation from that could further damage things.
 
If it is hot and has been that hot for a while, it could also be some kind of thermal damage that's now starting to rear its head.
Do you have any way to at least slightly cool it down by either changing position (to a cooler/airier location) or adding airflow to it (pc or normal fan) to see if it happens less?
Do not use something freezing btw, the condensation from that could further damage things.
That possible! It happend again today so I was really done with it. At the touch it was really hot, I couldn't touch it for 2 seconds. First It was placed on top of another device. Because of it happend again I relocated the proxmox device under my tv to connect it with a HDMI cable to see if something is happening.

Is most say, now it is really cooled down. Since I relocated it under my tv it didn't happen again so if it happends again I can maybe check it on the device it self. But when I go to the HDMI channel, the only thing I see is this.
I connected a keyboard and mouse to it, but it doesn't react to anything but CTRL+ALT+DEL (will reboot)
I cant really remember If iss suppose to see something here, because it has been awhile since I did the installation.

1722700218793.jpeg
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!