[SOLVED] Proxmox node is offline but everything else is running

fcarucci

New Member
May 13, 2023
26
10
3
Hello, I have a 5 node proxmox cluster with 4 node ceph.
One of the nodes' UI can not be accessed (server offline), but I can ssh into the node, all VMs are running, ceph monitor is up and everything else seems to be working fine.
I can ping the ip where the UI is supposed to be running on. I disabled the cluster firewall. Ceph is running fine with no errors.

pveproxy seems to be running
Code:
  22582 ?        S      0:00 pveproxy worker
  22583 ?        S      0:00 pveproxy worker
  22587 ?        S      0:00 pveproxy worker
cluster status looks ok
Code:
Cluster information
-------------------
Name:             Slapdash
Config Version:   30
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Sat Dec 14 13:28:49 2024
Quorum provider:  corosync_votequorum
Nodes:            5
Node ID:          0x00000001
Ring ID:          1.5520
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   5
Highest expected: 5
Total votes:      5
Quorum:           3
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.10.0.1 (local)
0x00000002          1 10.10.0.2
0x00000004          1 10.10.0.4
0x00000005          1 10.10.0.5
0x00000006          1 10.10.0.6

The only substantial difference between this node and the others is that I updated this node to the latest kernel version this morning (6.8.12-5).
What else can I try? Thanks!
 
Last edited:
After a "systemctl restart pveproxy" webui works again ? Otherwise if not migrate vm's and reboot the host.
 
I tried rebooting the host.

I just found this big clue
Code:
Dec 14 13:44:10 pve pveproxy[38402]: unable to open log file '/var/log/pveproxy/access.log' ->
Dec 14 13:44:10 pve pveproxy[38403]: unable to open log file '/var/log/pveproxy/access.log'
 
Ok, I fixed it. If anyone has the same problem, here's what happened.

I was running out of space and I mindlessly did a rm -rf of the content of /var/logs which is probably not a smart thing to do in general.
But in this case, it looks like if pveproxy doesn't find the its folder in the logs, it fails.
I fixed it by creating the folder and setting the right permissions.

I would suggest to add some code to create the log folder if it doesn't exist.

Thanks for your help!
 
  • Like
Reactions: waltar
So your "/" is too small or you have too much other data ... "/" should never run full :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!