Virtual Environment 5.3-5 The problem

dns173 · Feb 16, 2021

Here are the questions:

It has been running for more than 400 days.

Today, KVM is normal and LXC cannot be started. After logging in the Web management terminal, the display is all "?"

Restarted the relevant service through the following command, logged in the Web management interface shows that KVM can be managed, but LXC still cannot be managed and used. Note: This machine has no cluster and is a unique primary server.

service pvestatd stop

service pvedaemon stop

service cman stop

service pve-cluster stop

service pvestatd start

service pvedaemon start

service cman start

service pve-cluster start

Do not dare to restart the server now. Is there any other way to solve this problem? Please master to give guidance.

package versions

proxmox-ve: 5.3-1 (running kernel: 4.15.18-9-pve)
pve-manager: 5.3-5 (running version: 5.3-5/97ae681d)
pve-kernel-4.15: 5.2-12
pve-kernel-4.15.18-9-pve: 4.15.18-30
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-43
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-33
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-5
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-31
pve-container: 2.0-31
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-16
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-43
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1

t.lamprecht · Feb 16, 2021

Hi,

First, semi-related: Proxmox VE 5.x is EOL since July 2020
https://pve.proxmox.com/pve-docs/chapter-pve-faq.html#faq-support-table

But you are also not on latest 5.x which would be 5.4

Please update first to latest minor release and then plan a upgrade to supported PVE 6.x:
https://pve.proxmox.com/wiki/Upgrade_from_5.x_to_6.0

dns173 said:
service cman

FYI: Proxmox VE uses no cman service since PVE 4.0

dns173 said:
Do not dare to restart the server now. Is there any other way to solve this problem? Please master to give guidance.

Can you check the syslog, either over webinterface if still working else journalctl or less /var/log/syslog can be used when connecting via SSH to that server.

Any odd message or error would be interesting, to see what the actual problem is here,

dns173 · Feb 16, 2021

At present, the LXC does not work properly. How to solve this problem?

dns173 · Feb 16, 2021

We want to back up the data in the current failed LXC and then restart the server or upgrade the PvE version. How can I do that? Because there are some risky situations. We don't dare to do that.

t.lamprecht · Feb 16, 2021

dns173 said:
At present, the LXC does not work properly. How to solve this problem?

Yes, you repeatedly stated that.
Can you please check for the information requested, as we cannot guess what's wrong without any log (error) messages:

t.lamprecht said:
Can you check the syslog, either over webinterface if still working else journalctl or less /var/log/syslog can be used when connecting via SSH to that server.

dns173 · Feb 16, 2021

I uploaded two log files and hope there are no special error prompts. Please take a look at it. Thank you ~

dns173 · Feb 16, 2021

dns173 · Feb 16, 2021

Feb 15 06:25:04 vps12 liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1154" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Feb 15 06:25:04 vps12 spiceproxy[1963]: restarting server
Feb 15 06:25:04 vps12 spiceproxy[1963]: starting 1 worker(s)
Feb 15 06:25:04 vps12 spiceproxy[1963]: worker 2166 started
Feb 15 06:25:04 vps12 pveproxy[1933]: restarting server
Feb 15 06:25:04 vps12 pveproxy[1933]: starting 3 worker(s)
Feb 15 06:25:04 vps12 pveproxy[1933]: worker 2169 started
Feb 15 06:25:04 vps12 pveproxy[1933]: worker 2170 started
Feb 15 06:25:04 vps12 pveproxy[1933]: worker 2171 started
Feb 15 06:25:09 vps12 spiceproxy[22194]: worker exit
Feb 15 06:25:09 vps12 spiceproxy[1963]: worker 22194 finished
Feb 15 06:25:09 vps12 pveproxy[22197]: worker exit
Feb 15 06:25:09 vps12 pveproxy[22195]: worker exit
Feb 15 06:25:09 vps12 pveproxy[22196]: worker exit
Feb 15 06:25:09 vps12 pveproxy[1933]: worker 22197 finished
Feb 15 06:25:09 vps12 pveproxy[1933]: worker 22195 finished
Feb 15 06:25:09 vps12 pveproxy[1933]: worker 22196 finished
Feb 15 06:26:00 vps12 systemd[1]: Starting Proxmox VE replication runner...
Feb 15 06:26:01 vps12 systemd[1]: Started Proxmox VE replication runner.
Feb 15 06:27:00 vps12 systemd[1]: Starting Proxmox VE replication runner...
Feb 15 06:27:01 vps12 systemd[1]: Started Proxmox VE replication runner.
Feb 15 06:28:00 vps12 systemd[1]: Starting Proxmox VE replication runner...
Feb 15 06:28:01 vps12 systemd[1]: Started Proxmox VE replication runner.
Feb 15 06:29:00 vps12 systemd[1]: Starting Proxmox VE replication runner...
Feb 15 06:29:01 vps12 systemd[1]: Started Proxmox VE replication runner.
Feb 15 06:30:00 vps12 systemd[1]: Starting Proxmox VE replication runner...
Feb 15 06:30:01 vps12 systemd[1]: Started Proxmox VE replication runner.
Feb 15 06:31:00 vps12 systemd[1]: Starting Proxmox VE replication runner...
Feb 15 06:31:01 vps12 systemd[1]: Started Proxmox VE replication runner.
Feb 15 06:32:00 vps12 systemd[1]: Starting Proxmox VE replication runner...

That's basically the message.

dns173 · Feb 16, 2021

Are prompted when starting the LXC server

CT 109 already running (500)

TASK ERROR: you can't start a CT if it's a template

dns173 · Feb 17, 2021

This issue has been resolved. Our solution is to restart the server, which is probably your bug. This kind of problem can occur after more than a year of operation. Earlier versions of 2.1 and the like have had similar issues. So the restart is done.

t.lamprecht · Feb 17, 2021

dns173 said:
TASK ERROR: you can't start a CT if it's a template

Yeah, that's no bug, that's by design - you tried to start a template which will never work.

I checked your syslog and there's no obvious message or anything, so definitively strange.

dns173 said:
Our solution is to restart the server, which is probably your bug.

It shouldn't be required, but we fixed all known of those a few years ago when we had an enterprise customer which >600 days uptime - but with no error messages in the log it's hard to know what broke. For what is worth, it may be even work better in 6.x releases.

dns173 said:
This kind of problem can occur after more than a year of operation.

I'd advise updating more frequently and reboot if there was a kernel update, that has not only the advantage of such things not happening but also integrates important bug and security fixes in your system.

Anyway, glad you could solve it.
I'd recommend planning in an update first to latest 5.4 minor release, and then plan an upgrade to the currently supported PVE 6.x: https://pve.proxmox.com/wiki/Upgrade_from_5.x_to_6.0

Search

Search

Virtual Environment 5.3-5 The problem

dns173

Active Member

Attachments

t.lamprecht

Proxmox Staff Member

dns173

Active Member

dns173

Active Member

t.lamprecht

Proxmox Staff Member

dns173

Active Member

dns173

Active Member

Attachments

dns173

Active Member

dns173

Active Member

dns173

Active Member

t.lamprecht

Proxmox Staff Member

We value your privacy