All VMs are down

rukverc

Member
Dec 11, 2019
7
1
23
53
Hi mates,
we have a Proxmox working but today some weird thing happened.
All the VMs found stopped : something tried to start all of them before, but failed.
I attached the syslog and the tasklog andI hope you can find out where to start debugging.
Thanks for your help.
 

Attachments

  • px_syslog.png
    px_syslog.png
    315.2 KB · Views: 37
  • task_log.png
    task_log.png
    35.1 KB · Views: 35
There might be multiple problems. What is the output of the following commands?
Code:
systemctl status pvesr
journalctl -xe
apt update
 
Hi Dominic,
here are the answers:
Code:
root@pve:/var/log# systemctl status pvesr
● pvesr.service - Proxmox VE replication runner
   Loaded: loaded (/lib/systemd/system/pvesr.service; static; vendor preset: enabled)
   Active: inactive (dead) since Thu 2020-02-06 10:36:00 CET; 12s ago
  Process: 31950 ExecStart=/usr/bin/pvesr run --mail 1 (code=exited, status=0/SUCCESS)
 Main PID: 31950 (code=exited, status=0/SUCCESS)
      CPU: 661ms

Feb 06 10:36:00 pve systemd[1]: Starting Proxmox VE replication runner...
Feb 06 10:36:00 pve systemd[1]: Started Proxmox VE replication runner.
root@pve:/var/log#


Code:
root@pve:~# apt update
Ign:1 http://deb.debian.org/debian stretch InRelease
Ign:2 http://ftp.hu.debian.org/debian stretch InRelease                     
Hit:3 http://ftp.hu.debian.org/debian stretch-updates InRelease             
Hit:4 http://security.debian.org stretch/updates InRelease
Hit:5 http://deb.debian.org/debian stretch Release
Hit:6 http://ftp.hu.debian.org/debian stretch Release
Ign:8 https://enterprise.proxmox.com/debian/pve stretch InRelease
Err:10 https://enterprise.proxmox.com/debian/pve stretch Release
  401  Unauthorized
Reading package lists... Done
E: The repository 'https://enterprise.proxmox.com/debian/pve stretch Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
root@pve:~#

The "journalctl -xe" output is in the attached txt file .

Thank you for your help.

rukverc
 

Attachments

  • aa.txt
    97.9 KB · Views: 7
Maybe that has nothing to do with the error but you are missing the proxmox updates for non-subsribers. You have still selected the enterprise repository:

Code:
Ign:8 https://enterprise.proxmox.com/debian/pve stretch InRelease
Err:10 https://enterprise.proxmox.com/debian/pve stretch Release
  401  Unauthorized

Here you can find out how to change that: https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_no_subscription_repo
 
  • Like
Reactions: Dominic
Thank you Semmo,
I just changed a line on /etc/apt/sources.list.d/pve-enterprise.list which were pointed to the enterprise source.
Got 32 updates: see the picture.

apt_new.png

I hope this will solve the problems and never happen again.
 
Updating is certainly good.

To be sure: You did see the call to the startall function of our API (at nodes/{nodes}/startall) in your task log but you are not sure what executed it?
 
Dominic,
first I am looking for answer why 6 minutes missing from system log.
Secondly why didnt started all the VM's after (maybe the PVE) command.

We have a mail, db, web and etc server running on this proxmox and I just want to be shure these power downs will never happen again.
 
Is your Proxmox VE server in a cluster? On a standalone host the replication runner does not do much. Consequently, I'd (also) take a look at the other stuff happening around the 6 minute wait (udev, inserting modules).

By default startall starts only those guests where onboot is set true. There are two options why a VM does not start after a call to startall:
  1. onboot is set false in the guest configuration and the force parameter for the startall call has not been set.
    In this case Proxmox VE does not even try to start the guest. There is neither a hint in the Task Viewer (double click in the GUI) of the startall call nor a task ("VM X - Start") in the task protocol (in the GUI at the bottom).
  2. Proxmox VE tried to start the guest but it failed.
    In this case there is a hint in the Task Viewer ("Starting VM X") and a new task ("VM X - Start") with an error message in the task protocol.
 
Hi,
sorry to say but its happened again.
Thanks for your effort the VM's has been restarted and the downtime was only 7 minutes but the restart thingy is still exists.
Any idea where to start debugging ?
 

Attachments

  • daemon_s.png
    daemon_s.png
    371.5 KB · Views: 9
  • auth_log_s.png
    auth_log_s.png
    307.5 KB · Views: 7
  • system_s.png
    system_s.png
    355 KB · Views: 10
  • px_10_min_blank.png
    px_10_min_blank.png
    94.7 KB · Views: 9
Does the system restart or is it being reset by something?
Last thing can also happen due to (some ideas)
  • Bios Watchdog, which somehow has the opinion the system is stuck - and issues a reset
  • Power outage (is the system attached to a UPS?)
  • Bad power supply (how old is it?)
  • Overheat of the system (unlikely due to very low load and intel CPUs typically clock down instead of resetting, but who knows...)
  • People in the "datacenter" need a plug - and remove a cord or two (in this case a wrong one)
    • Other people that want to attach a vacuum cleaner and don't find a plug (unlikely though with less than 7 minutes)
 
  • Like
Reactions: Dominic
Does the system restart or is it being reset by something?
@rukverc Is the Proxmox VE host going down or only the virtual machines?
Do the VMs shutdown themselves or is the host doing this? If it is because of the host, then you can see VM X - Stop in the task protocol in the GUI. If you click on the Task Viewer you can find the unique task id. You can grep for it in /var/log. You should find hits in pve/tasks/index, pve/tasks/active, syslog, daemon.log and auth.log.
 
Last edited:
Hi Dominic,
it happened again yesterday.
There is no 'stop' in the logs or in the task viewer.
 
Did you already take a look at the syslogs in the virtual machines?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!