Job status pve question

Maksimus

Member
May 16, 2022
78
3
13
We connected the storage to the server (hardware) via a fiberchannel, set up a multipass, but any time you try to work with the storage, proxmox goes into question status.
But at the same time, VMs continue to work normally inside, but none of their proxmos tasks work, ssh is available, the console is also available via gui, but any other section is not available by time out error
Screenshot_20.png

The contents of the file /etc/multipass.conf
Code:
defaults {
    user_friendly_names yes
    find_multipaths yes
}

To get to the normal status of the green checkmark, you have to restart the server itself.
 
You need to start with examining the logs via "journalctl -n 1000" and include information about your "pveversion -verbose".


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
journalctl -n 1000 in attachment

Code:
root@HOST800:~# pveversion -verbose
proxmox-ve: 7.4-1 (running kernel: 5.15.102-1-pve)
pve-manager: 7.4-3 (running version: 7.4-3/9002ab8a)
pve-kernel-5.15: 7.3-3
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.85-1-pve: 5.15.85-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-4
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-1
libpve-rs-perl: 0.7.5
libpve-storage-perl: 7.4-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.3.3-1
proxmox-backup-file-restore: 2.3.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.6.4
pve-cluster: 7.3-3
pve-container: 4.4-3
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-1
pve-firewall: 4.3-1
pve-firmware: 3.6-4
pve-ha-manager: 3.6.0
pve-i18n: 2.11-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.9-pve1
 

Attachments

Your cluster seems to be either misconfigured or in some sort of bad state.
The -n1000 is not enough to find when the issue started, you should examine further back in history and/or reboot to start a new cycle.

I would also recommend employing a firewall if you must expose your PVE to internet.

At some point you may need to run "pvecm updatecerts --force" but I think its too early to say when.

good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
There are a few methods floating on the web, for example https://easycomputertutorial.com/restart-proxmox-services/
However, I am not endorsing doing that - too many unknowns about your environment.
Output from following commands might be useful, but by no means decisive:
pvecm nodes
pvecm status

you only mentioned a single node, is it? Check status of "systemctl|grep pve" is anything failed?. The symptoms that you have provided are too generic to determine what is broken with any certainty.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
it looks like the ssh is the issue, at least one of them.
do an ssh servername, from and to each server, confirm the connection with yes, than the key will be written in the authxxxx file. the next time you login, no password or userinput is required. plz check that

you have a lot
r 29 15:47:35 HOST800 sshd[1780111]: pam_unix(sshd:auth): check pass; user unknown
Mar 29 15:47:35 HOST800 sshd[1780111]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=public ip
Mar 29 15:47:37 HOST800 sshd[1780111]: Failed password for invalid user admin from public ip port 2512 ssh2

something seems not right with the user. you use the user admin
 
Last edited:
it looks like the ssh is the issue, at least one of them.
do an ssh servername, from and to each server, confirm the connection with yes, than the key will be written in the authxxxx file. the next time you login, no password or userinput is required. plz check that

you have a lot
r 29 15:47:35 HOST800 sshd[1780111]: pam_unix(sshd:auth): check pass; user unknown
Mar 29 15:47:35 HOST800 sshd[1780111]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=public ip
Mar 29 15:47:37 HOST800 sshd[1780111]: Failed password for invalid user admin from public ip port 2512 ssh2

something seems not right with the user. you use the user admin
we do not use the admin user, check ssh to which servers? the server is alone and without a cluster (storage is connected to it via a fiberchannel)
 
This morning I started testing again, created a lvm disk on the storage and decided to migrate the VM to it, but at the time of migration everything hung.
journal2.txt immediately after the task hangs (more precisely, after the gui stopped updating information)
journal3.txt after questions appeared in gui
 

Attachments

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!