[SOLVED] Cant create new VM's

FerryCliment

New Member
Apr 1, 2020
12
0
1
36
Hello.

First things first few days back I had some problem with the certs that affected the GUI. I have no idea what happened but after checking that VM's inside Proxmox were running just fine and SSH was enabled gave me a hint that was a web-related issue.

Apparently pveproxy-ssl.pem got corrupted or somehow was invalid/Missing (More than likely was something i did (Or someone from my team , Proxmox is our Development VE , people shouldnt be messing up with the proxmox itself but... cant guarantee) but not sure what or how i messed up that badly) i mean i know the disclaimer from the Proxmox wiki.

At the end it was solved with a "pvecm updatecerts --force" and "systemctl restart pveproxy" That granted me access to console via web.

I log back on , check my VMs everything seem fine (VM's performance never was an issue that was the hint that lead me to being a web related thing had to be something with certs or http daemon) The thing is apparently access to local (Can see the ISOs in my Local/Content) and local-lvm still gives me problems.

Summary -> Connection timeout (596) (Both summary)
Content -> Communication Failure (0) (Local-LVM content)

This causes the problem i cant create new VM's because web GUI dosent find the storage to place the VM Disk.

I check some of the status commands that i know via console as i'll show in the files attached. These bits of informations leads me to think that there is still some sort of similar issue with certificates , trust or communication between web , Proxmox and its storage.

(I'm not confident enough to create a VM through CLI and doing it correctly) but i'm fairly sure functionality is there , has to be some similar error.

Anyone had an idea? how to fix this?

Single node - 28 cores 100 ram 4tb storage. HP Proliant G8.
 

Attachments

  • firefox_qEDJCZY75M.png
    firefox_qEDJCZY75M.png
    50.1 KB · Views: 35
  • ConEmu64_kFQKM4YWxI.png
    ConEmu64_kFQKM4YWxI.png
    5.4 KB · Views: 32
Last edited:
hi,

I log back on , check my VMs everything seem fine (VM's performance never was an issue that was the hint that lead me to being a web related thing had to be something with certs or http daemon) The thing is apparently access to local (Can see the ISOs in my Local/Content) and local-lvm still gives me problems.

Summary -> Connection timeout (596) (Both summary)
Content -> Communication Failure (0) (Local-LVM content)

is it a standalone node, or is this a cluster?

can you check journalctl and cat /var/log/syslog for error messages and post them here?
 
Hello Oguz thanks for the quick response.

Yes its a single node , no cluster. As mentioned before its our dev virtual enviroment , we use it internally in our departament


Code:
May 26 15:42:17 proxmox pveproxy[58147]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1699.
May 26 15:42:17 proxmox pveproxy[58148]: /etc/pve/local/pveproxy-ssl.key: failed to load local private key (key_file or key) at /usr/share/perl5/PVE/APIServer/AnyEvent.pm line 1699.
May 26 15:42:21 proxmox pvecm[58168]: got inotify poll request in wrong process - disabling inotify

Based on the time frame this is the error that held me back to acces the web , it dosent show anymore just for the record.

Code:
root@proxmox:~# journalctl -p 3 --since="today"
-- Logs begin at Mon 2020-04-13 02:50:07 CEST, end at Wed 2020-05-27 13:40:00 CEST. --
May 27 05:21:42 proxmox pveupdate[36400]: command 'apt-get update' failed: exit code 100
May 27 05:21:42 proxmox pveupdate[36389]: <root@pam> end task UPID:proxmox:00008E30:16B6AB2F:5ECDDCC4:aptupdate::root@pam: command 'apt-get update' failed: exit code 100
May 27 10:07:38 proxmox pvecm[9550]: got inotify poll request in wrong process - disabling inotify
May 27 10:15:29 proxmox pvedaemon[10637]: command '/usr/bin/termproxy 5900 --path /nodes/proxmox --perm Sys.Console -- /bin/login -f root' failed: exit code 4
May 27 10:15:29 proxmox pvedaemon[21267]: <root@pam> end task UPID:proxmox:0000298D:16D18DC1:5ECE2197:vncshell::root@pam: command '/usr/bin/termproxy 5900 --path /nodes/proxmox --perm Sys.Console -- /bin/login -f root' failed: exit code 4
May 27 10:35:23 proxmox pvecm[13413]: got inotify poll request in wrong process - disabling inotify
May 27 10:40:13 proxmox pveproxy[14188]: got inotify poll request in wrong process - disabling inotify
May 27 11:37:51 proxmox pvedaemon[21846]: connection timed out
May 27 11:37:51 proxmox pvedaemon[21313]: <root@pam> end task UPID:proxmox:00005556:16D91817:5ECE34E5:vncproxy:204:root@pam: connection timed out
May 27 12:06:48 proxmox pveproxy[25815]: got inotify poll request in wrong process - disabling inotify
May 27 12:40:23 proxmox pveproxy[30388]: got inotify poll request in wrong process - disabling inotify

in /etc/log/syslog the only noticeable thing i can read that is not seen in the journal is.

Code:
May 27 12:23:57 proxmox pveproxy[25816]: proxy detected vanished client connection
May 27 12:23:58 proxmox pveproxy[25816]: proxy detected vanished client connection
May 27 12:23:58 proxmox pveproxy[13621]: proxy detected vanished client connection

Bestside system related stuff there is this entry , 204 is a testing machine from my colleague in the departament , i dont think there is much correlation with the issue with the proxmox as service.

Code:
May 27 11:37:51 proxmox pvedaemon[21313]: <root@pam> end task UPID:proxmox:00005556:16D91817:5ECE34E5:vncproxy:204:root@pam: connection timed out
[\CODE]
 

Attachments

  • firefox_RQat1ES57p.png
    firefox_RQat1ES57p.png
    15.9 KB · Views: 16
could you try rebooting the node?
 
Unfortunately not right now.

There are one of the hosts is running a test from one of our members ETA is 20h aprox.

even tho no matter what (If i can fix or not without rebooting) but once I have some window to reboot it , i will do it.
 
instead of rebooting you can try the following: systemctl restart pveproxy pvedaemon pve-cluster pvestatd
 
Done , same issue.

Code:
May 27 14:28:36 proxmox pvedaemon[45326]: command '/usr/bin/termproxy 5900 --path /nodes/proxmox --perm Sys.Console -- /bin/login -f root' failed: exit code 255
May 27 14:28:36 proxmox pvedaemon[44804]: <root@pam> end task UPID:proxmox:0000B10E:16E8BDC9:5ECE5CF3:vncshell::root@pam: command '/usr/bin/termproxy 5900 --path /nodes/proxmox --perm Sys.Console -- /bin/login -f root

This is the error that appears after those restarts.
 

Attachments

  • firefox_wEhqkoio1j.png
    firefox_wEhqkoio1j.png
    10.2 KB · Views: 20
  • firefox_Cul5t3qYmO.png
    firefox_Cul5t3qYmO.png
    13.2 KB · Views: 20
instead of rebooting you can try the following: systemctl restart pveproxy pvedaemon pve-cluster pvestatd

Do you think could be something related with ssl certificate , There if i'll delete the /etc/pve .pem there is a way to regenerate those? i'm fairly sure it has to be something related with the problem i had yesterday. Would like to regenerate those certificates just i dont know if moving those to .pem.bak and doing a systemctl restart would be enough to regenerate it and hopefully re-allow the conection between gui and local storage.
 
you can try again to do pvecm updatecerts -f ... just to be sure can you post the contents of cat /etc/hosts ??
 
Sorry for being a pain in the ass , specially after having problems because something that is clearly stated in the wiki to not touch ... was touched.

In the attached files there is the hosts content.

My idea is that somewhat , somehow... we fuck up the ssl , certs and system between GUI - Proxmox and Storage. (Without affecting the plain SSH or the funcionality of the VM's standalone)

the pvecm updatecerts -f dont adress what i think its the problem. reading in the forum i saw a post of a guy saying that updatecerts gives problem if there file exists (in his case , like i think its mine file is there but its clearly not working) specially after watching with ls -la dates are diferent .key and .pem specially. (Maybe what i did yesterday to solve the first issue with the web GUI now created a subsequent problem bc certs are not sync?)

If i'll delete (or move to .pem/crt.bak?) and restart service or reboot when i can , the certs will be recreated? (And hopefully all at once?)

Adding files to support theory.
 

Attachments

  • firefox_TaMOCy19c0.png
    firefox_TaMOCy19c0.png
    5.6 KB · Views: 22
  • firefox_rO0ZZ5cXsP.png
    firefox_rO0ZZ5cXsP.png
    3.9 KB · Views: 20
  • ConEmu64_seIxwVSM3O.png
    ConEmu64_seIxwVSM3O.png
    10.2 KB · Views: 20
i'm not sure but you can try reinstalling the package pve-manager like this: apt install pve-manager --reinstall. also please make sure you are on the latest package versions (what is your pveversion -v output?)

i'm also suspecting your storage configuration, so the contents of /etc/pve/storage.cfg could be useful.

would also help if you could attach the journal of a larger timeframe here
 
Hello Oguz.

pveversion -v

Code:
root@proxmox:~# pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.10-1-pve)
pve-manager: 6.1-3 (running version: 6.1-3/37248ce6)
pve-kernel-5.3: 6.0-12
pve-kernel-helper: 6.0-12
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 1.2.5-1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-2
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.1-2
pve-container: 3.0-14
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191002-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-2
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2

Contents of storage.cfg

Code:
root@proxmox:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content iso,vztmpl,backup

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

More bits of info regarding my storage

Code:
root@proxmox:~# lsblk /dev/sda
NAME                                   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                                      8:0    0  4.1T  0 disk 
├─sda1                                   8:1    0 1007K  0 part 
├─sda2                                   8:2    0  512M  0 part 
└─sda3                                   8:3    0  4.1T  0 part 
  ├─pve-swap                           253:0    0    8G  0 lvm  [SWAP]
  ├─pve-root                           253:1    0   96G  0 lvm  /
  ├─pve-data_tmeta                     253:2    0 15.8G  0 lvm  
  │ └─pve-data-tpool                   253:4    0    4T  0 lvm  
  │   ├─pve-data                       253:5    0    4T  0 lvm  
  │   ├─pve-vm--111--disk--0           253:6    0   20G  0 lvm  
  │   ├─pve-vm--200--disk--1           253:7    0   64G  0 lvm  
  │   ├─pve-vm--202--disk--0           253:8    0  200G  0 lvm  
  │   ├─pve-vm--100--disk--0           253:9    0   32G  0 lvm  
  │   ├─pve-vm--201--disk--0           253:10   0   50G  0 lvm  
  │   ├─pve-vm--101--disk--0           253:11   0   50G  0 lvm  
  │   ├─pve-vm--112--disk--0           253:12   0  100G  0 lvm  
  │   ├─pve-vm--110--disk--0           253:13   0   64G  0 lvm  
  │   ├─pve-vm--113--disk--0           253:14   0  128G  0 lvm  
  │   ├─pve-vm--203--disk--0           253:15   0  100G  0 lvm  
  │   ├─pve-vm--114--disk--0           253:16   0   64G  0 lvm  
  │   ├─pve-vm--102--disk--0           253:17   0   64G  0 lvm  
  │   ├─pve-vm--115--disk--0           253:18   0  256G  0 lvm  
  │   ├─pve-vm--116--disk--0           253:19   0  128G  0 lvm  
  │   ├─pve-vm--117--disk--0           253:20   0   64G  0 lvm  
  │   ├─pve-vm--204--disk--0           253:21   0  500G  0 lvm  
  │   └─pve-vm--114--state--ZabbixDone 253:22   0 16.5G  0 lvm  
  └─pve-data_tdata                     253:3    0    4T  0 lvm  
    └─pve-data-tpool                   253:4    0    4T  0 lvm  
      ├─pve-data                       253:5    0    4T  0 lvm  
      ├─pve-vm--111--disk--0           253:6    0   20G  0 lvm  
      ├─pve-vm--200--disk--1           253:7    0   64G  0 lvm  
      ├─pve-vm--202--disk--0           253:8    0  200G  0 lvm  
      ├─pve-vm--100--disk--0           253:9    0   32G  0 lvm  
      ├─pve-vm--201--disk--0           253:10   0   50G  0 lvm  
      ├─pve-vm--101--disk--0           253:11   0   50G  0 lvm  
      ├─pve-vm--112--disk--0           253:12   0  100G  0 lvm  
      ├─pve-vm--110--disk--0           253:13   0   64G  0 lvm  
      ├─pve-vm--113--disk--0           253:14   0  128G  0 lvm  
      ├─pve-vm--203--disk--0           253:15   0  100G  0 lvm  
      ├─pve-vm--114--disk--0           253:16   0   64G  0 lvm  
      ├─pve-vm--102--disk--0           253:17   0   64G  0 lvm  
      ├─pve-vm--115--disk--0           253:18   0  256G  0 lvm  
      ├─pve-vm--116--disk--0           253:19   0  128G  0 lvm  
      ├─pve-vm--117--disk--0           253:20   0   64G  0 lvm  
      ├─pve-vm--204--disk--0           253:21   0  500G  0 lvm  
      └─pve-vm--114--state--ZabbixDone 253:22   0 16.5G  0 lvm


Hope this bits of information helps to draw a picture , the fact storage configuration shows as active , even i can select the loaded isos and the console shows no error) keeps telling me that there is some kind of problem in the communication between web and storage.

I will reinstall the pve-manager once production stops.
 

Attachments

  • chrome_1btG1q2QZr.png
    chrome_1btG1q2QZr.png
    9.8 KB · Views: 12
  • chrome_BU7UwvD4V5.png
    chrome_BU7UwvD4V5.png
    17.7 KB · Views: 10
please try the following:

Code:
rm /etc/pve/local/pveproxy-ssl.key
pvecm updatecerts -f
systemctl restart pveproxy

and then please check the logs again and post them here
 
rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)
pvecm updatecerts -f
rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)
systemctl restart pveproxy pvedaemon pve-cluster pvestatd
rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)

apt update && apt -y full-upgrade
apt install pve-manager --reinstall
rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)

rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)
pvecm updatecerts -f
rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)
systemctl restart pveproxy pvedaemon pve-cluster pvestatd
rm /etc/pve/local/pveproxy-ssl.key (NOT FOUND)

Code:
root@proxmox:/etc/apt/sources.list.d# journalctl -p 3 --since="1 hour ago"
-- Logs begin at Mon 2020-04-13 02:50:07 CEST, end at Thu 2020-05-28 17:00:01 CEST. --
May 28 16:03:01 proxmox systemd[1]: Failed to start Proxmox VE replication runner.
May 28 16:03:54 proxmox smartd[58915]: In the system's table of devices NO devices found to scan
May 28 16:45:32 proxmox pveproxy[23249]: got inotify poll request in wrong process - disabling inotify
May 28 16:45:32 proxmox pveproxy[23248]: got inotify poll request in wrong process - disabling inotify
May 28 16:49:39 proxmox pveproxy[24157]: got inotify poll request in wrong process - disabling inotify
May 28 16:54:50 proxmox pvecm[24885]: got inotify poll request in wrong process - disabling inotify

Adding massive 800 line plain journal while performing the actions mentioned above.

Hopefully make sense that i'm not able to "access" storage via web bc of that missing .key and from there start working on finding something to do about it (since update certs dont seem to work)
 

Attachments

the file /etc/pve/local/pveproxy-ssl.key exists on the screenshot that you've sent. how can it be not found?
 
Sorry , left work as soon as you replied... I guess the find tool gave us the answer , now that i'm off work gonna go again through all the steps and end it with a reboot

1590694390366.png

Code:
root@proxmox:~# find / -name pveproxy*
/sys/fs/cgroup/cpu,cpuacct/system.slice/pveproxy.service
/sys/fs/cgroup/pids/system.slice/pveproxy.service
/sys/fs/cgroup/devices/system.slice/pveproxy.service
/sys/fs/cgroup/blkio/system.slice/pveproxy.service
/sys/fs/cgroup/memory/system.slice/pveproxy.service
/sys/fs/cgroup/systemd/system.slice/pveproxy.service
/sys/fs/cgroup/unified/system.slice/pveproxy.service
/run/pveproxy
/run/pveproxy/pveproxy.pid
/run/pveproxy/pveproxy.pid.lock
/run/lock/pveproxy.lck
/etc/pve/nodes/proxmox/pveproxy-ssl.key
/etc/systemd/system/multi-user.target.wants/pveproxy.service
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/pveproxy.service
/var/lib/systemd/deb-systemd-helper-enabled/pveproxy.service.dsh-also
/var/log/pveproxy
/usr/bin/pveproxy
/usr/lib/systemd/system/pveproxy.service
/usr/share/bash-completion/completions/pveproxy
/usr/share/pve-docs/pveproxy.8.html
/usr/share/perl5/PVE/Service/pveproxy.pm
/usr/share/man/man8/pveproxy.8.gz
 
Hello

Would like to adress to you @oguz and anyone that comes here having potencially similar issue.

I finally fixed the issue (i had to wait until weekend to have free power to reboot the system). The issue was indeed with the pve-ssl.pem (Still not sure how we did but it was a human factor) basically it came down to seeing the modification of the file was in sync with when the problems started.

pvecm updatecerts -f does not work in this case , neither everything mentioned above in here. at the end it got fixed with.

Removing the certs and rebooting without a pvecm updatecerts before the reboot.

Would like to tank oguz for the help , really appreciate it.
 
glad your problem is solved!

you can mark the thread as [SOLVED] so others in similar situation know what to expect :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!