failed 7>8 upgrade

me123

New Member
Jun 29, 2023
4
1
3
Hi!

I tried to upgrade my non subscription proxmox 7 to 8 and I failed.
The apt dist-upgrade worked fine, nothings caught my eye while doing so, just some minor
questions like on the /etc/issue and some others, 3 libs were removed (did not look relevant) all looked to have gone smooth.

But after a reboot of the host, it all went bad.
It started with a changed ssh key.
The web site :8006 did not return anything (was just hanging).
No VMs seem to run

I digged a bit deeper and it looks parts of the old /etc were gone. Like the /etc/pve directory was existant but empty.

I copied over the backup of the etc I made and restarted the host, the :8006 port behaves now different but still fails
(now it returns a 501 error)

An other strange thing, the changed hostname is a name of one of the VMs I had. I tried to use hostname and or just change /etc/hostname
back to proxmox-1 but after a reboot it always reverts to debclone3
I looked if this is somehow a wrong reverse dns, but no, the ip of the proxmox host has and had no reverse in dns.

(I could find the debclone3 name in the backup:
/root/backup/etc/2023-06-28T23:06:45+02:00/etc/pve/nodes/proxmox-1/qemu-server/104.conf
/root/backup/etc/2023-06-28T23:06:45+02:00/etc/pve/.rrd

It would be great to bring this server back to live again, so highly appreci.ated to hear what I could try (or
at least what additional information might be handy


Code:
> GET / HTTP/1.1
> Host: localhost:8006
> User-Agent: curl/7.88.1
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
< HTTP/1.1 501 Connection refused
< Cache-Control: max-age=0
< Connection: close
< Date: Wed, 28 Jun 2023 22:28:41 GMT
< Pragma: no-cache
< Server: pve-api-daemon/3.0
< Expires: Wed, 28 Jun 2023 22:28:41 GMT
Code:
root@debclone3:/etc/pve# systemctl status pveproxy pvedaemon pvecluster|cat
Unit pvecluster.service could not be found.
● pveproxy.service - PVE API Proxy Server
     Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-06-29 00:10:59 CEST; 18min ago
    Process: 1386 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=111)
    Process: 1547 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
   Main PID: 1990 (pveproxy)
      Tasks: 4 (limit: 38157)
     Memory: 149.2M
        CPU: 1.066s
     CGroup: /system.slice/pveproxy.service
             ├─1990 pveproxy
             ├─1999 "pveproxy worker"
             ├─2003 "pveproxy worker"
             └─2008 "pveproxy worker"

Jun 29 00:10:58 debclone3 pvecm[1386]: ipcc_send_rec[1] failed: Connection refused
Jun 29 00:10:58 debclone3 pvecm[1386]: ipcc_send_rec[2] failed: Connection refused
Jun 29 00:10:58 debclone3 pvecm[1386]: ipcc_send_rec[3] failed: Connection refused
Jun 29 00:10:58 debclone3 pvecm[1386]: Unable to load access control list: Connection refused
Jun 29 00:10:59 debclone3 pveproxy[1990]: starting server
Jun 29 00:10:59 debclone3 pveproxy[1990]: starting 3 worker(s)
Jun 29 00:10:59 debclone3 pveproxy[1990]: worker 1999 started
Jun 29 00:10:59 debclone3 pveproxy[1990]: worker 2003 started
Jun 29 00:10:59 debclone3 pveproxy[1990]: worker 2008 started
Jun 29 00:10:59 debclone3 systemd[1]: Started pveproxy.service - PVE API Proxy Server.

● pvedaemon.service - PVE API Daemon
     Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; preset: enabled)
     Active: active (running) since Thu 2023-06-29 00:10:58 CEST; 18min ago
    Process: 970 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
   Main PID: 1380 (pvedaemon)
      Tasks: 4 (limit: 38157)
     Memory: 213.2M
        CPU: 843ms
     CGroup: /system.slice/pvedaemon.service
             ├─1380 pvedaemon
             ├─1381 "pvedaemon worker"
             ├─1382 "pvedaemon worker"
             └─1383 "pvedaemon worker"

Jun 29 00:10:57 debclone3 systemd[1]: Starting pvedaemon.service - PVE API Daemon...
Jun 29 00:10:58 debclone3 pvedaemon[1380]: starting server
Jun 29 00:10:58 debclone3 pvedaemon[1380]: starting 3 worker(s)
Jun 29 00:10:58 debclone3 pvedaemon[1380]: worker 1381 started
Jun 29 00:10:58 debclone3 pvedaemon[1380]: worker 1382 started
Jun 29 00:10:58 debclone3 pvedaemon[1380]: worker 1383 started
Jun 29 00:10:58 debclone3 systemd[1]: Started pvedaemon.service - PVE API Daemon.

Code:
root@debclone3:~# pve7to8
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_CTYPE = "UTF-8",
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").
ipcc_send_rec[1] failed: Connection refused
ipcc_send_rec[2] failed: Connection refused
ipcc_send_rec[3] failed: Connection refused
Unable to load access control list: Connection refused
 
and look that you don'' have in /etc/hosts : 127.0.1.1 <yourhostname>

systemctl status pve-cluster
systemctl restart pve-cluster
That I do or do not have that?

I have it, but I would say, by the info in the file that is auto created with that value.
As said, the hostname is not what I would expect it to be.

Code:
root@debclone3:~# systemctl status pve-cluster

× pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-06-29 03:45:18 CEST; 4h 50min ago
    Process: 20832 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 6ms

Jun 29 03:45:18 debclone3 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jun 29 03:45:18 debclone3 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jun 29 03:45:18 debclone3 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jun 29 03:45:18 debclone3 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 29 03:45:18 debclone3 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.

Code:
root@debclone3:~# systemctl restart pve-cluster

Job for pve-cluster.service failed because the control process exited with error code.
See "systemctl status pve-cluster.service" and "journalctl -xeu pve-cluster.service" for details.


Code:
root@debclone3:~# systemctl status pve-cluster.service
× pve-cluster.service - The Proxmox VE cluster filesystem
     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-06-29 08:36:40 CEST; 46s ago
    Process: 45589 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)
        CPU: 6ms

Jun 29 08:36:40 debclone3 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Jun 29 08:36:40 debclone3 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
Jun 29 08:36:40 debclone3 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jun 29 08:36:40 debclone3 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Jun 29 08:36:40 debclone3 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.


Code:
Jun 29 08:36:39 debclone3 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
░░ Subject: A start job for unit pve-cluster.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has finished with a failure.
░░
░░ The job identifier is 3816 and the job result is failed.
Jun 29 08:36:39 debclone3 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 4.
░░ Subject: Automatic restarting of a unit has been scheduled
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ Automatic restarting of the unit pve-cluster.service has been scheduled, as the result for
░░ the configured Restart= setting for the unit.
Jun 29 08:36:39 debclone3 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
░░ Subject: A stop job for unit pve-cluster.service has finished
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A stop job for unit pve-cluster.service has finished.
░░
░░ The job identifier is 4010 and the job result is done.
Jun 29 08:36:39 debclone3 systemd[1]: Starting pve-cluster.service - The Proxmox VE cluster filesystem...
░░ Subject: A start job for unit pve-cluster.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has begun execution.
░░
░░ The job identifier is 4010.
Jun 29 08:36:39 debclone3 pmxcfs[45589]: [main] crit: Unable to get local IP address
Jun 29 08:36:39 debclone3 pmxcfs[45589]: [main] crit: Unable to get local IP address
Jun 29 08:36:39 debclone3 systemd[1]: pve-cluster.service: Control process exited, code=exited, status=255/EXCEPTION
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ An ExecStart= process belonging to unit pve-cluster.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 255.
Jun 29 08:36:39 debclone3 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Jun 29 08:36:39 debclone3 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
░░ Subject: A start job for unit pve-cluster.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has finished with a failure.
░░
░░ The job identifier is 4010 and the job result is failed.
Jun 29 08:36:40 debclone3 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
░░ Subject: Automatic restarting of a unit has been scheduled
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ Automatic restarting of the unit pve-cluster.service has been scheduled, as the result for
░░ the configured Restart= setting for the unit.
Jun 29 08:36:40 debclone3 systemd[1]: Stopped pve-cluster.service - The Proxmox VE cluster filesystem.
░░ Subject: A stop job for unit pve-cluster.service has finished
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A stop job for unit pve-cluster.service has finished.
░░
░░ The job identifier is 4204 and the job result is done.
Jun 29 08:36:40 debclone3 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Jun 29 08:36:40 debclone3 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ The unit pve-cluster.service has entered the 'failed' state with result 'exit-code'.
Jun 29 08:36:40 debclone3 systemd[1]: Failed to start pve-cluster.service - The Proxmox VE cluster filesystem.
░░ Subject: A start job for unit pve-cluster.service has failed
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit pve-cluster.service has finished with a failure.
░░
░░ The job identifier is 4204 and the job result is failed.


Code:
root@debclone3:~# cat /etc/hostname
debclone3
root@debclone3:~# cat /etc/hosts
# Your system has configured 'manage_etc_hosts' as True.
# As a result, if you wish for changes to this file to persist
# then you will need to either
# a.) make changes to the master file in /etc/cloud/templates/hosts.debian.tmpl
# b.) change or remove the value of 'manage_etc_hosts' in
#     /etc/cloud/cloud.cfg or cloud-config from user-data
#
127.0.1.1 debclone3 debclone3
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters



(ovh hosting ?)
what do you mean with "(ovh hosting ?)

It is running on a 1l hp system.
 
So I think I figured out what happened here. It is a lab system and some additional debian packages got installed over time.
I would assume that the
ii cloud-init 22.4.2-1 all initialization system for infrastructure cloud instances

might not be part of the original proxmox install.

That function caused the hostname to be overwritten and the hosts file to be auto generated which caused the services to fail.

doing a:
Code:
sudo touch /etc/cloud/cloud-init.disabled
and re-setting the hostname fixed that part.

Then it still failed to start up some service complaining about a fuse mountpoint not being empty.
That was due to me copying the files from my backup to /etc/pve where it seems that its auto generated by proxmox


Not sure if one reads it who is working on proxmox, but it might be handy to have the pve7to8 (or some additional health checker)
tool to alert that other packages are installed, and also inform about configuration issues (like the hosts file) or that /etc/pve is
not mounted and contained files. Also the logs could be a bit more self explaining.

Anyhow @spirit helped me to look into the right direction and I found other info in the forum as well
 
  • Like
Reactions: spirit

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!