k081xoes

New Member
Jan 13, 2023
17
0
1
Hi, I'm new to ProxMox, so, Sorry for noob questions.
I've bought a small pc (AMD Ryzen 7 5825U, 32G RAM, 1Tb NVME SSD) for experiments with virtualization.

Problem:

I've installed Proxmox VE 7.3 with ISO Installer from official site.
Everything worked fine during first ~10 minutes. After that, I've lost connection to host from ssh and from Web UI.

Web UI says: "PR_CONNECT_RESET_ERROR"
SSH says: "kex_exchange_identification: read: Connection reset by peer"

Web UI and SSH recovers only after reboot (by pushing button).

Analysis:
- There is nothing in journalctl (only connections);
- Error log is empty;
- Ping works fine;
- nmap shows ssh and wpl-analytics (pveproxy, I suppose) when ssh and web ui are not available;

What I've tried:
- update all software (apt update && apt dist-upgrade);
- update uefi by fwupdmgr;
- reinstall openssh-server and all pveproxy (apt install --reinstall);
- I've tried to install Ubuntu Server to host and it had no problems with ssh or networking at all;

Can anyone help me with solving this problem?
Thank you in advance,
Roman

[B]pveversion -v[/B] proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve) pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4) pve-kernel-5.15: 7.3-1 pve-kernel-helper: 7.3-1 pve-kernel-5.15.83-1-pve: 5.15.83-1 pve-kernel-5.15.74-1-pve: 5.15.74-1 ceph-fuse: 15.2.17-pve1 corosync: 3.1.7-pve1 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown2: 3.1.0-1+pmx3 ksm-control-daemon: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.24-pve2 libproxmox-acme-perl: 1.4.3 libproxmox-backup-qemu0: 1.3.1-1 libpve-access-control: 7.3-1 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.3-1 libpve-guest-common-perl: 4.2-3 libpve-http-server-perl: 4.1-5 libpve-storage-perl: 7.3-1 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 5.0.0-3 lxcfs: 4.0.12-pve1 novnc-pve: 1.3.0-3 proxmox-backup-client: 2.3.2-1 proxmox-backup-file-restore: 2.3.2-1 proxmox-mini-journalreader: 1.3-1 proxmox-widget-toolkit: 3.5.3 pve-cluster: 7.3-1 pve-container: 4.4-2 pve-docs: 7.3-1 pve-edk2-firmware: 3.20220526-1 pve-firewall: 4.2-7 pve-firmware: 3.6-2 pve-ha-manager: 3.5.1 pve-i18n: 2.8-1 pve-qemu-kvm: 7.1.0-4 pve-xtermjs: 4.16.0-1 qemu-server: 7.3-2 smartmontools: 7.2-pve3 spiceterm: 3.2-2 swtpm: 0.8.0~bpo11+2 vncterm: 1.7-1 zfsutils-linux: 2.1.7-pve2
 
Last edited:
Hi,
can you exclude a RAM issue? Try running a memory test.. Also, can you login to the machine physically when the ssh connection and WebUI fail? If yes, can you check disk usage, memory usage, dmesg -w, journal...
 
Chris, thank you for answering!
I did memtest86 before installation and it said, that everything works fine.
I can login to the machine, but it will by physically disconnected from all ethernet ports. Whether it is good option, or, I should check everything being connected to network, when problem appears?
Also, what do you mean by checking journal?
 
Chris, thank you for answering!
I did memtest86 before installation and it said, that everything works fine.
I can login to the machine, but it will by physically disconnected from all ethernet ports. Whether it is good option, or, I should check everything being connected to network, when problem appears?
Also, what do you mean by checking journal?
I would check your system while connected... there is no need to disconnect it. By journal I mean the systemd-journal, you can check it with journalctl -b -r to get all entries since reboot in reverse, or journalctl -f if you would like to continuously get updated output (can be nice for real time testing).
 
I've discovered very strange thing.
I've run multiple terminals for different checkups by ssh after reboot. One of them was with htop.
After about 10 minutes all terminals died (same as web ui), but htop is still working even if stop htop and run another command.

I've checked CPU and RAM with htop after the break, but nothing strange: RAM is used for about 4% (1,5 gb), CPU is not used at all (some of cores were very low loaded).

Here is information from journalctl during previous boot (problem repeated there too):

Bash:
root@proxmox:~# journalctl -b 99dfbfa114bf41c3a5092b94e2412606 -u ssh
-- Journal begins at Thu 2023-01-12 15:18:30 MSK, ends at Fri 2023-01-13 19:36:08 MSK. --
Jan 13 19:13:13 proxmox systemd[1]: Starting OpenBSD Secure Shell server...
Jan 13 19:13:13 proxmox sshd[1309]: Server listening on 0.0.0.0 port 22.
Jan 13 19:13:13 proxmox sshd[1309]: Server listening on :: port 22.
Jan 13 19:13:13 proxmox systemd[1]: Started OpenBSD Secure Shell server.
Jan 13 19:13:20 proxmox sshd[1703]: Accepted publickey for root from 10.10.1.4 port 57572 ssh2: RSA SHA256:DDF7GGX5JG0gMmMtdDNrd3rwUiEOy+e6sW7JHnlPe3k
Jan 13 19:13:20 proxmox sshd[1703]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 13 19:14:23 proxmox sshd[2642]: Accepted publickey for root from 10.10.1.4 port 57581 ssh2: RSA SHA256:DDF7GGX5JG0gMmMtdDNrd3rwUiEOy+e6sW7JHnlPe3k
Jan 13 19:14:23 proxmox sshd[2642]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 13 19:14:49 proxmox sshd[2853]: Accepted publickey for root from 10.10.1.4 port 57587 ssh2: RSA SHA256:DDF7GGX5JG0gMmMtdDNrd3rwUiEOy+e6sW7JHnlPe3k
Jan 13 19:14:49 proxmox sshd[2853]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Jan 13 19:35:08 proxmox systemd[1]: Stopping OpenBSD Secure Shell server...
Jan 13 19:35:08 proxmox sshd[1309]: Received signal 15; terminating.
Jan 13 19:35:08 proxmox systemd[1]: ssh.service: Succeeded.
Jan 13 19:35:08 proxmox systemd[1]: Stopped OpenBSD Secure Shell server.

Bash:
root@proxmox:~# journalctl -b 99dfbfa114bf41c3a5092b94e2412606 -u pveproxy
-- Journal begins at Thu 2023-01-12 15:18:30 MSK, ends at Fri 2023-01-13 19:36:08 MSK. --
Jan 13 19:13:14 proxmox systemd[1]: Starting PVE API Proxy Server...
Jan 13 19:13:15 proxmox pveproxy[1685]: starting server
Jan 13 19:13:15 proxmox pveproxy[1685]: starting 3 worker(s)
Jan 13 19:13:15 proxmox pveproxy[1685]: worker 1686 started
Jan 13 19:13:15 proxmox pveproxy[1685]: worker 1687 started
Jan 13 19:13:15 proxmox pveproxy[1685]: worker 1688 started
Jan 13 19:13:15 proxmox systemd[1]: Started PVE API Proxy Server.
Jan 13 19:27:56 proxmox pveproxy[1685]: received signal TERM
Jan 13 19:27:56 proxmox pveproxy[1685]: server closing
Jan 13 19:27:56 proxmox pveproxy[1687]: worker exit
Jan 13 19:27:56 proxmox pveproxy[1688]: worker exit
Jan 13 19:27:56 proxmox pveproxy[1686]: worker exit
Jan 13 19:27:56 proxmox pveproxy[1685]: worker 1686 finished
Jan 13 19:27:56 proxmox pveproxy[1685]: worker 1687 finished
Jan 13 19:27:56 proxmox pveproxy[1685]: worker 1688 finished
Jan 13 19:27:56 proxmox pveproxy[1685]: server stopped
Jan 13 19:27:57 proxmox systemd[1]: pveproxy.service: Succeeded.
Jan 13 19:27:57 proxmox systemd[1]: pveproxy.service: Consumed 1.463s CPU time.
Jan 13 19:29:11 proxmox systemd[1]: Starting PVE API Proxy Server...
Jan 13 19:29:12 proxmox pveproxy[9083]: starting server
Jan 13 19:29:12 proxmox pveproxy[9083]: starting 3 worker(s)
Jan 13 19:29:12 proxmox pveproxy[9083]: worker 9084 started
Jan 13 19:29:12 proxmox pveproxy[9083]: worker 9085 started
Jan 13 19:29:12 proxmox pveproxy[9083]: worker 9086 started
Jan 13 19:29:12 proxmox systemd[1]: Started PVE API Proxy Server.
Jan 13 19:35:06 proxmox systemd[1]: Stopping PVE API Proxy Server...
Jan 13 19:35:07 proxmox pveproxy[9083]: received signal TERM
Jan 13 19:35:07 proxmox pveproxy[9083]: server closing
Jan 13 19:35:07 proxmox pveproxy[9084]: worker exit
Jan 13 19:35:07 proxmox pveproxy[9086]: worker exit
Jan 13 19:35:07 proxmox pveproxy[9085]: worker exit
Jan 13 19:35:07 proxmox pveproxy[9083]: worker 9085 finished
Jan 13 19:35:07 proxmox pveproxy[9083]: worker 9086 finished
Jan 13 19:35:07 proxmox pveproxy[9083]: worker 9084 finished
Jan 13 19:35:07 proxmox pveproxy[9083]: server stopped
Jan 13 19:35:08 proxmox systemd[1]: pveproxy.service: Succeeded.
Jan 13 19:35:08 proxmox systemd[1]: Stopped PVE API Proxy Server.
Jan 13 19:35:08 proxmox systemd[1]: pveproxy.service: Consumed 1.350s CPU time.

Also, what should I look for with dmesg -w ?
 
Also, what should I look for with dmesg -w ?
The idea was to look for possible disk errors, or similar...

Jan 13 19:27:56 proxmox pveproxy[1685]: received signal TERM
Jan 13 19:35:08 proxmox sshd[1309]: Received signal 15; terminating.
Both, sshd and pveproxy get a TERM signal and are stopping therefore... So the question is, why? Did you do something other than waiting for the services to stop working? Any cronjob, or other tool which might stop the services?
 
Last edited:
Both, sshd and pveproxy get a TERM signal and are stopping therefore... So the question is, why? Did you do something other than waiting for the services to stop? Any cronjob, or other tool which might send the TERMSIG?
Definitely. When I lost access to device, I've pressed physical on/off button to turn it off. Then, I turned it on and got logs about previous boot.

I suppose, there must be a lower-level problem, because both processes are stopping answering same time.
Whethere there is a common process they may depend on?
 
I've pressed physical on/off button to turn it off
Well if you pressed the power button, that explains why systemd shuts down the services, and the TERM signal in your logs. But as i understand this happend after you already lost connection with the services. What you describe sounds more like you loose ethernet connectivity at some point. Have you checked the logs in detail... Please provide the full output of the journal.
 
Ethernet connectivity wasn't lost, because ping works and nmap can see both ssh and wpl-analytics.
Logs attached. I needed, I can get logs for another boots.
 

Attachments

  • journalctl_export.txt
    150.9 KB · Views: 2
Last edited:
Also, it seems like something similar to firewall wakes up in about 10-20 mins after boot and brakes all the connections. And we get connection reset by peer.
 
Code:
Jan 13 19:27:56 proxmox pveproxy[1685]: received signal TERM
This already happened way before you pressed the power button, which happened at
Jan 13 19:35:03 proxmox systemd-logind[1110]: Power key pressed.

I see the logins to the machine... what commands did you execute exactly within your shell?
 
I see the logins to the machine... what commands did you execute exactly within your shell?
I didn't make any changes. Just watching logs and htop.
I can make a "clean boot" where I'll only log into ssh and web without running any commands if needed.
 
Interesting thing. I can't connect to device any more even after reboot.
Neither ssh nor web are not answering.
 
On a hunch - this might be the result of having a duplicate IP (another device in your network with the same IP as your PVE node)?
 
On a hunch - this might be the result of having a duplicate IP (another device in your network with the same IP as your PVE node)?
I've registered IP at the side of router. So, there shouldn't be any collisions.

Excepting, I haven't done anything about subnet.
IP is 10.10.1.64, Subnet is 10.10.1.64/27. Router can give some IPs in range 64-96, but, now, there is no booked IPs there.

Also, I can ping machine and see it with right IP in router admin panel
 
Last edited:
I've got new logs.
journalctl_export2.txt - this case machine was unaccessible at all from start.
journalctl_export3.txt - this case machine became unaccessible after short time (I've used only journalctl and scp from local machine).
Hope, it will be useful.
 

Attachments

  • journalctl_export3.txt
    146.7 KB · Views: 2
  • journalctl_export2.txt
    142.7 KB · Views: 2
I've registered IP at the side of router. So, there shouldn't be any collisions.

Excepting, I haven't done anything about subnet.
IP is 10.10.1.64, Subnet is 10.10.1.64/27. Router can give some IPs in range 64-96, but, now, there is no booked IPs there.

Also, I can ping machine and see it with right IP in router admin panel
Try using 10.10.1.65 as an IP address, usually using the first / last IP address of a range is not a good idea.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!