This server was running for almost 2 years without real issues. Last week, I removed the cluster on my server, and updated all version (running now 6.3.2).
With all these changes, I also rebooted the server, and at first sight everything was working.
But after approx 2 days, I noticed some slow reaction times. When I now boot a VM, I'm getting the bios booting from hard disk, and after this a black screen. After stopping almost all my VM's (1 is my firewall/routing/...), this problem is gone? The VM earlier failing starts normal?
I just booted 5 VM's that are really important for daily use (Home Automation, data...). But now, after 24 hours, the problem comes back? I can't start a 'new' VM machine?
Strange thing that the machines that are currently runnig, don't have problem? Just new 'boots'?
Any tips how I can find out what the real problem is? :$
Some pv info:
Some hardware info:
With all these changes, I also rebooted the server, and at first sight everything was working.
But after approx 2 days, I noticed some slow reaction times. When I now boot a VM, I'm getting the bios booting from hard disk, and after this a black screen. After stopping almost all my VM's (1 is my firewall/routing/...), this problem is gone? The VM earlier failing starts normal?
I just booted 5 VM's that are really important for daily use (Home Automation, data...). But now, after 24 hours, the problem comes back? I can't start a 'new' VM machine?
Strange thing that the machines that are currently runnig, don't have problem? Just new 'boots'?
Any tips how I can find out what the real problem is? :$
Some pv info:
Code:
pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
pve-kernel-5.4.55-1-pve: 5.4.55-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-4.15: 5.4-19
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.13.13-2-pve: 4.13.13-33
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
----------------------------------
pvestatd status
running
root@stampertj:~# ^C
root@stampertj:~# systemctl status pvestatd
● pvestatd.service - PVE Status Daemon
Loaded: loaded (/lib/systemd/system/pvestatd.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-11-27 21:13:25 CET; 23h ago
Process: 16645 ExecStart=/usr/bin/pvestatd start (code=exited, status=0/SUCCESS)
Main PID: 16658 (pvestatd)
Tasks: 1 (limit: 7372)
Memory: 94.6M
CGroup: /system.slice/pvestatd.service
└─16658 pvestatd
nov 28 09:28:53 stampertj pvestatd[16658]: status update time (5.547 seconds)
nov 28 09:53:48 stampertj pvestatd[16658]: status update time (9.622 seconds)
nov 28 11:26:14 stampertj pvestatd[16658]: status update time (5.356 seconds)
nov 28 11:26:37 stampertj pvestatd[16658]: status update time (8.147 seconds)
nov 28 11:26:56 stampertj pvestatd[16658]: status update time (18.621 seconds)
nov 28 12:09:57 stampertj pvestatd[16658]: status update time (11.773 seconds)
nov 28 12:41:33 stampertj pvestatd[16658]: status update time (16.426 seconds)
nov 28 13:06:39 stampertj pvestatd[16658]: status update time (5.308 seconds)
nov 28 17:49:24 stampertj pvestatd[16658]: auth key pair too old, rotating..
-----------------
systemctl status pvedaemon
● pvedaemon.service - PVE API Daemon
Loaded: loaded (/lib/systemd/system/pvedaemon.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-11-27 10:47:23 CET; 1 day 9h ago
Process: 3246 ExecStart=/usr/bin/pvedaemon start (code=exited, status=0/SUCCESS)
Main PID: 3320 (pvedaemon)
Tasks: 6 (limit: 7372)
Memory: 291.2M
CGroup: /system.slice/pvedaemon.service
├─ 3320 pvedaemon
├─14811 pvedaemon worker
├─25716 pvedaemon worker
├─29663 pvedaemon worker
├─47269 task UPID:stampertj:0000B8A5:00B862B4:5FC2A30F:vncproxy:613:root@pam:
└─47271 /usr/bin/perl /usr/sbin/qm vncproxy 613
nov 28 20:20:42 stampertj pvedaemon[25716]: <root@pam> starting task UPID:stampertj:0000B85B:00B86104:5FC2A30A:vncproxy:613:root@pam:
nov 28 20:20:43 stampertj qm[47197]: VM 613 qmp command failed - VM 613 not running
nov 28 20:20:43 stampertj pvedaemon[47195]: Failed to run vncproxy.
nov 28 20:20:43 stampertj pvedaemon[25716]: <root@pam> end task UPID:stampertj:0000B85B:00B86104:5FC2A30A:vncproxy:613:root@pam: Failed to run vncproxy.
nov 28 20:20:44 stampertj pvedaemon[47219]: start VM 613: UPID:stampertj:0000B873:00B861D5:5FC2A30C:qmstart:613:root@pam:
nov 28 20:20:44 stampertj pvedaemon[25716]: <root@pam> starting task UPID:stampertj:0000B873:00B861D5:5FC2A30C:qmstart:613:root@pam:
nov 28 20:20:46 stampertj pvedaemon[25716]: <root@pam> end task UPID:stampertj:0000B873:00B861D5:5FC2A30C:qmstart:613:root@pam: OK
nov 28 20:20:47 stampertj pvedaemon[47269]: starting vnc proxy UPID:stampertj:0000B8A5:00B862B4:5FC2A30F:vncproxy:613:root@pam:
nov 28 20:20:47 stampertj pvedaemon[14811]: <root@pam> starting task UPID:stampertj:0000B8A5:00B862B4:5FC2A30F:vncproxy:613:root@pam:
nov 28 20:27:10 stampertj pvedaemon[14811]: <root@pam> successful auth for user 'root@pam'
-----------------
systemctl status pveproxy
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-11-27 10:47:26 CET; 1 day 9h ago
Process: 3324 ExecStartPre=/usr/bin/pvecm updatecerts --silent (code=exited, status=0/SUCCESS)
Process: 3327 ExecStart=/usr/bin/pveproxy start (code=exited, status=0/SUCCESS)
Process: 40778 ExecReload=/usr/bin/pveproxy restart (code=exited, status=0/SUCCESS)
Main PID: 3329 (pveproxy)
Tasks: 4 (limit: 7372)
Memory: 195.8M
CGroup: /system.slice/pveproxy.service
├─ 547 pveproxy worker
├─ 3329 pveproxy
├─40785 pveproxy worker
└─40787 pveproxy worker
nov 28 00:00:32 stampertj pveproxy[29214]: worker exit
nov 28 00:00:32 stampertj pveproxy[31445]: worker exit
nov 28 00:00:32 stampertj pveproxy[28008]: worker exit
nov 28 00:00:32 stampertj pveproxy[3329]: worker 31445 finished
nov 28 00:00:32 stampertj pveproxy[3329]: worker 28008 finished
nov 28 00:00:32 stampertj pveproxy[3329]: worker 29214 finished
nov 28 20:25:10 stampertj pveproxy[40786]: worker exit
nov 28 20:25:10 stampertj pveproxy[3329]: worker 40786 finished
nov 28 20:25:10 stampertj pveproxy[3329]: starting 1 worker(s)
nov 28 20:25:10 stampertj pveproxy[3329]: worker 547 started
---------------------------------------
pvesm status
Name Type Status Total Used Available %
VD3 lvm active 1171517440 729808896 441708544 62.30%
VD4 lvm active 1171517440 0 1171517440 0.00%
data lvm active 4883214336 2394947584 2488266752 49.04%
local dir active 98559220 16879832 76629840 17.13%
local-lvm lvmthin active 1826881536 404106195 1422775340 22.12%
Some hardware info:
Code:
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Model name: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
Stepping: 7
CPU MHz: 2238.334
CPU max MHz: 2500,0000
CPU min MHz: 1200,0000
BogoMIPS: 4000.02
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts
------
free
total used free shared buff/cache available
Mem: 396212660 81392340 310449960 319432 4370360 312120068
Swap: 8388604 0 8388604
------
df -h
Filesystem Size Used Avail Use% Mounted on
udev 189G 0 189G 0% /dev
tmpfs 38G 267M 38G 1% /run
/dev/mapper/pve-root 94G 17G 74G 19% /
tmpfs 189G 46M 189G 1% /dev/shm
tmpfs 5,0M 0 5,0M 0% /run/lock
tmpfs 189G 0 189G 0% /sys/fs/cgroup
/dev/fuse 30M 28K 30M 1% /etc/pve
tmpfs 38G 0 38G 0% /run/user/0
Last edited: