PVE 5.1: KVM broken on old CPUs

profpolymath

New Member
Oct 24, 2017
3
2
3
Toronto
VMs fail to launch after upgrading from PVE 5.0 to 5.1 on an old server:
Code:
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
The kvm_intel module refuses to load:
Code:
# lsmod | grep kv
kvm                   581632  0
irqbypass              16384  1 kvm
# modprobe kvm-intel
modprobe: ERROR: could not insert 'kvm_intel': Input/output error
The following thread suggests this is due to upstream changes in KVM deprecating older platforms:
hXXps://bbs.archlinux.org/viewtopic.php?pid=1727757#p1727757

The machine is a PowerEdge 2950 III with dual 4-core Xeon 'Dempsey' 5050 CPUs. About ten years old. If we want to continue running PVE on this hardware it sounds as though it will have to be pegged at 5.0.
Code:
# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             2
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            15
Model:                 6
Model name:            Intel(R) Xeon(TM) CPU 3.00GHz
Stepping:              4
CPU MHz:               2992.595
CPU max MHz:           3000.0000
CPU min MHz:           2000.0000
BogoMIPS:              5985.19
Virtualization:        VT-x
L1d cache:             16K
L2 cache:              2048K
NUMA node0 CPU(s):     0-7
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc pebs bts nopl cpuid pni dtes64 monitor ds_cpl vmx est cid cx16 xtpr pdcm lahf_lm tpr_shadow
Code:
# uname -a
Linux pve2950 4.13.4-1-pve #1 SMP PVE 4.13.4-25 (Fri, 13 Oct 2017 08:59:53 +0200) x86_64 GNU/Linux
# pveversion -v
proxmox-ve: 5.1-25 (running kernel: 4.13.4-1-pve)
pve-manager: 5.1-35 (running version: 5.1-35/722cc488)
pve-kernel-4.13.4-1-pve: 4.13.4-25
pve-kernel-4.10.17-3-pve: 4.10.17-23
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.2-pve1~bpo90
 
  • Like
Reactions: chrone

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
3,399
529
113
thanks for reporting this, we'll see about speeding up the (already planned) revert..
 
  • Like
Reactions: chrone

macleod

New Member
Aug 3, 2017
22
0
1
41
Same problem on a HP Proliant DL380G5 server. Reverting to 4.10 kernel solved the problem.

Code:
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             2
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 15
Model name:            Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
Stepping:              6
CPU MHz:               2000.002
BogoMIPS:              4000.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm tpr_shadow dtherm
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
3,399
529
113
test kernel is available on http://download.proxmox.com/temp/pve-kernel-4.13.4-1-pve_4.13.4-26~vmxtest1_amd64.deb , hash sums are:
Code:
SHA512: bf1abdaef81afbd3e06340bc9e01594acc175d6dde225e9311e7fbae53b9256aa62397c74ccb384560404b0de5240b3bbbe152782a77eeb00c6fb07dbc84874b
SHA256: b07b79306318341ae359752fecf1baa9faac09202286634e03b0e11caffd759c 
MD5: 109945d8e7929678df61f927e59b904b
please provide feedback, we don't have any affected machines (anymore) in our test lab, and neither do the upstream KVM kernel developers..
 

sandqst

New Member
Oct 24, 2017
3
0
1
41
After rolling back the 5.1 update, and staying on pve 5.0, testing the kernel posted above seams to work. However, I will be staying with the 4.10.17 kernel for now (and pve 5.0).

This is running on a HP proliant microserver gen 8 with a Intel Xenon E3-1256L V2 processor. Well, I guess that it is beginning to show its age. :-/
 

cybermcm

Member
Aug 20, 2017
94
10
8
HP DL380 G5 (really old ;-)) with 5.1 and new kernel -> working!!
Thanks to the Proxmox team for this quick solution
 
  • Like
Reactions: chrone

tburger

Member
Oct 13, 2017
33
3
8
36
I am just using the standard (free) repro. Patchlevel is as of yesterday 18:00 German time.
Havent installed the test-kernel from this thread.
Would you like me to check something particular in the logs?
edit/
Something just jumped to my mind: I would not expect that this would make a difference, but I chose to use a custom partition layout, therefore installed debian first and after that applied the proxmox-kernel.
/edit
 
Oct 26, 2017
8
0
1
24
I've installed Proxmox on a Dell D620 yesterday, and brought it fully up to date with dist-upgrade.
Containers worked just fine, but couldn't start VM's.
After installing the test kernel and rebooting the VM's work, but the Containers won't start anymore.

The D620 has got a T5600 CPU.

Edit: For some reason it didn't mount the CIFS share on boot, that was the reason why the containers wouldn't start. Fixed now.

Here are some logs:

systemctl status pve-container@607.service

pve-container@607.service - PVE LXC Container: 607
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-10-26 13:07:59 CEST; 5min ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 2149 ExecStart=/usr/bin/lxc-start -n 607 (code=exited, status=1/FAILURE)

Oct 26 13:07:58 hypervisor3 systemd[1]: Starting PVE LXC Container: 607...
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: lxccontainer.c: wait_on_daemonized_start: 751 No such file or directory - Failed to receive the container state
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 368 The container failed to start.
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 370 To get more details, run the container in foreground mode.
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 372 Additional information can be obtained by setting the --logfile and --logpriority optio
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Control process exited, code=exited status=1
Oct 26 13:07:59 hypervisor3 systemd[1]: Failed to start PVE LXC Container: 607.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Unit entered failed state.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Failed with result 'exit-code'.
~

journalctl -xe

-- Unit user@0.service has finished starting up.
--
-- The start-up result is done.
Oct 26 13:07:27 hypervisor3 kernel: perf: interrupt took too long (3145 > 3131), lowering kernel.perf_event_max_sample
Oct 26 13:07:58 hypervisor3 pvedaemon[2147]: starting CT 607: UPID:hypervisor3:00000863:00008392:59F1C20E:vzstart:607:
Oct 26 13:07:58 hypervisor3 pvedaemon[1194]: <root@pam> starting task UPID:hypervisor3:00000863:00008392:59F1C20E:vzst
Oct 26 13:07:58 hypervisor3 systemd[1]: Starting PVE LXC Container: 607...
-- Subject: Unit pve-container@607.service has begun start-up
-- Defined-By: systemd
--
-- Unit pve-container@607.service has begun starting up.
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: lxccontainer.c: wait_on_daemonized_start: 751 No such fil
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 368 The container failed to star
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 370 To get more details, run the
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 372 Additional information can b
Oct 26 13:07:59 hypervisor3 pvedaemon[1192]: unable to get PID for CT 607 (not running?)
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Control process exited, code=exited status=1
Oct 26 13:07:59 hypervisor3 systemd[1]: Failed to start PVE LXC Container: 607.
-- Subject: Unit pve-container@607.service has failed
-- Defined-By: systemd
--
-- Unit pve-container@607.service has failed.
--
-- The result is failed.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Unit entered failed state.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Failed with result 'exit-code'.
Oct 26 13:07:59 hypervisor3 pvedaemon[2147]: command 'systemctl start pve-container@607' failed: exit code 1
Oct 26 13:07:59 hypervisor3 pvedaemon[1194]: <root@pam> end task UPID:hypervisor3:00000863:00008392:59F1C20E:vzstart:6
Oct 26 13:08:00 hypervisor3 systemd[1]: Starting Proxmox VE replication runner...
-- Subject: Unit pvesr.service has begun start-up
-- Defined-By: systemd
--
-- Unit pvesr.service has begun starting up.
Oct 26 13:08:01 hypervisor3 systemd[1]: Started Proxmox VE replication runner.
-- Subject: Unit pvesr.service has finished start-up
-- Defined-By: systemd
--
-- Unit pvesr.service has finished starting up.
 
Last edited:

dendi

Member
Nov 17, 2011
105
7
18
I am just using the standard (free) repro. Patchlevel is as of yesterday 18:00 German time.
Havent installed the test-kernel from this thread.
Would you like me to check something particular in the logs?
Yes, if I did understand well, the problem was on CPUs with no virtual nmi support...
Can you check with command:
cat /proc/cpuinfo | grep nmi

you should get no output if your CPU doesn't support virtual nmi

Just for curiosity.

Thank you
 

superbit

New Member
Apr 17, 2012
20
0
1
I had problems too with a HP ProLiant ml110, with a Intel Xeon 3040 (no nmi support).
I installed test kernel and everything ok with VMs.

My server is still showing this error at booting:
/sbin/modprobe failed: 1
Can't process LV pve/data: thin-pool target support missing from kernel?
Can't process LV pve/vm-901-disk-1: thin-pool target support missing from kernel?

I run lvm and lvscan and LVM seem to be fine:
Code:
  ACTIVE            '/dev/pve/swap' [4.00 GiB] inherit
  ACTIVE            '/dev/pve/root' [116.25 GiB] inherit
  ACTIVE            '/dev/pve/data' [329.26 GiB] inherit
  ACTIVE            '/dev/pve/vm-901-disk-1' [100.00 GiB] inherit
 
Oct 26, 2017
8
0
1
24
I've got a similar error message on boot, it says that the LVM-thin pool apparently is missing support from the kernel just like superbit mentioned.
Didn't test it out yet on LVM-thin, but that old D620 machine is still capable of running KVM VMs and containers.
 

tburger

Member
Oct 13, 2017
33
3
8
36
...you should get no output if your CPU doesn't support virtual nmi...
As chOzcA75vE0poCY0F6XC states, I dont get anything back from that as well (Opteron 6276)
I dont get the LVM message, but that makes sense as I am not using LVM...
 

fabian

Proxmox Staff Member
Staff member
Jan 7, 2016
3,399
529
113
thanks for the feedback, forwarded upstream. the LVM-thin message on boot can be safely ignored, it is not relevant at that stage of the boot process.
 

superbit

New Member
Apr 17, 2012
20
0
1
Thanks Fabian. I prefer don't see an error on boot, but I'm quiet if it can be safely ignored. Thanks again. A personal doubt, what's the reason of this error?

thanks for the feedback, forwarded upstream. the LVM-thin message on boot can be safely ignored, it is not relevant at that stage of the boot process.
 

neiion

New Member
Apr 3, 2017
4
3
3
40
I've installed Proxmox on a Dell D620 yesterday, and brought it fully up to date with dist-upgrade.
Containers worked just fine, but couldn't start VM's.
After installing the test kernel and rebooting the VM's work, but the Containers won't start anymore.

The D620 has got a T5600 CPU.

Edit: For some reason it didn't mount the CIFS share on boot, that was the reason why the containers wouldn't start. Fixed now.

Here are some logs:

systemctl status pve-container@607.service

pve-container@607.service - PVE LXC Container: 607
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-10-26 13:07:59 CEST; 5min ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 2149 ExecStart=/usr/bin/lxc-start -n 607 (code=exited, status=1/FAILURE)

Oct 26 13:07:58 hypervisor3 systemd[1]: Starting PVE LXC Container: 607...
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: lxccontainer.c: wait_on_daemonized_start: 751 No such file or directory - Failed to receive the container state
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 368 The container failed to start.
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 370 To get more details, run the container in foreground mode.
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 372 Additional information can be obtained by setting the --logfile and --logpriority optio
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Control process exited, code=exited status=1
Oct 26 13:07:59 hypervisor3 systemd[1]: Failed to start PVE LXC Container: 607.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Unit entered failed state.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Failed with result 'exit-code'.
~

journalctl -xe

-- Unit user@0.service has finished starting up.
--
-- The start-up result is done.
Oct 26 13:07:27 hypervisor3 kernel: perf: interrupt took too long (3145 > 3131), lowering kernel.perf_event_max_sample
Oct 26 13:07:58 hypervisor3 pvedaemon[2147]: starting CT 607: UPID:hypervisor3:00000863:00008392:59F1C20E:vzstart:607:
Oct 26 13:07:58 hypervisor3 pvedaemon[1194]: <root@pam> starting task UPID:hypervisor3:00000863:00008392:59F1C20E:vzst
Oct 26 13:07:58 hypervisor3 systemd[1]: Starting PVE LXC Container: 607...
-- Subject: Unit pve-container@607.service has begun start-up
-- Defined-By: systemd
--
-- Unit pve-container@607.service has begun starting up.
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: lxccontainer.c: wait_on_daemonized_start: 751 No such fil
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 368 The container failed to star
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 370 To get more details, run the
Oct 26 13:07:59 hypervisor3 lxc-start[2149]: lxc-start: 607: tools/lxc_start.c: main: 372 Additional information can b
Oct 26 13:07:59 hypervisor3 pvedaemon[1192]: unable to get PID for CT 607 (not running?)
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Control process exited, code=exited status=1
Oct 26 13:07:59 hypervisor3 systemd[1]: Failed to start PVE LXC Container: 607.
-- Subject: Unit pve-container@607.service has failed
-- Defined-By: systemd
--
-- Unit pve-container@607.service has failed.
--
-- The result is failed.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Unit entered failed state.
Oct 26 13:07:59 hypervisor3 systemd[1]: pve-container@607.service: Failed with result 'exit-code'.
Oct 26 13:07:59 hypervisor3 pvedaemon[2147]: command 'systemctl start pve-container@607' failed: exit code 1
Oct 26 13:07:59 hypervisor3 pvedaemon[1194]: <root@pam> end task UPID:hypervisor3:00000863:00008392:59F1C20E:vzstart:6
Oct 26 13:08:00 hypervisor3 systemd[1]: Starting Proxmox VE replication runner...
-- Subject: Unit pvesr.service has begun start-up
-- Defined-By: systemd
--
-- Unit pvesr.service has begun starting up.
Oct 26 13:08:01 hypervisor3 systemd[1]: Started Proxmox VE replication runner.
-- Subject: Unit pvesr.service has finished start-up
-- Defined-By: systemd
--
-- Unit pvesr.service has finished starting up.
I had a issue mounting shares also , this is how i fixed it https://forum.proxmox.com/threads/pve-5-1-cifs-share-issue-mount-error-112-host-is-down.37788/

hope it helps :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!