Virtual Machine's status: Internal-error

RomanB

New Member
Apr 4, 2016
11
0
1
34
sory for my bad english
I have a big problem with proxmox 4.1 and AMD CPU+motherboard.
Virtual machine hangs hard when i install guest os (ubuntu for expample).
Proxmox interface says "status: internal-error", in dmesg of host i see this http://pastebin.com/f43s8QJH
I tried to goole it, and found this: https://bugzilla.redhat.com/show_bug.cgi?id=1266659
The problem is very similar to mine. When ubuntu try to install system packages, virtual machine was hang.
I tried some kernel version (pve-kernel-4.2.2-1-pve, pve-kernel-4.2.8-1-pve) and upgrade BIOS, It's doesn't help.

proxmox-ve: 4.1-39 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-22 (running version: 4.1-22/aca130cf)
pve-kernel-4.2.8-1-pve: 4.2.8-39
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-36
qemu-server: 4.0-64
pve-firmware: 1.1-7
libpve-common-perl: 4.0-54
libpve-access-control: 4.0-13
libpve-storage-perl: 4.0-45
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-9
pve-container: 1.0-52
pve-firewall: 2.0-22
pve-ha-manager: 1.0-25
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve7~jessie

CPU: AMD FX(tm)-8350 Eight-Core Processor
Motherboard: ASUS M5A78L-M/USB3, bios v2101.

In host with intel cpu proxmox 4.1 works fine. Also, proxmox 3.4 haven't this problem on AMD hosts.
 
Last edited:
Hi,
have you tried to disable nested virtualization?
If yes does it help?
 
It doesn't help :(
> cat /sys/module/kvm_amd/parameters/nested
> 0
When virtual machine starts, in dmesg i see this message:
[ 344.118746] kvm [4788]: vcpu0 unhandled rdmsr: 0xc001100d
[ 344.284324] kvm [4788]: vcpu1 unhandled rdmsr: 0xc001100d
And then vm hangs with this message in dmesg http://pastebin.com/ga4AqkVT
 
Last edited:
Hi,
can you send the config of the Vm

/etc/pve/qemu/<VMID>.conf
 
bootdisk: virtio0
cores: 2
ide2: none,media=cdrom
memory: 512
name: ceph-test
net0: e1000=66:63:34:64:31:66,bridge=vmbr0
numa: 0
ostype: l26
smbios1: uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8
sockets: 1
virtio0: ceph-rbd-pool:vm-100-disk-1,size=50G
 
I installed 4.4.0-0.bpo.1-amd64 kernel from backports repository, and i had this error again when ubuntu install the base packages (ubuntu 14.04 server was installed via pxe).
dmesg: http://pastebin.com/LEk0u1qf
 
Last edited:
By default kvm running with options "-cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce" (ps aux | grep kvm)
I manually run kvm with different "cpu" options:

# NO CPU OPTIONS (only -cpu kvm64), ALL WORKS GOOD, no error messages in dmesg
/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8 -name ceph-test -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64 -m 512 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:5fd447b64e1 -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=rbd:pve_data/vm-100-disk-2:mon_host=10.10.10.3;10.10.10.4;10.10.10.5:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/my-ceph-pool.keyring,if=none,id=drive-virtio1,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=66:63:34:64:31:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=100

# PROXMOX'S 3.4 CPU OPTIONS, all works fine, in dmesg only this: kvm [18840]: vcpu0 unhandled rdmsr: 0xc001100d ; kvm [18840]: vcpu1 unhandled rdmsr: 0xc001100d
/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8 -name ceph-test -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64,+lahf_lm,+x2apic,+sep -m 512 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:5fd447b64e1 -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=rbd:pve_data/vm-100-disk-2:mon_host="10.10.10.3;10.10.10.4;10.10.10.5:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/my-ceph-pool.keyring",if=none,id=drive-virtio1,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=66:63:34:64:31:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=100

# DEFAULT 4.1 CPU OPTIONS - FAIL
/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8 -name ceph-test -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 512 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:5fd447b64e1 -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=rbd:pve_data/vm-100-disk-2:mon_host="10.10.10.3;10.10.10.4;10.10.10.5:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/my-ceph-pool.keyring",if=none,id=drive-virtio1,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=66:63:34:64:31:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=100

# DEFAULT 4.1 CPU OPTIONS WITHOUT "+kvm_pv_eoi"- FAIL
/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8 -name ceph-test -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,enforce -m 512 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:5fd447b64e1 -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=rbd:pve_data/vm-100-disk-2:mon_host="10.10.10.3;10.10.10.4;10.10.10.5:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/my-ceph-pool.keyring",if=none,id=drive-virtio1,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=66:63:34:64:31:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=100

# DEFAULT 4.1 CPU OPTIONS WITHOUT "+kvm_pv_unhalt", ALL WORKS GOOD, in dmesg: kvm [18267]: vcpu0 unhandled rdmsr: 0xc001100d ; kvm [18267]: vcpu1 unhandled rdmsr: 0xc001100d
/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8 -name ceph-test -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_eoi,enforce -m 512 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:5fd447b64e1 -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=rbd:pve_data/vm-100-disk-2:mon_host="10.10.10.3;10.10.10.4;10.10.10.5:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/my-ceph-pool.keyring",if=none,id=drive-virtio1,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=66:63:34:64:31:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=100

# ONLY "+kvm_pv_unhalt" OPTION - FAIL
/usr/bin/kvm -id 100 -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d75c86aa-3ad5-4731-b3fd-102da2038af8 -name ceph-test -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000 -vga cirrus -vnc unix:/var/run/qemu-server/100.vnc,x509,password -cpu kvm64,+kvm_pv_unhalt -m 512 -k en-us -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:5fd447b64e1 -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 -drive file=rbd:pve_data/vm-100-disk-2:mon_host="10.10.10.3;10.10.10.4;10.10.10.5:id=admin:auth_supported=cephx:keyring=/etc/pve/priv/ceph/my-ceph-pool.keyring",if=none,id=drive-virtio1,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown -device e1000,mac=66:63:34:64:31:66,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=100

In result the option "kvm_pv_unhalt" breaks all?
 
Last edited:
I commented out the line "push @$cpuFlags , '+kvm_pv_unhalt' if !$nokvm;" in /usr/share/perl5/PVE/QemuServer.pm, then restarted pvedaemon and now all vm's starting without "+kvm_pv_unhalt" option.
It works, but it is a "dirty hack".
 
Last edited:
the patch was applied to kernel 4.4.0-0.bpo.1-amd64. I tried this kernel, no difference with pve kernel.

linux-4.4.6/arch/x86/kvm/svm.c:
/*
* svm_set_cr0() sets PG and WP and clears NW and CD on save->cr0.
* It also updates the guest-visible cr0 value.
*/
svm_set_cr0(&svm->vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
kvm_mmu_reset_context(&svm->vcpu);

save->cr4 = X86_CR4_PAE;
/* rdx = ?? */
 
Last edited:
I am getting identical error on proxmox boot, but my vm seems to run without any problem. But the VM is still being configured and is not under any kind of load. The errors appearing on console immediately after proxmox boot :
kvm[9764]: vcpu0 unhandled rdmsr : 0x570
kvm[9764]: vcpu1 unhandled rdmsr : 0x570
kvm[9764]: vcpu2 unhandled rdmsr : 0x570
kvm[9764]: vcpu3 unhandled rdmsr : 0x570
I am not sure if they appear before or after the VM also starts.


pveversion reports : pve-manager/4.1-22/aca130cf (running kernel: 4.2.8-1-pve)
server hardware: dell poweredge t630
 
4.4.6-pve kernel doesn't fix this issue
proxmox-ve: 4.2-48 (running kernel: 4.4.6-1-pve)
pve-manager: 4.2-2 (running version: 4.2-2/725d76f0)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.8-1-pve: 4.2.8-41
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-72
pve-firmware: 1.1-8
libpve-common-perl: 4.0-59
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-14
pve-container: 1.0-62
pve-firewall: 2.0-25
pve-ha-manager: 1.0-28
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
 
Last edited:
Hi,
Today I encoutered this issue on HP proliant dl380 g6 (AMD CPU).

I migrate a VM from a Dell R710 host to HP proliant dl380 g6 host. After the migration completed the VM was in "internal-error" state.

I have something similar to you in /var/log/syslog :
Code:
....
WARNING: CPU: 4 PID: 8774 at arch/x86/kvm/emulate.c:5410 x86_emulate_insn+0xbb2/0xe30 [kvm]()
...
WARNING: CPU: 4 PID: 8774 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]()

proxmox-ve: 4.2-51 (running kernel: 4.4.8-1-pve)
pve-manager: 4.2-5 (running version: 4.2-5/7cf09667)
pve-kernel-4.4.6-1-pve: 4.4.6-48
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.4.8-1-pve: 4.4.8-51
pve-kernel-4.2.8-1-pve: 4.2.8-41
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-39
qemu-server: 4.0-75
pve-firmware: 1.1-8
libpve-common-perl: 4.0-62
libpve-access-control: 4.0-16
libpve-storage-perl: 4.0-50
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-17
pve-container: 1.0-64
pve-firewall: 2.0-27
pve-ha-manager: 1.0-31
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve2
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve9~jessie
 
Hi,
Today I encoutered this issue on HP proliant dl380 g6 (AMD CPU).

I migrate a VM from a Dell R710 host to HP proliant dl380 g6 host. After the migration completed the VM was in "internal-error" state.
This is due to migrate between hosts with different cpus (intel -> amd)
 
Hi,
I have migrated virtual machines between these hosts machines. This is the first time i see this problem.
Even if mix hosts with AMD and Intel is not recommended, it shouldn't be a problem.

https://forum.proxmox.com/threads/amd-and-intel-in-one-cluster.24004/#post-120666
yes, HA is no problem.

live migration between AMD and Intel "should" work, as long as you use only CPU features which are available on both.

but yes, using the same CPUs in all nodes is a better approach.

I have exactly the same error blocks than OP. But for me the error appears randomly.
 
Last edited:
Hi,

I understand that internal-error can be triggered by bad hardware. If a VM running under KVM gets unexpectedly paused, take a look at /var/log/syslog on the host. If you see something like this, try replacing your RAM or CPU. In rare situations it could be your mainboard.

Oct 8 02:30:01 slash QEMU[3038]: KVM internal error. Suberror: 3
Oct 8 02:30:01 slash QEMU[3038]: extra data[0]: 0x0000000080000b0e
Oct 8 02:30:01 slash QEMU[3038]: extra data[1]: 0x0000000000000031
Oct 8 02:30:01 slash QEMU[3038]: extra data[2]: 0x0000000000000083
Oct 8 02:30:01 slash QEMU[3038]: extra data[3]: 0x0000000812968fe0
Oct 8 02:30:01 slash QEMU[3038]: extra data[4]: 0x0000000000000002
Oct 8 02:30:01 slash QEMU[3038]: RAX=0000000812968008 RBX=fffffe0010473090 RCX=00000000c0000101 RDX=00000000ffffffff

To confirm you're seeing this issue, make sure that Suberror is 3 (which means KVM_INTERNAL_ERROR_DELIVERY_EV) and extra data[1] is 31 (indicating that the VM exit reason was EXIT_REASON_EPT_MISCONFIG). The rest of the fields may vary, but those two must match those values.

Best,

Joe
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!