Proxmox 4: exception trying to migrate online

iwasinnamuknow

New Member
Oct 29, 2015
2
0
1
Just installed a cluster (no HA) and it seems to be working well with 3 nodes running identical versions:

Code:
proxmox-ve: 4.0-16 (running kernel: 4.2.2-1-pve)
pve-manager: 4.0-48 (running version: 4.0-48/0d8559d0)
pve-kernel-4.2.2-1-pve: 4.2.2-16
lvm2: 2.02.116-pve1
corosync-pve: 2.3.5-1
libqb0: 0.17.2-1
pve-cluster: 4.0-22
qemu-server: 4.0-30
pve-firmware: 1.1-7
libpve-common-perl: 4.0-29
libpve-access-control: 4.0-9
libpve-storage-perl: 4.0-25
pve-libspice-server1: 0.12.5-1
vncterm: 1.2-1
pve-qemu-kvm: 2.4-9
pve-container: 1.0-6
pve-firewall: 2.0-12
pve-ha-manager: 1.0-9
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.3-1
lxcfs: 0.9-pve2
cgmanager: 0.37-pve2
criu: 1.6.0-1
zfsutils: 0.6.5-pve4~jessie

vm storage is over nfs shared between all nodes. When trying to migrate an online kvm via the web interface, the task appears to complete successfully but the vm is left in "internal-error" state on the new node. Syslog from the target node follows:

Code:
Oct 28 23:55:22 bagel kernel: WARNING: CPU: 0 PID: 17315 at arch/x86/kvm/emulate.c:5387 x86_emulate_insn+0xbb2/0xe30 [kvm]()
Oct  28 23:55:22 bagel kernel: Modules linked in: openvswitch libcrc32c  xt_NFLOG ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6  ipt_REJECT nf_reject_ipv4 xt_physdev xt_comment nf_conntrack_ipv4  nf_defrag_ipv4 xt_tcpudp xt_mark xt_set xt_addrtype xt_multiport  xt_conntrack nf_conntrack nfsv3 ip_set_hash_net ip_set ip6table_filter  ip6_tables iptable_filter ip_tables x_tables nfsd auth_rpcgss nfs_acl  nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad  ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi  nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO)  spl(O) zavl(PO) ipmi_ssif amdkfd amd_iommu_v2 iTCO_wdt  iTCO_vendor_support gpio_ich radeon ttm coretemp drm_kms_helper  kvm_intel drm kvm input_leds i2c_algo_bit snd_pcm lpc_ich snd_timer snd  i5000_edac psmouse
Oct 28 23:55:22 bagel kernel:  edac_core soundcore  pcspkr serio_raw hpilo ipmi_si hpwdt ipmi_msghandler i5k_amb shpchp  8250_fintek mac_hid vhost_net vhost macvtap macvlan autofs4 hid_generic  usbmouse usbkbd usbhid hid hpsa pata_acpi bnx2 cciss
Oct 28 23:55:22 bagel kernel: CPU: 0 PID: 17315 Comm: kvm Tainted: P           O    4.2.2-1-pve #1
Oct 28 23:55:22 bagel kernel: Hardware name: HP ProLiant DL380 G5, BIOS P56 11/01/2008
Oct 28 23:55:22 bagel kernel:  ffffffffc02b57de ffff8800b7467b68 ffffffff817c92f3 0000000000000007
Oct 28 23:55:22 bagel kernel:  0000000000000000 ffff8800b7467ba8 ffffffff8107776a ffff8802218ca5e0
Oct 28 23:55:22 bagel kernel:  ffff8802218ca5e0 0000000000000006 ffffffffc02ab800 0000000000000000
Oct 28 23:55:22 bagel kernel: Call Trace:
Oct 28 23:55:22 bagel kernel:  [<ffffffff817c92f3>] dump_stack+0x45/0x57
Oct 28 23:55:22 bagel kernel:  [<ffffffff8107776a>] warn_slowpath_common+0x8a/0xc0
Oct 28 23:55:22 bagel kernel:  [<ffffffff8107785a>] warn_slowpath_null+0x1a/0x20
Oct 28 23:55:22 bagel kernel:  [<ffffffffc029c172>] x86_emulate_insn+0xbb2/0xe30 [kvm]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc028090d>] x86_emulate_instruction+0x1bd/0x730 [kvm]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc027e700>] ? kvm_arch_vcpu_load+0x130/0x1e0 [kvm]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc04f9ea0>] handle_exception+0x150/0x380 [kvm_intel]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc04fd55a>] vmx_handle_exit+0xca/0x12c0 [kvm_intel]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc04ff9f4>] ? vmx_vcpu_run+0x4a4/0x6e0 [kvm_intel]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc04f617d>] ? vmx_save_host_state+0x16d/0x1c0 [kvm_intel]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc04f3340>] ? vmx_invpcid_supported+0x30/0x30 [kvm_intel]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc02848c7>] kvm_arch_vcpu_ioctl_run+0x367/0x11e0 [kvm]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc027e72f>] ? kvm_arch_vcpu_load+0x15f/0x1e0 [kvm]
Oct 28 23:55:22 bagel kernel:  [<ffffffffc026e46d>] kvm_vcpu_ioctl+0x2fd/0x570 [kvm]
Oct 28 23:55:22 bagel kernel:  [<ffffffff810b70c8>] ? __wake_up_locked_key+0x18/0x20
Oct 28 23:55:22 bagel kernel:  [<ffffffff81237cb1>] ? eventfd_write+0xc1/0x260
Oct 28 23:55:22 bagel kernel:  [<ffffffff8120172a>] do_vfs_ioctl+0x2ba/0x490
Oct 28 23:55:22 bagel kernel:  [<ffffffff811ee5b9>] ? vfs_write+0x149/0x190
Oct 28 23:55:22 bagel kernel:  [<ffffffff81201979>] SyS_ioctl+0x79/0x90
Oct 28 23:55:22 bagel kernel:  [<ffffffff81173ddf>] ? fire_user_return_notifiers+0x3f/0x50
Oct 28 23:55:22 bagel kernel:  [<ffffffff817cfd72>] entry_SYSCALL_64_fastpath+0x16/0x75
Oct 28 23:55:22 bagel kernel: ---[ end trace cdaa45724a0458d7 ]---

Does anyone have any idea what this could be? I'm a bit lost on that exception and haven't managed to find anything useful online so far.

Thank in advance
 
Hi,
what do you have set as cpu in the vm config?
has your cluster different physical cpus?
If yes you have to set the olderst cpu in your cluster or if you mix AMD and intel you have to chose kvm64.
 
Thanks for the quick reply.

We have two nodes with identical Xeons and one with Opterons. All vms are currently set to kvm64 I believe (they haven't been changed from default). Just to be on the safe side, I changed a machine from "Default (kvm64)" to "kvm64" and it seems to be working well now with < 100ms downtime.

I don't mind overriding the default, although I'm not sure why it's necessary. Working well enough for now, thanks for the suggestions :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!