QEMU 9.0 available as of now

fiona · Nov 5, 2024

@twhidden is the failure always for a SATA type drive or others too? Does it help if you downgrade with apt install pve-qemu-kvm=8.2.2-1 or further withapt install pve-qemu-kvm=8.1.5-6?

twhidden · Nov 5, 2024

_gabriel said:
shutdown + start VM required to use new version.

I did do a shut/start of the VMs after reading the first part of this thread, but still ran into the same issue after.

twhidden · Nov 5, 2024

t.lamprecht said:
This sounds like you got some issues with the source or target storage, it could still be a regression from QEMU 9, but IMO that's a bit less likely.
Can you check the kernel/system logs for possibly relate log messages happening around the time the live-migration fails. Please also post the VM config (qm config VMID) and the type of the underlying source and target storage.

Code:

qm config 105
boot: order=sata0;ide2;net0
cores: 4
cpu: host
ide2: none,media=cdrom
memory: 4096
name: fr-dev-qa-gen5-balanced
net0: virtio=0A:1E:76:93:E3:58,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
sata0: vm-storage-01:vm-105-disk-0,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=42f30e80-d619-4133-b1c9-e40a6dcdfddc
sockets: 1
vmgenid: 0f8eb6a8-4904-4047-8200-b4ab077816a1

Here is what happens in the system logs at the time of that error - most notably, the "kvm: Failed to put registers after init: Invalid argument" which was in red.

Code:

Nov 05 09:47:38 pve50 QEMU[298688]: kvm: Failed to put registers after init: Invalid argument
Nov 05 09:47:38 pve50 kernel: tap105i0: left allmulticast mode
Nov 05 09:47:38 pve50 kernel: fwbr105i0: port 2(tap105i0) entered disabled state
Nov 05 09:47:38 pve50 kernel: fwbr105i0: port 1(fwln105i0) entered disabled state
Nov 05 09:47:38 pve50 kernel: vmbr1: port 3(fwpr105p0) entered disabled state
Nov 05 09:47:38 pve50 kernel: fwln105i0 (unregistering): left allmulticast mode
Nov 05 09:47:38 pve50 kernel: fwln105i0 (unregistering): left promiscuous mode
Nov 05 09:47:38 pve50 kernel: fwbr105i0: port 1(fwln105i0) entered disabled state
Nov 05 09:47:38 pve50 kernel: fwpr105p0 (unregistering): left allmulticast mode
Nov 05 09:47:38 pve50 kernel: fwpr105p0 (unregistering): left promiscuous mode
Nov 05 09:47:38 pve50 kernel: vmbr1: port 3(fwpr105p0) entered disabled state
Nov 05 09:47:38 pve50 kernel:  zd96: p1 p2 p3
Nov 05 09:47:38 pve50 lvm[299538]: /dev/zd96p3 excluded: device is rejected by filter config.
Nov 05 09:47:39 pve50 systemd[1]: 105.scope: Deactivated successfully.
Nov 05 09:47:39 pve50 systemd[1]: 105.scope: Consumed 1min 37.896s CPU time.
Nov 05 09:47:39 pve50 sshd[299547]: Accepted publickey for root from 10.10.20.40 port 33550 ssh2: RSA ....
Nov 05 09:47:39 pve50 sshd[299547]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
Nov 05 09:47:39 pve50 systemd-logind[2170]: New session 74 of user root.
Nov 05 09:47:39 pve50 systemd[1]: Started session-74.scope - Session 74 of User root.
Nov 05 09:47:39 pve50 sshd[299547]: pam_env(sshd:session): deprecated reading of user environment enabled
Nov 05 09:47:40 pve50 pvestatd[2741]: no such logical volume pve/data
Nov 05 09:47:40 pve50 qm[299553]: <root@pam> starting task UPID:pve50:0004923B:00487BF4:672A5A3C:qmstop:105:root@pam:
Nov 05 09:47:40 pve50 qm[299579]: stop VM 105: UPID:pve50:0004923B:00487BF4:672A5A3C:qmstop:105:root@pam:
Nov 05 09:47:40 pve50 qm[299553]: <root@pam> end task UPID:pve50:0004923B:00487BF4:672A5A3C:qmstop:105:root@pam: OK
Nov 05 09:47:40 pve50 sshd[299547]: Received disconnect from 10.10.20.40 port 33550:11: disconnected by user
Nov 05 09:47:40 pve50 sshd[299547]: Disconnected from user root 10.10.20.40 port 33550
Nov 05 09:47:40 pve50 sshd[299547]: pam_unix(sshd:session): session closed for user root

Source Storage: LVM-Thin
Destination Storage: ZFS (thin also)

Next try, ill roll it back to 8.x as one of the comments suggests to try.

twhidden · Nov 5, 2024

fiona said:
@twhidden is the failure always for a SATA type drive or others too? Does it help if you downgrade with apt install pve-qemu-kvm=8.2.2-1 or further withapt install pve-qemu-kvm=8.1.5-6?

Hi @fiona

Downgraded to 8.2.2-1 as suggested and re-tried the live migration

Failed with errors in the journalctl - (Red text - the TSC frequency didn't show on version 9)

Code:

Nov 05 09:59:38 pve51 QEMU[262939]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 09:59:38 pve51 QEMU[262939]: kvm: Failed to put registers after init: Invalid argument

Then downgraded to 8.1.5-6, and the migration worked.

journalctl did not have the "Invalid Argument" error listed (but the TSC frequency was shown a couple time).
Here is the output:

Code:

Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265490 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265491 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265492 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265493 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265494 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265495 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 kernel: clearing PKRU xfeature bit as vCPU from PID 265496 reports no PKRU support - migration from fpu-leaky kernel?
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable
Nov 05 10:06:16 pve51 QEMU[265298]: kvm: warning: TSC frequency mismatch between VM (2194842 kHz) and host (2599997 kHz), and TSC scaling unavailable

twhidden · Nov 5, 2024

A work around I found here - is to temporarily migrate to a temp LVM-thin volume. That migrates without error. Once its migrated, move the disks to the correct new zfs storage. An extra step, but a work around.

Hope this info helped.

fiona · Nov 6, 2024

twhidden said:
Here is what happens in the system logs at the time of that error - most notably, the "kvm: Failed to put registers after init: Invalid argument" which was in red.

I'd guess this is the actual cause of the issue and the failing drive-mirror just being a later consequence.

What is the CPU model of the hosts, i.e. migration source node and migration target node? What kernels are they running?

twhidden · Nov 6, 2024

fiona said:
I'd guess this is the actual cause of the issue and the failing drive-mirror just being a later consequence.

What is the CPU model of the hosts, i.e. migration source node and migration target node? What kernels are they running?

not following on "failing drive-mirror"... but here is the info on the hosts:

pve1 (source)

Code:

CPU(s) 40 x Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz (2 Sockets)
Kernel Version Linux 6.8.12-3-pve (2024-10-23T11:41Z)
Boot Mode EFI
Manager Version pve-manager/8.2.7

pve50 or pve51 (destination)

Code:

CPU(s) 56 x Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz (2 Sockets)
Kernel Version Linux 6.8.12-3-pve (2024-10-23T11:41Z)
Boot Mode Legacy BIOS
Manager Version pve-manager/8.2.7

fiona · Nov 6, 2024

twhidden said:
not following on "failing drive-mirror"... but here is the info on the hosts:

That was the original error message you posted.

twhidden said:
Code:

cpu: host

twhidden said:

pve1 (source)

Code:

CPU(s) 40 x Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz (2 Sockets)
Kernel Version Linux 6.8.12-3-pve (2024-10-23T11:41Z)
Boot Mode EFI
Manager Version pve-manager/8.2.7

pve50 or pve51 (destination)

Code:

CPU(s) 56 x Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz (2 Sockets)
Kernel Version Linux 6.8.12-3-pve (2024-10-23T11:41Z)
Boot Mode Legacy BIOS
Manager Version pve-manager/8.2.7

You should not use host CPU type when you don't have the same CPU model on source and target. Live migration cannot be guaranteed to work then, see: https://pve.proxmox.com/pve-docs/chapter-qm.html#_cpu_type

twhidden · Nov 6, 2024

fiona said:
That was the original error message you posted.

You should not use host CPU type when you don't have the same CPU model on source and target. Live migration cannot be guaranteed to work then, see: https://pve.proxmox.com/pve-docs/chapter-qm.html#_cpu_type

Gotcha -- but that isn't the reason for the migration failing, is it? The "Invalid Argument"

Side note, I had to use Host for some dev software we were testing, as they used certain CPU flags required that were not available under the default. I believe it was related to AVX2. But that is good to know. New hardware is identical so should be better.

Thanks for the hint downgrading to 8.1.5-6 - that got back to working. Hope you can reproduce the 9.x issue with what we learned here.

fiona · Nov 7, 2024

twhidden said:
Gotcha -- but that isn't the reason for the migration failing, is it? The "Invalid Argument"

It most likely is, Failed to put registers refers to CPU registers: https://gitlab.com/qemu-project/qemu/-/blob/master/accel/kvm/kvm-all.c?ref_type=heads#L2902

twhidden said:
Side note, I had to use Host for some dev software we were testing, as they used certain CPU flags required that were not available under the default. I believe it was related to AVX2. But that is good to know. New hardware is identical so should be better.

Thanks for the hint downgrading to 8.1.5-6 - that got back to working. Hope you can reproduce the 9.x issue with what we learned here.

I don't have exactly those CPU models and there is no need to reproduce, see my previous reply.

bor.paw · Nov 7, 2024

twhidden said:
Side note, I had to use Host for some dev software we were testing, as they used certain CPU flags required that were not available under the default. I believe it was related to AVX2. But that is good to know. New hardware is identical so should be better.

Sorry if it is off-topic, but @twhidden, for the AVX flags, you can use x86-64-v3, which supports AVX2 and corresponds to an Intel Haswell (2013, > Xeon v3) or an AMD Excavator (2015).

And when you upgrade to have both servers with the same CPU, e.g., Silver 4210 CPU, you can emulate as x86-64-v4, which corresponds to CPUs that support AVX512, to Intel CPUs from Skylake (2015) or AMD CPUs using Zen.

Uturn · Nov 22, 2024

fiona said:
I can reproduce the mem address conflict messages. They appear after changes in SeaBIOS that also caused issues for 32bit guests: https://mail.coreboot.org/hyperkitt...org/message/R7FOQMMYWVX577QNIA2AKUAGOZKNJIAP/

The questions is if that is the same root cause as the passthrough breakage or if the conflict messages are just a red herring.

A workaround is using less memory, e.g. with 2048 I do not get the messages. Could you check if that works for you too? If yes and if it also fixes passthrough then it would be a good hint that it's the same root cause.

Another workaround would be not using SeaBIOS but OVMF/UEFI.

I've updated to latest Proxmox 8.3 but again had to downgrade pve-qemu-kvm to 8.1.5-6, as i get lots of mem address conflicts during boot of VMs when i assign more than 2048M of memory.

Is there any progress regarding this issue?

fiona · Nov 22, 2024

Uturn said:
I've updated to latest Proxmox 8.3 but again had to downgrade pve-qemu-kvm to 8.1.5-6, as i get lots of mem address conflicts during boot of VMs when i assign more than 2048M of memory.

Is there any progress regarding this issue?

The SeaBIOS developers are discussing potential steps forward. See the recent messages in: https://mail.coreboot.org/hyperkitt...MSBPBDXOOAQ/#OSE5WX3S3TLQKVIVJAFEVFKQETNUPT5C

Uturn · Nov 22, 2024

fiona said:
The SeaBIOS developers are discussing potential steps forward. See the recent messages in: https://mail.coreboot.org/hyperkitt...MSBPBDXOOAQ/#OSE5WX3S3TLQKVIVJAFEVFKQETNUPT5C

I'm following this discussion. So let's see what comes around.

fiona · Nov 22, 2024

Seems like a there is a proposed solution: https://mail.coreboot.org/hyperkitt....org/thread/QEL4ZB4IFXAPX5CEKGF45FSX4NGLCVHQ/

If this is merged in SeaBIOS, it will be possible in a future QEMU version (I'd guess 9.2 or 10.0) to override whether to use 64bit PCI window or not.

pszemek · Nov 24, 2024

@fiona or other member from Proxmox Staff - are you have any information about multiqueue for IO, implemented on latest version of QEMU but cannot usable on Proxmox? In your repo, path is available.

fiona · Nov 25, 2024

Hi,

pszemek said:
@fiona or other member from Proxmox Staff - are you have any information about multiqueue for IO, implemented on latest version of QEMU but cannot usable on Proxmox? In your repo, path is available.

for VirtIO-SCSI you can configure it via CLI/API using the queues property of the scsi<N> option.

Related feature requests:
for VirtIO-blk: https://bugzilla.proxmox.com/show_bug.cgi?id=4295
for the UI: https://bugzilla.proxmox.com/show_bug.cgi?id=5812

SInisterPisces · Dec 12, 2024

QEMU 9.2 is out. I'm guessing this will be a Proxmox 9 thing? Lots of new features.
Here's the summary from Phoronix: https://www.phoronix.com/news/QEMU-9.2-Released

A few of these sound really great for Proxmox in the long term.

- QEMU adds a new "Nitro-Enclave" machine type on x86 that can emulate an AWS Nitro Enclave environment and is able to boot Enclave Image Format "EIF" files.

- QEMU 9.2 adds support for enabling AVX10 and specifying the desired version of AVX10 such as AVX10-128, AVX10-256, AVX10-512, and other AVX10 version properties.

- VirtIO GPU now supports Venus encapsulation for Vulkan when using recent Virglrenderer code on the host and newer Mesa code within the guest.

- The VirtIO memory driver now supports suspend and resume on x86_64.

fiona · Dec 12, 2024

SInisterPisces said:
QEMU 9.2 is out. I'm guessing this will be a Proxmox 9 thing? Lots of new features.
Here's the summary from Phoronix: https://www.phoronix.com/news/QEMU-9.2-Released

A few of these sound really great for Proxmox in the long term.

I think your guess about Proxmox VE 9 is good, we shall see. I'll look at QEMU 9.2 for Proxmox VE next year

QEMU 9.1 is currently applied in git and going through internal testing.

SInisterPisces · Dec 12, 2024

Thanks for the info. Sounds like we've got some exciting stuff to look forward to for next year.

QEMU 9.0 available as of now

Proxmox Staff Member

New Member

New Member

New Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

New Member

Proxmox Staff Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

We value your privacy