QEMU 10.1 available on pve-test and pve-no-subscription as of now

Hi,
I’m currently updating some of my VMs to use newer machine types, and I noticed that Proxmox offers both pc-q35-10.0+pve1 and pc-q35-10.1.
Before applying this change in production, I’d like to ask:
Which one is recommended for a stable production environment?
  • Is it safer to keep using the Proxmox-patched version (pc-q35-10.0+pve1)?
  • Or is pc-q35-10.1 (without the PVE suffix) already considered stable and fully supported for production workloads?
machine versions are incremental, so changes that apply for 10.0+pve1 also apply for 10.1. In particular, the change for which the bump was made. In general, it's recommended to use the latest available version.

See also:
https://pve.proxmox.com/pve-docs/chapter-qm.html#qm_machine_type
https://pve.proxmox.com/wiki/QEMU_Machine_Version_Upgrade
 
  • Like
Reactions: marcio79
I have a problem with cloning templates. I have an LVM on which I have a templates. When I want to clone the template, it tells me an error
Code:
create full clone of drive ide0 (lvm-nvme:vm-9922-cloudinit)
  Logical volume "vm-219-cloudinit" created.
create full clone of drive virtio0 (lvm-nvme:vm-9922-disk-0)
  Logical volume "vm-219-disk-1" created.
transferred 0.0 B of 5.0 GiB (0.00%)
transferred 52.2 MiB of 5.0 GiB (1.02%)
transferred 103.9 MiB of 5.0 GiB (2.03%)
transferred 156.2 MiB of 5.0 GiB (3.05%)
transferred 207.9 MiB of 5.0 GiB (4.06%)
transferred 260.1 MiB of 5.0 GiB (5.08%)
transferred 311.8 MiB of 5.0 GiB (6.09%)
....
transferred 1.9 GiB of 5.0 GiB (38.59%)
transferred 2.0 GiB of 5.0 GiB (39.61%)
qemu-img: error while writing at byte 2145386496: Invalid argument
  Logical volume "vm-219-cloudinit" successfully removed.
  Logical volume "vm-219-disk-1" successfully removed.
TASK ERROR: clone failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f raw -O raw /dev/vg_data/vm-9922-disk-0 /dev/vg_data/vm-219-disk-1' failed: exit code 1
I have everything up to date, I updated today just to be sure. pve-qemu-kvm is on version 10.1.2-4
If I downgrade, which I did, to version 10.0.2-4, it's ok. Am I the only one who has this problem?

Is there a bug report for this? I'm getting the exact same error at the extra same byte on my "nvme over tcp" setup via a miktotik rose storage server.. I am on production (qemu-server (9.1.3)

I assumed it was the mikrotik so I've lodged a support ticket with them - but maybe it's proxmox (or upstream qemu bug)?

Some details I pasted in the mikrotik forum https://forum.mikrotik.com/t/rose-nvme-over-tcp-to-proxomx-failing-on-4k-lba-drives/267214/5

edit: FYI: I previously had my mikrotik working perfectly fine. https://forum.mikrotik.com/t/rose-d...orld-storage-server-with-basic-stats/265759/5 (then I decided to format the NVME drives from 512 to 4K LBA.. and then I started experiencing issues - could just be timing and it broke in another way during that window)

So to the others running into this? Are you running 4K LBA? or 512? might not be related.. but worth checking.
 
Last edited:
Is there a bug report for this? I'm getting the exact same error at the extra same byte on my "nvme over tcp" setup on a miktotik rose storage server.. I am on production (qemu-server (9.1.3), so I don't know if this bug is related?

I assumed it was the mikrotik so I've lodged a support ticket with them - but maybe it's proxmox?

ref https://forum.mikrotik.com/t/rose-nvme-over-tcp-to-proxomx-failing-on-4k-lba-drives/267214/5
https://bugzilla.proxmox.com/show_bug.cgi?id=7197
 
Code:
cat /etc/pve/storage.cfg
zfspool: SSD
        pool SSD
        content images,rootdir
        mountpoint /SSD
        nodes unit0

lvm: vg-rose-raidPM983a
        vgname vg-rose-raidPM983a
        content rootdir,images
        nodes unit1,unit0
        saferemove 0
        shared 1

Full trace attached via `qemu-img --trace '*' convert -p -n -f raw -O raw /dev/zvol/SSD/vm-129-disk-2 /dev/vg-rose-raidPM983a/dummy 2>&1 | tail -n 1000 > /tmp/qemu-img-trace.txt`

Code:
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa769d00 to 0x6083fa80a180 opaque 0x7ffc458f47c0
blk_co_pwritev blk 0x6083fa802a50 bs 0x6083fa804e60 offset 2145386496 bytes 2096640 flags 0x6
bdrv_co_pwritev_part bs 0x6083fa804e60 offset 2145386496 bytes 2096640 flags 0x6
qemu_mutex_lock waiting on mutex 0x6083fa8090b8 (../block/io.c:627)
qemu_mutex_locked taken mutex 0x6083fa8090b8 (../block/io.c:627)
qemu_mutex_unlock released mutex 0x6083fa8090b8 (../block/io.c:629)
bdrv_co_pwrite_zeroes bs 0x6083fa80c1c0 offset 2145386496 bytes 2096640 flags 0x4
bdrv_co_pwritev_part bs 0x6083fa80c1c0 offset 2145386496 bytes 2096640 flags 0x6
qemu_mutex_lock waiting on mutex 0x6083fa810418 (../block/io.c:627)
qemu_mutex_locked taken mutex 0x6083fa810418 (../block/io.c:627)
qemu_mutex_unlock released mutex 0x6083fa810418 (../block/io.c:629)
thread_pool_submit_aio pool 0x6083fa7a6ee0 req 0x6083fa802f60 opaque 0x7d38c8f936a0
qemu_mutex_lock waiting on mutex 0x6083fa7a6ef0 (../util/thread-pool.c:261)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:261)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:266)
qemu_coroutine_yield from 0x6083fa80a180 to 0x6083fa769d00
qemu_co_mutex_lock_uncontended mutex 0x7ffc458f4960 self 0x6083fa769d00
qemu_co_mutex_unlock_entry mutex 0x7ffc458f4960 self 0x6083fa769d00
blk_co_preadv blk 0x6083fa8021c0 bs 0x6083fa7f59c0 offset 2160066048 bytes 2097152 flags 0x0
bdrv_co_preadv_part bs 0x6083fa7f59c0 offset 2160066048 bytes 2097152 flags 0x0
qemu_mutex_lock waiting on mutex 0x6083fa7f9c18 (../block/io.c:627)
qemu_mutex_locked taken mutex 0x6083fa7f9c18 (../block/io.c:627)
qemu_mutex_unlock released mutex 0x6083fa7f9c18 (../block/io.c:629)
bdrv_co_preadv_part bs 0x6083fa7fcd20 offset 2160066048 bytes 2097152 flags 0x0
qemu_mutex_lock waiting on mutex 0x6083fa800f78 (../block/io.c:627)
qemu_mutex_locked taken mutex 0x6083fa800f78 (../block/io.c:627)
qemu_mutex_unlock released mutex 0x6083fa800f78 (../block/io.c:629)
thread_pool_submit_aio pool 0x6083fa7a6ee0 req 0x6083fa802e80 opaque 0x7d38c9296790
qemu_mutex_lock waiting on mutex 0x6083fa7a6ef0 (../util/thread-pool.c:261)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:261)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:266)
qemu_coroutine_yield from 0x6083fa769d00 to 0x7d38cae523c8
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:91)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:109)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:91)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:109)
qemu_mutex_lock waiting on mutex 0x6083fa7a6ef0 (../util/thread-pool.c:119)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:119)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:91)
lockcnt_fast_path_attempt lockcnt 0x6083fa7efcdc fast path 0->4
lockcnt_fast_path_success lockcnt 0x6083fa7efcdc fast path 0->4 succeeded
lockcnt_fast_path_attempt lockcnt 0x6083fa7f012c fast path 0->4
lockcnt_fast_path_success lockcnt 0x6083fa7f012c fast path 0->4 succeeded
thread_pool_complete_aio pool 0x6083fa7a6ee0 req 0x6083fa802e80 opaque 0x7d38c9296730 ret 0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x7d38cae523c8 to 0x6083fa769d00 opaque 0x7ffc458f47c0
qemu_mutex_lock waiting on mutex 0x6083fa800f78 (../block/io.c:591)
qemu_mutex_locked taken mutex 0x6083fa800f78 (../block/io.c:591)
qemu_mutex_unlock released mutex 0x6083fa800f78 (../block/io.c:593)
qemu_mutex_lock waiting on mutex 0x6083fa7f9c18 (../block/io.c:591)
qemu_mutex_locked taken mutex 0x6083fa7f9c18 (../block/io.c:591)
qemu_mutex_unlock released mutex 0x6083fa7f9c18 (../block/io.c:593)
qemu_coroutine_yield from 0x6083fa769d00 to 0x7d38cae523c8
qemu_mutex_lock waiting on mutex 0x6083fa7a6ef0 (../util/thread-pool.c:119)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:119)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:91)
lockcnt_fast_path_attempt lockcnt 0x6083fa7efcdc fast path 0->4
lockcnt_fast_path_success lockcnt 0x6083fa7efcdc fast path 0->4 succeeded
lockcnt_fast_path_attempt lockcnt 0x6083fa7f012c fast path 0->4
lockcnt_fast_path_success lockcnt 0x6083fa7f012c fast path 0->4 succeeded
thread_pool_complete_aio pool 0x6083fa7a6ee0 req 0x6083fa802f60 opaque 0x7d38c8f93650 ret 0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x7d38cae523c8 to 0x6083fa80a180 opaque 0x7ffc458f47c0
thread_pool_submit_aio pool 0x6083fa7a6ee0 req 0x6083fa802e80 opaque 0x7d38c8f936a0
qemu_mutex_lock waiting on mutex 0x6083fa7a6ef0 (../util/thread-pool.c:261)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:261)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:266)
qemu_coroutine_yield from 0x6083fa80a180 to 0x7d38cae523c8
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:91)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:109)
qemu_mutex_lock waiting on mutex 0x6083fa7a6ef0 (../util/thread-pool.c:119)
qemu_mutex_locked taken mutex 0x6083fa7a6ef0 (../util/thread-pool.c:119)
qemu_mutex_unlock released mutex 0x6083fa7a6ef0 (../util/thread-pool.c:91)
lockcnt_fast_path_attempt lockcnt 0x6083fa7efcdc fast path 0->4
lockcnt_fast_path_success lockcnt 0x6083fa7efcdc fast path 0->4 succeeded
lockcnt_fast_path_attempt lockcnt 0x6083fa7f012c fast path 0->4
lockcnt_fast_path_success lockcnt 0x6083fa7f012c fast path 0->4 succeeded
thread_pool_complete_aio pool 0x6083fa7a6ee0 req 0x6083fa802e80 opaque 0x7d38c8f93650 ret -22
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x7d38cae523c8 to 0x6083fa80a180 opaque 0x7ffc458f47c0
qemu_vfree ptr (nil)
qemu_mutex_lock waiting on mutex 0x6083fa810418 (../block/io.c:591)
qemu_mutex_locked taken mutex 0x6083fa810418 (../block/io.c:591)
qemu_mutex_unlock released mutex 0x6083fa810418 (../block/io.c:593)
qemu_vfree ptr (nil)
qemu_mutex_lock waiting on mutex 0x6083fa8090b8 (../block/io.c:591)
qemu_mutex_locked taken mutex 0x6083fa8090b8 (../block/io.c:591)
qemu_mutex_unlock released mutex 0x6083fa8090b8 (../block/io.c:593)
qemu-img: error while writing at byte 2145386496: Invalid argument
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa80a180 to 0x6083fa78f770 opaque 0x7ffc458f47c0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa78f770 to 0x6083fa7e8110 opaque 0x7ffc458f47c0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa7e8110 to 0x6083fa7e5640 opaque 0x7ffc458f47c0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa7e5640 to 0x6083fa7e0c30 opaque 0x7ffc458f47c0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa7e0c30 to 0x6083fa796360 opaque 0x7ffc458f47c0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa796360 to 0x6083fa75fa30 opaque 0x7ffc458f47c0
qemu_aio_coroutine_enter ctx 0x6083fa7f0080 from 0x6083fa75fa30 to 0x6083fa769d00 opaque 0x7ffc458f47c0

cc @fiona
 

Attachments

Last edited:
Code:
root@unit0:~# lvdisplay /dev/vg-rose-raidPM983a/dummy
  --- Logical volume ---
  LV Path                /dev/vg-rose-raidPM983a/dummy
  LV Name                dummy
  VG Name                vg-rose-raidPM983a
  LV UUID                m0sk0O-uped-yeML-cuR5-D0Nk-n06G-U31uLV
  LV Write Access        read/write
  LV Creation host, time unit0, 2025-12-31 12:31:53 +1100
  LV Status              available
  # open                 0
  LV Size                50.00 GiB
  Current LE             12800
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     28672
  Block device           252:3

blockdev --getsize64 /dev/zvol/SSD/vm-129-disk-2
53687091200
blockdev --getsize64 /dev/vg-rose-raidPM983a/dummy
53687091200
 
# sector size
blockdev --getss /dev/zvol/SSD/vm-129-disk-2
512
blockdev --getss /dev/vg-rose-raidPM983a/dummy
4096

# physical block size
blockdev --getpbsz /dev/zvol/SSD/vm-129-disk-2
16384
blockdev --getpbsz /dev/vg-rose-raidPM983a/dummy
4096

This is just for debug… but looks to confirm the previous details

Might be able to `lvcreate -L 50G -n dummy vg-rose-raidPM983a --setphysicalblocksize 512` and see if that solves the issue (but I'd rather keep the 4K blocks end to end)

block size of 16k for zfs is weird to see but I believe that’s correct.
Proxmox's default volblocksize for ZFS pools meant to store VM disks is 16k

This is looking more and more like the issue might be on the qemu and proxmox side… but it’s still very annoying - migrating from 512 to 4K LBA disks really shouldn’t be a major problem

Must be a regression?

plus I think I’ve caught at least a small bug on the mikrotik rose side… but likely wasn’t the root cause. so I’ll let mikrotik explore
 
Last edited:
Regarding the qemu-img convert issue: in all three reports, the problematic offset is 2145386496, which is exactly 2 MiB below 2 GiB.
 
@drmartins @vmwaretoprox @Shlee while having the problematic version installed, could you also capture a trace with strace of the failing qemu-img command? I.e.:
Code:
apt install strace
# allocate dummy target and lvchange commands to activate as before
strace qemu-img convert -n -f raw -O raw /path/to/source/image /path/to/target/image  2>&1 | tail -n 1000 > /tmp/qemu-img-strace.txt

Looking at the changes between QEMU 10.0 and 10.1, this one might be a candidate:
file-posix: allow BLKZEROOUT with -t writeback

I.e. the default -t writeback did not change, but the zeroout operation was previously not used. And maybe that operation causes issues with some storages. The strace output might help to confirm this.
 
Update: https://gitlab.com/qemu-project/qemu/-/issues/3257

Code:
1767451263.183051 fallocate(8</dev/dm-3<block 252:3>>, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2105540608, 2097152) = 0 <0.004543>
1767451263.187637 write(7<{eventfd-count=0, eventfd-id=803, eventfd-semaphore=0}>, "\1\0\0\0\0\0\0\0", 8) = 8 <0.000020>
1767451263.204195 fallocate(8</dev/dm-3<block 252:3>>, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2116026368, 2097152) = 0 <0.003597>
1767451263.207834 write(7<{eventfd-count=0, eventfd-id=803, eventfd-semaphore=0}>, "\1\0\0\0\0\0\0\0", 8) = 8 <0.000054>
1767451263.225862 fallocate(8</dev/dm-3<block 252:3>>, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2126512128, 2097152) = 0 <0.002620>
1767451263.228756 write(7<{eventfd-count=0, eventfd-id=803, eventfd-semaphore=0}>, "\1\0\0\0\0\0\0\0", 8) = 8 <0.000027>
1767451263.245782 fallocate(8</dev/dm-3<block 252:3>>, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2136997888, 2097152) = 0 <0.003657>
1767451263.249490 write(7<{eventfd-count=0, eventfd-id=803, eventfd-semaphore=0}>, "\1\0\0\0\0\0\0\0", 8) = 8 <0.000031>
1767451263.264920 fallocate(8</dev/dm-3<block 252:3>>, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2147479552, 3584) = -1 EINVAL (Invalid argument) <0.000015>
1767451263.264986 ioctl(8</dev/dm-3<block 252:3>>, BLKZEROOUT, [2147479552, 3584]) = -1 EINVAL (Invalid argument) <0.000014>
1767451263.265051 write(7<{eventfd-count=0, eventfd-id=803, eventfd-semaphore=0}>, "\1\0\0\0\0\0\0\0", 8) = 8 <0.000011>
1767451263.570552 +++ exited with 1 +++
1. Feels like this is a root cause...
2. I haven't confirmed BLKZEROOUT on my Mikrotik rose, but I'd imagine it's supported.
3. it looks like the write is happening after the failure? so it's not checking for a successful fallocate before the write? sounds like a second problem?

Warning: LLM output edited by myself.

Code:
Successful FALLOC_FL_PUNCH_HOLE
The successfully offset 2147479552 and lengths 2097152 are 4K aligned.

Failing FALLOC_FL_PUNCH_HOLE and BLKZEROOUT
The offset 2147479552 is 4K-aligned.
The 3584 is not a typical block size multiple (it is 7 × 512, but not 4K aligned).

qemu-img appears to try hole punching first, then falls back to BLKZEROOUT. On your device, both fail, due to the weird length.
 

Attachments

Last edited: