[SOLVED] unable to create an Linux VM with VirtIO drive since update

JOduMonT

Well-Known Member
Jan 20, 2016
72
6
48
Bangkok
jdumont.consulting
Since i did this upgrade https://forum.proxmox.com/threads/pci-passthrough-fail-when-booting-with-5-11-22-7-pve.99551
I'm unable to create Linux (at least, I didn't try windows) such as Ubuntu, Univention, ParrotOS, Kali, ... with a VirtIO drive but it work well with SCSI.

it systematically fail at grub installation

My config
- Linux pve 5.11.22-5-pve #1 SMP PVE 5.11.22-10 (Tue, 28 Sep 2021 08:15:41 +0200) x86_64 GNU/Linux
- pve-qemu-kvm/stable,now 6.1.0-1
 
I'm unable to create Linux (at least, I didn't try windows) such as Ubuntu, Univention, ParrotOS, Kali, ... with a VirtIO drive but it work well with SCSI.
I mean, SCSI, with VirtIO-SCSI as controller is often better anyway.

it systematically fail at grub installation
What's the exact error, and what guest-OS versions do you try?
 
SCSI Controller is VirIO SCSI, that parameter I don't change it, when I'm talking about VirtIO vs SCSI is at the Bus/Device level.

What's the exact error, and what guest-OS versions do you try?
The day before the update, I created a vm with these parameters for the Hard Drive, which I used since I forever. Might be something wrong with my setting, but I use these since forever (I use Linux since 2008 and proxmox since early version 5)

Maybe also, not related to the update, but it is the only major change I have in my setup.
I redownload the ISO of ubuntu-20.04.3-live-server-amd64.iso and have the same issue, I also tried to point to a different repo but had the same issue. I also tried with cache at default and by disabling Discard without any success. But if I choose SCSI as Bus/Device it works well.

1637047322988.png

here a part of the crash log

Code:
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu27.18
Architecture: amd64
CasperMD5CheckResult: pass
CasperVersion: 1.445.1
CrashDB: {'impl': 'launchpad', 'project': 'subiquity'}
...
 [   90.027587] Buffer I/O error on device vda2, logical block 6328700
 [   90.027588] Buffer I/O error on device vda2, logical block 6328701
 [   90.027588] Buffer I/O error on device vda2, logical block 6328702
 [   90.027589] Buffer I/O error on device vda2, logical block 6328703
 [   90.027589] Buffer I/O error on device vda2, logical block 6328704
 [   90.027590] Buffer I/O error on device vda2, logical block 6328705
 [   90.027590] Buffer I/O error on device vda2, logical block 6328706
 [   90.027591] Buffer I/O error on device vda2, logical block 6328707
 [   90.027591] Buffer I/O error on device vda2, logical block 6328708
 [   90.027592] Buffer I/O error on device vda2, logical block 6328709
...
 [  100.051446] buffer_io_error: 3281 callbacks suppressed
 [  100.051447] Buffer I/O error on device vda2, logical block 71680
 [  100.051450] Buffer I/O error on device vda2, logical block 71681
 [  100.051450] Buffer I/O error on device vda2, logical block 71682
 [  100.051451] Buffer I/O error on device vda2, logical block 71683
 [  100.051451] Buffer I/O error on device vda2, logical block 71684
 [  100.051452] Buffer I/O error on device vda2, logical block 71685
 [  100.051452] Buffer I/O error on device vda2, logical block 71686
 [  100.051453] Buffer I/O error on device vda2, logical block 71687
 [  100.051453] Buffer I/O error on device vda2, logical block 71688
 [  100.051454] Buffer I/O error on device vda2, logical block 71689
...
 [  112.542475] buffer_io_error: 3574 callbacks suppressed
 [  112.542476] Buffer I/O error on device vda2, logical block 95746
 [  112.542478] Buffer I/O error on device vda2, logical block 95747
 [  112.542479] Buffer I/O error on device vda2, logical block 95748
 [  112.542480] Buffer I/O error on device vda2, logical block 95749
 [  112.542480] Buffer I/O error on device vda2, logical block 95750
 [  112.542481] Buffer I/O error on device vda2, logical block 95751
 [  112.542481] Buffer I/O error on device vda2, logical block 95752
 [  112.542482] Buffer I/O error on device vda2, logical block 95753
 [  112.542483] Buffer I/O error on device vda2, logical block 95754
 [  112.542490] Buffer I/O error on device vda2, logical block 95755
...
 [  124.734771] JBD2: Detected IO errors while flushing file data on vda2-8
CurtinConfig:
 # Autogenerated by Subiquity: 2021-11-16 02:22:39.372259 UTC
 apt:
   preserve_sources_list: false
   primary:
   - arches: [amd64, i386]
     uri: http://th.archive.ubuntu.com/ubuntu
   - arches: [default]
     uri: http://ports.ubuntu.com/ubuntu-ports
 curthooks_commands:
   001-configure-apt: [/snap/subiquity/2651/bin/subiquity-configure-apt, /snap/subiquity/2651/usr/bin/python3,
     'true']
 debconf_selections: {subiquity: ''}
 grub: {probe_additional_os: true, terminal: unmodified}
 install: {save_install_config: /var/log/installer/curtin-install-cfg.yaml, save_install_log: /var/log/installer/curtin-install.log,
   target: /target, unmount: disabled}
 kernel: {package: linux-generic}
 pollinate:
   user_agent: {subiquity: 21.08.2_2651}
 reporting:
   subiquity: {identifier: curtin_event.1862, type: journald}
 sources: {ubuntu00: 'cp:///media/filesystem'}
 stages: [early, partitioning, extract, curthooks, hook, late]
 storage:
   config:
   - {ptable: gpt, path: /dev/vda, wipe: superblock-recursive, preserve: false, name: '',
     grub_device: true, type: disk, id: disk-vda}
   - {device: disk-vda, size: 1048576, flag: bios_grub, number: 1, preserve: false,
     grub_device: false, type: partition, id: partition-0}
   - {device: disk-vda, size: 34356592640, wipe: superblock, flag: '', number: 2, preserve: false,
     grub_device: false, type: partition, id: partition-1}
   - {fstype: ext4, volume: partition-1, preserve: false, type: format, id: format-0}
   - {path: /, device: format-0, type: mount, id: mount-0}
   version: 1
 verbosity: 3
 write_files:
   etc_default_keyboard: {content: '# KEYBOARD CONFIGURATION FILE
 
 
       # Consult the keyboard(5) manual page.
 
 
       XKBMODEL="pc105"
 
       XKBLAYOUT="us"
 
       XKBVARIANT=""
 
       XKBOPTIONS=""
 
 
       BACKSPACE="guess"
 
       ', path: etc/default/keyboard, permissions: 420}
   etc_machine_id: {content: '81353fe4a9d14f0b8c12ba202ea7daff
 
       ', path: etc/machine-id, permissions: 292}
   etc_netplan_installer: {content: "# This is the network config written by 'subiquity'\n\
       network:\n  ethernets:\n    ens18:\n      dhcp4: true\n  version: 2\n", path: etc/netplan/00-installer-config.yaml}
   md5check: {content: "{\n  \"checksum_missmatch\": [\n],\n  \"result\": \"pass\"\n\
       }\n", path: var/log/installer/casper-md5check.json, permissions: 420}
   media_info: {content: Ubuntu-Server 20.04.3 LTS "Focal Fossa" - Release amd64 (20210824),
     path: var/log/installer/media-info, permissions: 420}
   nonet: {content: 'network: {config: disabled}
 
       ', path: etc/cloud/cloud.cfg.d/subiquity-disable-cloudinit-networking.cfg}
CurtinErrors:
 curtin-logs-2021-11-16-02-23/
 
SCSI Controller is VirIO SCSI, that parameter I don't change it, when I'm talking about VirtIO vs SCSI is at the Bus/Device level.
Sure, but you're using Virtio Block for disks so the SCSI controller does not matters at all currently, what I meant is trying to use (VirtIO) SCSI, but from the error log you posted that does not seems to matterr..

Buffer I/O error on device vda2, logical block 95746
This seems rather like a dying disk (or some other part in the IO stack)...
Check the Proxmox VE host syslog for IO errors and check also the disks (e.g., their S.M.A.R.T data), a (in-progress/partial) HW failure seems much more likely to me from the logs and this config working just fine until recently.
 
I wrote a topic about this specific behavior yesterday but it is stuck for verification so nobody cares about it at the moment.

But this specific behavior is at least from my side proven with Windows machines as well.
If I install Windows 10 with full VirtIO my installation stuck / freeze at installing updates. While if I'm using IDE/SATA everything is fine.

Unfortunately there are no specific logging available what shows any particiular reason for this. My quess is that something broke in qemu/kvm package because this behavior just exists since I installed latest proxmox updates version 7.0-14+1

Okay I guess I figured it. VirtIO BLK is dead. VirtIO SCSI is working like charm. I will run a 24h test but it seems like VirtIO-BLK under Windows is finally broken.
 
Last edited:
  • Like
Reactions: JOduMonT
I wrote a topic about this specific behavior yesterday but it is stuck for verification so nobody cares about it at the moment.

But this specific behavior is at least from my side proven with Windows machines as well.
If I install Windows 10 with full VirtIO my installation stuck / freeze at installing updates. While if I'm using IDE/SATA everything is fine.

Unfortunately there are no specific logging available what shows any particiular reason for this. My quess is that something broke in qemu/kvm package because this behavior just exists since I installed latest proxmox updates version 7.0-14+1

Okay I guess I figured it. VirtIO BLK is dead. VirtIO SCSI is working like charm. I will run a 24h test but it seems like VirtIO-BLK under Windows is finally broken.
Note that the author here specifically stated that they're just having issues with Linux based OS and did not try Windows at all:
I'm unable to create Linux (at least, I didn't try windows) such as Ubuntu, Univention, ParrotOS, Kali, ...

So not to sure how related this thread is or did you also see any IO errors on the disks like the author here does?
As I still would rather be blaming failing HW than any update. I mean, there's naturally a (small) chance that I'm wrong, and it's really something that an update broke. But, that would then be only for very specific setups, as installing a Fedora 35 and a Windows 11 worked just fine on VirtIO Block here (just re-tested)...
 
I can reproduce it multiple times that windows with VirtIO-Block is not working properly and end up with freeze. If I use VirtIO-SCSI everything ist absolutley fine. The Windows machines are freezing but there is no specific error noticeable. I just see that promox trys to reach the vm through qemu but it fails because of the freeze. Since I move to VirtIO-SCSI everything is fine again and even more performante. Linux vms started to fail when I tried to use PBS then disk died in linux very fast and a restart was required.
 
As I still would rather be blaming failing HW than any update. I mean, there's naturally a (small) chance that I'm wrong, and it's really something that an update broke. But, that would then be only for very specific setups, as installing a Fedora 35 and a Windows 11 worked just fine on VirtIO Block here (just re-tested)...
Retesting, before opening the issue, I tried 2-5 times per day for at least a week
such as redownloading .iso from different OS, but of course always Debian/Ubuntu.

Today my UCS which has been installed before the update with a BUS/Device VirtIO, started to have I/O corruption

But on the bright side, my Windows which is also installed before the update with BUS/Device on the same NVM storage works perfectly well
1637110670542.png

Which give me an idea, I'll try to reinstall my Ubuntu/Debian what ever Linux with Machine chipset q35 instead of i440fx.

@t.lamprecht I did believe and opened an issue on the Ubuntu side first, but nothing concrete has been followed.
 
Last edited:
@m4k5ym I know how it could be frustrated but if you need professional support Proxmox have a very affordable option for that.

For your Windows, did you follow the best practice guide, it really gives great insight and yes if you keep the default configuration of your VM, Windows will freeze during the installation.
 
Yeah, as said, check your drives/storage -> likely they're failing..

I also just tried to install a Debian 11 on latest Proxmox VE with VirtIO Block as disk driver, no issues whatsoever.
 
I have this problem too. I am regularly rebuilding vms with packer / terraform. Since 2 days i (sometimes) get the same errors as above when using virtio as disk type and virtio-scsi-pci as controller. Everything works fine with scsi as disk type.
No errors on other vms (yet) or on the host (nvme ssd with zfspool as underlying storage).
Sometimes the live installer switches to read-only filesystem, sometimes the error appears on the first boot after installation.
I currently only tried ubuntu 20.04 as guest vm.

Linux pve 5.13.19-1-pve #1 SMP PVE 5.13.19-2 (Tue, 09 Nov 2021 12:59:38 +0100) x86_64 GNU/Linux
pve-qemu-kvm 6.1.0-1
 
Last edited:
Everything works fine with scsi as disk type.
For anybody where that is the case I'd suggest switching to SCSI with the scsi-controller being VirtIO-SCSI, as that has more features than VirtIO block anyway.

I currently only tried ubuntu 20.04 as guest vm.
Ok, I can test that too, if it makes problems I'd suspect older guest kernels having a problem with newer host QEMU and/or Kernel
 
We were hit by something very similar in the last couple of days. We also are using Debian (11.1 Bullseye) as PVE-Host and Guest OS and starting on Nov 10th, one of our CI-Builders started to behave similar as described here.
We are using Docker based building recipes there. And whenever we did ANY Docker activity, we saw identical ext4 FS errors in dmesg on the Guest:


Code:
[Thu Nov 11 17:55:29 2021] EXT4-fs warning (device vda1): ext4_end_bio:349: I/O error 10 writing to inode 3540141 starting block 12492800)
[Thu Nov 11 17:55:29 2021] EXT4-fs warning (device vda1): ext4_end_bio:349: I/O error 10 writing to inode 3540141 starting block 12491520)
[Thu Nov 11 17:55:29 2021] EXT4-fs warning (device vda1): ext4_end_bio:349: I/O error 10 writing to inode 3540141 starting block 12493056)
[Thu Nov 11 17:55:29 2021] buffer_io_error: 9207 callbacks suppressed
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492544
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492545
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492546
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492547
[Thu Nov 11 17:55:29 2021] EXT4-fs warning (device vda1): ext4_end_bio:349: I/O error 10 writing to inode 3540141 starting block 12501760)
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492548
[Thu Nov 11 17:55:29 2021] EXT4-fs warning (device vda1): ext4_end_bio:349: I/O error 10 writing to inode 3540141 starting block 12502016)
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492549
[Thu Nov 11 17:55:29 2021] EXT4-fs warning (device vda1): ext4_end_bio:349: I/O error 10 writing to inode 3540141 starting block 12502272)
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492550
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492551
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492552
[Thu Nov 11 17:55:29 2021] Buffer I/O error on device vda1, logical block 12492553

At the same time, nothing important was to be seen in the Host's logs.
Sometimes, this issue seemed to be "bad enough" to the Guest kernel to trigger the panic action to mount the FS Read-Only, sometimes not. Sometimes, fsck found a corruption on the FS, sometimes not ... if this were a Bare Metal - Node, I had tried to exchange the data cable of the harddrive ...
It already was enough to just pull a new Docker image using `docker pull nginx` to trigger this, while we could more or less do whatever we want without triggering this, apart from Docker.

We now switched to VirtIO SCSI for the Bus and are observing the behaviour no problems so far.
 
Last edited:
Hmm, OK, with the amount of reports I think I retract my initial suspicion about HW going bad as bogus and would rather think that it's a regression in the VirtIO block subsystem of the new QEMU 6.1, that was made available since 2021-11-09 on pve-no-subscription.
 
@t.lamprecht FIY I'm an update addict and, while it might be a different subject, I'm still using stuck at the kernel 5.11.22-5-pve (now 2 kernels behind) because with newer kernel my Windows don't boot, it seems to be related to my VGA PCI-passthrough.

but more important, I saw new update for kvm, pve-server and libvirt storage, so I updating I reboot and try again.
 
Last edited:
OK, with the amount of reports I think I retract my initial suspicion about HW going bad as bogus
Sadly, I feel a little insulted by that; since, on my side at least, I tried many OS with 2 different storage
1st: one is a zfs-raid on nvm the other is a ssd-raid on SSD and I use both since over 3 years now, and I don't see any error (o yes I had before, but it was with my zfs-raid on sata disk so not these one)
2nd: also it works well, and I change only one parameter

also I would like to stress again the fact that when I'm changing virtio to scsi it is at the hard drive bus/device level not at the SCSI Controller level.
1637286808091.png

in any case thank for you support :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!