Linux Cloud Images Freeze with Default SCSI Controller

Jan 6, 2022
1
0
6
36
I am creating new VMs using the command line interface. They default to using the LSI 53C895A SCSI Controller.

Common Linux cloud images boot, but don't make it all the way to a login screen. I've tried CentOS and Ubuntu.

I am not posting to ask for help, because I discovered that changing the to the VirtIO SCSI Controller resolves the problem.
  • I am posting so that the Proxmox Team and Community are aware of the issue, because it took me a long time to troubleshoot the root cause.
  • I am also hoping that something can be done, so that using all defaults will work, instead of causing mysterious VM freezes.

Steps to Recreate: Assumes Proxmox node name is pve3 and target storage is local.
Bash:
#download cloud image of your choice
curl -o ~/cloud.img https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2
curl -o ~/cloud.img https://cloud-images.ubuntu.com/releases/focal/release/ubuntu-20.04-server-cloudimg-amd64.img

#create VM and attach cloud disk image
pvesh create /nodes/pve3/qemu --vmid 2929
qm importdisk 2929 ~/cloud.img local
qm set 2929 --scsi0 local:2929/vm-2929-disk-0.raw
qm set 2929 --boot order=scsi0

VMs will boot, but freeze during the startup process:

Bash:
#ubuntu stopping point
Btrfs loaded, crc32c=crc32c-generic
random: fast init done #...after about 5 minutes
random: crng init done #..after about 10 minutes

#centos stopping point
ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xe160 irq14
ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xe168 irq15
tsc: Refined TSC clocksource calibration: 3191.957 MHz

I saw another post related to VM IO erros and the LSI 53C895A controller on the forum.
  • However, the solution proposed doesn't seem to work for me.
  • I have the version of pve-qemu-kvm that was recommended to fix the issue.
Bash:
#using Proxmox 7.1-8 with pve-no-subscription repo
dpkg -s pve-qemu-kvm | grep Version
Version: 6.1.0-3
uname -rv
5.13.19-2-pve #1 SMP PVE 5.13.19-4 (Mon, 29 Nov 2021 12:10:09 +0100)

Does anyone have any thoughts on how to avoid such a difficult-to-identify bug for (what seems to me) a common use case when using all defaults?
 
I have exactly the same experience but on a slightly different setup.

I'm using the Modules Garden's WHMCS module, but I'm not sure it's the module's fault.

Here is my setup:

- HV1 has own Cloud-Init for Ubuntu-
- Hv7 has own Cloud-Init for Ubuntu

HV1 + HV7 are part of a cluster. There is no shared storage.

On HV1, Cloud-Init worked properly before it was a cluster. Since HV7 has joined the cluster, I haven't used HV1 anymore but before it deployed perfectly every time.

Now on HV7 I have the same problem as you describe.

Below is screenshot of the image before cloning. In my opinion the SCSI controller is correct and should stay at VirtIO SCSI.

1655014991443.png

However, after cloning, the machine boots, and it's SCSI controller is at default, namely LSI 53C895A. The machine won't boot.

> I saw another post related to VM IO errors and the LSI 53C895A controller on the forum...

I'm not so sure if this is a technical issue with errors. I'd rather say it's a configuration issue with Cloud-Init.

Although I can manually just change it for the VM to boot, we cannot commission VMs automatically anymore using the Modules Garden's WHMCS module. What's also interesting when we use the module to try and upgrade the VM, the controller also tried to default back from VirtIO to LSI 53C895A.

I have approached Modules Garden who kindly offered to help but I just still need to make time and get a test setup going for them. I'll keep you posted, but please do let me know if you have solved this in the meanwhile.

> I am not posting to ask for help...

Lol, I am posting to ask for help. I'm sure it's something small but I don't know where to start...
 
Last edited:
I'm using the Modules Garden's WHMCS module, but I'm not sure it's the module's fault.

Most likely not, since the module uses the exact same settings that appear in the machine's configuration. However, just to be sure, you can handle this as you would most such tricky problems, i.e. check if it also occurs in the Proxmox panel. If not, compare if the configuration used in the module mirrors the one in Proxmox. Even one small difference in configuration options can lead to some unexpected problems.

I have approached Modules Garden who kindly offered to help but I just still need to make time and get a test setup going for them. I'll keep you posted, but please do let me know if you have solved this in the meanwhile.

Take your time Eugene, we're ready when you are!
 
  • Like
Reactions: eugenevdm
@ModulesGarden you just made my day :) Your service is so ridiculously good and has always been for all these years. Thank you so much. Will get in touch soon. Our Business growing at breakneck speed and I can hardly keep up..
 
  • Like
Reactions: ModulesGarden
It took me a year to isolate this problem but finally by logging a ticket and double checking everything I finally notice this rookie mistake. The default for Hard Drive was set wrong! It's working now.

1689498902732.png
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!