Ubuntu vm wont start when cloned to lvm iscsi shared storage, local storage ok

lost_avocado

New Member
Oct 30, 2023
13
0
1
Hi
Got issues when trying to deploy vm's from template to lvm shared storage(iscsi), I'm sure this worked ok some weeks ago. When cloning the same template to local storage theres no problems. This is a 4 node cluster created as POC and so far a no-sub license. Hopefully this could be resolved, but this was a little setback to say at least :)

Other vm's on the same storage are ok, boots and migrates with no problems.
Have tried some steps like changing the disks from iscsi to ide, virtio and also tried to change the machine parameter. But so far nothing works, have anybody stumbled across similar issues lately?
Any tips would be greatly appreciated.

The vm starts, then stops and drops into shell ...
Gave up waiting for root file system device. Common problems:
- Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)

- Missing modules (cat /proc/modules; ls /dev)
ALERT! LABEL=cloudimg-rootfs does not exist. Dropping to a shell!

BusyBox v1.30.1 (Ubuntu 1:1.30.1-7ubuntu3) built-in shell (ash)


VM config:
:~# qm config 6001
agent: 1
boot: order=ide2;scsi0;ide0;net0
cipassword: **********
ciuser: admin
cores: 2
cpu: kvm64
ide0: SAS-1:vm-6001-cloudinit,media=cdrom,size=4M
ide2: none,media=cdrom
ipconfig0: ip=dhcp
memory: 2048
meta: creation-qemu=8.1.5,ctime=1707403862
name: 6001-iscsi-slettes
nameserver: 1.1.1.1
net0: virtio=BC:24:11:78:C6:4E,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: SAS-1:vm-6001-disk-0,size=10G
scsihw: virtio-scsi-pci
searchdomain:
serial0: socket
smbios1: uuid=56fc8a0e-a43c-47b1-b172-a3ee98e97f0d
sockets: 1
vga: serial0
vmgenid: 9c4ffbb0-78ae-4311-ac48-75da6739919d

pveversion -v
proxmox-ve: 8.1.0 (running kernel: 6.5.11-8-pve)
pve-manager: 8.1.4 (running version: 8.1.4/ec5affc9e41f1d79)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.5: 6.5.11-8
proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8
proxmox-kernel-6.5.11-7-pve-signed: 6.5.11-7
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.7-pve2
corosync: 3.1.7-pve3
criu: 3.17.1-2
dnsmasq: 2.89-1
frr-pythontools: 8.5.2-1+pve1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.0.7
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.1.0
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.5
libpve-network-perl: 0.9.5
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.0.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve4
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.4-1
proxmox-backup-file-restore: 3.1.4-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.1.3
pve-cluster: 8.0.5
pve-container: 5.0.8
pve-docs: 8.1.3
pve-edk2-firmware: 4.2023.08-3
pve-firewall: 5.0.3
pve-firmware: 3.9-1
pve-ha-manager: 4.0.3
pve-i18n: 3.2.0
pve-qemu-kvm: 8.1.5-2
pve-xtermjs: 5.3.0-3
qemu-server: 8.0.10
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.2-pve1
 
Last edited:
When narrowing down the issue, NFS and Ceph is working ok.
Cloning ubuntu template from NFS datastore to same datastore works ok... and then live-cloning that vm to iscsi datastore works ok.
But when trying to clone the same template directly to iscsi datastore fails..

Gave up waiting for root file system device. Common problems:
- Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)

- Missing modules (cat /proc/modules; ls /dev)
ALERT! LABEL=cloudimg-rootfs does not exist. Dropping to a shell!


It seems like the clone process mess up the disk somehow and the iscsi datastore is just fine, so is there a way to troubleshoot that process?
Whats the difference from cloning from template vs cloning from a running vm?

Any tips would be greatly appreciated :)
 
Last edited:
When doing a qemu-img compare we get 'Content mismatch at offset 4096!' between the original template disk ans the newly created vm disk.
Seeing this error both when doing qm clone and qm importdisk to the iscsi datastore.

The only way to get a vm up and running on iscsi datastore is to create the same vm on either ceph or nfs and use the good old cp to to replace the imported troubled disk. So it seems to be related to some qemu-img stuff.
 
Procedure to reproduce.
Original disk file:
/dev/SAS-1/vm-150-disk-0

qm clone 150 4202 --full --storage SAN-A --target pve4 --name Ubuntu-iscsi-1
qm clone 150 4201 --full --storage TRUENAS --target pve4 --name Ubuntu-NFS-1

nfs vm disk:
/mnt/pve/TRUENAS/images/4201/vm-4201-disk-0.raw
qemu-img compare /dev/SAS-1/vm-150-disk-0 /mnt/pve/TRUENAS/images/4201/vm-4201-disk-0.raw
Images are identical.

iscsi vm disk:
/dev/SAS-1/vm-4202-disk-0
qemu-img compare /dev/SAS-1/vm-150-disk-0 /dev/SAS-1/vm-4202-disk-0
Content mismatch at offset 4096!

dd if=/mnt/pve/TRUENAS/images/4201/vm-4201-disk-0.raw of=/dev/SAS-1/vm-4202-disk-0 status=progress
2358632960 bytes (2.4 GB, 2.2 GiB) copied, 296 s, 8.0 MB/s
4612096+0 records in
4612096+0 records out
2361393152 bytes (2.4 GB, 2.2 GiB) copied, 296.656 s, 8.0 MB/s
qemu-img compare /dev/SAS-1/vm-150-disk-0 /dev/SAS-1/vm-4202-disk-0
Images are identical.
(speed is adjusted to 100Mb, just in case it cause any corruption)

Is it possible to add some verification parameters to the qm clone or use another version of qemu?

qemu-system-x86_64 --version
QEMU emulator version 8.1.5 (pve-qemu-kvm_8.1.5-3)
Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
 
Last edited:
Could this be related to physical and logical sector size, as nfs got physical and logical sector sizes of 512 vs iscsi 512/4096 bytes (512e). I cannot find the option to set the qemu sectorsize anywhere, just an old bug thread.
https://bugzilla.proxmox.com/show_bug.cgi?id=3282

nfs:
fdisk -l /mnt/pve/TRUENAS/images/4202/vm-4202-disk-0.raw
Disk /mnt/pve/TRUENAS/images/4202/vm-4202-disk-0.raw: 2.2 GiB, 2361393152 bytes, 4612096 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt

Iscsi:
fdisk -l /dev/mapper/SAS--1-vm--4201--disk--0
Disk /dev/mapper/SAS--1-vm--4201--disk--0: 2.2 GiB, 2361393152 bytes, 4612096 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 1048576 bytes
Disklabel type: gpt
 
1. could you also provide your storage.cfg ?
2. what is the iscsi target?
3. could you also post your lvm.conf please?
4. could you also post the clone task log, and the journalctl output covering the time period of the clone?

thanks!
 
I had to do this all over as the previous vm's are deleted, but still getting disk corruption while cloning to iscsi datastore, but now I got a different offset "Content mismatch at offset 117497856!". This actually isn't stopping the ubuntu from booting up ok, but I'm pretty sure a windows vm is not able to boot.

1. storage.cfg
Code:
root@pve4:~# cat /etc/pve/storage.cfg
iscsi: Lenovo4200-1
        portal 10.1.x.x
        target iqn.2002-09.com.lenovo:01.array.00c0ff3b2e24
        content none

iscsi: Lenovo4200-2
        portal 10.2.x.x
        target iqn.2002-09.com.lenovo:01.array.00c0ff3b2e24
        content none


lvm: SAN-A
        vgname SAS-1
        content images
        shared 1

2. The iscsi target is Lenovo DS4200 multipath config.

3. lvm.conf (as far as I can see all was commented out, unless the last lines in that file)
Code:
devices {
     # added by pve-manager to avoid scanning ZFS zvols and Ceph rbds
     # global_filter=["r|/dev/zd.*|"]
     global_filter=["r|/dev/zd.*|","r|/dev/rbd.*|"]
 }

4. Journal and clone task log
Code:
root@pve4:~# journalctl --since 09:04
Mar 12 09:04:02 pve4 pmxcfs[2649]: [status] notice: received log
Mar 12 09:04:02 pve4 pmxcfs[2649]: [status] notice: received log
Mar 12 09:04:02 pve4 pmxcfs[2649]: [status] notice: received log
Mar 12 09:04:03 pve4 qm[2119485]: <root@pam> starting task UPID:pve4:002057D6:0374CF95:65F00C73:qmclone:105:root@pam:
Mar 12 09:04:06 pve4 kernel: nfs: server x.x.x.x not responding, timed out
Mar 12 09:04:07 pve4 qm[2119485]: <root@pam> end task UPID:pve4:002057D6:0374CF95:65F00C73:qmclone:105:root@pam: OK
Mar 12 09:04:12 pve4 kernel: sd 6:0:0:0: alua: supports implicit TPGS
Mar 12 09:04:12 pve4 kernel: sd 6:0:0:0: alua: device naa.600c0ff0003b1e68e23f5b6501000000 port group 0 rel port 1
Mar 12 09:04:12 pve4 kernel: sd 7:0:0:0: alua: supports implicit TPGS
Mar 12 09:04:12 pve4 kernel: sd 7:0:0:0: alua: device naa.600c0ff0003b1e68e23f5b6501000000 port group 1 rel port 7
Mar 12 09:04:12 pve4 kernel: sd 8:0:0:0: alua: supports implicit TPGS
Mar 12 09:04:12 pve4 kernel: sd 8:0:0:0: alua: device naa.600c0ff0003b1e68e23f5b6501000000 port group 0 rel port 3
Mar 12 09:04:12 pve4 kernel: sd 9:0:0:0: alua: supports implicit TPGS
Mar 12 09:04:12 pve4 kernel: sd 9:0:0:0: alua: device naa.600c0ff0003b1e68e23f5b6501000000 port group 1 rel port 5

create full clone of drive ide0 (TRUENAS:105/vm-105-cloudinit.raw)
  Logical volume "vm-4201-cloudinit" created.
create full clone of drive scsi0 (TRUENAS:105/base-105-disk-0.raw)
  Wiping gpt signature on /dev/SAS-1/vm-4201-disk-0.
  Wiping gpt signature on /dev/SAS-1/vm-4201-disk-0.
  Wiping PMBR signature on /dev/SAS-1/vm-4201-disk-0.
  Logical volume "vm-4201-disk-0" created.
transferred 0.0 B of 2.2 GiB (0.00%)
transferred 24.8 MiB of 2.2 GiB (1.10%)
.....
transferred 2.2 GiB of 2.2 GiB (100.00%)
transferred 2.2 GiB of 2.2 GiB (100.00%)
TASK OK
 
whatabout lvmconfig --typeconfig full devices/issue_discards?

could you try doing a test clone of a VM with a very small disk and compare the source and clone LVs? and then maybe repeat that experiment, but fully write the source volume before the clone?

my guess is your storage (box) might be behaving wrong either with regards to holes or discards..
 
  • Like
Reactions: Kingneutron
Can you find out WHAT the actual mismatch is?
Thank you for your reply LnxBil
The last clone I got the mismatch at a different offset.
If I do a hexdump of that offset I got this result If that makes any sense, this is to be fair over my head.
Offset 117497856:
0700e000: ffff ffff ffff ffff ffff ffff ffff ffff ................
whatabout lvmconfig --typeconfig full devices/issue_discards?

could you try doing a test clone of a VM with a very small disk and compare the source and clone LVs? and then maybe repeat that experiment, but fully write the source volume before the clone?

my guess is your storage (box) might be behaving wrong either with regards to holes or discards..
Thank you for looking into this Fabian.
We will try those tests and see if we can find anything. Will report back later.
 
Thank you for your reply LnxBil
The last clone I got the mismatch at a different offset.
If I do a hexdump of that offset I got this result If that makes any sense, this is to be fair over my head.
Offset 117497856:
0700e000: ffff ffff ffff ffff ffff ffff ffff ffff ................
Fabian's and my questions go into the same direction. I would like to know the data of the mismatch on both disks. The output you provides is "all ones", the other storage would als be nice to know.
 
Fabian's and my questions go into the same direction. I would like to know the data of the mismatch on both disks. The output you provides is "all ones", the other storage would als be nice to know.
Thank you for your reponse, here is another try, the offset is different as you see.

Content mismatch at offset 114294784!
Code:
xxd -s 114294784 -l 16 /mnt/pve/TRUENAS/images/105/base-105-disk-0.raw
06d00000: 0000 0000 0000 0000 0000 0000 0000 0000  ................

xxd -s 114294784 -l 16 /dev/mapper/SAS--1-vm--4205--disk--0
06d00000: 0a11 0603 0000 0001 2000 0000 0000 0000  ........ .......

The Truenas disk is nfs and the template.
 
If I use dd to copy the same disk file to the newly created vm there is no content mismatch, thats consistent and ok every time I've done it.
Based on that I asumed that there could be something to do with the clone process, doesn't that make sense?

Code:
dd if=/mnt/pve/TRUENAS/images/105/base-105-disk-0.raw of=/dev/mapper/SAS--1-vm--4201--disk--0
4612096+0 records in
4612096+0 records out
2361393152 bytes (2.4 GB, 2.2 GiB) copied, 310.285 s, 7.6 MB/s
root@pve4:~# qemu-img compare /mnt/pve/TRUENAS/images/105/base-105-disk-0.raw /dev/mapper/SAS--1-vm--4201--disk--0
Images are identical.
 
Given that the failure seems to be related to iSCSI, you may want to enable iSCSI Digest for both headers and data (if your storage supports it).

The other option is to find the smallest image size that causes an issue and get a network trace source/client and client/dest. You may, at least, be able to point to the originator of the "bad" data.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I still think this is a problem in LVM (as @fabian already stated) and that it'll not overwrite the data with zeros and just ignore it. maybe some "optimization"?
 
I still think this is a problem in LVM (as @fabian already stated) and that it'll not overwrite the data with zeros and just ignore it. maybe some "optimization"?
Its possible. Or at the storage layer where zeroes can also be optimized (we do). Perhaps this particular storage does not do it properly.

One way to take PVE out of mix is to create an LVM slice of sufficient size and dd the raw image to it, does it compare properly?
Another way is to skip LVM and dd image directly to LUN - does the data (up to size of the image) compare properly?

One could also use "fio" write-verify with a well-known pattern, ie to include zeroes.

@LnxBil also reading @fabian's message I think he is pointing at the backend storage, not LVM layer. But I could be misreading.


PS upon checking a bit further into this - LVM does not do any data optimization, so it would be an unlikely culprit here.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
yes, I suspect the storage is doing some "wrong" optimization w.r.t. holes or discards, we've seen that in the past with some proprietary storage boxes. a plain dd might not trigger it since it doesn't do sparse copying by default.
 
@LnxBil also reading @fabian's message I think he is pointing at the backend storage, not LVM layer. But I could be misreading.
Yes, maybe I misread the comment about the issue_discards option. I tought this is a LVM option that does the discarding before sparse copying the disk image. Therefore I suspected LVM, yet the underlying problem is of course the storage, that does not honor this. If the discard option is not interpreted correctly and not actually zeroed, one would get the problem that is described here.
 
This is a Lenovo DS4200 that ain't that rare I guess, It's been i production for some years connected to other systems and we have not had any issues with it.
If I do a clone (getting 'Offset mismatch'), than fill the disk with zeros and delete the vm, then clone the exact same template the images are identical. Is this a backend storage issue, shouldn't the clone process handle this?

One more thing.
I tried to edit the lvm.conf file and uncommented the parameter "wipe_signatures_when_zeroing_new_lvs = 1", that didn't solve anything.
Is it possible to somehow force the qm clone to do an extra zeroing out the disks?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!