Disk resize problem

Thomas Plant · Aug 29, 2019

Verified, size seems correct after clone:

Code:

Entering Qemu Monitor for VM 112 - type 'help' for help
qm> info block -v drive-scsi1

drive-scsi1 (#block318): /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2 (qcow2)
    Attached to:      scsi1
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

Images:
image: /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2
file format: qcow2
virtual size: 172G (184683593728 bytes)
disk size: 132G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
qm>

And this is after online resize:

Code:

qm> info block -v drive-scsi1

drive-scsi1 (#block318): /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2 (qcow2)
    Attached to:      scsi1
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

Images:
image: /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 132G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
qm>

Error on management web is always:
VM 112 qmp command 'block_resize' failed - got timeout (500)

tim · Aug 30, 2019

Ok, I'm not able to reproduce this. Can you describe in detail all steps you did from exporting->converting->importing the xva including the exact version of xen and the tools you used. I don't think that this is a general problem, it's more likely some specific issue with that VM.

What's interesting is that when you tried to clone the VM 104:

Code:

create full clone of drive scsi1 (NFS01:104/vm-104-disk-1.qcow2)
Formatting '/mnt/pve/NFS01/images/114/vm-114-disk-1.qcow2', fmt=qcow2 size=0 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16

The size here was zero, but should have been 172G which indicates that there is some problem with the imported VM in the first place and/or with your storage where the size is read from.

Thomas Plant · Aug 30, 2019

The host from which I exported is XCP-ng 7.5, it is a fork of Citrix XenServer.
'XCP-ng Center' is version 7.6.1

Export went the following way:
- I created a snapshot from the live VM with XCP-ng-Center and expored it to an .xva.
- extracted the content of the .xva with 'tar -xf file.xva'
- the resulting files where processed with 'xva-img' https://github.com/eriklax/xva-img into a raw image.
an example is: ./xva-img -p disk-export "Ref:137" xvda.img
- then from Proxmox I mounted the nfs share where the raw image was created and imported with the following command:
qm importdisk 114 xvda.img NFS01 --format=qcow2

Did this with other VMs from this host and no one showed the strange behaviour as this vm 104.

Thomas Plant · Aug 30, 2019

And as this is a Centos 7, on XenServer/XCP-ng it has no virtio drivers in its initramfs. So I first have to attach the discs as IDE, add virto_scsi, virtio_net etc. to the initramfs, shutdown the vm, detach the disks and reattach as scsi.

Thomas Plant · Aug 30, 2019

Redid the conversion from xva with another utility https://gist.github.com/miebach/0433947bcf053de23159
Reimported the discs. Attached first as IDE then as SCSI after adding virtio drivers to initramfs.
Successfully resized one disk from 150 to 170 GBytes on the original VM!

Cloned the VM, resizing on the cloned VM destroyed the disc, but on the management screen I still see the 170 GB size.
Tried another resize on the original VM and this time it destroyed the disc, and showing in the management screen a size of 2 GBytes, the amount I tried to increase the disk (see attached screenshot). Seems to me that during the clone something happens to the size information of the discs?

Here just the output of the import from the converted images:

Code:

root@pve5:~# qm importdisk 104 /tmp/nfs/vs29/xvda-py.img NFS01 --format=qcow2
Formatting '/mnt/pve/NFS01/images/104/vm-104-disk-0.qcow2', fmt=qcow2 size=15032385536 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
    (100.00/100%)
root@pve5:~# qm importdisk 104 /tmp/nfs/vs29/xvdb-py.img NFS01 --format=qcow2
Formatting '/mnt/pve/NFS01/images/104/vm-104-disk-1.qcow2', fmt=qcow2 size=161061273600 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
    (100.00/100%)
root@pve5:~# qm importdisk 104 /tmp/nfs/vs29/xvdc-py.img NFS01 --format=qcow2
Formatting '/mnt/pve/NFS01/images/104/vm-104-disk-2.qcow2', fmt=qcow2 size=21474836480 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
    (100.00/100%)

Here is the log of the cloning of the VM:

Code:

create full clone of drive scsi2 (NFS01:104/vm-104-disk-2.qcow2)
Formatting '/mnt/pve/NFS01/images/114/vm-114-disk-0.qcow2', fmt=qcow2 size=21474836480 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
transferred: 0 bytes remaining: 21474836480 bytes total: 21474836480 bytes progression: 0.00 %
transferred: 216895848 bytes remaining: 21257940632 bytes total: 21474836480 bytes progression: 1.01 %
transferred: 431644213 bytes remaining: 21043192267 bytes total: 21474836480 bytes progression: 2.01 %
transferred: 648540061 bytes remaining: 20826296419 bytes total: 21474836480 bytes progression: 3.02 %
...
transferred: 21474836480 bytes remaining: 0 bytes total: 21474836480 bytes progression: 100.00 %
create full clone of drive scsi0 (NFS01:104/vm-104-disk-0.qcow2)
Formatting '/mnt/pve/NFS01/images/114/vm-114-disk-1.qcow2', fmt=qcow2 size=15032385536 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
transferred: 0 bytes remaining: 15032385536 bytes total: 15032385536 bytes progression: 0.00 %
transferred: 150323855 bytes remaining: 14882061681 bytes total: 15032385536 bytes progression: 1.00 %
...
transferred: 15032385536 bytes remaining: 0 bytes total: 15032385536 bytes progression: 100.00 %
create full clone of drive scsi1 (NFS01:104/vm-104-disk-1.qcow2)
Formatting '/mnt/pve/NFS01/images/114/vm-114-disk-2.qcow2', fmt=qcow2 size=182536110080 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
transferred: 0 bytes remaining: 182536110080 bytes total: 182536110080 bytes progression: 0.00 %
transferred: 1825361100 bytes remaining: 180710748980 bytes total: 182536110080 bytes progression: 1.00 %
...
transferred: 180948045922 bytes remaining: 1588064158 bytes total: 182536110080 bytes progression: 99.13 %
transferred: 182536110080 bytes remaining: 0 bytes total: 182536110080 bytes progression: 100.00 %
transferred: 182536110080 bytes remaining: 0 bytes total: 182536110080 bytes progression: 100.00 %
TASK OK

The sizes of the diskimages look alright, don't they?

I am in the process of exporting the VM again from the XCP-ng Server and redo the import, maybe during the export to .xva file something went wrong. But it seems strange as the VM works before the clone operation.

Kind Regards,
Thomas

Thomas Plant · Sep 2, 2019

Did reimport the disks, but no change of behaviour. Proxmox still destroys my disks after cloning the VM and doing a resize.

One thing I did notice. Why is the cloned disk bigger than the original disk?

Code:

root@pve5:~# ls -ls /mnt/pve/NFS01/images/104/
total 146086221
  7849241 -rw-r----- 1 nobody nogroup  15034941440 Sep  2 11:14 vm-104-disk-0.qcow2
138236981 -rw-r----- 1 nobody nogroup 182234319872 Sep  2 11:14 vm-104-disk-1.qcow2
root@pve5:~# ls -ls /mnt/pve/NFS01/images/114/
total 146087345
138238089 -rw-r----- 1 nobody nogroup 193303281664 Sep  2 11:23 vm-114-disk-0.qcow2
  7849257 -rw-r----- 1 nobody nogroup  15034941440 Sep  2 11:24 vm-114-disk-1.qcow2

tim · Sep 4, 2019

I did what you described, with a reference xen vm and couldn't see any of your issues happening. Is it possible that your share isn't used exclusively and that there is another cluster accessing the same img files and messing around with it?
Do you have snapshots of the vms?

Thomas Plant · Sep 4, 2019

Hi,
thanks for your efforts.

No this is the only Proxmox cluster we have. NFS is on his own network, no other devices can attach to it.
What do you mean by snapshots? Made on Proxmox oder XenServer Snapshots?

Thomas Plant · Sep 4, 2019

Sorted out a problem with switch firmware, with some rx errors on nfs server side. But this was not the problem as it seems.
Still eating my disks when cloning them.

Thomas Plant · Sep 5, 2019

Did another bit of testing. Added 10Gbit Card to a testserver, created an NFS share and connected our Cluster to it.
The NFS is now a Centos 6 and a hardware RAID Controller for the discs, NFS is now version 4 on the Open-E cluster it is 3.

Imported the same disks as before to the VM on Proxmox, cloned it and it behaves strange. I see now the size of the bigger disk as '0T'. See attached screenshot. Attached the log of the clone process, no unusual things in there it seems to me.

Files on the NFS storage look good to me:

Code:

root@pve5:~# ls -ls /mnt/pve/PVETEST/images/117/
total 189317436
 20593900 -rw-r----- 1 nobody 4294967294  21478375424 Sep  5 09:41 vm-117-disk-0.qcow2
154586580 -rw-r----- 1 nobody 4294967294 161086111744 Sep  5 09:42 vm-117-disk-1.qcow2
 14136956 -rw-r----- 1 nobody 4294967294  15034941440 Sep  5 09:41 vm-117-disk-2.qcow2

By the way, can I say to Proxmox to rescan the files on the nfs share, to eventually correct the size of '0T' it sees?
Did not try to resize it, as it already shows 0 it will destroy the file for shure.

Thanks,
Thomas

Thomas Plant · Sep 5, 2019

Did a last test. Reimported the disks as usual, configured the vm etc. etc.
Started the cloned VM, and tried to resize the smaller disks, this always worked. Then before resizing the 170 GB disk I edited it's properties, just setting the 'no backup' option, saved, and then I could resize the disk without problem. Did the same thing on the original VM and this also worked.
So there must some information going lost during the clone operation.....

Could use the steps above as a workaround but not very comfortable and error prone.

By the way: if we can not resolve this here in the forum, is there a commercial support option? Maybe we can do a remote session?

tim · Sep 5, 2019

The no backup option has really nothing to do with this context. AFAIK this isn't even evaluated when cloning or resizing.
You can force a rescan of the storage with:
qm rescan --vmid <integer>

Sure here are our offerings: https://www.proxmox.com/en/proxmox-ve/pricing

ThomasPl · Sep 5, 2019

Hello,

did not want to say that it has directly to do with the clone, in fact I modified the setting after the cloning. Couldn't it be that Proxmox looks again at the disks when I modify such an option?

Will try tomorror the qm rescan

tim · Sep 6, 2019

Yes would make sense, but in the case of the resize command the value in the config isn't used.
Can you do one more test and try to resize the disk while the VM isn't running.

Thomas Plant · Sep 6, 2019

Hi,

offline resizing works correctly. After the successful resize I started the VM and did another resize, it destroyed the vm.

Thomas Plant · Sep 6, 2019

Also the original VM was destroyed by an online resize.

tim · Sep 6, 2019

Ok great, we are getting closer to the actual problem.
Let's do some manual resizing to see what actually cause this.

First connect to the problematic VM with ssh.

tail -f /var/log/syslog

Keep it that way and monitor whats happening while we do the resize.

Now on the host:

qemu-img info <path-to-qcow2>

Take note of the size in bytes.

qm monitor <vmid>

In the new qm shell:

info block -v
block_resize <nameofdrive> 8G

The name of the drive is the first word from the info command e.g. drive-scsi0 and add a the size in absolut format, so if you had 170G write 180G.

Take a look at the syslog of the VM what happend.

Thomas Plant · Sep 6, 2019

Did what you told me, but no errors this time:

/var/log/messages said the right thing I think:

Code:

Sep  6 13:32:13 vs29 kernel: sd 2:0:0:1: Capacity data has changed
Sep  6 13:32:13 vs29 kernel: sd 2:0:0:1: [sdc] 356515840 512-byte logical blocks: (182 GB/170 GiB)
Sep  6 13:32:13 vs29 kernel: sdc: detected capacity change from 161061273600 to 182536110080
Sep  6 13:32:13 vs29 kernel: sdc: unknown partition table

Proxmox Management console does not reflect the change though.

tim · Sep 6, 2019

Ok, looks good.

Thomas Plant said:
Proxmox Management console does not reflect the change though.

Yeah that's expected, as we didn't tell proxmox anything about it.

Next step would be to add some debug printing to the actual code to see what's happening here, when executed via the management interface.

Open the file /usr/share/perl5/PVE/QemuServer.pm and locate the "sub qemu_block_resize" it's around line 4760.
Add the following line of code:

warn("DEBUG RESIZE:".$size);

Example:

Perl:

sub qemu_block_resize {
    my ($vmid, $deviceid, $storecfg, $volid, $size) = @_;

    my $running = check_running($vmid);

    $size = 0 if !PVE::Storage::volume_resize($storecfg, $volid, $size, $running);

    return if !$running;
    warn("DEBUG RESIZE:".$size);
    vm_mon_cmd($vmid, "block_resize", device => $deviceid, size => int($size));

}

After you saved your changes restart the pvedaemon:

systemctl restart pvedaemon.service

Now do the resize via the management console and monitor the host syslog for the debug warning. Let's see what's the size reported here, as you can see directly after our debug statement, the actual resize command is queued for execution which did work in the manual example you did before.

Make sure to remove the change after we finish testing, otherwise it will be removed with the next update.

Thomas Plant · Sep 6, 2019

Hello

here is the output of syslog of the host:

Code:

Sep  6 15:01:14 pve5 pvedaemon[12142]: <admtpl@pve> end task UPID:pve5:00002F94:05AF51A2:5D725898:vncproxy:117:admtpl@pve: OK
Sep  6 15:01:28 pve5 pvedaemon[12142]: <admtpl@pve> update VM 117: resize --disk scsi1 --size +20G
Sep  6 15:01:28 pve5 pvedaemon[12142]: DEBUG RESIZE:21474836480 at /usr/share/perl5/PVE/QemuServer.pm line 4768.

The output of the VM is in the attached screenshot.

Disk resize problem

Member

Proxmox Staff Member

Member

Member

Member

Attachments

Member

Proxmox Staff Member

Member

Member

Member

Attachments

Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Attachments

We value your privacy