Disk resize problem

Thomas Plant

Member
Mar 28, 2018
93
1
13
54
Hello all,

I had a very strange and scary problem with one of our VMs when resizing a disk. The discs of the VM where imported through 'qm importdisk....' from an XenServer xva file. I cloned the VM to do a test and tried to resize on the clone. VM had 3 disks (scsi0 -> 2), where I tried to resize size disk 1 which had originally 150GB, tried to add 20 GByte, but instead the disk was truncated to 20GB! All data was gone.
Did the same thing on the original VM, there I got a lock error and in the management windows showed still 150GB, but it also killed the disk, the disk was damaged and unusable. Other VirtualMachines I tried resized correctly.

So, as stupid I am (destroying every evidence), I deleted the VM and reimported from XenServer the same way I did previously. But now resizing work correctly.

Proxmox is Version 6 with subscription and updated, but the VM originally was imported through Proxmox 5. Storage is NFS on an Open-E Jovian Cluster.

Anybody experienced such a problem? Makes me a little nervous putting in production this Proxmox Cluster.

Kind Regards,
Thomas
 

Attachments

  • pve-version.txt
    1.2 KB · Views: 6
Was able to reproduce the problem. Imported the VM, cloned it and tried to do an online resize of the disk....data is gone.
Made a video on the steps I did, except the clone of the disk. VM I cloned was stopped/shutdown.

Here the video: Proxmox eats my data

And I added a screenshot of the nfs mount where we can see that the disk I resized was effectively reduced to 12G....
The .conf file of the vm still showing size of 172G:

root@pve5:~# cat /etc/pve/qemu-server/112.conf
boot: cdn
bootdisk: scsi0
cores: 4
cpu: Broadwell
ide2: none,media=cdrom
memory: 8192
name: test
net0: virtio=FE:BD:B3:60:79:40,bridge=vmbr0,link_down=1
net1: virtio=8A:5E:35:6C:AD:3A,bridge=vmbr1,link_down=1
numa: 1
ostype: l26
scsi0: NFS01:112/vm-112-disk-0.qcow2,discard=on,size=14G
scsi1: NFS01:112/vm-112-disk-1.qcow2,discard=on,size=172G
scsihw: virtio-scsi-pci
smbios1: uuid=a606156a-fc4e-4681-99a2-0ec478e108e0
sockets: 2
vmgenid: 1771e21d-635e-4476-ad36-cdd74243897c
 

Attachments

  • 2019-08-27_10h25_19.png
    2019-08-27_10h25_19.png
    4.4 KB · Views: 12
Last edited:
Hi,

can you share what's in the syslog & task log of the host around the time when you cloned & resized the disk and additionally your storage configuration (/etc/pve/storage.cfg)
 
Ok, added the syslog and tar of /var/log/pve/tasks.

Operations tock place from 09:55 to aprox. 10:15 CEST

Edit: added storage.cfg file to the zip
 

Attachments

  • logs.zip
    105.9 KB · Views: 8
Ok, seems there is some sort of timeout with the connection.
Try the following:

# qm monitor 112

In the new qm shell type:

qm> info block -v drive-scsi1

Please post the output or any errors you might get.
 
Here is the output:

Code:
qm> info block -v drive-scsi1

drive-scsi1 (#block301): /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2 (qcow2)
    Attached to:      scsi1
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

Images:
image: /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2
file format: qcow2
virtual size: 172G (184683593728 bytes)
disk size: 132G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false
qm> info block -v drive-scsi1

drive-scsi1 (#block301): /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2 (qcow2)
    Attached to:      scsi1
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

Images:
image: /mnt/pve/NFS01/images/112/vm-112-disk-0.qcow2
file format: qcow2
virtual size: 10G (10737418240 bytes)
disk size: 7.5G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


Code:
qm> info block -v drive-scsi0                                                                                                                                        

drive-scsi0 (#block113): /mnt/pve/NFS01/images/112/vm-112-disk-1.qcow2 (qcow2)
    Attached to:      scsi0
    Cache mode:       writeback, direct
    Detect zeroes:    unmap

Images:
image: /mnt/pve/NFS01/images/112/vm-112-disk-1.qcow2
file format: qcow2
virtual size: 14G (15032385536 bytes)
disk size: 7.5G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Could it be that the clone process rotated the disk assignment? As in the 112.conf the disks are now configure this way:
scsi0: NFS01:112/vm-112-disk-1.qcow2,discard=on,size=14G
scsi1: NFS01:112/vm-112-disk-0.qcow2,discard=on,size=172G

And on the original VM they are ordered the right way around:
scsi0: NFS01:104/vm-104-disk-0.qcow2,discard=on,size=14G
scsi1: NFS01:104/vm-104-disk-1.qcow2,discard=on,size=172G
 
In the first code block, you did the "info block -v drive-scsi1" twice, with different results. Did you change anything in between or did it report 2 different images for the same drive?
 
Edit: I am a litte confused....in the first code block I did the first command before the resize and the second after the resize
 
Tried to clone the 104 VM a second time and now even this errors out:

Code:
transferred: 14646053227 bytes remaining: 386332309 bytes total: 15032385536 bytes progression: 97.43 %
transferred: 14797880321 bytes remaining: 234505215 bytes total: 15032385536 bytes progression: 98.44 %
transferred: 14948204176 bytes remaining: 84181360 bytes total: 15032385536 bytes progression: 99.44 %
transferred: 15032385536 bytes remaining: 0 bytes total: 15032385536 bytes progression: 100.00 %
transferred: 15032385536 bytes remaining: 0 bytes total: 15032385536 bytes progression: 100.00 %
create full clone of drive scsi1 (NFS01:104/vm-104-disk-1.qcow2)
Formatting '/mnt/pve/NFS01/images/114/vm-114-disk-1.qcow2', fmt=qcow2 size=0 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
transferred: 0 bytes remaining: 0 bytes total: 0 bytes progression: 0.00 %
qemu-img: output file is smaller than input file
TASK ERROR: clone failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f qcow2 -O qcow2 /mnt/pve/NFS01/images/104/vm-104-disk-1.qcow2 zeroinit:/mnt/pve/NFS01/images/114/vm-114-disk-1.qcow2' failed: exit code 1
 
Sure you can, but I would prefer to go on here just to not have to look at two different places for answers.
Do you have a local storage where you can test this as well, just to rule out the NFS share and make things little bit easier.

I haven't been able to reproduce it yet, but I'm on it.
 
Hi,

no problem if your working on it, I wait. Sorry, not enough local storage to test, VM is to big...I will see if the same problem exists with a smaller one and test it on the NFS and then on local storage.
 
On local or nfs storage?
 
Interessanterweise nachdem die Disk beim Vergrößern zerstört wurde, funktioniert das Vergrößern der jetzt nutzlosen Disk.
 
Did another test.....Resizing with the VM turned off after cloning does work too. Only when I do an online resize it kills my disk....
Did the following: resized the disk with the shutdown VM which went ok. Then I started the VM and did another resize which reduced the size of the disk to the amount it should have increased it, destroying it.
 
Next Test: cloned the VM again, started it up, resized the disk but put the size I wanted the disk to become in the resize windows and voila, it was expanded to the right size and VM still works. And now the 'normal' way of increasing disk size works too, tried twice and all worked as it should.
 
Next Test: cloned the VM again, started it up, resized the disk but put the size I wanted the disk to become in the resize windows and voila, it was expanded to the right size and VM still works. And now the 'normal' way of increasing disk size works too, tried twice and all worked as it should.

My first guess was, that qemu doesn't know about the correct size, that's why I asked for "info block -v drive-scsi1", but from the output you gave above, the size seems to be correct. Can you verify that "info block -v drive-scsi1" is the true size before the first time resizing?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!