qcow2 disk image corruption

cspiegel

New Member
May 17, 2009
24
0
1
Hi,

I'm currently deploying Proxmox VE 1.2 as a possible replacement for VMware certain products and I'm very satisfied with the Proxmox VE concept up to now. Great work!

There's just one thing which was driving me crazy last week. I setup a brand new kvm based machine and installed Windows 2003 Server. No problems up to then. By default, Proxmox VE will setup new machines with qcow2 disk images. I would also prefer these since they allow the creation of snapshots even while the machine is running what can be quite helpful.

During installation of several software products on the new Windows server, at the beginning, everything went quite smoothly. But then, all of a sudden, Windows told me "error while writing to disk, data loss occured". The message came up every now and then (together with an "attention" tray icon) and I started wondering. But still, Windows worked as expected. I shut down the machine (again, the "error while writing to disk" message popped up several times) and afterwards, I was unable to start it up again. The emulated BIOS told me something like "no bootable media found". Obviously, the qcow2 disk image completely crashed in the meanwhile. After some survey in the QEMU forums, I found that there were at least three bugfixes regarding corrupted qcow2 disk images since QEMU version 0.10.0 which is located in the qemu-server package of Proxmox VE 1.2 (cf. QEMU changelog). Other users obviously had the same problem. Trying to use qemu-img to convert the qcow2 disk to e.g. a raw image results either in "error while reading" or "segmentation fault". There seems to be no way to fix broken qcow2 images. Also weird things happened when snapshots were previously created on one of the affected images. Sometimes, they just vanished and sometimes, I was unable to remove them from the image.

As a workaround, I now converted my qcow2 files to raw files just after disk creation and used the Proxmox VE web interface to delete the qcow2 file and change to the raw image. No problems up to now. However, there's no way to create snapshots from the raw disk images.

However, it would be great to have an update of the qemu-server package. The current version available at QEMU is 0.10.4. Question for the Proxmox VE package mainainers: how much trouble would it be to provide an update? It would be highly appreciated.

Best wishes,
Christoph
 
However, it would be great to have an update of the qemu-server package. The current version available at QEMU is 0.10.4.

We use kvm (which is a bit different from qemu). We will release an update after kvm people release an update (kvm-86).

- Dietmar
 
We use kvm (which is a bit different from qemu). We will release an update after kvm people release an update (kvm-86).

Hi Dietmar,

thanks for the quick response.

Do I understand correctly that in the Proxmox VE 1.2 distribution, there are (among others) these two packages:

pve-kvm/lenny uptodate 85-1
qemu-server/lenny uptodate 1.0-12

The first one contains the kvm components, the second one provides QEMU server tools like qemu-img.

Does that mean that qemu-server provides the QEMU server tools but the QEMU core components (such as disk image core components) are included in the pve-kvm package?

Thanks for clarification.

Regards
Christoph
 
Hi Dietmar,

thanks for the quick response.

Do I understand correctly that in the Proxmox VE 1.2 distribution, there are (among others) these two packages:

pve-kvm/lenny uptodate 85-1
qemu-server/lenny uptodate 1.0-12

The first one contains the kvm components, the second one provides QEMU server tools like qemu-img.

Does that mean that qemu-server provides the QEMU server tools but the QEMU core components (such as disk image core components) are included in the pve-kvm package?

Thanks for clarification.

Regards
Christoph

I never had this. please describe your host hardware and you guest setup (post you /etc/qemu-server/VMID.conf of the guest)
 
I never had this. please describe your host hardware and you guest setup (post you /etc/qemu-server/VMID.conf of the guest)

Hi Tom,

if I browse the KVM or QEMU bug trackers, I can find that there are multiple users having the same problem. It occurs mostly (if not exclusively) for Windows guests and the image becomes corrupted in the very moment when Windows finished shutting down.

This is my machine config:
Code:
name: somename
ide2: none,media=cdrom
smp: 4
bootdisk: ide0
ostype: w2k3
memory: 2048
vlan0: virtio=somemacaddress
I'm running Proxmox VE 1.2 on a dual socket, eight core Intel machine with a 3ware RAID controller. I'm having two LVM PVs, installed PVE on the first one and added the second one to /dev/pve/data.

Also for me, the problem only occured for Windows guests and only if the underlying disk image has qcow2 format. I converted to raw as a workaround, still no problem up to now.

I'm pretty certain that this issue is related to the KVM bug linked in my previous post. Actually my impression is that the code causing the trouble is taken from the QEMU to the KVM project. The bug is still unresolved in KVM.

Regards
Christoph
 
Hi Tom,

if I browse the KVM or QEMU bug trackers, I can find that there are multiple users having the same problem. It occurs mostly (if not exclusively) for Windows guests and the image becomes corrupted in the very moment when Windows finished shutting down.

This is my machine config:
Code:
name: somename
ide2: none,media=cdrom
smp: 4
bootdisk: ide0
ostype: w2k3
memory: 2048
vlan0: virtio=somemacaddress
I'm running Proxmox VE 1.2 on a dual socket, eight core Intel machine with a 3ware RAID controller. I'm having two LVM PVs, installed PVE on the first one and added the second one to /dev/pve/data.

Also for me, the problem only occured for Windows guests and only if the underlying disk image has qcow2 format. I converted to raw as a workaround, still no problem up to now.

I'm pretty certain that this issue is related to the KVM bug linked in my previous post. Actually my impression is that the code causing the trouble is taken from the QEMU to the KVM project. The bug is still unresolved in KVM.

Regards
Christoph

thanks for info. can you try with just one CPU and with e1000 instead of virtio? not that I am sure that this helps but if possible, give it a try.
 
Does that mean that qemu-server provides the QEMU server tools but the QEMU core components (such as disk image core components) are included in the pve-kvm package?

No. Please use 'dpkg' if you want to examine package contents:

Code:
dpkg -L pve-kvm
dpkg -L qemu-server

- Dietmar
 
No. Please use 'dpkg' if you want to examine package contents

Hi Dietmar,

thanks for the hint. I assume that you at Proxmox maintain the pve-kvm package. Since you told me that you'll go for an update as soon as kvm provides one, I conclude that the affected QEMU components will only be updated in Proxmox VE if the guys at KVM update them. Is that true? In that case, I will monitor the KVM bug lists in the next weeks.

Regards
Christoph
 
thanks for info. can you try with just one CPU and with e1000 instead of virtio? not that I am sure that this helps but if possible, give it a try.

Hi Tom,

I'm going to setup a new guest and give it a try. Since I'll have to provide at least four cores, I will first off only try to not use the VIRTIO network card.

Regards,
Christoph
 
Since you told me that you'll go for an update as soon as kvm provides one, I conclude that the affected QEMU components will only be updated in Proxmox VE if the guys at KVM update them. Is that true? In that case, I will monitor the KVM bug lists in the next weeks.

I just uploaded a new versions to the pvetest repository:

ftp://pve.proxmox.com/debian/dists/lenny/pvetest/binary-amd64/

Please consider that unstable - we have not done any test jet.

- Dietmar
 
can you try with just one CPU and with e1000 instead of virtio? not that I am sure that this helps but if possible, give it a try.

I've tested without virtio networking for one week now, no problems with the disk images so far. I have two Windows 2003 servers and three Linux machines in the test setup.

There might be a relation between the virtio networking drivers and the qcow2 disk image corruptions observed under Windows operating systems only.

I did not yet test it with kvm-86 provided in the pve-1.3 prerelease.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!