error before or during data restore, some or all disks were not completely restored. VM 106 state is NOT cleaned up.

Matejkek

New Member
Jul 30, 2023
1
0
1
Hi,

today I reinstalled my server. I back-uped all my VMs. When I had proxmox freshly installed, I went straight to restoring my VMs. All went well except the most important VM for me. The size is 1.7TB.

Server config:
-old dell r710
-wd purple 4tb in raid 1 and other drives
-128GB RAM
-Xeon CPU E5645 @ 2.40GHz (2 Sockets)

I have the following error:


Task viewer: VM 106 - Restore
1690740766606.png
restore vma archive: zstd -q -d -c /mnt/wdpurple/dump/vzdump-qemu-117-2023_07_30-09_06_49.vma.zst | vma extract -v -r /var/tmp/vzdumptmp4122.fifo - /var/tmp/vzdumptmp4122
CFG: size: 635 name: qemu-server.conf
DEV: dev_id=1 size: 1825361100800 devname: drive-scsi0
CTIME: Sun Jul 30 09:06:56 2023
Formatting '/mnt/wdpurple/images/106/vm-106-disk-0.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=metadata compression_type=zlib size=1825361100800 lazy_refcounts=off refcount_bits=16
no lock found trying to remove 'create' lock
error before or during data restore, some or all disks were not completely restored. VM 106 state is NOT cleaned up.
TASK ERROR: command 'set -o pipefail && zstd -q -d -c /mnt/wdpurple/dump/vzdump-qemu-117-2023_07_30-09_06_49.vma.zst | vma extract -v -r /var/tmp/vzdumptmp4122.fifo - /var/tmp/vzdumptmp4122' failed: command '/usr/bin/qemu-img create -o 'preallocation=metadata' -f qcow2 /mnt/wdpurple/images/106/vm-106-disk-0.qcow2 1782579200K' failed: got timeout


If anybody could help me, I would appreciate it.
 
is your target storage slow? you can try setting the preallocation mode to off:

Code:
pvesm set STORAGENAME --preallocation off

and retry..
 
Hi,
Hi... Sorry to talk about this topic again, but why this issues occurs?
I have the same behavior.
It's just a simple VM with 2 disks in NFS storage.
how large are the disks? When does the failure occur? Please share more details including the full log.

Originally, there was a 10 minute timeout for allocating the disks and for qcow2 this includes metadata by default. Therefore, it was recommended to turn that preallocation off for slow storages. Nowadays, in particular since qemu-server >= 8.0.8, the timeout is not there anymore, so if you can wait, you don't have to turn preallocation off. If you are on a newer version your issue is likely different.
 
  • Like
Reactions: Gilberto Ferreira
Hi,

how large are the disks? When does the failure occur? Please share more details including the full log.

Originally, there was a 10 minute timeout for allocating the disks and for qcow2 this includes metadata by default. Therefore, it was recommended to turn that preallocation off for slow storages. Nowadays, in particular since qemu-server >= 8.0.8, the timeout is not there anymore, so if you can wait, you don't have to turn preallocation off. If you are on a newer version your issue is likely different.
Hi

Well... There is no timeout whatsoever. The error message appears right when try to do a restore! It's took about 2 sec!

1750853666500.png


This is the log:
Code:
error before or during data restore, some or all disks were not completely restored. VM 1155900 state is NOT cleaned up.
TASK ERROR: unable to create image: got signal 7

Code:
pveversion
pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.14.5-1-bpo12-pve)

The storage:

Code:
nfs: sto-nvme
        export /mnt/nvme_239/folder
        path /mnt/pve/sto-nvme
        server 10.60.53.21
        content images
        options vers=4.2
        preallocation off
        prune-backups keep-all=1

The original VM:

Code:
qm config 1026
agent: 1
balloon: 1024
boot: order=scsi0;ide2;net0
cores: 32
cpu: host
hotplug: disk,network,usb,memory,cpu
ide2: none,media=cdrom
machine: q35
memory: 8192
meta: creation-qemu=7.2.0,ctime=1708004660
name: vm-gilberto
net0: virtio=BC:24:11:1F:AA:86,bridge=vmbr0,firewall=1,tag=3008
numa: 1
onboot: 1
ostype: l26
scsi0: sto-nvme:1026/vm-1026-disk-0.qcow2,discard=on,iothread=1,size=100G,ssd=1
scsi1: sto-nvme:1026/vm-1026-disk-1.qcow2,iothread=1,size=200G
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=315cd243-8a12-4d9d-9f6c-771f41ace92c
sockets: 2
tags:
vcpus: 2
vga: std
vmgenid: aecfd0bc-de84-4878-893a-9bd9034be2f3
 
Last edited:
This is the log:
Code:
error before or during data restore, some or all disks were not completely restored. VM 1155900 state is NOT cleaned up.
TASK ERROR: unable to create image: got signal 7
Code:
pveversion
pve-manager/8.4.1/2a5fa54a8503f96d (running kernel: 6.14.5-1-bpo12-pve)
Code:
nfs: sto-nvme
        export /mnt/nvme_239/folder
        path /mnt/pve/sto-nvme
        server 10.60.53.21
        content images
        options vers=4.2
        preallocation off
        prune-backups keep-all=1
So the qemu-img create process gets terminated by signal 7 (SIGBUS), which could be related to a kernel/memory/hardware issue.

Do you see anything in the system logs/journal around the time of the issue?

Does it work with a different directory-based target storage? Does it work with kernel 6.8? I'd also run a memory test.