timeout issue with qmrestore and large volume on zfs?

udo · Mar 2, 2018

Hi,
I tried to restore an VM with an big (2TB) second disk to an (slow+idle) local zfs-pool and the restore job fails:

Code:

# qmrestore /var/lib/vz/dump/vzdump-qemu-102-2018_03_02-05_58_22.vma.lzo 102 --storage local-zfs
restore vma archive: lzop -d -c /var/lib/vz/dump/vzdump-qemu-102-2018_03_02-05_58_22.vma.lzo|vma extract -v -r /var/tmp/vzdumptmp20589.fifo - /var/tmp/vzdumptmp20589
CFG: size: 417 name: qemu-server.conf
DEV: dev_id=1 size: 53687091200 devname: drive-scsi0
DEV: dev_id=2 size: 2199023255552 devname: drive-scsi1
CTIME: Fri Mar  2 05:58:23 2018
new volume ID is 'local-zfs:vm-102-disk-1'
map 'drive-scsi0' to '/dev/zvol/rpool/data/vm-102-disk-1' (write zeros = 0)
new volume ID is 'local-zfs:vm-102-disk-2'
map 'drive-scsi1' to '/dev/zvol/rpool/data/vm-102-disk-2' (write zeros = 0)

** (process:20596): ERROR **: can't open file /dev/zvol/rpool/data/vm-102-disk-2 - Could not open '/dev/zvol/rpool/data/vm-102-disk-2': No such file or directory
/bin/bash: line 1: 20595 Broken pipe             lzop -d -c /var/lib/vz/dump/vzdump-qemu-102-2018_03_02-05_58_22.vma.lzo
     20596 Trace/breakpoint trap   | vma extract -v -r /var/tmp/vzdumptmp20589.fifo - /var/tmp/vzdumptmp20589
temporary volume 'local-zfs:vm-102-disk-1' sucessfuly removed
temporary volume 'local-zfs:vm-102-disk-2' sucessfuly removed
command 'lzop -d -c /var/lib/vz/dump/vzdump-qemu-102-2018_03_02-05_58_22.vma.lzo|vma extract -v -r /var/tmp/vzdumptmp20589.fifo - /var/tmp/vzdumptmp20589' failed: exit code 133

After some second I do exactly the same and the restore starts.
Perhaps qmrestore must wait a little bit longer for the created volumes?

Udo

Alwin · Mar 2, 2018

Where the discs in standby on first try? I guess, that would explain the second restore to work.

Code:

hdparm -C /dev/sdX

man page:

-C Check the current IDE power mode status, which will always be one of unknown (drive does not support this command), active/idle (normal operation), standby (low power mode, drive has spun
down), or sleeping (lowest power mode, drive is completely shut down). The -S, -y, -Y, and -Z options can be used to manipulate the IDE power modes.

udo · Mar 5, 2018

Alwin said:
Where the discs in standby on first try? I guess, that would explain the second restore to work.

Code:

hdparm -C /dev/sdX

man page:

Hi Alwin,
I assume the disks wasn't standby, because all disks are part of rpool and due to the logging of pve they shouldn't switch to standby.
It's an standard installation. The only thing is the HP bios setting "Balanced Power and Performances", but if I check now the disks, they are all "active/idle".
And there are smart entrys in the syslog because of temperature changes during no action time (like 23:32, 28 min before the next 6hour sync)...
(don't know if this happens on standby-disks - i think not).

So I asume it's take to long to create an big volume on such an zfs-raid with 5400rpm-disk (5 stripes of 3-hdd-raidz1).

Udo

Alwin · Mar 5, 2018

AFAIK, ZFS doesn't care if the disk is there or not, as long as you don't write (or read more then metadata / in cache). The pvestatd daemon may not wake up the disks. If qmrestore would need to wait longer for the volume to be created, it would fail every time, as it creates a new image prior restore.

wolfgang · Mar 6, 2018

Please open a bug and please write you are using 5400 spinning disk. So I will check and see how we can handle these delays.

udo · Mar 6, 2018

wolfgang said:
Please open a bug and please write you are using 5400 spinning disk. So I will check and see how we can handle these delays.

Hi Wolfgang,
done: https://bugzilla.proxmox.com/show_bug.cgi?id=1691

Udo

wolfgang · Mar 7, 2018

Thanks a lot.
I will check it.

Search

Search

timeout issue with qmrestore and large volume on zfs?

udo

Distinguished Member

Alwin

Proxmox Retired Staff

udo

Distinguished Member

Alwin

Proxmox Retired Staff

wolfgang

Proxmox Retired Staff

udo

Distinguished Member

wolfgang

Proxmox Retired Staff