Snapshot Rollback of a ZFS over ISCSI based Snapshot runs in timeout

Robert Schuster

Renowned Member
Feb 18, 2009
19
0
66
I'm evaluating Proxmox VE on behalf of a customer.
One Point of interest is the (hopefully seamless) integration in the existing ZFS storage environment.
There are several sun/oracle hardware based (current OmniOS) storage clusters in place.

Most of the test catalog works very well. So I'm able to:

  • Connect Proxmox VE systems to the ZFS-storage via ISCSI
  • create, run/shutdown systems
  • create snapshots (manually and automaticly based on a qm-cli script)
  • delete snapshots (manually and automaticly based on a qm-cli script)
The only thing which fails at the moment is a rollback of a snapshot.
Whatever I do (running or stopped vm, most recent snapshot or older one, vm with many snapshots or just one) - everything fails with the same error:

command '/usr/bin/ssh -o 'BatchMode=yes' -i /etc/pve/priv/zfs/192.168.100.3_id_rsa root@192.168.100.3 zfs list -t snapshot -o name -s creation' failed: got timeout

upload_2016-1-24_22-16-39.png

The command:
ssh -i /etc/pve/priv/zfs/192.168.100.3_id_rsa root@192.168.100.3 zfs list -t snapshot -o name -s creation
takes 6,09 sec. to display a result.

The vm.conf (with one snapshots) is:
balloon: 512
bootdisk: virtio0
cores: 1
ide2: none,media=cdrom
memory: 1024
name: test.testbed.org
net0: virtio=E2:15:18:FE:E5:B2,bridge=vmbr0
numa: 0
ostype: l26
parent: test01
smbios1: uuid=0a86b194-781f-43df-bb3c-24d611c918e2
sockets: 1
virtio0: storage03-store01:vm-150-disk-1,size=8G

[test01]
#test01
balloon: 512
bootdisk: virtio0
cores: 1
ide2: none,media=cdrom
memory: 1024
name: test.testbed.org
net0: virtio=E2:15:18:FE:E5:B2,bridge=vmbr0
numa: 0
ostype: l26
smbios1: uuid=0a86b194-781f-43df-bb3c-24d611c918e2
snaptime: 1453669910
sockets: 1
virtio0: storage03-store01:vm-150-disk-1,size=8G


The current test-system is based on pve-manager/4.1-5 (testing-repository)

any ideas?

regards
Robert
 
Last edited:
Hi,

You get a timeout because the the default is set to 5 sec .
What happen if you try this command on a terminal?
And if you get the list, please measure the time how long it takes?
 
Hi Wolfgang,

which command do you mean?
/usr/bin/ssh -o 'BatchMode=yes' -i /etc/pve/priv/zfs/192.168.100.3_id_rsa root@192.168.100.3 zfs list -t snapshot -o name -s creation
shows the list in ~6sec. (from 5,85 - 6,20) as described in my initial post.

The qm rollback <vmid> <snapname> command doesnt work either (same error on cli).

Is there a way to increase the timeout? The productiv target system has thousands of snapshots on its store so it could take a bit longer to generate the list:-)

kind regards
Robert
 
just two additional infos:

time qm rollback 200 test

real 0m12.254s
user 0m0.652s
sys 0m0.116s


(time for this command varies from 5 to 12sec.)


time /usr/bin/ssh -o 'BatchMode=yes' -i /etc/pve/priv/zfs/192.168.100.3_id_rsa root@192.168.100.3 zfs list -t snapshot -o name -s creation
...
list of snapshots
...
real 0m5.380s
user 0m0.016s
sys 0m0.000s


(time for this command is always ~5sec.)
 
hm...
That means wait and see.

I've found a 5sec. timeout in the two ZFS related scripts in /usr/share/perl5/PVE/Storage.
Increasing this to 10 doesnt help at all but may you have onother place to change it.

Increasing the timeout value at ZFSPlugin.pm to more than 10 (I've 60 now) makes the restore by the CLI command (pm rollback etc.) sucessfull.
A rollback by the gui command still fails.
 
Last edited:
You changed 5 to 60 in line 52 at /usr/share/perl5/PVE/Storage/ZFSPlugin.pm?
and still over the GUI it does not work?
 
yes!
/usr/share/perl5/PVE/Storage/ZFSPlugin.pm line 52. Increasing to 10 didn't help, than I've tried 60 and it worked for the CLI.
CLI via qm rollback vmid snapshot-name works perfect even with RAM, GUI not.
Same error than before.
 
This is the Gui timeout what we can't increase.
 
Hi Wolfgang,

thanks for your help - even it didn't solve the probelm totaly:-)

One hint for your further development:
The time to deliver the snapshot list takes normaly much more than 5 sec.
Even on high performance storage systems under high load and with thousands of snapshots on the zfs pool it can take up to 15sec.
(Tested yesterday on a productiv Sun/Oracle X4540 with 96TB on 48 Discs and ~1.200 snapshots on the pool. It took always ~12sec. to display the snapshotlist)

For other useres they may interested in this:

CLI restore of the most recent snapshot ist possible with this workaround.
(Increasing the timeout in /usr/share/perl5/PVE/Storage/ZFSPlugin.pm line 52 from 5 to 60)

Restoring a zfs snapshot which is not the most recent can be done by:
(be carefull, just for experienced users)
  1. Create a clone of the choosen snapshot i.e. (zfs clone store01/vm-103-disk-1@daily_1453503301 store01/vm-103-disk-2)
  2. Import that snapshot to a LUN
  3. Create a view for this LUN (1. - 3. on the storage system)
  4. Copy the source system vm.conf to a new one (on the Proxmox VE)
  5. Change the virtual disc in that vm.conf from vm-103-disk-1 to vm-103-disk-2 (in my example above)
  6. Shut down the original vm
  7. Start the new created vm with the cloned snapshot
  8. Don't forget: Almost identical vm's (same mac address etc.) - never start both!
 
Please restart you
pveproxy and pvedaemon and then the rollback should also be work on the GUI
 
Hi Wolfgang,

I can confirm - after increasing the timeout in /usr/share/perl5/PVE/Storage/ZFSPlugin.pm (line 52 from 5 to 60) and a restart of the Proxmox VE Host a rollback of the most recent snapshot ist successfull via CLI and GUI!

kind regards
Robert
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!