Failed Fail Over

ballybob · Aug 5, 2024

fiona said:
Nothing interesting in the zvol properties, they are identical except dates and IDs.

Could you also share the log for the successful migration for comparison? I wasn't able to reproduce the issue yet locally.

Any ideas?

fiona · Aug 5, 2024

Unfortunately, no. You could still check your physical disk's health, e.g. smartctl -a /dev/XYZ. But maybe the IO error is network-related. I'm not aware of any other user reporting this issue and wasn't able to reproduce it myself.

I noticed that your EFI disk has volsize 1M instead of the 4M that the script creates it with. Did you move the EFI disk to another storage and back at some point? Did you originally create the VM on ZFS?

ballybob · Aug 5, 2024

I did originally have it on local zfs, then moved it to nfs, then back to local zfs.

How would this have affected it?
Is there a way to fix it?

fiona · Aug 5, 2024

ballybob said:
I did originally have it on local zfs, then moved it to nfs, then back to local zfs.

How would this have affected it?

It shouldn't, but it's better to follow every lead.

ballybob said:
Is there a way to fix it?

It's not clear why the IO error happens. It's unfortunate the logs don't contain any hint about that either.

esi_y · Aug 5, 2024

ballybob said:

Code:

root@pve1:~# zfs get all rpool/data/vm-101-disk-0
NAME                      PROPERTY              VALUE                     SOURCE
rpool/data/vm-101-disk-0  type                  volume                    -
rpool/data/vm-101-disk-0  creation              Sat Jul 20 17:04 2024     -
rpool/data/vm-101-disk-0  used                  56K                       -
rpool/data/vm-101-disk-0  available             874G                      -
rpool/data/vm-101-disk-0  referenced            56K                       -
rpool/data/vm-101-disk-0  compressratio         1.00x                     -
rpool/data/vm-101-disk-0  reservation           none                      default
rpool/data/vm-101-disk-0  volsize               1M                        local
rpool/data/vm-101-disk-0  volblocksize          16K                       default
rpool/data/vm-101-disk-0  checksum              on                        default
rpool/data/vm-101-disk-0  compression           on                        inherited from rpool
rpool/data/vm-101-disk-0  readonly              off                       default
rpool/data/vm-101-disk-0  createtxg             327                       -
rpool/data/vm-101-disk-0  copies                1                         default
rpool/data/vm-101-disk-0  refreservation        none                      default
rpool/data/vm-101-disk-0  guid                  12720012326769879953      -
rpool/data/vm-101-disk-0  primarycache          all                       default
rpool/data/vm-101-disk-0  secondarycache        all                       default
rpool/data/vm-101-disk-0  usedbysnapshots       0B                        -
rpool/data/vm-101-disk-0  usedbydataset         56K                       -
rpool/data/vm-101-disk-0  usedbychildren        0B                        -
rpool/data/vm-101-disk-0  usedbyrefreservation  0B                        -
rpool/data/vm-101-disk-0  logbias               latency                   default
rpool/data/vm-101-disk-0  objsetid              141                       -
rpool/data/vm-101-disk-0  dedup                 off                       default
rpool/data/vm-101-disk-0  mlslabel              none                      default
rpool/data/vm-101-disk-0  sync                  standard                  inherited from rpool
rpool/data/vm-101-disk-0  refcompressratio      1.00x                     -
rpool/data/vm-101-disk-0  written               0                         -
rpool/data/vm-101-disk-0  logicalused           28K                       -
rpool/data/vm-101-disk-0  logicalreferenced     28K                       -
rpool/data/vm-101-disk-0  volmode               default                   default
rpool/data/vm-101-disk-0  snapshot_limit        none                      default
rpool/data/vm-101-disk-0  snapshot_count        none                      default
rpool/data/vm-101-disk-0  snapdev               hidden                    default
rpool/data/vm-101-disk-0  context               none                      default
rpool/data/vm-101-disk-0  fscontext             none                      default
rpool/data/vm-101-disk-0  defcontext            none                      default
rpool/data/vm-101-disk-0  rootcontext           none                      default
rpool/data/vm-101-disk-0  redundant_metadata    all                       default
rpool/data/vm-101-disk-0  encryption            off                       default
rpool/data/vm-101-disk-0  keylocation           none                      default
rpool/data/vm-101-disk-0  keyformat             none                      default
rpool/data/vm-101-disk-0  pbkdf2iters           0                         default
rpool/data/vm-101-disk-0  snapshots_changed     Thu Jul 25 10:30:10 2024  -
rpool/data/vm-101-disk-0  prefetch              all                       default

Code:

root@pve2:~# zfs get all rpool/data/vm-101-disk-0
NAME                      PROPERTY              VALUE                     SOURCE
rpool/data/vm-101-disk-0  type                  volume                    -
rpool/data/vm-101-disk-0  creation              Sat Jul 20 19:54 2024     -
rpool/data/vm-101-disk-0  used                  56K                       -
rpool/data/vm-101-disk-0  available             878G                      -
rpool/data/vm-101-disk-0  referenced            56K                       -
rpool/data/vm-101-disk-0  compressratio         1.00x                     -
rpool/data/vm-101-disk-0  reservation           none                      default
rpool/data/vm-101-disk-0  volsize               1M                        local
rpool/data/vm-101-disk-0  volblocksize          16K                       default
rpool/data/vm-101-disk-0  checksum              on                        default
rpool/data/vm-101-disk-0  compression           on                        inherited from rpool
rpool/data/vm-101-disk-0  readonly              off                       default
rpool/data/vm-101-disk-0  createtxg             1555                      -
rpool/data/vm-101-disk-0  copies                1                         default
rpool/data/vm-101-disk-0  refreservation        none                      default
rpool/data/vm-101-disk-0  guid                  9213793783124796618       -
rpool/data/vm-101-disk-0  primarycache          all                       default
rpool/data/vm-101-disk-0  secondarycache        all                       default
rpool/data/vm-101-disk-0  usedbysnapshots       0B                        -
rpool/data/vm-101-disk-0  usedbydataset         56K                       -
rpool/data/vm-101-disk-0  usedbychildren        0B                        -
rpool/data/vm-101-disk-0  usedbyrefreservation  0B                        -
rpool/data/vm-101-disk-0  logbias               latency                   default
rpool/data/vm-101-disk-0  objsetid              369                       -
rpool/data/vm-101-disk-0  dedup                 off                       default
rpool/data/vm-101-disk-0  mlslabel              none                      default
rpool/data/vm-101-disk-0  sync                  standard                  inherited from rpool
rpool/data/vm-101-disk-0  refcompressratio      1.00x                     -
rpool/data/vm-101-disk-0  written               0                         -
rpool/data/vm-101-disk-0  logicalused           28K                       -
rpool/data/vm-101-disk-0  logicalreferenced     28K                       -
rpool/data/vm-101-disk-0  volmode               default                   default
rpool/data/vm-101-disk-0  snapshot_limit        none                      default
rpool/data/vm-101-disk-0  snapshot_count        none                      default
rpool/data/vm-101-disk-0  snapdev               hidden                    default
rpool/data/vm-101-disk-0  context               none                      default
rpool/data/vm-101-disk-0  fscontext             none                      default
rpool/data/vm-101-disk-0  defcontext            none                      default
rpool/data/vm-101-disk-0  rootcontext           none                      default
rpool/data/vm-101-disk-0  redundant_metadata    all                       default
rpool/data/vm-101-disk-0  encryption            off                       default
rpool/data/vm-101-disk-0  keylocation           none                      default
rpool/data/vm-101-disk-0  keyformat             none                      default
rpool/data/vm-101-disk-0  pbkdf2iters           0                         default
rpool/data/vm-101-disk-0  snapshots_changed     Thu Jul 25 10:30:11 2024  -
rpool/data/vm-101-disk-0  prefetch              all                       default

A bit of a shot in the dark, but can you try:

zfs set refreservation=10M rpool/data/vm-101-disk-0

ballybob · Aug 6, 2024

esi_y said:
A bit of a shot in the dark, but can you try:

zfs set refreservation=10M rpool/data/vm-101-disk-0

This seemed to help, I pulled the plug on pve1 and let it fail over then brought it back online and it failed back. It has done this before successfully (rarely) so I am not 100% sure, but it looks like it might have done it. I wish I checked what the setting was before I executed it. Thanks!

Will keep an eye on it and see if it continues to work.

esi_y · Aug 6, 2024

ballybob said:
This seemed to help, I pulled the plug on pve1 and let it fail over then brought it back online and it failed back. It has done this before successfully (rarely) so I am not 100% sure, but it looks like it might have done it. I wish I checked what the setting was before I executed it. Thanks!

Will keep an eye on it and see if it continues to work.

I am pretty positive zvols are created as thick (i.e. with refreservation set) by default. It was the moving around that lost it.

Please report back later! I have noticed some funny patters with how ZFS works with zvols and your case is completing my picture...

esi_y · Aug 6, 2024

ballybob said:
I wish I checked what the setting was before I executed it.

If you wonder what was refreserved before you set it now .. it was 0:

ballybob said:
Code:

rpool/data/vm-101-disk-0 refreservation none default

esi_y · Aug 6, 2024

@fiona I would not want to pretend I know all the intricacies of PVE especially with migrations, any idea how the OP could have ended up with a sparse efidisk just by moving it around within PVE environment (on and off NFS)? The default for zfs create -V is thick unless -s is specified, I do not think PVE code does that anywhere?

fiona · Aug 6, 2024

esi_y said:
@fiona I would not want to pretend I know all the intricacies of PVE especially with migrations, any idea how the OP could have ended up with a sparse efidisk just by moving it around within PVE environment (on and off NFS)? The default for zfs create -V is thick unless -s is specified, I do not think PVE code does that anywhere?

The storage configuration for ZFS has a sparse setting. I could not reproduce the issue with that either though.

esi_y · Aug 6, 2024

fiona said:
The storage configuration for ZFS has a sparse setting. I could not reproduce the issue with that either though.

You mean you *allow* actually creating sparse ZVOLs in PVE? (Apologies for asking like an idiot, but I typically create all my zpools manually.)

You would not be able to reproduce it when you create it fresh I am afraid, this problem would only pop up with time. I don't think sparse and zvol is a good idea generally - can I track this change/rationale to some pve-devel discussion?

esi_y · Aug 6, 2024

esi_y said:
I don't think sparse and zvol is a good idea generally

So that I don't talk with nothing backing me up whatsover:

https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html#volsize

These effects can also occur when the volume size is changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size.

esi_y · Aug 6, 2024

@ballybob FWIW I personally avoid ZVOLs for VMs, rather than anectodal evidence, if I pull quick search results this is still in the top:

https://jrs-s.net/2018/03/13/zvol-vs-qcow2-with-kvm/

https://forum.level1techs.com/t/zvo...ormance-difference-on-nvme-based-zpool/182074

So I do not know about performance on any specific setup (I probably would not be benchmarking against ext4 or zfs dataset to begin with), but the points on how KVM behaves from the first link are valid.

Now with that said, I don't have anything against ZFS and all the features (and it works fine with containers on regular datasets), but ZVOLs have always been a bit weird.

fiona · Aug 7, 2024

esi_y said:
So that I don't talk with nothing backing me up whatsover:

https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html#volsize

Code:

Without the reservation, the volume could run out of space, resulting in undefined behavior or data corruption, depending on how the volume is used. These effects can also occur when the volume size is changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size.

Of course, when you run out of space, thin provisioning has issues. That's the big downside of thin provisioning in general.

EDIT: and for completeness, shrinking volumes via PVE UI/API/CLI is not even allowed exactly because it's very dangerous.

@ballybob I suppose you did not run out of space, or?

ballybob · Aug 7, 2024

fiona said:
Code:

Without the reservation, the volume could run out of space, resulting in undefined behavior or data corruption, depending on how the volume is used. These effects can also occur when the volume size is changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size.

Of course, when you run out of space, thin provisioning has issues. That's the big downside of thin provisioning in general.

@ballybob I suppose you did not run out of space, or?

No, it was a tiny fraction of available space.

esi_y · Aug 7, 2024

fiona said:
Code:

Without the reservation, the volume could run out of space, resulting in undefined behavior or data corruption, depending on how the volume is used. These effects can also occur when the volume size is changed while it is in use (particularly when shrinking the size). Extreme care should be used when adjusting the volume size.

Of course, when you run out of space, thin provisioning has issues. That's the big downside of thin provisioning in general.

EDIT: and for completeness, shrinking volumes via PVE UI/API/CLI is not even allowed exactly because it's very dangerous.

@ballybob I suppose you did not run out of space, or?

I wished to to emphasize more the second sentence (than the obvious).

And your own observation above:

fiona said:
I noticed that your EFI disk has volsize 1M instead of the 4M that the script creates it with. Did you move the EFI disk to another storage and back at some point? Did you originally create the VM on ZFS?

How could that happen in PVE?

esi_y · Aug 7, 2024

@ballybob Do you mind posting also arc_summary from your system?

fiona · Aug 7, 2024

esi_y said:
How could that happen in PVE?

The EFI disk is special, because it has a fixed amount of data. Depending on the storage, the volume that data is stored on can have a different size. When moving the storage, the volume is allocated, then the data is copied over. That is not the same as reducing the volsize in ZFS. If you move it away from ZFS and back, it will be a new ZFS volume.

ballybob · Aug 7, 2024

esi_y said:
@ballybob Do you mind posting also arc_summary from your system?

https://pastebin.com/HcJKiAiY

esi_y · Aug 7, 2024

ballybob said:

Code:

------------------------------------------------------------------------
ZFS Subsystem Report                            Wed Aug 07 11:08:34 2024
Linux 6.8.8-4-pve                                             2.2.4-pve1
Machine: pve1 (x86_64)                                        2.2.4-pve1

...

Tunables:
...

        zvol_blk_mq_blocks_per_thread                                  8
        zvol_blk_mq_queue_depth                                      128
        zvol_enforce_quotas                                            1
        zvol_inhibit_dev                                               0
        zvol_major                                                   230
        zvol_max_discard_blocks                                    16384
        zvol_num_taskqs                                                0
        zvol_open_timeout_ms                                        1000
        zvol_prefetch_bytes                                       131072
        zvol_request_sync                                              0
        zvol_threads                                                   0
        zvol_use_blk_mq                                                0
        zvol_volmode                                                   1

...

Thanks, I mostly wanted to know if zvol_use_blk_mq was not set (it's not), but may find something more later on. I just experienced more weird behaviour with ZVOLs over time. Please let us know later in case the "fix" was not real, but I will just assume that thick provisioning the volume made it for you.

Failed Fail Over

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Renowned Member

Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Member

Renowned Member

Renowned Member

Proxmox Staff Member

Member

Renowned Member

We value your privacy