[SOLVED] Problem during Migration with gluster filesystem.

Dark26 · May 2, 2022

Bonjour,

Everything was working fine ( almost 2 years now), But recently ( since the last update ?) i have a problem with my gluster storage.

The other day i try update a vm ( dist-upgrade inside the vm ), and during writting files --> the VM shutdown

same after a restauration from a pbs backup.

I decide to move the vm disk to a local storage ( not gluster ), i do the upgrade en everything was fine.

But when i want to move back to the storage on Gluster --> no way :


create full clone of drive scsi1 (P1_SSDinterne:170/vm-170-disk-1.qcow2)
Formatting 'gluster://10.10.5.92/SSDinterne/images/170/vm-170-disk-0.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=metadata compression_type=zlib size=5368709120 lazy_refcounts=off refcount_bits=16
[2022-04-30 18:09:33.815373 +0000] I [io-stats.c:3706:ios_sample_buf_size_configure] 0-SSDinterne: Configure ios_sample_buf  size is 1024 because ios_sample_interval is 0
[2022-04-30 18:09:33.943578 +0000] E [MSGID: 108006] [afr-common.c:6140:__afr_handle_child_down_event] 0-SSDinterne-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up. 
[2022-04-30 18:09:43.821021 +0000] I [io-stats.c:4038:fini] 0-SSDinterne: io-stats translator unloaded
[2022-04-30 18:09:44.835240 +0000] I [io-stats.c:3706:ios_sample_buf_size_configure] 0-SSDinterne: Configure ios_sample_buf  size is 1024 because ios_sample_interval is 0
[2022-04-30 18:09:45.505523 +0000] E [MSGID: 108006] [afr-common.c:6140:__afr_handle_child_down_event] 0-SSDinterne-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up. 
[2022-04-30 18:09:54.841835 +0000] I [io-stats.c:4038:fini] 0-SSDinterne: io-stats translator unloaded
transferred 0.0 B of 5.0 GiB (0.00%)
[2022-04-30 18:09:55.997707 +0000] I [io-stats.c:3706:ios_sample_buf_size_configure] 0-SSDinterne: Configure ios_sample_buf  size is 1024 because ios_sample_interval is 0
qemu-img: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed.
TASK ERROR: storage migration failed: copy failed: command '/usr/bin/qemu-img convert -p -n -f qcow2 -O qcow2 /Data/SSDinterne/P1_SSDinterne/images/170/vm-170-disk-1.qcow2 zeroinit:gluster://10.10.5.92/SSDinterne/images/170/vm-170-disk-0.qcow2' failed: got signal 6

after some investigation, i have the same error on the log when i was try to upgrade th vm directly on the glusterfs storage. ( bl.request_alignment' failed )

I was thinking of a problem with the lastest update oh the qemu program.

Any idea ?


[TABLE]
[TR]
[TD]Kernel Version
 
Linux 5.13.19-6-pve #1 SMP PVE 5.13.19-15 (Tue, 29 Mar 2022 15:59:50 +0200)
[/TD]
[/TR]
[TR]
[TD]PVE Manager Version
 
pve-manager/7.1-12/b3c09de3[/TD]
[/TR]
[/TABLE]

Bonne journée.

Dark26

Dark26 · May 2, 2022


root@p1:~# pveversion 
pve-manager/7.1-12/b3c09de3 (running kernel: 5.13.19-6-pve)
root@p1:~# pveversion -v 
proxmox-ve: 7.1-2 (running kernel: 5.13.19-6-pve)
pve-manager: 7.1-12 (running version: 7.1-12/b3c09de3)
pve-kernel-helper: 7.2-2
pve-kernel-5.13: 7.1-9
pve-kernel-5.13.19-6-pve: 5.13.19-15
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksmtuned: 4.20150326
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-7
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-5
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.1-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.12-1
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.1.6-1
proxmox-backup-file-restore: 2.1.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-10
pve-cluster: 7.1-3
pve-container: 4.1-5
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-1
pve-ha-manager: 3.3-4
pve-i18n: 2.6-3
pve-qemu-kvm: 6.2.0-5
pve-xtermjs: 4.16.0-1
qemu-server: 7.1-5
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1

fiona · May 3, 2022

Hi,
which version of QEMU were you using previously? One of the logs in /var/log/apt/) should contain the information. Does it work if you downgrade to that version again (VM has to be shut down + started, rebooted from outside or migrated to a different node, to start using the newly installed version)?

Dark26 · May 3, 2022

This is the last update :


Start-Date: 2022-04-29  09:40:19
Commandline: apt-get dist-upgrade
Upgrade: proxmox-widget-toolkit:amd64 (3.4-9, 3.4-10), pve-firmware:amd64 (3.3-6, 3.4-1), pve-qemu-kvm:amd64 (6.2.0-3, 6.2.0-5), libproxmox-acme-perl:amd64 (1.4.1, 1.4.2), pve-ha-manager:amd64 (3.3-3, 3.3-4), libpve-guest-common-perl:amd64 (4.1-1, 4.1-2), libpostproc55:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1), proxmox-ve:amd64 (7.1-1, 7.1-2), novnc-pve:amd64 (1.3.0-2, 1.3.0-3), libavcodec58:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1), qemu-server:amd64 (7.1-4, 7.1-5), pve-container:amd64 (4.1-4, 4.1-5), libproxmox-acme-plugins:amd64 (1.4.1, 1.4.2), pve-i18n:amd64 (2.6-2, 2.6-3), libavutil56:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1), libswscale5:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1), smartmontools:amd64 (7.2-pve2, 7.2-pve3), libswresample3:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1), libavformat58:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1), pve-kernel-helper:amd64 (7.1-14, 7.2-2), libavfilter7:amd64 (7:4.3.3-0+deb11u1, 7:4.3.4-0+deb11u1)
End-Date: 2022-04-29  09:42:04

so qemu , , qemu-server:amd64 (7.1-4, 7.1-5) , qemu-server:amd64 (7.1-4, 7.1-5

fiona · May 3, 2022

Could you try downgrading to pve-qemu-kvm=6.2.0-3, shutdown+start the VM (creating a clone for testing might be a good idea) and see if the issue persists? If it does, could you try downgrading further to pve-qemu-kvm=6.1.1-2?

Dark26 · May 3, 2022

Fabian_E said:
Could you try downgrading to pve-qemu-kvm=6.2.0-3, shutdown+start the VM (creating a clone for testing might be a good idea) and see if the issue persists? If it does, could you try downgrading further to pve-qemu-kvm=6.1.1-2?

I have a similar problem on my workstation ( promox inside too ) who is using the same glusterfs storage; Ramdomly, the vm crash, perhaps on high write disk activity, ( it dit when i was updating the kernel ).

To be sure that it's a storage problem have move the disk to local SSD .


create full clone of drive scsi0 (SSDinterne:110/vm-110-disk-0.qcow2)
Formatting '/Data/images/110/vm-110-disk-0.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=metadata compression_type=zlib size=15032385536 lazy_refcounts=off refcount_bits=16
drive mirror is starting for drive-scsi0
drive-scsi0: transferred 773.8 MiB of 14.0 GiB (5.40%) in 1s
drive-scsi0: transferred 1.0 GiB of 14.0 GiB (7.35%) in 2s


drive-scsi0: transferred 14.1 GiB of 14.1 GiB (100.00%) in 7m 48s
drive-scsi0: transferred 14.1 GiB of 14.1 GiB (100.00%) in 7m 49s, ready
all 'mirror' jobs are ready
drive-scsi0: Completing block job_id...
drive-scsi0: Completed successfully.
drive-scsi0: mirror-job finished
TASK OK

If a i have no problem during a moment , it will be a storage problem

For the downgrade of qemu/ it will be difficult a this time because the VM ( 170 ) is my TGT serveur for my boot on san Workstation.

I did a manual copy of the disk of vm 170 to the glusterfs storage and modify the conf file of the vm.

Wait and see . Perhaps an upgrade of glusterfs ( the version provide 9.2 is outdated

Dark26 · May 3, 2022

it's look like this problem :

https://github.com/qemu/qemu/commit/a6b257a08e3d72219f03e461a52152672fec0612

fiona · May 4, 2022

We were able to reproduce the issue now (thanks to @mira) and it seems to be a regression between our QEMU 6.1.1 and 6.2.0 packages. The commit you mentioned is included in both, could you explain how you found it/why you think it's relevant?

EDIT: Forgot to add that our reproducer uses SSD emulation and discard set on the VM disk. Do you have that as well (disabling it might be a workaround then)?

Dark26 · May 4, 2022

i have :
cache : write back
discard : yes
but ssd emulation : no.

i believe qemu-img create a sparse file ( qcow2 ) with size=0 on the new storage ( glusterfs in our case ), when the migration start, the program try to recreate the structure ih the filesystem in the qcow2 file, there is a problem of alignement because it can't detect the structure inside the file because the qcow2 is zero, and the glusterfsserver doen't return the right infiormation.

i hope i was clear.

perhaps with raw drive there not this problem.

Dark26

fiona · May 5, 2022

Disabling discard should work around the issue until the fix is out. I sent a patch.

Dark26 · May 6, 2022

Fabian_E said:
Disabling discard should work around the issue until the fix is out. I sent a patch.

well , without the discard option, i was able to update the VM, without any crash.

with the discard option "on" i had this in the log :


May 06 21:06:30 p3 pvestatd[1312]: status update time (11.371 seconds)
May 06 21:06:42 p3 pvestatd[1312]: status update time (11.521 seconds)
May 06 21:06:53 p3 QEMU[3028148]: kvm: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed.
May 06 21:06:53 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state
May 06 21:06:53 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state
May 06 21:06:53 p3 systemd[1]: 170.scope: Succeeded.
May 06 21:06:53 p3 systemd[1]: 170.scope: Consumed 1h 13min 23.814s CPU time.
May 06 21:06:53 p3 pvestatd[1312]: status update time (11.234 seconds)
May 06 21:06:55 p3 qmeventd[239816]: Starting cleanup for 170
May 06 21:06:55 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state
May 06 21:06:55 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state
May 06 21:06:55 p3 kernel: device fwln170i0 left promiscuous mode
May 06 21:06:55 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state
May 06 21:06:55 p3 kernel: device fwpr170p0 left promiscuous mode
May 06 21:06:55 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state
May 06 21:06:55 p3 qmeventd[239816]: Finished cleanup for 170
May 06 21:07:04 p3 pvestatd[1312]: status update time (11.043 seconds)
May 06 21:07:15 p3 pvestatd[1312]: status update time (10.753 seconds)

we see the crash of the Vm 170 after the : May 06 21:06:53 p3 QEMU[3028148]: kvm: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed

Thanks for the tips.

I hope we can find a solution for this.

Merci

Dark26

Dark26 · May 6, 2022

Well it's not stable.

The discard option is off .

i was trying to remove the sbapshot before the update :


May 06 23:00:16 p3 pvestatd[1312]: status update time (11.471 seconds)
May 06 23:00:21 p3 pvedaemon[253908]: <root@pam> starting task UPID:p3:0004610A:04207A01:62758C65:qmdelsnapshot:170:root@pam:
May 06 23:00:21 p3 pvedaemon[286986]: <root@pam> delete snapshot VM 170: avant_maj
May 06 23:00:21 p3 QEMU[277316]: kvm: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed.
May 06 23:00:21 p3 pvedaemon[286986]: VM 170 qmp command failed - VM 170 qmp command 'blockdev-snapshot-delete-internal-sync' failed - client closed connection
May 06 23:00:21 p3 pvedaemon[286986]: VM 170 qmp command 'blockdev-snapshot-delete-internal-sync' failed - client closed connection
May 06 23:00:21 p3 pvedaemon[253908]: <root@pam> end task UPID:p3:0004610A:04207A01:62758C65:qmdelsnapshot:170:root@pam: VM 170 qmp command 'blockdev-snapshot-delete-internal-sync' failed - client closed connection
May 06 23:00:21 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state
May 06 23:00:21 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state
May 06 23:00:21 p3 systemd[1]: 170.scope: Succeeded.
May 06 23:00:21 p3 systemd[1]: 170.scope: Consumed 3min 30.117s CPU time.
May 06 23:00:23 p3 qmeventd[286992]: Starting cleanup for 170
May 06 23:00:23 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state
May 06 23:00:23 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state
May 06 23:00:23 p3 kernel: device fwln170i0 left promiscuous mode
May 06 23:00:23 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state
May 06 23:00:23 p3 kernel: device fwpr170p0 left promiscuous mode
May 06 23:00:23 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state
May 06 23:00:23 p3 qmeventd[286992]: Finished cleanup for 170
May 06 23:00:27 p3 pvestatd[1312]: status update time (10.937 seconds)
May 06 23:00:30 p3 pmxcfs[1114]: [status] notice: received log

and still the same error who is taking my vm 170 down.

fiona · May 9, 2022

Dark26 said:
well , without the discard option, i was able to update the VM, without any crash.

with the discard option "on" i had this in the log :

May 06 21:06:30 p3 pvestatd[1312]: status update time (11.371 seconds) May 06 21:06:42 p3 pvestatd[1312]: status update time (11.521 seconds) May 06 21:06:53 p3 QEMU[3028148]: kvm: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed. May 06 21:06:53 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state May 06 21:06:53 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state May 06 21:06:53 p3 systemd[1]: 170.scope: Succeeded. May 06 21:06:53 p3 systemd[1]: 170.scope: Consumed 1h 13min 23.814s CPU time. May 06 21:06:53 p3 pvestatd[1312]: status update time (11.234 seconds) May 06 21:06:55 p3 qmeventd[239816]: Starting cleanup for 170 May 06 21:06:55 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state May 06 21:06:55 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state May 06 21:06:55 p3 kernel: device fwln170i0 left promiscuous mode May 06 21:06:55 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state May 06 21:06:55 p3 kernel: device fwpr170p0 left promiscuous mode May 06 21:06:55 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state May 06 21:06:55 p3 qmeventd[239816]: Finished cleanup for 170 May 06 21:07:04 p3 pvestatd[1312]: status update time (11.043 seconds) May 06 21:07:15 p3 pvestatd[1312]: status update time (10.753 seconds)

we see the crash of the Vm 170 after the : May 06 21:06:53 p3 QEMU[3028148]: kvm: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed

Thanks for the tips.

I hope we can find a solution for this.

Merci

Dark26

The patch I mentioned should fix the failing assertion, but it's not applied/packaged yet. That will take a bit of time.

Dark26 said:
Well it's not stable.

The discard option is off .

i was trying to remove the sbapshot before the update :

May 06 23:00:16 p3 pvestatd[1312]: status update time (11.471 seconds) May 06 23:00:21 p3 pvedaemon[253908]: <root@pam> starting task UPID:p3:0004610A:04207A01:62758C65:qmdelsnapshot:170:root@pam: May 06 23:00:21 p3 pvedaemon[286986]: <root@pam> delete snapshot VM 170: avant_maj May 06 23:00:21 p3 QEMU[277316]: kvm: ../block/io.c:3118: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed. May 06 23:00:21 p3 pvedaemon[286986]: VM 170 qmp command failed - VM 170 qmp command 'blockdev-snapshot-delete-internal-sync' failed - client closed connection May 06 23:00:21 p3 pvedaemon[286986]: VM 170 qmp command 'blockdev-snapshot-delete-internal-sync' failed - client closed connection May 06 23:00:21 p3 pvedaemon[253908]: <root@pam> end task UPID:p3:0004610A:04207A01:62758C65:qmdelsnapshot:170:root@pam: VM 170 qmp command 'blockdev-snapshot-delete-internal-sync' failed - client closed connection May 06 23:00:21 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state May 06 23:00:21 p3 kernel: fwbr170i0: port 2(tap170i0) entered disabled state May 06 23:00:21 p3 systemd[1]: 170.scope: Succeeded. May 06 23:00:21 p3 systemd[1]: 170.scope: Consumed 3min 30.117s CPU time. May 06 23:00:23 p3 qmeventd[286992]: Starting cleanup for 170 May 06 23:00:23 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state May 06 23:00:23 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state May 06 23:00:23 p3 kernel: device fwln170i0 left promiscuous mode May 06 23:00:23 p3 kernel: fwbr170i0: port 1(fwln170i0) entered disabled state May 06 23:00:23 p3 kernel: device fwpr170p0 left promiscuous mode May 06 23:00:23 p3 kernel: vmbr1: port 4(fwpr170p0) entered disabled state May 06 23:00:23 p3 qmeventd[286992]: Finished cleanup for 170 May 06 23:00:27 p3 pvestatd[1312]: status update time (10.937 seconds) May 06 23:00:30 p3 pmxcfs[1114]: [status] notice: received log

and still the same error who is taking my vm 170 down.

Hmm, not sure why it triggers there. Maybe QEMU internally uses discard when operating with qcow2 snapshots.

Dark26 · May 9, 2022

Thanks for the reply. I waiting for the patch.

fiona · May 12, 2022

FYI, the package pve-qemu-kvm=6.2.0-6, which includes a fix for the issue, is now available on the pvetest repository. If you'd like to try it, add the pvetest repository (see here, but can also be done via the UI), run apt update and apt install pve-qemu-kvm and then disable the repository again.

Dark26 · May 12, 2022

i try to pass the patch tonight

Dark26 · May 12, 2022

I install the lastest pve update ans the qemu package of the test repository

glusterfs storage to glusterfs storage

Code:

create full clone of drive scsi2 (SSDinterne:170/vm-170-disk-2.qcow2)
Formatting 'gluster://10.10.5.92/GlusterSSD/images/170/vm-170-disk-0.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=metadata compression_type=zlib size=5368709120 lazy_refcounts=off refcount_bits=16
[2022-05-12 19:57:00.724871 +0000] I [io-stats.c:3706:ios_sample_buf_size_configure] 0-GlusterSSD: Configure ios_sample_buf  size is 1024 because ios_sample_interval is 0
[2022-05-12 19:57:00.890975 +0000] E [MSGID: 108006] [afr-common.c:6140:__afr_handle_child_down_event] 0-GlusterSSD-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up.
[2022-05-12 19:57:10.731695 +0000] I [io-stats.c:4038:fini] 0-GlusterSSD: io-stats translator unloaded
[2022-05-12 19:57:11.746462 +0000] I [io-stats.c:3706:ios_sample_buf_size_configure] 0-GlusterSSD: Configure ios_sample_buf  size is 1024 because ios_sample_interval is 0
[2022-05-12 19:57:12.405974 +0000] E [MSGID: 108006] [afr-common.c:6140:__afr_handle_child_down_event] 0-GlusterSSD-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up.
[2022-05-12 19:57:21.754057 +0000] I [io-stats.c:4038:fini] 0-GlusterSSD: io-stats translator unloaded
drive mirror is starting for drive-scsi2
drive-scsi2: transferred 0.0 B of 5.0 GiB (0.00%) in 0s
drive-scsi2: transferred 109.0 MiB of 5.0 GiB (2.13%) in 1s
drive-scsi2: transferred 231.0 MiB of 5.0 GiB (4.51%) in 2s

no error

tomorrow when i can shut down the vm, i will try to reactivate the discard option to see if it's working or not with it.

So far so good. Thanks

Dark26 · May 17, 2022

i try now with discard option on, and it's working again.

thanks for the patch.

fiona · May 18, 2022

Glad to hear it's working

Please mark the thread as [SOLVED] by editing the thread/first post.

imran.tee · Oct 18, 2022

Could you please tell how can I get the patch?

[SOLVED] Problem during Migration with gluster filesystem.

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Proxmox Staff Member

Renowned Member

Proxmox Staff Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Member

We value your privacy