Just a shot in the dark, could it also be that the node is swapping heavily during live migration (RAM sync)?
Thank you for your answer.
This was one of my first thoughts.
The server have plenty of ram, so tried swappiness=0 and disabling swap volume.
Same results.
This is the stack trace:
May 14 08:23:07 pve-LAB2 kernel: [ 1702.482562] PGD 175a40e067 P4D 175a40e067 PUD 175a410067 PMD 1ff7edc067 PTE 1ff790c061
May 14 08:23:07 pve-LAB2 kernel: [ 1702.482592] Oops: 0003 [#1] SMP PTI
May 14 08:23:07 pve-LAB2 kernel: [ 1702.482605] Modules linked in: drbd_transport_tcp(O) drbd(O) libcrc32c ip_set ip6table_filter ip6_tables iptable_filter softdog nfnetlink_log nfnetlink ipmi_ssif intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ast pcbc ttm aesni_intel aes_x86_64 crypto_simd mxm_wmi glue_helper drm_kms_helper cryptd drm intel_cstate intel_rapl_perf fb_sys_fops syscopyarea sysfillrect sysimgblt snd_pcm snd_timer snd soundcore pcspkr mei_me joydev input_leds lpc_ich mei shpchp ioatdma ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter acpi_pad mac_hid vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO)
May 14 08:23:07 pve-LAB2 kernel: [ 1702.482866] icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs xor zstd_compress raid6_pq hid_generic usbkbd usbmouse usbhid hid i2c_i801 ahci libahci ixgbe igb mdio i2c_algo_bit dca ptp pps_core mpt3sas raid_class scsi_transport_sas
May 14 08:23:07 pve-LAB2 kernel: [ 1702.482963] CPU: 0 PID: 11425 Comm: drbd_r_Test1 Tainted: P O 4.15.17-1-pve #1
May 14 08:23:07 pve-LAB2 kernel: [ 1702.482990] Hardware name: Supermicro X10DRH LN4/X10DRH-CLN4, BIOS 2.0 01/30/2016
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483015] RIP: 0010:avl_insert+0x4b/0xd0 [zavl]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483032] RSP: 0018:ffffae4ab5517c20 EFLAGS: 00010282
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483049] RAX: 0000000000000000 RBX: ffff967e27f25300 RCX: ffffffffc03e7fcf
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483072] RDX: 0000000000000000 RSI: ffff967e27f25308 RDI: ffff967ef1a85d60
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483095] RBP: ffffae4ab5517c70 R08: ffffffffc03e7fce R09: ffff967efec07180
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483118] R10: ffff967e27f25300 R11: 0000000000000000 R12: ffff967ef1a85d30
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483140] R13: ffff967dfd201400 R14: 0000000000000000 R15: 0000000000000000
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483164] FS: 0000000000000000(0000) GS:ffff967eff200000(0000) knlGS:0000000000000000
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483208] CR2: ffffffffc03e7fce CR3: 000000175a40a003 CR4: 00000000003606f0
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483231] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483254] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483277] Call Trace:
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483335] ? zfs_range_lock+0x4bf/0x5c0 [zfs]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483357] ? spl_kmem_alloc+0xae/0x190 [spl]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483403] zvol_request+0x16e/0x300 [zfs]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483422] generic_make_request+0x123/0x2f0
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483439] submit_bio+0x73/0x150
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483452] ? submit_bio+0x73/0x150
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483473] ? drbd_flush_after_epoch+0x11c/0x360 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483496] drbd_flush_after_epoch+0x1b6/0x360 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483518] ? conn_wait_ee_cond+0x29/0x60 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483538] ? w_flush+0x50/0x50 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483557] receive_Barrier+0x17d/0x260 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483579] ? w_flush+0x50/0x50 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483599] drbd_receiver+0x45b/0x6a0 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483622] drbd_thread_setup+0x76/0x180 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483641] kthread+0x105/0x140
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483660] ? __drbd_next_peer_device_ref+0x170/0x170 [drbd]
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483682] ? kthread_create_worker_on_cpu+0x70/0x70
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483700] ? kthread_create_worker_on_cpu+0x70/0x70
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483719] ret_from_fork+0x35/0x40
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483733] Code: 89 c1 83 e0 04 48 83 c9 01 48 09 c8 4d 85 c0 48 c7 06 00 00 00 00 48 c7 46 08 00 00 00 00 48 89 46 10 0f 84 84 00 00 00 48 63 c2 <49> 89 34 c0 49 8b 50 10 8b 04 85 70 71 32 c0 89 d1 83 e1 03 83
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483841] CR2: ffffffffc03e7fce
May 14 08:23:07 pve-LAB2 kernel: [ 1702.483854] ---[ end trace 32d19839ea2084e4 ]---