IPoIB interface stops working when restoring a vm dump from USB

shaunieb

New Member
Oct 29, 2012
17
1
1
South Africa
Hi Guys

Im struggling with a problem for a few days now and I cant find a solution.

When restoring a VM dump from a USB drive after a few minutes of the restore process the infiniband interface on the IPoIB layer becomes non responsive and I can restore it with a
Code:
service networking restart
. Im not sure whether the triggering of this issue is only possible from this scenario as I have had an IB port react in the same way when putting a average load onto the ceph storage cluster.

here is the kern.log from the time i mount the USB until the IB interface stops working.


Code:
Nov 13 15:03:14 jhb-tc-pve-a kernel: [ 1714.145101] EXT4-fs (sdh1): recovery complete
Nov 13 15:03:14 jhb-tc-pve-a kernel: [ 1714.145281] EXT4-fs (sdh1): mounted filesystem with ordered data mode. Opts: (null)
Nov 13 15:03:26 jhb-tc-pve-a pvedaemon[2325]: <rootatpam> starting task UPID:jhb-tc-pve-a:000037FD:0002A255:5645DF9E:qmrestore:104:rootpam:
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1729.707159] Key type ceph registered
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1729.707470] libceph: loaded (mon/osd proto 15/24)
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1729.709242] rbd: loaded (major 251)
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1729.712515] libceph: client744804 fsid acaae5c1-7e7d-4482-8993-bcbbb04e7870
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1729.713698] libceph: mon0 10.10.10.10:6789 session established
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1729.741190] rbd: rbd0: added with size 0x3200000000
Nov 13 15:03:30 jhb-tc-pve-a kernel: [ 1730.329258] rbd: rbd1: added with size 0x7d00000000
Nov 13 15:04:06 jhb-tc-pve-a kernel: [ 1766.569633] libceph: mon0 10.10.10.10:6789 socket closed (con state OPEN)
Nov 13 15:04:06 jhb-tc-pve-a kernel: [ 1766.569660] libceph: mon0 10.10.10.10:6789 session lost, hunting for new mon
Nov 13 15:04:28 jhb-tc-pve-a kernel: [ 1787.948248] libceph: mon1 10.10.10.11:6789 socket closed (con state CONNECTING)
Nov 13 15:04:36 jhb-tc-pve-a kernel: [ 1796.574077] libceph: mon0 10.10.10.10:6789 socket closed (con state OPEN)
Nov 13 15:04:42 jhb-tc-pve-a kernel: [ 1802.553924] libceph: mon2 10.10.10.12:6789 socket closed (con state CONNECTING)
Nov 13 15:04:52 jhb-tc-pve-a kernel: [ 1812.519390] libceph: mon1 10.10.10.11:6789 socket closed (con state CONNECTING)
Nov 13 15:05:01 jhb-tc-pve-a kernel: [ 1821.520016] libceph: mon1 10.10.10.11:6789 socket closed (con state CONNECTING)
Nov 13 15:05:18 jhb-tc-pve-a kernel: [ 1837.774144] libceph: mon1 10.10.10.11:6789 socket closed (con state CONNECTING)
Nov 13 15:05:22 jhb-tc-pve-a kernel: [ 1841.762128] libceph: mon2 10.10.10.12:6789 socket closed (con state CONNECTING)
Nov 13 15:05:36 jhb-tc-pve-a kernel: [ 1856.582368] libceph: mon0 10.10.10.10:6789 socket closed (con state OPEN)
Nov 13 15:05:48 jhb-tc-pve-a kernel: [ 1867.824753] libceph: mon2 10.10.10.12:6789 socket closed (con state CONNECTING)
Nov 13 15:05:51 jhb-tc-pve-a kernel: [ 1871.536953] libceph: mon1 10.10.10.11:6789 socket closed (con state CONNECTING)
Nov 13 15:06:16 jhb-tc-pve-a kernel: [ 1896.071937] libceph: mon2 10.10.10.12:6789 socket closed (con state CONNECTING)
Nov 13 15:06:26 jhb-tc-pve-a kernel: [ 1906.589293] libceph: mon0 10.10.10.10:6789 socket closed (con state OPEN)
Nov 13 15:06:31 jhb-tc-pve-a kernel: [ 1911.073091] libceph: mon2 10.10.10.12:6789 socket closed (con state CONNECTING)
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.074129] libceph: mon2 10.10.10.12:6789 socket closed (con state CONNECTING)
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529877] INFO: task kworker/u26:2:291 blocked for more than 120 seconds.
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529921]       Tainted: P           O    4.2.2-1-pve #1
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529945] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529980] kworker/u26:2   D 0000000000000006     0   291      2 0x00000000
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529992] Workqueue: writeback wb_workfn (flush-251:0)
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529995]  ffff880869e274f8 0000000000000046 ffff88086b6aee00 ffff88086a44ee00
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.529998]  0000000000000000 ffff880869e28000 ffff88087fc16a00 7fffffffffffffff
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530000]  ffff880869e27718 ffffe8ffffc0b900 ffff880869e27518 ffffffff817cc077
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530003] Call Trace:
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530014]  [<ffffffff817cc077>] schedule+0x37/0x80
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530018]  [<ffffffff817ceeb1>] schedule_timeout+0x201/0x2a0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530023]  [<ffffffff81380337>] ? blk_flush_plug_list+0xc7/0x220
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530030]  [<ffffffff8101cc99>] ? read_tsc+0x9/0x10
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530033]  [<ffffffff817cb66b>] io_schedule_timeout+0xbb/0x140
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530037]  [<ffffffff8138ae69>] bt_get+0x129/0x1b0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530044]  [<ffffffff810b76e0>] ? wait_woken+0x90/0x90
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530047]  [<ffffffff8138b247>] blk_mq_get_tag+0x97/0xc0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530050]  [<ffffffff81384cb2>] ? ll_back_merge_fn+0x132/0x190
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530053]  [<ffffffff81386d1b>] __blk_mq_alloc_request+0x1b/0x1f0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530055]  [<ffffffff81388a4b>] blk_mq_map_request+0x17b/0x1c0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530061]  [<ffffffff811790f5>] ? mempool_alloc_slab+0x15/0x20
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530063]  [<ffffffff81389953>] blk_sq_make_request+0x73/0x330
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530067]  [<ffffffff8137bb0c>] ? generic_make_request_checks+0x1dc/0x3a0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530070]  [<ffffffff8137bd9c>] generic_make_request+0xcc/0x110
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530073]  [<ffffffff8137be56>] submit_bio+0x76/0x180
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530079]  [<ffffffff813d39c6>] ? __percpu_counter_add+0x56/0x70
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530083]  [<ffffffff81223b4c>] submit_bh_wbc+0x14c/0x180
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530088]  [<ffffffff81225bbd>] __block_write_full_page.constprop.37+0x11d/0x3c0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530091]  [<ffffffff81388602>] ? __blk_mq_run_hw_queue+0x1d2/0x360
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530093]  [<ffffffff812261d0>] ? I_BDEV+0x20/0x20
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530096]  [<ffffffff812261d0>] ? I_BDEV+0x20/0x20
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530098]  [<ffffffff81225fab>] block_write_full_page+0x14b/0x170
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530102]  [<ffffffff81226b78>] blkdev_writepage+0x18/0x20
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530104]  [<ffffffff811815a7>] __writepage+0x17/0x40
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530107]  [<ffffffff811837a5>] write_cache_pages+0x215/0x480
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530109]  [<ffffffff81181590>] ? wb_position_ratio+0x1f0/0x1f0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530112]  [<ffffffff81183a50>] generic_writepages+0x40/0x60
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530114]  [<ffffffff8118478e>] do_writepages+0x1e/0x30
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530117]  [<ffffffff8121a645>] __writeback_single_inode+0x45/0x290
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530120]  [<ffffffff8121ad68>] writeback_sb_inodes+0x228/0x480
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530123]  [<ffffffff8121b049>] __writeback_inodes_wb+0x89/0xc0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530126]  [<ffffffff8121b268>] wb_writeback+0x1e8/0x280
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530128]  [<ffffffff8121badd>] wb_workfn+0x2fd/0x470
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530135]  [<ffffffff8108f807>] process_one_work+0x157/0x3f0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530138]  [<ffffffff81090229>] worker_thread+0x69/0x480
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530141]  [<ffffffff810901c0>] ? rescuer_thread+0x310/0x310
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530145]  [<ffffffff810957db>] kthread+0xdb/0x100
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530147]  [<ffffffff81095700>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530151]  [<ffffffff817d019f>] ret_from_fork+0x3f/0x70
Nov 13 15:06:40 jhb-tc-pve-a kernel: [ 1920.530153]  [<ffffffff81095700>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 13 15:06:56 jhb-tc-pve-a kernel: [ 1936.593369] libceph: mon0 10.10.10.10:6789 socket closed (con state OPEN)

Any help will be greatly appreciated. Even a tap in the right direction.

Thaks
Shaun
 
I've made some progress but I dot feel I have resolved the issue. Ive just gone around it ;)

I was using the KRBD connector to connect to the ceph (hammer) cluster. When I added a second connector using the libRBD and restored the backup I had no issues although the restore time increases dramatically.

What did happen when I checked tis morning is that yet again an ipoib interface on one of the cluster nodes stopped working again.

Code:
Nov 17 15:53:54 jhb-tc-pve-b pvedaemon[233200]: <rootatpam> starting task UPID:jhb-tc-pve-b:00039222:0215893A:564B3172:qmstart:103:rootatpam:
Nov 17 15:53:55 jhb-tc-pve-b kernel: [349993.303295] device tap103i0 entered promiscuous mode
Nov 17 15:53:58 jhb-tc-pve-b kernel: [349997.007016] kvm: zapping shadow pages for mmio generation wraparound
Nov 17 15:53:58 jhb-tc-pve-b kernel: [349997.012085] kvm: zapping shadow pages for mmio generation wraparound
Nov 17 16:02:52 jhb-tc-pve-b kernel: [350530.893999] libceph: osd0 up
Nov 17 16:02:52 jhb-tc-pve-b kernel: [350530.894004] libceph: osd0 weight 0x10000 (in)
Nov 17 16:03:59 jhb-tc-pve-b kernel: [350597.806424] libceph: osd0 10.10.10.10:6808 socket closed (con state OPEN)
Nov 17 16:04:00 jhb-tc-pve-b kernel: [350598.773720] libceph: osd0 down
Nov 17 16:04:53 jhb-tc-pve-b kernel: [350651.838124] libceph: osd5 down
Nov 17 16:05:01 jhb-tc-pve-b kernel: [350660.103955] libceph: osd4 down
Nov 17 16:05:10 jhb-tc-pve-b kernel: [350668.841703] libceph: osd3 down
Nov 17 16:05:38 jhb-tc-pve-b kernel: [350696.952594] libceph: osd1 down
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175449] INFO: task kmmpd-rbd0:106162 blocked for more than 120 seconds.
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175474]       Tainted: P           O    4.2.2-1-pve #1
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175493] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175523] kmmpd-rbd0      D 0000000000000002     0 106162      2 0x00000000
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175527]  ffff880c3b113bb8 0000000000000046 ffff880fb6172940 ffff880fdd15d280
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175530]  ffff880c3b113b98 ffff880c3b114000 ffff880ffe096a00 7fffffffffffffff
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175531]  ffffffff817cc860 ffff880c3b113d48 ffff880c3b113bd8 ffffffff817cc077
Nov 17 20:36:16 jhb-tc-pve-b kernel: [366951.175533] Call Trace:
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290188] INFO: task kmmpd-rbd0:106162 blocked for more than 120 seconds.
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290246]       Tainted: P           O    4.2.2-1-pve #1
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290376] kmmpd-rbd0      D 0000000000000002     0 106162      2 0x00000000
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290381]  ffff880c3b113bb8 0000000000000046 ffff880fb6172940 ffff880fdd15d280
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290383]  ffff880c3b113b98 ffff880c3b114000 ffff880ffe096a00 7fffffffffffffff
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290385]  ffffffff817cc860 ffff880c3b113d48 ffff880c3b113bd8 ffffffff817cc077
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290387] Call Trace:
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290396]  [<ffffffff817cc860>] ? bit_wait_timeout+0x70/0x70
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290398]  [<ffffffff817cc077>] schedule+0x37/0x80
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290400]  [<ffffffff817ceeb1>] schedule_timeout+0x201/0x2a0
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290405]  [<ffffffff8138841a>] ? blk_mq_run_hw_queue+0x8a/0xa0
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290409]  [<ffffffff8101cc99>] ? read_tsc+0x9/0x10
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290411]  [<ffffffff817cc860>] ? bit_wait_timeout+0x70/0x70
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290413]  [<ffffffff817cb66b>] io_schedule_timeout+0xbb/0x140
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290417]  [<ffffffff810b73e7>] ? prepare_to_wait+0x57/0x80
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290419]  [<ffffffff817cc89e>] bit_wait_io+0x3e/0x60
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290421]  [<ffffffff817cc43f>] __wait_on_bit+0x5f/0x90
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290423]  [<ffffffff817cc860>] ? bit_wait_timeout+0x70/0x70
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290425]  [<ffffffff817cc4e2>] out_of_line_wait_on_bit+0x72/0x80
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290428]  [<ffffffff810b7720>] ? autoremove_wake_function+0x40/0x40
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290431]  [<ffffffff812227c6>] __wait_on_buffer+0x36/0x40
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290435]  [<ffffffff812afdd1>] write_mmp_block+0x101/0x130
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290437]  [<ffffffff812b01e2>] kmmpd+0x192/0x3d0
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290439]  [<ffffffff812b0050>] ? __dump_mmp_msg+0x70/0x70
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290442]  [<ffffffff810957db>] kthread+0xdb/0x100
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290444]  [<ffffffff81095700>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290446]  [<ffffffff817d019f>] ret_from_fork+0x3f/0x70
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290447]  [<ffffffff81095700>] ? kthread_create_on_node+0x1c0/0x1c0
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290454] INFO: task mysqld:108969 blocked for more than 120 seconds.
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290506]       Tainted: P           O    4.2.2-1-pve #1
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290553] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290635] mysqld          D 0000000000000006     0 108969 108559 0x00000100
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290637]  ffff8811127bfb88 0000000000000082 ffff881f6f13e040 ffff881fe52ec4c0
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290639]  0000000000000000 ffff8811127c0000 ffff881ffd816a00 7fffffffffffffff
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290641]  ffffffff817cc860 ffff8811127bfd18 ffff8811127bfba8 ffffffff817cc077
Nov 17 20:38:16 jhb-tc-pve-b kernel: [367071.290643] Call Trace:

THe VM that was running was still set to use the KRBD connector so Ill change that to libRBD now and see if that makes a difference.
 
Hello All,

Could you please inform me how you have resolved the problem?
I have almost the same issue. Currently I am using Proxmox 4.3 on 3 node cluster with CEPH and RBD pool. All is running on the Infiniband network.
When I am trying to restore the backup from the HDD directly on the ceph pool when KRBD is ON, backup-restore hangs and I see below output.
When KRBD is OFF I see no issues.

I have updated the kernel on 4.3. Ufortunately it did not help.

Please help,

8640.354047] INFO: task kworker/u66:2:12817 blocked for more than 120 seconds.
[ 8640.354067] Tainted: P IO 4.4.24-1-pve #1
[ 8640.354079] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8640.354096] kworker/u66:2 D ffff8800362ff568 0 12817 2 0x00000000
[ 8640.354107] Workqueue: writeback wb_workfn (flush-250:0)
[ 8640.354110] ffff8800362ff568 ffff88086b570800 ffff880469bdf080 ffff88046c591900
[ 8640.354114] ffff880036300000 ffff88046fa97180 7fffffffffffffff ffff8800362ff740
[ 8640.354117] ffffe8fbefe88880 ffff8800362ff580 ffffffff81854cd5 0000000000000000
[ 8640.354120] Call Trace:
[ 8640.354127] [<ffffffff81854cd5>] schedule+0x35/0x80
[ 8640.354130] [<ffffffff81857f05>] schedule_timeout+0x235/0x2d0
[ 8640.354134] [<ffffffff813c9dd6>] ? blk_flush_plug_list+0xd6/0x240
[ 8640.354140] [<ffffffff810f5abc>] ? ktime_get+0x3c/0xb0
[ 8640.354143] [<ffffffff818541cb>] io_schedule_timeout+0xbb/0x140
[ 8640.354147] [<ffffffff813d59db>] bt_get+0x14b/0x1d0
[ 8640.354153] [<ffffffff810c41c0>] ? wait_woken+0x90/0x90
[ 8640.354156] [<ffffffff813d5d55>] blk_mq_get_tag+0x45/0xe0
[ 8640.354159] [<ffffffff813d14eb>] __blk_mq_alloc_request+0x1b/0x1f0
[ 8640.354163] [<ffffffff813d33c9>] blk_mq_map_request+0x199/0x210
[ 8640.354166] [<ffffffff813d4424>] blk_sq_make_request+0xa4/0x3e0
[ 8640.354171] [<ffffffff813c7bf3>] ? generic_make_request_checks+0x243/0x4f0
[ 8640.354175] [<ffffffff813c8430>] generic_make_request+0x110/0x1f0
[ 8640.354178] [<ffffffff813c8586>] submit_bio+0x76/0x180
[ 8640.354182] [<ffffffff812464ef>] submit_bh_wbc+0x12f/0x160
[ 8640.354186] [<ffffffff812485cd>] __block_write_full_page.constprop.39+0x11d/0x3c0
[ 8640.354189] [<ffffffff81248be0>] ? I_BDEV+0x20/0x20
[ 8640.354191] [<ffffffff81248be0>] ? I_BDEV+0x20/0x20
[ 8640.354194] [<ffffffff812489bb>] block_write_full_page+0x14b/0x170
[ 8640.354198] [<ffffffff812493e8>] blkdev_writepage+0x18/0x20
[ 8640.354203] [<ffffffff81198f03>] __writepage+0x13/0x30
[ 8640.354207] [<ffffffff8119b178>] write_cache_pages+0x228/0x4e0
[ 8640.354211] [<ffffffff810ac769>] ? try_to_wake_up+0x49/0x400
[ 8640.354214] [<ffffffff81198ef0>] ? wb_position_ratio+0x1f0/0x1f0
[ 8640.354218] [<ffffffff8119b481>] generic_writepages+0x51/0x80
[ 8640.354221] [<ffffffff8119c27e>] do_writepages+0x1e/0x30
[ 8640.354225] [<ffffffff8123cbf5>] __writeback_single_inode+0x45/0x340
[ 8640.354228] [<ffffffff8123d41c>] writeback_sb_inodes+0x27c/0x580
[ 8640.354232] [<ffffffff8123d7a9>] __writeback_inodes_wb+0x89/0xc0
[ 8640.354235] [<ffffffff8123db22>] wb_writeback+0x272/0x300
[ 8640.354239] [<ffffffff81229eb1>] ? get_nr_dirty_inodes+0x51/0x80
[ 8640.354242] [<ffffffff8123e36c>] wb_workfn+0x2ec/0x400
[ 8640.354246] [<ffffffff8109b0b8>] process_one_work+0x158/0x420
[ 8640.354249] [<ffffffff8109bb99>] worker_thread+0x69/0x480
[ 8640.354253] [<ffffffff8109bb30>] ? rescuer_thread+0x330/0x330
[ 8640.354256] [<ffffffff810a107a>] kthread+0xea/0x100
[ 8640.354259] [<ffffffff810a0f90>] ? kthread_park+0x60/0x60
[ 8640.354262] [<ffffffff8185918f>] ret_from_fork+0x3f/0x70
[ 8640.354265] [<ffffffff810a0f90>] ? kthread_park+0x60/0x60
 
Ok I have found the solutions by myself. I have created a two separated pools. One for the VM and one for the LXC.

Best regards,
 
Last edited:
Ok I have found the solutions by myself. I have created a two separated pools. One for the VM and one for the LXC.

Best regards,

Nice Find. I haven't had that issue in the later versions of PM4. But that cluster has not been heavily used. Ill update you if I have any issues as to keep you in the loop :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!