Hi,
I have a reproduceable Issue with the qemu-guest-agent and the NFS server running inside a guest VM. Each night, when snapshot backups are run, and thus fsfreeze is issues to the Guest VM, I have a nasty Kernel debug stuff in my logs:
The VM is running Ubuntu 22.04 Kernel 5.15.0-89-generic
I suspect that somehow after fsfreezing the nfs server is still able to write stuff. Has anyone else experienced this and how could I mitigate this?
regards
I have a reproduceable Issue with the qemu-guest-agent and the NFS server running inside a guest VM. Each night, when snapshot backups are run, and thus fsfreeze is issues to the Guest VM, I have a nasty Kernel debug stuff in my logs:
Code:
Mar 26 01:30:00 publikore-data qemu-ga: info: guest-ping called
Mar 26 01:30:00 publikore-data qemu-ga: info: guest-fsfreeze called
Mar 26 01:30:01 publikore-data kernel: [10844965.818729] ------------[ cut here ]------------
Mar 26 01:30:01 publikore-data kernel: [10844965.818732] WARNING: CPU: 1 PID: 811 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0
Mar 26 01:30:01 publikore-data kernel: [10844965.818739] Modules linked in: nfnetlink tls binfmt_misc nls_iso8859_1 joydev input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua drm nfsd auth_rpcgss efi_pstore nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear virtio_net net_failover psmouse failover virtio_scsi floppy
Mar 26 01:30:01 publikore-data kernel: [10844965.818766] CPU: 1 PID: 811 Comm: nfsd Tainted: G W 5.15.0-89-generic #99-Ubuntu
Mar 26 01:30:01 publikore-data kernel: [10844965.818768] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
Mar 26 01:30:01 publikore-data kernel: [10844965.818770] RIP: 0010:ext4_journal_check_start+0x68/0xb0
Mar 26 01:30:01 publikore-data kernel: [10844965.818772] Code: 02 00 00 31 c0 48 85 d2 74 07 8b 02 83 e0 02 75 24 4c 8b 65 f8 c9 c3 cc cc cc cc 4c 8b 65 f8 b8 fb ff ff ff c9 c3 cc cc cc cc <0f> 0b eb ce b8 e2 ff ff ff eb dc 44 8b 42 10 68 08 50 c1 91 45 31
Mar 26 01:30:01 publikore-data kernel: [10844965.818773] RSP: 0018:ffffad69c0abbad8 EFLAGS: 00010246
Mar 26 01:30:01 publikore-data kernel: [10844965.818775] RAX: ffff8ab5c8a38000 RBX: 0000000000000001 RCX: 0000000000000002
Mar 26 01:30:01 publikore-data kernel: [10844965.818777] RDX: 0000000000000000 RSI: 0000000000001766 RDI: ffff8ab5c8b29000
Mar 26 01:30:01 publikore-data kernel: [10844965.818778] RBP: ffffad69c0abbae0 R08: 0000000000000000 R09: ffffffff90a7e73a
Mar 26 01:30:01 publikore-data kernel: [10844965.818779] R10: ffff8ab5c434ad68 R11: 0000000000000000 R12: ffff8ab5c8b29000
Mar 26 01:30:01 publikore-data kernel: [10844965.818780] R13: 0000000000000000 R14: 0000000000000008 R15: ffff8ab5c8b29000
Mar 26 01:30:01 publikore-data kernel: [10844965.818783] FS: 0000000000000000(0000) GS:ffff8ab6f7d00000(0000) knlGS:0000000000000000
Mar 26 01:30:01 publikore-data kernel: [10844965.818785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 26 01:30:01 publikore-data kernel: [10844965.818786] CR2: 00007f5d91b721a0 CR3: 0000000103500000 CR4: 00000000000006e0
Mar 26 01:30:01 publikore-data kernel: [10844965.818790] Call Trace:
Mar 26 01:30:01 publikore-data kernel: [10844965.818792] <TASK>
Mar 26 01:30:01 publikore-data kernel: [10844965.818795] ? show_trace_log_lvl+0x1d6/0x2ea
Mar 26 01:30:01 publikore-data kernel: [10844965.818800] ? show_trace_log_lvl+0x1d6/0x2ea
Mar 26 01:30:01 publikore-data kernel: [10844965.818803] ? __ext4_journal_start_sb+0x35/0x130
Mar 26 01:30:01 publikore-data kernel: [10844965.818805] ? show_regs.part.0+0x23/0x29
Mar 26 01:30:01 publikore-data kernel: [10844965.818807] ? show_regs.cold+0x8/0xd
Mar 26 01:30:01 publikore-data kernel: [10844965.818809] ? ext4_journal_check_start+0x68/0xb0
Mar 26 01:30:01 publikore-data kernel: [10844965.818811] ? __warn+0x8c/0x100
Mar 26 01:30:01 publikore-data kernel: [10844965.818814] ? ext4_journal_check_start+0x68/0xb0
Mar 26 01:30:01 publikore-data kernel: [10844965.818816] ? report_bug+0xa4/0xd0
Mar 26 01:30:01 publikore-data kernel: [10844965.818821] ? handle_bug+0x39/0x90
Mar 26 01:30:01 publikore-data kernel: [10844965.818825] ? exc_invalid_op+0x19/0x70
Mar 26 01:30:01 publikore-data kernel: [10844965.818827] ? asm_exc_invalid_op+0x1b/0x20
Mar 26 01:30:01 publikore-data kernel: [10844965.818830] ? ext4_dirty_inode+0x3a/0x80
Mar 26 01:30:01 publikore-data kernel: [10844965.818833] ? ext4_journal_check_start+0x68/0xb0
Mar 26 01:30:01 publikore-data kernel: [10844965.818835] ? ext4_journal_check_start+0x13/0xb0
Mar 26 01:30:01 publikore-data kernel: [10844965.818836] __ext4_journal_start_sb+0x35/0x130
Mar 26 01:30:01 publikore-data kernel: [10844965.818838] ext4_dirty_inode+0x3a/0x80
Mar 26 01:30:01 publikore-data kernel: [10844965.818840] __mark_inode_dirty+0x5e/0x330
Mar 26 01:30:01 publikore-data kernel: [10844965.818843] generic_update_time+0x6c/0xd0
Mar 26 01:30:01 publikore-data kernel: [10844965.818845] file_update_time+0x127/0x140
Mar 26 01:30:01 publikore-data kernel: [10844965.818848] ? mk_fsid+0x110/0x110 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818869] file_modified+0x27/0x40
Mar 26 01:30:01 publikore-data kernel: [10844965.818870] ext4_buffered_write_iter+0x57/0x170
Mar 26 01:30:01 publikore-data kernel: [10844965.818873] ext4_file_write_iter+0x43/0x60
Mar 26 01:30:01 publikore-data kernel: [10844965.818875] do_iter_readv_writev+0x14a/0x1b0
Mar 26 01:30:01 publikore-data kernel: [10844965.818878] do_iter_write+0x8c/0x160
Mar 26 01:30:01 publikore-data kernel: [10844965.818881] vfs_iter_write+0x19/0x30
Mar 26 01:30:01 publikore-data kernel: [10844965.818883] nfsd_vfs_write+0x149/0x610 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818901] ? nfs4_put_stid+0xa4/0xc0 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818923] nfsd4_write+0x130/0x1b0 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818941] nfsd4_proc_compound+0x41c/0x770 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818958] ? nfsd_cache_lookup+0x3b7/0x4a0 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818977] nfsd_dispatch+0x16c/0x270 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.818994] svc_process_common+0x3da/0x720 [sunrpc]
Mar 26 01:30:01 publikore-data kernel: [10844965.819021] ? nfsd_svc+0x190/0x190 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.819035] svc_process+0xbc/0x100 [sunrpc]
Mar 26 01:30:01 publikore-data kernel: [10844965.819057] nfsd+0xed/0x150 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.819071] ? nfsd_shutdown_threads+0x90/0x90 [nfsd]
Mar 26 01:30:01 publikore-data kernel: [10844965.819085] kthread+0x12a/0x150
Mar 26 01:30:01 publikore-data kernel: [10844965.819089] ? set_kthread_struct+0x50/0x50
Mar 26 01:30:01 publikore-data kernel: [10844965.819091] ret_from_fork+0x22/0x30
Mar 26 01:30:01 publikore-data kernel: [10844965.819096] </TASK>
Mar 26 01:30:01 publikore-data kernel: [10844965.819096] ---[ end trace a8672e992bf3892a ]---
Mar 26 01:30:01 publikore-data kernel: [10844965.819115] ------------[ cut here ]------------
The VM is running Ubuntu 22.04 Kernel 5.15.0-89-generic
I suspect that somehow after fsfreezing the nfs server is still able to write stuff. Has anyone else experienced this and how could I mitigate this?
regards