PVE node rebooyed suddenly.

Alexis Cambriegis

New Member
Oct 1, 2017
3
0
1
27
Hi guys! I have such kind of trouble: one of proxmox node is rebooting at random time in the night.

Hardware is: 2 SSD's in raidz1. Inte Xeon W5580, one HDD in separate zpool. There is only one Windows VM ont the server running win server 2016 and SQL server 2012 on it. It happens while night backups...

/var/log/messages says something like that

How Should I struggle with that?

Oct 2 22:55:24 pve-node2 kernel: [195624.474749] Call Trace:

Oct 2 22:55:24 pve-node2 kernel: [195624.474758] __schedule+0x233/0x6f0

Oct 2 22:55:24 pve-node2 kernel: [195624.474759] schedule+0x36/0x80

Oct 2 22:55:24 pve-node2 kernel: [195624.474760] rwsem_down_write_failed+0x21d/0x3a0

Oct 2 22:55:24 pve-node2 kernel: [195624.474765] call_rwsem_down_write_failed+0x17/0x30

Oct 2 22:55:24 pve-node2 kernel: [195624.474766] down_write+0x2d/0x40

Oct 2 22:55:24 pve-node2 kernel: [195624.474768] filename_create+0x7e/0x160

Oct 2 22:55:24 pve-node2 kernel: [195624.474769] SyS_mkdir+0x51/0x100

Oct 2 22:55:24 pve-node2 kernel: [195624.474770] entry_SYSCALL_64_fastpath+0x1e/0xad

Oct 2 22:55:24 pve-node2 kernel: [195624.474772] RIP: 0033:0x7f65decca477

Oct 2 22:55:24 pve-node2 kernel: [195624.474772] RSP: 002b:00007ffc400c5538 EFLAGS: 00000246 ORIG_RAX: 0000000000000053

Oct 2 22:55:24 pve-node2 kernel: [195624.474773] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f65decca477

Oct 2 22:55:24 pve-node2 kernel: [195624.474774] RDX: 00005578317d7364 RSI: 00000000000001ff RDI: 0000557836dec9d0

Oct 2 22:55:24 pve-node2 kernel: [195624.474775] RBP: 0000557833929010 R08: 0000000000000200 R09: 0000557833929028

Oct 2 22:55:24 pve-node2 kernel: [195624.474775] R10: 0000000000000000 R11: 0000000000000246 R12: 0000557836e3cc90

Oct 2 22:55:24 pve-node2 kernel: [195624.474776] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Oct 2 22:57:25 pve-node2 kernel: [195745.304438] pvesr D 0 12406 1 0x00000000

Oct 2 22:57:25 pve-node2 kernel: [195745.304440] Call Trace:

Oct 2 22:57:25 pve-node2 kernel: [195745.304449] __schedule+0x233/0x6f0

Oct 2 22:57:25 pve-node2 kernel: [195745.304450] schedule+0x36/0x80
Oct 2 22:57:25 pve-node2 kernel: [195745.304452] rwsem_down_write_failed+0x21d/0x3a0
Oct 2 22:57:25 pve-node2 kernel: [195745.304456] call_rwsem_down_write_failed+0x17/0x30
Oct 2 22:57:25 pve-node2 kernel: [195745.304457] down_write+0x2d/0x40
Oct 2 22:57:25 pve-node2 kernel: [195745.304459] filename_create+0x7e/0x160
Oct 2 22:57:25 pve-node2 kernel: [195745.304460] SyS_mkdir+0x51/0x100
Oct 2 22:57:25 pve-node2 kernel: [195745.304461] entry_SYSCALL_64_fastpath+0x1e/0xad
Oct 2 22:57:25 pve-node2 kernel: [195745.304463] RIP: 0033:0x7f65decca477
Oct 2 22:57:25 pve-node2 kernel: [195745.304463] RSP: 002b:00007ffc400c5538 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
Oct 2 22:57:25 pve-node2 kernel: [195745.304464] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f65decca477
Oct 2 22:57:25 pve-node2 kernel: [195745.304465] RDX: 00005578317d7364 RSI: 00000000000001ff RDI: 0000557836dec9d0
Oct 2 22:57:25 pve-node2 kernel: [195745.304466] RBP: 0000557833929010 R08: 0000000000000200 R09: 0000557833929028
Oct 2 22:57:25 pve-node2 kernel: [195745.304466] R10: 0000000000000000 R11: 0000000000000246 R12: 0000557836e3cc90
Oct 2 22:57:25 pve-node2 kernel: [195745.304467] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Oct 2 22:59:26 pve-node2 kernel: [195866.134138] pvesr D 0 12406 1 0x00000000
Oct 2 22:59:26 pve-node2 kernel: [195866.134141] Call Trace:
Oct 2 22:59:26 pve-node2 kernel: [195866.134149] __schedule+0x233/0x6f0
Oct 2 22:59:26 pve-node2 kernel: [195866.134151] schedule+0x36/0x80
Oct 2 22:59:26 pve-node2 kernel: [195866.134152] rwsem_down_write_failed+0x21d/0x3a0
Oct 2 22:59:26 pve-node2 kernel: [195866.134157] call_rwsem_down_write_failed+0x17/0x30
Oct 2 22:59:26 pve-node2 kernel: [195866.134157] down_write+0x2d/0x40
Oct 2 22:59:26 pve-node2 kernel: [195866.134159] filename_create+0x7e/0x160
Oct 2 22:59:26 pve-node2 kernel: [195866.134161] SyS_mkdir+0x51/0x100
Oct 2 22:59:26 pve-node2 kernel: [195866.134163] entry_SYSCALL_64_fastpath+0x1e/0xad
Oct 2 22:59:26 pve-node2 kernel: [195866.134164] RIP: 0033:0x7f65decca477






Oct 4 10:08:03 pve-node2 pvedaemon[6255]: <root@pam> end task UPID:pve-node2:000032EA:0004F0BC:59D488D2:qmstart:101:root@pam: OK

Oct 4 10:14:58 pve-node2 kernel: [ 3654.231683] perf: interrupt took too long (3127 > 3126), lowering kernel.perf_event_max_sample_rate to 63750

Oct 4 10:21:57 pve-node2 kernel: [ 4072.883895] perf: interrupt took too long (3909 > 3908), lowering kernel.perf_event_max_sample_rate to 51000

Oct 4 10:35:03 pve-node2 kernel: [ 4858.390220] perf: interrupt took too long (5075 > 4886), lowering kernel.perf_event_max_sample_rate to 39250

Oct 4 21:13:22 pve-node2 pvedaemon[4112]: <root@pam> update VM 101: -delete shares -balloon 0 -memory 23552

Oct 5 02:21:23 pve-node2 kernel: [61626.052075] ksmd D 0 112 2 0x00000000

Oct 5 02:21:23 pve-node2 kernel: [61626.052077] Call Trace:

Oct 5 02:21:23 pve-node2 kernel: [61626.052086] __schedule+0x233/0x6f0

Oct 5 02:21:23 pve-node2 kernel: [61626.052088] schedule+0x36/0x80

Oct 5 02:21:23 pve-node2 kernel: [61626.052089] rwsem_down_read_failed+0xfa/0x150

Oct 5 02:21:23 pve-node2 kernel: [61626.052094] call_rwsem_down_read_failed+0x18/0x30

Oct 5 02:21:23 pve-node2 kernel: [61626.052095] down_read+0x20/0x40

Oct 5 02:21:23 pve-node2 kernel: [61626.052097] ksm_scan_thread+0x1f5/0x1350

Oct 5 02:21:23 pve-node2 kernel: [61626.052101] ? wake_atomic_t_function+0x60/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052104] kthread+0x109/0x140

Oct 5 02:21:23 pve-node2 kernel: [61626.052105] ? try_to_merge_with_ksm_page+0xa0/0xa0

Oct 5 02:21:23 pve-node2 kernel: [61626.052106] ? kthread_create_on_node+0x60/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052109] ret_from_fork+0x2c/0x40

Oct 5 02:21:23 pve-node2 kernel: [61626.052203] khugepaged D 0 113 2 0x00000000

Oct 5 02:21:23 pve-node2 kernel: [61626.052204] Call Trace:

Oct 5 02:21:23 pve-node2 kernel: [61626.052206] __schedule+0x233/0x6f0

Oct 5 02:21:23 pve-node2 kernel: [61626.052208] schedule+0x36/0x80

Oct 5 02:21:23 pve-node2 kernel: [61626.052209] rwsem_down_read_failed+0xfa/0x150

Oct 5 02:21:23 pve-node2 kernel: [61626.052211] call_rwsem_down_read_failed+0x18/0x30

Oct 5 02:21:23 pve-node2 kernel: [61626.052212] down_read+0x20/0x40

Oct 5 02:21:23 pve-node2 kernel: [61626.052215] khugepaged+0x45c/0x2030

Oct 5 02:21:23 pve-node2 kernel: [61626.052217] ? wake_atomic_t_function+0x60/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052219] kthread+0x109/0x140

Oct 5 02:21:23 pve-node2 kernel: [61626.052220] ? collapse_shmem+0xc20/0xc20

Oct 5 02:21:23 pve-node2 kernel: [61626.052221] ? kthread_create_on_node+0x60/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052223] ret_from_fork+0x2c/0x40

Oct 5 02:21:23 pve-node2 kernel: [61626.052317] kswapd0 D 0 141 2 0x00000000






Oct 5 02:21:23 pve-node2 kernel: [61626.052318] Call Trace:

Oct 5 02:21:23 pve-node2 kernel: [61626.052320] __schedule+0x233/0x6f0

Oct 5 02:21:23 pve-node2 kernel: [61626.052322] schedule+0x36/0x80

Oct 5 02:21:23 pve-node2 kernel: [61626.052323] schedule_timeout+0x22a/0x3f0

Oct 5 02:21:23 pve-node2 kernel: [61626.052367] ? zio_taskq_member.isra.8.constprop.17+0x70/0x70 [zfs]

Oct 5 02:21:23 pve-node2 kernel: [61626.052370] ? ktime_get+0x41/0xb0

Oct 5 02:21:23 pve-node2 kernel: [61626.052372] io_schedule_timeout+0xa4/0x110

Oct 5 02:21:23 pve-node2 kernel: [61626.052380] cv_wait_common+0xbc/0x140 [spl]

Oct 5 02:21:23 pve-node2 kernel: [61626.052382] ? wake_atomic_t_function+0x60/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052386] __cv_wait_io+0x18/0x20 [spl]

Oct 5 02:21:23 pve-node2 kernel: [61626.052415] zio_wait+0xc4/0x160 [zfs]

Oct 5 02:21:23 pve-node2 kernel: [61626.052443] zil_commit.part.13+0x3f8/0x7f0 [zfs]

Oct 5 02:21:23 pve-node2 kernel: [61626.052472] zil_commit+0x17/0x20 [zfs]

Oct 5 02:21:23 pve-node2 kernel: [61626.052502] zvol_request+0x37c/0x680 [zfs]

Oct 5 02:21:23 pve-node2 kernel: [61626.052506] ? SyS_madvise+0x8c0/0x8c0

Oct 5 02:21:23 pve-node2 kernel: [61626.052508] generic_make_request+0x110/0x2d0

Oct 5 02:21:23 pve-node2 kernel: [61626.052510] submit_bio+0x73/0x150

Oct 5 02:21:23 pve-node2 kernel: [61626.052511] ? SyS_madvise+0x8c0/0x8c0

Oct 5 02:21:23 pve-node2 kernel: [61626.052512] ? map_swap_page+0x12/0x20

Oct 5 02:21:23 pve-node2 kernel: [61626.052514] __swap_writepage+0x2be/0x310

Oct 5 02:21:23 pve-node2 kernel: [61626.052515] ? __frontswap_store+0x6d/0xf0

Oct 5 02:21:23 pve-node2 kernel: [61626.052516] swap_writepage+0x34/0x90

Oct 5 02:21:23 pve-node2 kernel: [61626.052520] pageout.isra.43+0x189/0x2a0

Oct 5 02:21:23 pve-node2 kernel: [61626.052521] shrink_page_list+0x833/0xa00

Oct 5 02:21:23 pve-node2 kernel: [61626.052523] shrink_inactive_list+0x231/0x560

Oct 5 02:21:23 pve-node2 kernel: [61626.052525] shrink_node_memcg+0x5fb/0x7d0

Oct 5 02:21:23 pve-node2 kernel: [61626.052529] ? css_next_descendant_pre+0x4c/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052531] ? css_next_descendant_pre+0x4c/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052532] shrink_node+0xe1/0x320

Oct 5 02:21:23 pve-node2 kernel: [61626.052534] kswapd+0x2f6/0x710

Oct 5 02:21:23 pve-node2 kernel: [61626.052535] kthread+0x109/0x140

Oct 5 02:21:23 pve-node2 kernel: [61626.052537] ? mem_cgroup_shrink_node+0x170/0x170

Oct 5 02:21:23 pve-node2 kernel: [61626.052538] ? kthread_create_on_node+0x60/0x60

Oct 5 02:21:23 pve-node2 kernel: [61626.052540] ret_from_fork+0x2c/0x40

Oct 5 02:21:23 pve-node2 kernel: [61626.052632] kswapd1 D 0 142 2 0x00000000



Oct 5 02:21:23 pve-node2 kernel: [61626.054362] RIP: 0033:0x7fbdb9f1fe07

Oct 5 02:21:23 pve-node2 kernel: [61626.054363] RSP: 002b:00007fbdaabfc538 EFLAGS: 00000246 ORIG_RAX: 0000000000000010

Oct 5 02:21:23 pve-node2 kernel: [61626.054364] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fbdb9f1fe07

Oct 5 02:21:23 pve-node2 kernel: [61626.054365] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001a

Oct 5 02:21:23 pve-node2 kernel: [61626.054366] RBP: 00007fbdae5ee000 R08: 00005608670a2c50 R09: 00000000000000ff

Oct 5 02:21:23 pve-node2 kernel: [61626.054367] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000

Oct 5 02:21:23 pve-node2 kernel: [61626.054367] R13: 00007fbdd21ea000 R14: 0000000000000000 R15: 00007fbdae5ee000

Oct 5 02:21:23 pve-node2 kernel: [61626.057084] kvm D 0 13154 1 0x00000000

Oct 5 02:21:23 pve-node2 kernel: [61626.057086] Call Trace:

Oct 5 02:21:23 pve-node2 kernel: [61626.057088] __schedule+0x233/0x6f0

Oct 5 02:21:23 pve-node2 kernel: [61626.057101] ? kvm_sched_in+0x39/0x40 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057103] schedule+0x36/0x80

Oct 5 02:21:23 pve-node2 kernel: [61626.057104] rwsem_down_read_failed+0xfa/0x150

Oct 5 02:21:23 pve-node2 kernel: [61626.057105] ? schedule+0x36/0x80

Oct 5 02:21:23 pve-node2 kernel: [61626.057107] call_rwsem_down_read_failed+0x18/0x30

Oct 5 02:21:23 pve-node2 kernel: [61626.057108] down_read+0x20/0x40

Oct 5 02:21:23 pve-node2 kernel: [61626.057119] kvm_host_page_size+0x4d/0xa0 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057133] mapping_level+0x5f/0x110 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057147] tdp_page_fault+0xb0/0x280 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057161] kvm_mmu_page_fault+0x60/0x120 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057164] handle_ept_violation+0x91/0x170 [kvm_intel]

Oct 5 02:21:23 pve-node2 kernel: [61626.057167] vmx_handle_exit+0x1ca/0x1400 [kvm_intel]

Oct 5 02:21:23 pve-node2 kernel: [61626.057171] ? atomic_switch_perf_msrs+0x6f/0xa0 [kvm_intel]

Oct 5 02:21:23 pve-node2 kernel: [61626.057174] ? vmx_vcpu_run+0x2d1/0x490 [kvm_intel]

Oct 5 02:21:23 pve-node2 kernel: [61626.057187] kvm_arch_vcpu_ioctl_run+0x7bd/0x15d0 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057189] ? update_load_avg+0x6b/0x510

Oct 5 02:21:23 pve-node2 kernel: [61626.057200] kvm_vcpu_ioctl+0x339/0x620 [kvm]

Oct 5 02:21:23 pve-node2 kernel: [61626.057202] ? pick_next_task_fair+0x47a/0x4b0

Oct 5 02:21:23 pve-node2 kernel: [61626.057203] ? __switch_to+0x23c/0x520

Oct 5 02:21:23 pve-node2 kernel: [61626.057204] do_vfs_ioctl+0xa3/0x610

Oct 5 02:21:23 pve-node2 kernel: [61626.057206] ? __schedule+0x23b/0x6f0

Oct 5 02:21:23 pve-node2 kernel: [61626.057207] SyS_ioctl+0x79/0x90

Oct 5 02:21:23 pve-node2 kernel: [61626.057209] entry_SYSCALL_64_fastpath+0x1e/0xad

Oct 5 02:21:23 pve-node2 kernel: [61626.057210] RIP: 0033:0x7fbdb9f1fe07

Oct 5 02:21:23 pve-node2 kernel: [61626.057210] RSP: 002b:00007fbda9bfc538 EFLAGS: 00000246 ORIG_RAX: 0000000000000010

Oct 5 02:21:23 pve-node2 kernel: [61626.057212] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fbdb9f1fe07

Oct 5 02:21:23 pve-node2 kernel: [61626.057213] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001b
 
Please use spoiler feature (under the small "+" in the text field) in the forum, when writing log entries in posts. That way the post will not be unreadable.

What does the performance of the PVE server during the backup look like? Where do you write the backup to?
 
How much memory (total) does the server have? And how much memory is allocated to the VM?
 
What does the performance of the PVE server during the backup look like? Where do you write the backup to? How much disk space do you use currently? As ZFS needs 1GB RAM for every 1TB and some more if L2ARC is set up. Also don't forget, that it needs some RAM for cache too.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!