Proxmox disconnects from network daily rebooting necessary for a couple of hours

rene415

Member
Jun 29, 2021
3
0
6
27
Hello,

I am having some issues with my server, It happened to me a while ago with another build that I had but this time is really haven't done what I did to the last one.

The problem is that my server keeps disconnecting from the network and I notice that the blinking HDD light on my case goes out, when i try to access my server it keeps disconnecting. I usually have to restart the server but I don't know why. It usually happens late at night when I go to sleep or sometimes after just an hour of booting back up. I know that is is bad to restart the server from time to time like that but I have no way of accessing the server.

One thing that I noticed is that when I restart the machine i check for updates and there is a new update. I update it and it last for a few days until I have to restart and then install new updates.

I am not sure what it could be, I check syslogs and I do see serval things that pop up on the last minutes of the connection


Such as:
"entered blocking state"
"kernel: RIP: 0010:lock_page_memcg+0x23/0xa0"
"kernel: RIP: 0033:0x7f9226d95699"

I just don't know how to deal with it.


Any suggestions or ideas would be greatly appreciated.
 

Attachments

  • Proxmox.png
    Proxmox.png
    76.8 KB · Views: 9
Such as:
"entered blocking state"
"kernel: RIP: 0010:lock_page_memcg+0x23/0xa0"
"kernel: RIP: 0033:0x7f9226d95699"
please post the complete trace and a few lines around it - then we might have a better idea where the issue is
(please post as text inside code tags)
 
Here are the logs that I keep getting, Not sure where it starts. I kept seeing kernel issues so i think it starts from there



Code:
May 28 08:51:33 prometheus kernel: BUG: Bad page state in process swapper/2  pfn:18aa29
May 28 08:51:33 prometheus kernel: page:00000000492de447 refcount:2 mapcount:1 mapping:00000000e9f6a074 index:0x9 pfn:0x18aa29
May 28 08:51:33 prometheus kernel: memcg:ffff98eca0ece000
May 28 08:51:33 prometheus kernel: aops:ext4_da_aops ino:6090b dentry name:"netstandard.dll"
May 28 08:51:33 prometheus kernel: flags: 0x17ffffc0020016(referenced|uptodate|lru|mappedtodisk|node=0|zone=2|lastcpupid=0x1fffff)
May 28 08:51:33 prometheus kernel: raw: 0017ffffc0020016 dead000000000100 dead000000000122 ffff98ecc0926308
May 28 08:51:33 prometheus kernel: raw: 0000000000000009 0000000000000000 0000000200000000 ffff98eca0ece000
May 28 08:51:33 prometheus kernel: page dumped because: page still charged to cgroup
May 28 08:51:33 prometheus kernel: Modules linked in: tcp_diag inet_diag nft_limit xt_LOG nf_log_syslog xt_limit xt_comment xt_tcpudp nft_chain_nat xt_MASQUERADE nf_nat xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter binfmt_misc veth nls_utf8 cifs cifs_arc4 cifs_md4 fscache netfs nf_tables bonding tls softdog nfnetlink_log nfnetlink snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic coretemp ledtrig_audio mei_hdcp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi rapl intel_cstate i915 pcspkr snd_hda_codec intel_wmi_thunderbolt efi_pstore snd_hda_core snd_hwdep ttm snd_pcm mxm_wmi snd_timer drm_kms_helper snd soundcore ee1004 cec rc_core i2c_algo_bit mei_me fb_sys_fops syscopyarea sysfillrect sysimgblt mei intel_pch_thermal mac_hid zfs(PO) acpi_pad zunicode(PO) zzstd(O) zlua(O)
May 28 08:51:33 prometheus kernel:  zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress ses enclosure scsi_transport_sas raid6_pq libcrc32c simplefb uas usb_storage crc32_pclmul i2c_i801 i2c_smbus xhci_pci ahci alx xhci_pci_renesas mdio xhci_hcd libahci wmi video
May 28 08:51:33 prometheus kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P           O      5.15.35-1-pve #1
May 28 08:51:33 prometheus kernel: Hardware name: MSI MS-7977/Z170A-G45 GAMING (MS-7977), BIOS 2.D0 07/02/2018
May 28 08:51:33 prometheus kernel: Call Trace:
May 28 08:51:33 prometheus kernel:  <IRQ>
May 28 08:51:33 prometheus kernel:  dump_stack_lvl+0x4a/0x5f
May 28 08:51:33 prometheus kernel:  dump_stack+0x10/0x12
May 28 08:51:33 prometheus kernel:  bad_page.cold+0x63/0x94
May 28 08:51:33 prometheus kernel:  check_free_page_bad+0x66/0x70
May 28 08:51:33 prometheus kernel:  free_pcppages_bulk+0x1c3/0x390
May 28 08:51:33 prometheus kernel:  free_unref_page_commit.constprop.0+0x12b/0x170
May 28 08:51:33 prometheus kernel:  free_unref_page+0xdf/0x180
May 28 08:51:33 prometheus kernel:  __put_page+0x70/0xd0
May 28 08:51:33 prometheus kernel:  skb_release_data+0x109/0x170
May 28 08:51:33 prometheus kernel:  consume_skb+0x3b/0xb0
May 28 08:51:33 prometheus kernel:  validate_xmit_skb+0x1ea/0x360
May 28 08:51:33 prometheus kernel:  validate_xmit_skb_list+0x4d/0x70
May 28 08:51:33 prometheus kernel:  sch_direct_xmit+0x145/0x390
May 28 08:51:33 prometheus kernel:  __qdisc_run+0x15d/0x5b0
May 28 08:51:33 prometheus kernel:  net_tx_action+0x11a/0x290
May 28 08:51:33 prometheus kernel:  __do_softirq+0xd9/0x2e6
May 28 08:51:33 prometheus kernel:  irq_exit_rcu+0x8c/0xb0
May 28 08:51:33 prometheus kernel:  common_interrupt+0x8a/0xa0
May 28 08:51:33 prometheus kernel:  </IRQ>
May 28 08:51:33 prometheus kernel:  <TASK>
May 28 08:51:33 prometheus kernel:  asm_common_interrupt+0x1e/0x40
May 28 08:51:33 prometheus kernel: RIP: 0010:cpuidle_enter_state+0xd9/0x620
May 28 08:51:33 prometheus kernel: Code: 3d 24 54 83 75 e8 d7 8c 71 ff 49 89 c7 0f 1f 44 00 00 31 ff e8 38 99 71 ff 80 7d d0 00 0f 85 5a 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 66 01 00 00 4d 63 ee 49 83 fd 09 0f 87 e1 03 00 00
May 28 08:51:33 prometheus kernel: RSP: 0018:ffffbf4f800e7e38 EFLAGS: 00000246
May 28 08:51:33 prometheus kernel: RAX: ffff98f196d30b40 RBX: ffffdf4f7fd00000 RCX: 0000000000000000
May 28 08:51:33 prometheus kernel: RDX: 0000000000000315 RSI: 00000000248799d1 RDI: 0000000000000000
May 28 08:51:33 prometheus kernel: RBP: ffffbf4f800e7e88 R08: 00000624c5190d79 R09: 0000000000030d40
May 28 08:51:33 prometheus kernel: R10: 0000000000000007 R11: 071c71c71c71c71c R12: ffffffff8bcd3a80
May 28 08:51:33 prometheus kernel: R13: 0000000000000003 R14: 0000000000000003 R15: 00000624c5190d79
May 28 08:51:33 prometheus kernel:  ? cpuidle_enter_state+0xc8/0x620
May 28 08:51:33 prometheus kernel:  cpuidle_enter+0x2e/0x40
May 28 08:51:33 prometheus kernel:  do_idle+0x209/0x2b0
May 28 08:51:33 prometheus kernel:  cpu_startup_entry+0x20/0x30
May 28 08:51:33 prometheus kernel:  start_secondary+0x12a/0x180
May 28 08:51:33 prometheus kernel:  secondary_startup_64_no_verify+0xc2/0xcb
May 28 08:51:33 prometheus kernel:  </TASK>
May 28 08:51:33 prometheus kernel: BUG: Bad page state in process swapper/2  pfn:18aa2a
May 28 08:51:33 prometheus kernel: page:00000000d0fb933c refcount:2 mapcount:1 mapping:00000000e9f6a074 index:0xa pfn:0x18aa2a
May 28 08:51:33 prometheus systemd-journald[371]: Missed 402 kernel messages
May 28 08:51:33 prometheus kernel: flags: 0x17ffffc0020016(referenced|uptodate|lru|mappedtodisk|node=0|zone=2|lastcpupid=0x1fffff)
May 28 08:51:33 prometheus kernel: raw: 0017ffffc0020016 dead000000000100 dead000000000122 ffff98ecc0926308
May 28 08:51:33 prometheus kernel: raw: 0000000000000012 0000000000000000 0000000200000000 ffff98eca0ece000
May 28 08:51:33 prometheus kernel: page dumped because: page still charged to cgroup
May 28 08:51:33 prometheus kernel: Modules linked in: tcp_diag inet_diag nft_limit xt_LOG nf_log_syslog xt_limit xt_comment xt_tcpudp nft_chain_nat xt_MASQUERADE nf_nat xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter binfmt_misc veth nls_utf8 cifs cifs_arc4 cifs_md4 fscache netfs nf_tables bonding tls softdog nfnetlink_log nfnetlink snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic coretemp ledtrig_audio mei_hdcp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi rapl intel_cstate i915 pcspkr snd_hda_codec intel_wmi_thunderbolt efi_pstore snd_hda_core snd_hwdep ttm snd_pcm mxm_wmi snd_timer drm_kms_helper snd soundcore ee1004 cec rc_core i2c_algo_bit mei_me fb_sys_fops syscopyarea sysfillrect sysimgblt mei intel_pch_thermal mac_hid zfs(PO) acpi_pad zunicode(PO) zzstd(O) zlua(O)
May 28 08:51:33 prometheus kernel:  zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress ses enclosure scsi_transport_sas raid6_pq libcrc32c simplefb uas usb_storage crc32_pclmul i2c_i801 i2c_smbus xhci_pci ahci alx xhci_pci_renesas mdio xhci_hcd libahci wmi video
May 28 08:51:33 prometheus kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P    B      O      5.15.35-1-pve #1
May 28 08:51:33 prometheus kernel: Hardware name: MSI MS-7977/Z170A-G45 GAMING (MS-7977), BIOS 2.D0 07/02/2018
May 28 08:51:33 prometheus kernel: Call Trace:
May 28 08:51:33 prometheus kernel:  <IRQ>
May 28 08:51:33 prometheus kernel:  dump_stack_lvl+0x4a/0x5f
May 28 08:51:33 prometheus kernel:  dump_stack+0x10/0x12
May 28 08:51:33 prometheus kernel:  bad_page.cold+0x63/0x94
May 28 08:51:33 prometheus kernel:  check_free_page_bad+0x66/0x70
May 28 08:51:33 prometheus kernel:  free_pcppages_bulk+0x1c3/0x390
May 28 08:51:33 prometheus kernel:  free_unref_page_commit.constprop.0+0x12b/0x170
May 28 08:51:33 prometheus kernel:  free_unref_page+0xdf/0x180
May 28 08:51:33 prometheus kernel:  __put_page+0x70/0xd0
May 28 08:51:33 prometheus kernel:  skb_release_data+0x109/0x170
May 28 08:51:33 prometheus kernel:  consume_skb+0x3b/0xb0
May 28 08:51:33 prometheus kernel:  validate_xmit_skb+0x1ea/0x360
May 28 08:51:33 prometheus kernel:  validate_xmit_skb_list+0x4d/0x70
May 28 08:51:33 prometheus kernel:  sch_direct_xmit+0x145/0x390
May 28 08:51:33 prometheus kernel:  __qdisc_run+0x15d/0x5b0
May 28 08:51:33 prometheus kernel:  net_tx_action+0x11a/0x290
May 28 08:51:33 prometheus kernel:  __do_softirq+0xd9/0x2e6
May 28 08:51:33 prometheus kernel:  irq_exit_rcu+0x8c/0xb0
May 28 08:51:33 prometheus kernel:  common_interrupt+0x8a/0xa0
May 28 08:51:33 prometheus kernel:  </IRQ>
May 28 08:51:33 prometheus kernel:  <TASK>
May 28 08:51:33 prometheus kernel:  asm_common_interrupt+0x1e/0x40
May 28 08:51:33 prometheus kernel: RIP: 0010:cpuidle_enter_state+0xd9/0x620
May 28 08:51:33 prometheus kernel: Code: 3d 24 54 83 75 e8 d7 8c 71 ff 49 89 c7 0f 1f 44 00 00 31 ff e8 38 99 71 ff 80 7d d0 00 0f 85 5a 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 66 01 00 00 4d 63 ee 49 83 fd 09 0f 87 e1 03 00 00
May 28 08:51:33 prometheus kernel: RSP: 0018:ffffbf4f800e7e38 EFLAGS: 00000246
May 28 08:51:33 prometheus kernel: RAX: ffff98f196d30b40 RBX: ffffdf4f7fd00000 RCX: 0000000000000000
May 28 08:51:33 prometheus kernel: RDX: 0000000000000315 RSI: 00000000248799d1 RDI: 0000000000000000
May 28 08:51:33 prometheus kernel: RBP: ffffbf4f800e7e88 R08: 00000624c5190d79 R09: 0000000000030d40
May 28 08:51:33 prometheus kernel: R10: 0000000000000007 R11: 071c71c71c71c71c R12: ffffffff8bcd3a80
May 28 08:51:33 prometheus kernel: R13: 0000000000000003 R14: 0000000000000003 R15: 00000624c5190d79
May 28 08:51:33 prometheus kernel:  ? cpuidle_enter_state+0xc8/0x620
May 28 08:51:33 prometheus kernel:  cpuidle_enter+0x2e/0x40
May 28 08:51:33 prometheus kernel:  do_idle+0x209/0x2b0
May 28 08:51:33 prometheus kernel:  cpu_startup_entry+0x20/0x30
May 28 08:51:33 prometheus kernel:  start_secondary+0x12a/0x180
May 28 08:51:33 prometheus kernel:  secondary_startup_64_no_verify+0xc2/0xcb
May 28 08:51:33 prometheus kernel:  </TASK>
May 28 08:51:33 prometheus kernel: BUG: Bad page state in process swapper/2  pfn:18aa33
May 28 08:51:33 prometheus kernel: page:000000005c6b5a23 refcount:2 mapcount:1 mapping:00000000e9f6a074 index:0x13 pfn:0x18aa33
May 28 08:51:33 prometheus kernel: memcg:ffff98eca0ece000
May 28 08:51:33 prometheus kernel: aops:ext4_da_aops ino:6090b dentry name:"netstandard.dll"
May 28 08:51:33 prometheus kernel: flags: 0x17ffffc0020016(referenced|uptodate|lru|mappedtodisk|node=0|zone=2|lastcpupid=0x1fffff)
May 28 08:51:33 prometheus kernel: raw: 0017ffffc0020016 dead000000000100 dead000000000122 ffff98ecc0926308
May 28 08:51:33 prometheus kernel: raw: 0000000000000013 0000000000000000 0000000200000000 ffff98eca0ece000
May 28 08:51:33 prometheus kernel: page dumped because: page still charged to cgroup
May 28 08:51:33 prometheus kernel: Modules linked in: tcp_diag inet_diag nft_limit xt_LOG nf_log_syslog xt_limit xt_comment xt_tcpudp nft_chain_nat xt_MASQUERADE nf_nat xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter binfmt_misc veth nls_utf8 cifs cifs_arc4 cifs_md4 fscache netfs nf_tables bonding tls softdog nfnetlink_log nfnetlink snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic coretemp ledtrig_audio mei_hdcp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi rapl intel_cstate i915 pcspkr snd_hda_codec intel_wmi_thunderbolt efi_pstore snd_hda_core snd_hwdep ttm snd_pcm mxm_wmi snd_timer drm_kms_helper snd soundcore ee1004 cec rc_core i2c_algo_bit mei_me fb_sys_fops syscopyarea sysfillrect sysimgblt mei intel_pch_thermal mac_hid zfs(PO) acpi_pad zunicode(PO) zzstd(O) zlua(O)
May 28 08:51:33 prometheus kernel:  zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress ses enclosure scsi_transport_sas raid6_pq libcrc32c simplefb uas usb_storage crc32_pclmul i2c_i801 i2c_smbus xhci_pci ahci alx xhci_pci_renesas mdio xhci_hcd libahci wmi video
May 28 08:51:33 prometheus kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P    B      O      5.15.35-1-pve #1
May 28 08:51:33 prometheus kernel: Hardware name: MSI MS-7977/Z170A-G45 GAMING (MS-7977), BIOS 2.D0 07/02/2018
May 28 08:51:33 prometheus kernel: Call Trace:
May 28 08:51:33 prometheus kernel:  <IRQ>
 
Hmm - looks unfamiliar ...

* do you have containers (pct/lxc) running on that machine - if yes - do you have any non-default configs enabled for those containers?
* do you have made any modifications to the configuration - compared to a plain PVE system?

else:
* the BIOS is from 2018 - I'd recommend to upgrade it - since these things can be caused by an outdated BIOS (+new kernel)
* last but not least - I'd suggest to run a memtest for an extended period - maybe it's a broken memory stick

I hope this helps!
 
  • Like
Reactions: rene415

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!