BUG: Bad page state in process

hvisage

Renowned Member
May 21, 2013
251
20
83
Good day,

During a high network activity, keeping a 1GbE port fully utilized for about 10-20minutes, having high read and writes to the disks too (all network & disk activity to/from the VM running a torrent client), the system went "unreachable", looking at the dmesg output:

Code:
root@prsurf:/var/log# dmesg | grep -i BUG
[1154898.042101] BUG: Bad page state in process kvm  pfn:532ffd
[1154898.073382] BUG: Bad page state in process kvm  pfn:c04f85
[1154906.191530] BUG: Bad page state in process kworker/u16:3  pfn:fb4254
[1154906.411013] BUG: Bad page state in process txg_sync  pfn:31a892
[1154919.614854] BUG: Bad page state in process z_wr_iss  pfn:344db8
[1154954.569705] BUG: Bad page state in process kvm  pfn:bbafc7
[1154955.844705] BUG: Bad page state in process z_wr_iss  pfn:c68439
[1154976.975976] BUG: Bad page state in process spl_kmem_cache  pfn:cc1d41
root@prsurf:/var/log#

Attached dmesg & pvereport output
 

Attachments

  • dmesg.txt
    260.8 KB · Views: 2
  • pvereport.txt
    42 KB · Views: 2
but the problems already started earlier (and presumably unrelated to ZFS) when the kernel failed to allocate pages.. IMHO this sounds like a memory pressure/memory fragmentation issue..
 
Hmmm...
but the problems already started earlier (and presumably unrelated to ZFS) when the kernel failed to allocate pages.. IMHO this sounds like a memory pressure/memory fragmentation issue..

How much earlier, or can't you tell?
Looking at my RRD graphs Screenshot 2017-09-06 12.19.17.pngScreenshot 2017-09-06 12.18.32.png that might've been the backups schedules that caused that? (those 02:00 peaks)

The question then: is it a bug, or "normal", and what could be done to alleviate/prevent it in future?
 
the first entries of your posted log (which is ~70s before the first BUG entry, but I do not know what happened before the first entries ;)).

posting a full log from 2017-09-06 9-10:30 might shed some light onto the issue:
Code:
journalctl --since "2017-09-06 09:00" --until "2017-09-06 10:30"
 
posting a full log from 2017-09-06 9-10:30 might shed some light onto the issue:
Code:
journalctl --since "2017-09-06 09:00" --until "2017-09-06 10:30"

See attached.
 

Attachments

  • journalctl-0900-1030.txt
    495.5 KB · Views: 3

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!