BUG: Bad page state in process

hvisage

Renowned Member
May 21, 2013
300
33
93
Good day,

During a high network activity, keeping a 1GbE port fully utilized for about 10-20minutes, having high read and writes to the disks too (all network & disk activity to/from the VM running a torrent client), the system went "unreachable", looking at the dmesg output:

Code:
root@prsurf:/var/log# dmesg | grep -i BUG
[1154898.042101] BUG: Bad page state in process kvm  pfn:532ffd
[1154898.073382] BUG: Bad page state in process kvm  pfn:c04f85
[1154906.191530] BUG: Bad page state in process kworker/u16:3  pfn:fb4254
[1154906.411013] BUG: Bad page state in process txg_sync  pfn:31a892
[1154919.614854] BUG: Bad page state in process z_wr_iss  pfn:344db8
[1154954.569705] BUG: Bad page state in process kvm  pfn:bbafc7
[1154955.844705] BUG: Bad page state in process z_wr_iss  pfn:c68439
[1154976.975976] BUG: Bad page state in process spl_kmem_cache  pfn:cc1d41
root@prsurf:/var/log#

Attached dmesg & pvereport output
 

Attachments

but the problems already started earlier (and presumably unrelated to ZFS) when the kernel failed to allocate pages.. IMHO this sounds like a memory pressure/memory fragmentation issue..
 
Hmmm...
but the problems already started earlier (and presumably unrelated to ZFS) when the kernel failed to allocate pages.. IMHO this sounds like a memory pressure/memory fragmentation issue..

How much earlier, or can't you tell?
Looking at my RRD graphs Screenshot 2017-09-06 12.19.17.pngScreenshot 2017-09-06 12.18.32.png that might've been the backups schedules that caused that? (those 02:00 peaks)

The question then: is it a bug, or "normal", and what could be done to alleviate/prevent it in future?
 
the first entries of your posted log (which is ~70s before the first BUG entry, but I do not know what happened before the first entries ;)).

posting a full log from 2017-09-06 9-10:30 might shed some light onto the issue:
Code:
journalctl --since "2017-09-06 09:00" --until "2017-09-06 10:30"