Strange Error Messages in Log Files

Jan 9, 2012
282
2
18
Hi,

i use Proxmox for several months now and it went so far very stable.

But yesterday, on Sunday morning, i noticed that no of my VM's was running. I checked the Proxmox Log's and it seems as if the Host just rebooted in the early morning!? In the Log's are a few odd items that I've never seen before:

....
Jul 15 05:34:47 proxmox kernel: gran_size: 64K chunk_size: 64K num_reg: 10 lose cover RAM: 486M
....
Jul 15 05:34:47 proxmox kernel: PM: Registered nosave memory: 000000000009a000 - 000000000009b000
....

For full Log's see Attachments.

What's going on there??


Alex
 

Attachments

Yesterday afternoon the error showed up again!
The system has just rebooted and the log shows this error's again:

...
Jul 25 15:08:56 proxmox kernel: gran_size: 64K chunk_size: 4M num_reg: 10 lose cover RAM: 486M
Jul 25 15:08:56 proxmox kernel: gran_size: 64K chunk_size: 8M num_reg: 10 lose cover RAM: 486M
Jul 25 15:08:56 proxmox kernel: gran_size: 64K chunk_size: 16M num_reg: 10 lose cover RAM: 486M
Jul 25 15:08:56 proxmox kernel: *BAD*gran_size: 64K chunk_size: 32M num_reg: 10 lose cover RAM: -26M
Jul 25 15:08:56 proxmox kernel: gran_size: 64K chunk_size: 64M num_reg: 10 lose cover RAM: 0G
Jul 25 15:08:56 proxmox kernel: gran_size: 64K chunk_size: 128M num_reg: 10 lose cover RAM: 0G
....
Jul 25 15:08:56 proxmox kernel: PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000
Jul 25 15:08:56 proxmox kernel: PM: Registered nosave memory: 00000000000e0000 - 0000000000100000
Jul 25 15:08:56 proxmox kernel: PM: Registered nosave memory: 0000000020000000 - 0000000020200000
....

Is this a Hardware Problem?


Alex
View attachment syslog.zip
 
Yesterday afternoon the error showed up again!
The system has just rebooted and the log shows this error's again:

Is this a Hardware Problem?

Alex
View attachment 1070

Hello Alex,

today I have the same Problem, an the Server and all the VMs are not accessible.
not even on the local machine, not even proxmox.
Everything I can do was a reset the entire machine

What have you done so far?

My Version on a 3ware 9750 Raid10 with 4x2TB WD-SATA HDDs:
root@proxmox01:~# pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-12-pve
proxmox-ve-2.6.32: 2.1-68
pve-kernel-2.6.32-11-pve: 2.6.32-66
pve-kernel-2.6.32-12-pve: 2.6.32-68
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-16
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1


regards maxprox
 
Hi maxprox,

until now I have not done anything yet. The problem persists.
At irregular intervals, the PC suddenly reboots and then, there are the error's in the log file.
The next time i have to check my hardware.


Alex
 

Thank You,

And I found this:
http://my-fuzzy-logic.de/blog/index.php?/archives/41-Solving-linux-MTRR-problems.html
first I will try the last kernel update from proxmox, maybe a BIOS update
if it not gone I will try the kernel options as in the link above

EDIT: the kernel update to pve-kernel-2.6.32-14-pve: 2.6.32-74 does not help
also the kernel option "enable_mtrr_cleanup mtrr_spare_reg_nr=1" does not help (without "mtrr_gran_size=32M mtrr_chunk_size=128M")
I will test the BIOS update and the other two kernel option in the next days.


maxprox
 
Last edited: