[SOLVED] kern.log, syslog and messages growing too big

Fathi

Member
May 13, 2016
38
0
6
47
Tunis, Tunisia
Hi,
First I thought that the pve no subscription channel has some kernel debugging enabled that filled my root device in less than two days of non continual usage, but even dmesg is listing some errors intead of the usual boot and peripheral information. My logs are bloated of repeated messages. dmesg returns a lot of ones similar to the following:
[34256.095328] pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
[34256.095333] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e0(Receiver ID)
[34256.095337] pcieport 0000:00:1c.0: device [8086:a293] error status/mask=00000001/00002000
[34256.095354] pcieport 0000:00:1c.0: [ 0] Receiver Error (First)
[34256.097803] pcieport 0000:00:1c.0: AER: Multiple Corrected error received: id=00e0
[34256.097917] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[34256.097921] pcieport 0000:00:1c.0: device [8086:a293] error status/mask=00001100/00002000
[34256.097923] pcieport 0000:00:1c.0: [ 8] RELAY_NUM Rollover
[34256.097925] pcieport 0000:00:1c.0: [12] Replay Timer Timeout
[34256.097929] pcieport 0000:00:1c.0: AER: Multiple Corrected error received: id=00e0
[34256.098090] pcieport 0000:00:1c.0: can't find device of ID00e0
[34256.098092] pcieport 0000:00:1c.0: AER: Multiple Corrected error received: id=00e0
[34256.098151] pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
[34256.098155] pcieport 0000:00:1c.0: device [8086:a293] error status/mask=00003100/00002000
[34256.098159] pcieport 0000:00:1c.0: [ 8] RELAY_NUM Rollover
[34256.098162] pcieport 0000:00:1c.0: [12] Replay Timer Timeout

I am setting up this server for an unmanaged poc embedded on a train, which should be running before this weekend. When root partition filled, no vm, no container could start.

Could someone please help me debug this ?

P.S.: This is on a new optiplex 5050 with one intel onboard nic and one rtl8111 added nic.
TIA.
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
1,847
178
63
seems like a problem with a pci-device - look at the output of `lspci -v` and see which device is behind `0000:00:1c.0`
 

Fathi

Member
May 13, 2016
38
0
6
47
Tunis, Tunisia
I was nearly certain that the problem comes from the second nic card, but lspc -v returned:

00:1c.0 PCI bridge: Intel Corporation Device a293 (rev f0) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 122
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: f7000000-f70fffff
Prefetchable memory behind bridge: 00000000f0000000-00000000f00fffff
Capabilities: [40] Express Root Port (Slot+), MSI 00
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [90] Subsystem: Dell Device 07a2
Capabilities: [a0] Power Management version 3
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Access Control Services
Capabilities: [220] #19
Kernel driver in use: pcieport
Kernel modules: shpchp

This desktop has had a sencond nic card and a 8Gb ram added by the resseller. What could be the reason ? I know this is probably not proxmox fault, so forgive me for asking this here.
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
1,847
178
63
  • Like
Reactions: Fathi

Fathi

Member
May 13, 2016
38
0
6
47
Tunis, Tunisia
Hi,
Finally we bought a new, branded, nic and replaced the one added by the reseller (rtl8169 chip on unbranded nic) and all the problems disappeared.
That was a hardware problem. I could not even suspect a certified dell reseller adding an unbranded nic on original dell desktop.
Thank you all.
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
1,847
178
63
Glad to hear your problem is resolved - Please mark the thread as solved, since this helps other users with similar problems!
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
1,847
178
63
On top of the thread next to the subject there should be the menu "Thread Tools" -> Edit thread -> set the Prefix to "Solved"
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!