Patch x570/Ryzen EDAC support into pve 6.1?

I was trying to monitor for ECC errors and when I ran edac-util I got

Code:
# edac-util -v
edac-util: Error: No memory controller data found.

Then looking at dmesg I see:
Code:
# dmesg | grep EDAC
[    0.795765] EDAC MC: Ver: 3.0.0
[   16.382549] EDAC amd64: Node 0: DRAM ECC enabled.
[   16.382553] EDAC amd64: F17h detected (node 0).
[   16.382563] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[   16.382564] EDAC amd64: Error: Error probing instance: 0
[   16.451951] EDAC amd64: Node 0: DRAM ECC enabled.
... repeated last 5 lines 21 times ...
[   17.734011] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[   17.734688] EDAC amd64: Error: Error probing instance: 0
[   17.776688] EDAC amd64: Node 0: DRAM ECC enabled.
[   17.777413] EDAC amd64: F17h detected (node 0).
[   17.778102] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[   17.778786] EDAC amd64: Error: Error probing instance: 0


Here is what is reported about the memory from dmidecode:
Code:
# dmidecode -t memory|grep -iE "ecc|width"
	Error Correction Type: Multi-bit ECC
	Total Width: 128 bits
	Data Width: 64 bits
	Total Width: 128 bits
	Data Width: 64 bits
	Total Width: 128 bits
	Data Width: 64 bits
	Total Width: 128 bits
	Data Width: 64 bits


Hardware is:
- Pro-WS-X570-ACE
- Ryzen 3900X
- SAMSUNG M391A2K43BB1-CRC 16GB (1X16GB) 2400MHZ PC4-19200 CL17 ECC
 
Not really for X570, but I think there is an issue with the lasts PVE kernels.

Code:
pveversion
pve-manager/6.1-5/9bf06119 (running kernel: 5.0.21-5-pve)

edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
edac-util: No errors to report.

Code:
pveversion
pve-manager/6.1-5/9bf06119 (running kernel: 5.3.10-1-pve)
edac-util -v
edac-util: Error: No memory controller data found.

Code:
pveversion
pve-manager/6.1-5/9bf06119 (running kernel: 5.3.13-1-pve)
edac-util -v
edac-util: Error: No memory controller data found.


ASUS P10S-i
Xeon 1230 v6
32 Gb Kingston (KVR24E17D8/16)
 
Hrm EDAC works for me on another server:

Code:
$ pveversion
pve-manager/6.1-5/9bf06119 (running kernel: 5.3.13-1-pve)

$ edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc1: 0 Uncorrected Errors with no DIMM info
mc1: 0 Corrected Errors with no DIMM info
edac-util: No errors to report.

Also /sys/devices/system/edac/mc/mc0 exists on that machine and does not on the affected machine.

- Intel S2600CP2J
- 2x Xeon E5-2670
- 128GB Micron DDR3 ECC 36JSF1G72PZ-1G6M1x


pveversion from the affected machine is the same:
pve-manager/6.1-5/9bf06119 (running kernel: 5.3.13-1-pve)
 
On my machine with 5.0 kernel:

Code:
pveversion
pve-manager/6.1-5/9bf06119 (running kernel: 5.0.21-5-pve)

Code:
dmesg | grep EDAC
[    0.353422] EDAC MC: Ver: 3.0.0
[    3.078947] EDAC MC0: Giving out device to module ie31200_edac controller IE31200: DEV 0000:00:00.0 (POLLED)

Code:
/sys/devices/system/edac/mc/mc0# ls
ce_count         max_location  power  rank1  rank3           seconds_since_reset  ue_count         uevent
ce_noinfo_count  mc_name       rank0  rank2  reset_counters  size_mb              ue_noinfo_count


And with 5.3:

Code:
pveversion
pve-manager/6.1-5/9bf06119 (running kernel: 5.3.13-1-pve)

Code:
 dmesg | grep EDAC
[    0.274603] EDAC MC: Ver: 3.0.0

Code:
cd /sys/devices/system/edac/mc/mc0
bash: cd: /sys/devices/system/edac/mc/mc0: Aucun fichier ou dossier de ce type

With the same kernel my hardware is affected :/
 
Can it be the source of the problem ?

with 5.0:

Code:
lspci -v
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
        Subsystem: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
        Flags: bus master, fast devsel, latency 0
        Capabilities: [e0] Vendor Specific Information: Len=10 <?>
        Kernel driver in use: ie31200_edac
        Kernel modules: ie31200_edac

and with 5.3:

Code:
 lspci -v
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
        Subsystem: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
        Flags: bus master, fast devsel, latency 0
        Capabilities: [e0] Vendor Specific Information: Len=10 <?>
        Kernel driver in use: skl_uncore
        Kernel modules: ie31200_edac
 
Can it be the source of the problem ?

Okay, take the following with a grain of salt, since I couldn't find much on the 'skl_uncore' driver (seems to just be for statistics about uncore events?), but since it wasn't present in 5.0 and everything ran fine, you should be able to simply unbind it and load your edac driver instead.

E.g.:
Code:
echo '0000:00:00.0' > /sys/bus/pci/drivers/skl_uncore/unbind
modprobe -v ie31200_edac

Then check dmesg or edac-util -v again.

If that works, you could blacklist the 'skl_uncore' driver by creating a file /etc/modprobe.d/uncore.conf containing 'blacklist skl_uncore' and then rebuild your initramfs.
 
Code:
echo '0000:00:00.0' > /sys/bus/pci/drivers/skl_uncore/unbind
modprobe -v ie31200_edac

I tried but not working, skl_uncore is always used.

I tried to blacklist it and update initramfs, but again always used.

Code:
lspci -v
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
        Subsystem: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
        Flags: bus master, fast devsel, latency 0
        Capabilities: [e0] Vendor Specific Information: Len=10 <?>
        Kernel driver in use: skl_uncore
        Kernel modules: ie31200_edac

Code:
cat /etc/modprobe.d/blacklist.conf
blacklist amdgpu
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist snd_hda_intel
blacklist skl_uncore
 
@EricD Alright, I took another look at our kernel and made sure we really include all necessary patches - and we do. At [0] you can see the PCI device ID 0x1460, which is what your dmesg mentions as well, i.e. the kernel tries to access the correct EDAC ports for AMD platform 17h/70h (Zen 2 aka. Ryzen 3000 series).

Have you tried updating your BIOS? Maybe also try booting a live system from a more mainline Distro, e.g. Arch or Manjaro, and check what those say about EDAC - it could of course also be a hardware error; your board, CPU or RAM might be defective.

AFAICT our software stack should work just fine with EDAC on X570/Ryzen 3000.

[0] https://git.proxmox.com/?p=mirror_u...c75f99c1d22e15fbe63ca1eea23256dd;hb=HEAD#l117
 
I have the same issue:
CPU: AMD Ryzen 9 3950x
Mobo: ASrock x570 Creator (Latest bios 2.10, ECC enabled in settings)
Memory: 4 x Kingston KSM26ED8/16ME (ECC)" (64 GB total)

What does worry me is the Total Width of the memory modules being 128 rather than 144. As far as I know ECC needs extra bits.
root@pve:~# dmidecode -t memory
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.2.1 present.
# SMBIOS implementations newer than version 3.2.0 are not
# fully supported by this version of dmidecode.

Handle 0x000C, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: Multi-bit ECC
Maximum Capacity: 128 GB
Error Information Handle: 0x000B
Number Of Devices: 4

Handle 0x0014, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x000C
Error Information Handle: 0x0013
Total Width: 128 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: DIMM 0
Bank Locator: P0 CHANNEL A
Type: DDR4
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 2667 MT/s
Manufacturer: Kingston
Serial Number: 9761035D
Asset Tag: Not Specified
Part Number: 9965745-002.A00G
Rank: 2
Configured Memory Speed: 2667 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V

omitted the 3 other memory modules as they are similar

root@pve:~# pveversion
pve-manager/6.1-3/37248ce6 (running kernel: 5.3.10-1-pve)
root@pve:~# dmesg | grep EDAC
[ 0.240158] EDAC MC: Ver: 3.0.0
[ 12.260770] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.260771] EDAC amd64: F17h detected (node 0).
[ 12.260780] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.260812] EDAC amd64: Error: Error probing instance: 0
[ 12.316586] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.316587] EDAC amd64: F17h detected (node 0).
[ 12.316594] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.316626] EDAC amd64: Error: Error probing instance: 0
[ 12.372743] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.372744] EDAC amd64: F17h detected (node 0).
[ 12.372748] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.372777] EDAC amd64: Error: Error probing instance: 0
[ 12.412583] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.412584] EDAC amd64: F17h detected (node 0).
[ 12.412592] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.412623] EDAC amd64: Error: Error probing instance: 0
[ 12.452627] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.452629] EDAC amd64: F17h detected (node 0).
[ 12.452636] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.452665] EDAC amd64: Error: Error probing instance: 0
[ 12.492515] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.492516] EDAC amd64: F17h detected (node 0).
[ 12.492523] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.492552] EDAC amd64: Error: Error probing instance: 0
[ 12.540608] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.540609] EDAC amd64: F17h detected (node 0).
[ 12.540619] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.540649] EDAC amd64: Error: Error probing instance: 0
[ 12.588582] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.588583] EDAC amd64: F17h detected (node 0).
[ 12.588590] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.588619] EDAC amd64: Error: Error probing instance: 0
[ 12.628538] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.628539] EDAC amd64: F17h detected (node 0).
[ 12.628546] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.628575] EDAC amd64: Error: Error probing instance: 0
[ 12.683117] EDAC amd64: Node 0: DRAM ECC enabled.
[ 12.683118] EDAC amd64: F17h detected (node 0).
[ 12.683122] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 12.683150] EDAC amd64: Error: Error probing instance: 0
[ 13.124951] EDAC amd64: Node 0: DRAM ECC enabled.
[ 13.124952] EDAC amd64: F17h detected (node 0).
[ 13.124963] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 13.124968] EDAC amd64: Error: Error probing instance: 0
[ 13.717145] EDAC amd64: Node 0: DRAM ECC enabled.
[ 13.717147] EDAC amd64: F17h detected (node 0).
[ 13.717158] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 13.717163] EDAC amd64: Error: Error probing instance: 0
[ 14.020782] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.020784] EDAC amd64: F17h detected (node 0).
[ 14.020796] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.020830] EDAC amd64: Error: Error probing instance: 0
[ 14.064584] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.064585] EDAC amd64: F17h detected (node 0).
[ 14.064594] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.064616] EDAC amd64: Error: Error probing instance: 0
[ 14.100500] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.100501] EDAC amd64: F17h detected (node 0).
[ 14.100505] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.100526] EDAC amd64: Error: Error probing instance: 0
[ 14.148570] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.148571] EDAC amd64: F17h detected (node 0).
[ 14.148580] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.148602] EDAC amd64: Error: Error probing instance: 0
[ 14.188453] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.188454] EDAC amd64: F17h detected (node 0).
[ 14.188457] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.188479] EDAC amd64: Error: Error probing instance: 0
[ 14.228478] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.228480] EDAC amd64: F17h detected (node 0).
[ 14.228487] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.228509] EDAC amd64: Error: Error probing instance: 0
[ 14.272474] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.272475] EDAC amd64: F17h detected (node 0).
[ 14.272478] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.272500] EDAC amd64: Error: Error probing instance: 0
[ 14.308501] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.308502] EDAC amd64: F17h detected (node 0).
[ 14.308509] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.308530] EDAC amd64: Error: Error probing instance: 0
[ 14.348565] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.348567] EDAC amd64: F17h detected (node 0).
[ 14.348576] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.348599] EDAC amd64: Error: Error probing instance: 0
[ 14.384505] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.384506] EDAC amd64: F17h detected (node 0).
[ 14.384513] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.384534] EDAC amd64: Error: Error probing instance: 0
[ 14.428500] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.428501] EDAC amd64: F17h detected (node 0).
[ 14.428504] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.428525] EDAC amd64: Error: Error probing instance: 0
[ 14.464744] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.464745] EDAC amd64: F17h detected (node 0).
[ 14.464748] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.464769] EDAC amd64: Error: Error probing instance: 0
[ 14.504819] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.504820] EDAC amd64: F17h detected (node 0).
[ 14.504830] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.504853] EDAC amd64: Error: Error probing instance: 0
[ 14.540775] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.540777] EDAC amd64: F17h detected (node 0).
[ 14.540784] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.540806] EDAC amd64: Error: Error probing instance: 0
[ 14.572751] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.572752] EDAC amd64: F17h detected (node 0).
[ 14.572759] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.572780] EDAC amd64: Error: Error probing instance: 0
[ 14.616766] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.616767] EDAC amd64: F17h detected (node 0).
[ 14.616773] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.616794] EDAC amd64: Error: Error probing instance: 0
[ 14.660750] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.660751] EDAC amd64: F17h detected (node 0).
[ 14.660755] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.660775] EDAC amd64: Error: Error probing instance: 0
[ 14.709337] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.709339] EDAC amd64: F17h detected (node 0).
[ 14.709351] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.709375] EDAC amd64: Error: Error probing instance: 0
[ 14.753319] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.753320] EDAC amd64: F17h detected (node 0).
[ 14.753331] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.753357] EDAC amd64: Error: Error probing instance: 0
[ 14.793327] EDAC amd64: Node 0: DRAM ECC enabled.
[ 14.793328] EDAC amd64: F17h detected (node 0).
[ 14.793339] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 14.793367] EDAC amd64: Error: Error probing instance: 0
 
updating to the latest version did not help.

root@pve:~# pveversion
pve-manager/6.1-7/13e58d5e (running kernel: 5.3.18-2-pve)
root@pve:~# dmesg | grep EDAC
[ 140.822837] EDAC MC: Ver: 3.0.0
[ 161.085921] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.085922] EDAC amd64: F17h detected (node 0).
[ 161.085933] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.085965] EDAC amd64: Error: Error probing instance: 0
[ 161.119196] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.119197] EDAC amd64: F17h detected (node 0).
[ 161.119206] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.119238] EDAC amd64: Error: Error probing instance: 0
[ 161.175240] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.175241] EDAC amd64: F17h detected (node 0).
[ 161.175244] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.175273] EDAC amd64: Error: Error probing instance: 0
[ 161.223207] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.223208] EDAC amd64: F17h detected (node 0).
[ 161.223212] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.223240] EDAC amd64: Error: Error probing instance: 0
[ 161.263241] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.263242] EDAC amd64: F17h detected (node 0).
[ 161.263252] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.263281] EDAC amd64: Error: Error probing instance: 0
[ 161.299178] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.299179] EDAC amd64: F17h detected (node 0).
[ 161.299185] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.299214] EDAC amd64: Error: Error probing instance: 0
[ 161.339230] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.339231] EDAC amd64: F17h detected (node 0).
[ 161.339235] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.339263] EDAC amd64: Error: Error probing instance: 0
[ 161.383178] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.383179] EDAC amd64: F17h detected (node 0).
[ 161.383189] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.383219] EDAC amd64: Error: Error probing instance: 0
[ 161.419212] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.419213] EDAC amd64: F17h detected (node 0).
[ 161.419217] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.419245] EDAC amd64: Error: Error probing instance: 0
[ 161.479210] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.479211] EDAC amd64: F17h detected (node 0).
[ 161.479218] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.479248] EDAC amd64: Error: Error probing instance: 0
[ 161.515143] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.515145] EDAC amd64: F17h detected (node 0).
[ 161.515152] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.515181] EDAC amd64: Error: Error probing instance: 0
[ 161.563236] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.563237] EDAC amd64: F17h detected (node 0).
[ 161.563240] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.563269] EDAC amd64: Error: Error probing instance: 0
[ 161.607274] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.607276] EDAC amd64: F17h detected (node 0).
[ 161.607282] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.607312] EDAC amd64: Error: Error probing instance: 0
[ 161.651218] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.651219] EDAC amd64: F17h detected (node 0).
[ 161.651225] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.651254] EDAC amd64: Error: Error probing instance: 0
[ 161.687399] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.687400] EDAC amd64: F17h detected (node 0).
[ 161.687407] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.687436] EDAC amd64: Error: Error probing instance: 0
[ 161.731214] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.731215] EDAC amd64: F17h detected (node 0).
[ 161.731218] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.731246] EDAC amd64: Error: Error probing instance: 0
[ 161.799209] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.799211] EDAC amd64: F17h detected (node 0).
[ 161.799217] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.799246] EDAC amd64: Error: Error probing instance: 0
[ 161.859169] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.859170] EDAC amd64: F17h detected (node 0).
[ 161.859176] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.859206] EDAC amd64: Error: Error probing instance: 0
[ 161.900694] EDAC amd64: Node 0: DRAM ECC enabled.
[ 161.900695] EDAC amd64: F17h detected (node 0).
[ 161.900706] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 161.900735] EDAC amd64: Error: Error probing instance: 0
[ 162.175634] EDAC amd64: Node 0: DRAM ECC enabled.
[ 162.175635] EDAC amd64: F17h detected (node 0).
[ 162.175646] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 162.175650] EDAC amd64: Error: Error probing instance: 0
[ 162.811195] EDAC amd64: Node 0: DRAM ECC enabled.
[ 162.811196] EDAC amd64: F17h detected (node 0).
[ 162.811202] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 162.811206] EDAC amd64: Error: Error probing instance: 0
[ 163.235511] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.235512] EDAC amd64: F17h detected (node 0).
[ 163.235523] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.235550] EDAC amd64: Error: Error probing instance: 0
[ 163.275288] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.275289] EDAC amd64: F17h detected (node 0).
[ 163.275299] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.275321] EDAC amd64: Error: Error probing instance: 0
[ 163.331192] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.331193] EDAC amd64: F17h detected (node 0).
[ 163.331196] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.331218] EDAC amd64: Error: Error probing instance: 0
[ 163.367160] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.367161] EDAC amd64: F17h detected (node 0).
[ 163.367164] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.367185] EDAC amd64: Error: Error probing instance: 0
[ 163.407290] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.407291] EDAC amd64: F17h detected (node 0).
[ 163.407301] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.407327] EDAC amd64: Error: Error probing instance: 0
[ 163.443249] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.443251] EDAC amd64: F17h detected (node 0).
[ 163.443260] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.443282] EDAC amd64: Error: Error probing instance: 0
[ 163.479205] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.479206] EDAC amd64: F17h detected (node 0).
[ 163.479213] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.479234] EDAC amd64: Error: Error probing instance: 0
[ 163.535184] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.535185] EDAC amd64: F17h detected (node 0).
[ 163.535189] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.535209] EDAC amd64: Error: Error probing instance: 0
[ 163.567188] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.567190] EDAC amd64: F17h detected (node 0).
[ 163.567193] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.567213] EDAC amd64: Error: Error probing instance: 0
[ 163.607183] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.607185] EDAC amd64: F17h detected (node 0).
[ 163.607191] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.607214] EDAC amd64: Error: Error probing instance: 0
[ 163.651212] EDAC amd64: Node 0: DRAM ECC enabled.
[ 163.651213] EDAC amd64: F17h detected (node 0).
[ 163.651216] EDAC amd64: Error: F0 not found, device 0x1460 (broken BIOS?)
[ 163.651237] EDAC amd64: Error: Error probing instance: 0
 
@aaron ; Actually hearsay might not be the best source:

I have gotten an official statement from AMD (rather than some site which might or might not be truthful in depicting ASrock rack possibly/perhaps deploying a smokescreen) them self stating that AM4 should support both ECC and ECC error reporting.
And that includes my setup.

Here is the conversation with AMD tech support email account in reverse (tech*dot?*amdsupport*at?*customercare)
I have asked and gotten their permission to publish the conversation online.

------------------------------
Kyle (AMD)

6 Mar, 13:20 CET

Thank you for contacting AMD.

I'll be happy to answer to clear your doubts,

We can confirm both ECC error reporting are supported by the Ryzen platform and the EDAC command should work just fine on your setup:

{some text omitted}

****************
My Question in between
Can I interpret your answer as follows:?

That the AMD AM4 socket supports both ECC and ECC error reporting.
And that the EDAC command should work just fine on my setup:
Mobo: ASrock x570 creator which has an AMD AM4 socket
CPU: AMD Ryzen 9 3950X
Memory: 4 x 16 GB ECC ((Kingston KSM26ED8/16ME) from the ASrock x570 creator QVL)
*****************

Isabelle (AMD)

6 Mar, 10:46 CET

{some text omitted}

Correctable and un-correctable ECC errors are both reported in the Windows Event Log under the WHEA category or in Linux through the EDAC command. So indeed, both are supported.
 
I haven't had time for more testing yet, the next step would be finding a linux distro that should support ECC on Ryzen and boot from that. I'll see if I can give it a go this weekend.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!