Proxmox 8 Ceph Quincy monitor no longer working on AMD Opteron 2427

jdancer

Well-Known Member
May 26, 2019
159
29
48
54
Yeah, yeah, I know. EOL CPU.

Ceph was working fine under Proxmox 7 using the same CPU.

I did pve7to8 upgrade and a clean install of Proxmox 8.

Both situations got the 'Caught signal (illegal instruction)' when attempting to start up a Ceph monitor.

It's either pointing to a bad binary or re-compile.

I've already posted my findings in the Proxmox VE 8.0 released! thread.

Has anyone done a clean install of Proxmox 8 Ceph on any other CPU?
 
Last edited:
As mentioned in the other thread, your CPU is over 14 years old and quite probably neither compiler projects, library developers nor Ceph or us test on that old HW, so glitches in there can go unnoticed. I'd recommend installing the newest BIOS/Firmware and CPU microcode available for that platform, with a bit of luck this makes the CPU compatible enough.

If that doesn't help, I'd recommend either trying to debug this on your own to get out more detail on what instruction causes this, e.g. checking the kernel log and maybe using gdb, maybe you can get out something to relay to compiler devs to get this fixed.
Otherwise, you can still run Proxmox VE 7 on that system if it worked, it's supported for roughly a year.
And afterwards, the security state of the OS is probably not mattering a lot in this case with the age and EOL state of that HW it's a leaky bucket anyway.
 
  • Like
Reactions: dwm
I hope there will be some improvements here, as it seems to be the same issue with the AMD N40L/N54L :
AMD Turion(tm) II Neo N40L Dual-Core Processor and AMD Turion(tm) II Neo N54L Dual-Core Processor

No new firmware now
 
Last edited:
FWIW, the error seems to come from the GF-Complete libraries' initialization, namely:
https://github.com/ceph/gf-complete/blob/a6862d10c9db467148f20eef2c6445ac9afd94d8/src/gf.c#L474

This does a lot of lower level things, including some inline assembler – but also the C(++) code could be compiled down differently with newer compiler from Debian Bookworm.

Currently, we have no high-level enterprise support request for this, so we cannot allocate any time into investigating this more closely, I'm afraid. But if the community finds something, maybe even a fix, we'll gladly incorporate it.
 
  • Like
Reactions: dwm
FWIW, the error seems to come from the GF-Complete libraries' initialization, namely:
https://github.com/ceph/gf-complete/blob/a6862d10c9db467148f20eef2c6445ac9afd94d8/src/gf.c#L474

This does a lot of lower level things, including some inline assembler – but also the C(++) code could be compiled down differently with newer compiler from Debian Bookworm.

Currently, we have no high-level enterprise support request for this, so we cannot allocate any time into investigating this more closely, I'm afraid. But if the community finds something, maybe even a fix, we'll gladly incorporate it.
Looking a little deeper, it appears that modern GCC may over-optimise gf-complete and emit instructions that may not be present on older AMD64 CPUs: Debian bug https://bugs.debian.org/1012935.

The fix adopted by Debian is apparently to ensure that GCC is invoked with -O1: https://salsa.debian.org/openstack-...mmit/7751c075f868bf95873c6739d0d942f2a668c58f

The trick will be to inject the option into the Ceph CMake build system in the right place. By inspection of https://github.com/ceph/ceph/blob/4.../src/erasure-code/jerasure/CMakeLists.txt#L70, it looks like you might be able to cause this by patching CMakeLists.txt to inject the value "-O1" into the list of compiler flags?

(Of course, this might have performance implications, except reportedly the gf-complete library dynamically uses optimisations at runtime?)

The Ceph developers are busy doing the final go/no-go release process for Reef, but perhaps once that's settled down, they might be interested in accepting a patch upstream…
 
Same issue with Intel(R) Xeon(R) CPU E5320 @ 1.86GHz
That CPU is from 2006, we do not have any test HW for such old HW around, our oldest test HW is "already" based on the Nehalem microarchitecture, which should have already a new enough SSE4.2 support to not be affected, will see if we get a bit of spare time to hook it up and boot it again with the newest Proxmox VE release to verify if it would work there already.
There is a way to test if my CPU is compatible?
As rough heuristic: If it was released in the last decade (i.e., 2010+) it should be OK.
Slightly more specifically, if it supports SSE 4.1 it should be fine, FWICT for Intel that would be the Penryn microarchitecture (~ 2007) and for AMD it would be the Bulldozer microarchitecture (~ 2011).

I currently only know of one possible workaround: reducing the optimization level on compilation to -O1, which would mean a quite high performance impact for the majority of our users, which we'd rather like to avoid. Ideally this would be fixed (or the fix backported) in GCC-12.
 
Hello.
Just registered to report that happened to me with an "AMD Phenom II X6 1090T Black Edition" too (2010/2011). Stars microarchitecture, right before Bulldozer.

Unfortunately, it happened on a single node ceph. I am aware this is not the best scenario, but I am ok with that since the failure domain is equivalent to a simple NAS.
Fortunately, this is still a test machine, no critical data, mostly duplicated from my old main NAS and I do have backups.

Despite all the crash tests I've ran, I've never been able to loose a single data, ceph is so resilient and rock solid.
But I never thought a major update would break it down such hardly.

To continue my test as if it was a real life scenario, I will try to recover this node, at least partially. Downgrade seems not possible or way too risky, I don't want to worsen my test case right away.

I currently only know of one possible workaround: reducing the optimization level on compilation to -O1, which would mean a quite high performance impact for the majority of our users, which we'd rather like to avoid. Ideally this would be fixed (or the fix backported) in GCC-12.
I agree this should be avoided. Maybe at least can you indicate a way to rebuild ceph with -O1 for this minority of users? Github readme is enough? Especially in regard of settings, to run the own-compiled version in place of the stock version and use the existing node configuration.
I would like to be able to run ceph, at least temporarily, to prove that data are recoverable even in such a doomed case.

Best regards.
 
Maybe at least can you indicate a way to rebuild ceph with -O1 for this minority of users? Github readme is enough?
Our main source mirror is on git.proxmox.com, GitHub is at best a read only mirror w.r.t to our projects.

So you'd need to clone the current branch for Ceph 17.2 Quincy for Proxmox VE 8:
git clone -b quincy-stable-8 git://git.proxmox.com/git/ceph.git

Then add something like this change:
https://salsa.debian.org/openstack-...mmit/03e0314af5e814a7ef74dcf4f9416d60c6322e51

Then install all the build dependencies, this can be automated via
Code:
apt install devscripts
mk-build-deps -ir

Start the build with:
Code:
# Disable building the dbgsym packages (they're huge)
export DEB_BUILD_OPTIONS=noautodbgsym
make deb

A few tips and disclaimers:

I did not build with above linked d/rules change, it should be enough, but I cannot guarantee so, I'm afraid.
Ceph builds are huge and use up lots of resources.
Especially memory usage scales with higher parallelism level, i.e., you need almost 2 GB per thread for build and linkage.
Depending on your HW a build can take quite a bit of time, here it takes something around 30 and 45 minutes with 56 cores assigned from a dual EPYC 7351 system.

Oh, and worth a try could be also trying to add -O1 as compile flag to just the ceph/src/erasure-code/jerasure/CMakeLists.txt file before hitting make deb. That would limit using the worse optimization level to just erasure coding, but from top of my head I'm not 100% sure if there won't be any other fallout then.
 
  • Like
Reactions: Kachidoki
Thank you @t.lamprecht for this guideline. I am currently trying to follow it (embedded sw dev speaking, not used to such high level builds).
Just to add a new limitation, I wanted to prepare the build environment under a debian 12 VM on the node itself (using a local storage as ceph is down), setting the CPU type as host to be sure gcc take the actual cpu type. But at a moment when I deleted and recreated the VM, I forgot to set "host" and left "x86-64-v2-AES", which drove me to an new error regarding the SSE instruction set:
1693473956286.png

Just wanted to let you known this.
 
I forgot to set "host" and left "x86-64-v2-AES", which drove me to an new error regarding the SSE instruction set:
Yes, this is known. When we updated the default CPU type that is selected in the web UI's VM creation wizard from using the very old, and now deprecated, kvm64 to something newer with the Proxmox VE 8.0 release two months ago, we had to make a call with trade-offs for which x86 psABI model to use. After quite some discussion we decided to using the v2 level as all but relatively ancient CPUs support that. While it's a tradeoff, the VMs now get AES and SSE support enabled by default, if the host CPU supports it, which increases performance a lot for a broad array of applications and was something that (especially newer) users overlooked when starting out.
 
Hello,
you need almost 2 GB per thread for build and linkage.
Depending on your HW a build can take quite a bit of time, here it takes something around 30 and 45 minutes with 56 cores assigned from a dual EPYC 7351 system.
I was unable to finalize the build under my VM, even with only one thread and 8GB allocated to the VM on this 6 core / 16GB node... Don't know exactly why, but OOM killed my qemu each time, and yet there is nothing else running but proxmox itself.

Then, I decided to stop playing and go to a more serious path. I used a fraction of an EPYC 7742 I had around, 32 cores allocated to a vmware guest, with 4GB per thread. The build is a lot faster for sure, but most importantly, it finished! Then I prepared several variants:
  • default setting build, just to be sure the build is ok...
  • inserting -O1 into CMakeList.txt only to CFLAGS
  • inserting -O1 into debian/rules only to CFLAGS
  • inserting -O1 into debian/rules to CFLAGS, CXXFLAGS, CPPFLAGS
I am almost sure to have seen -O1 into the parameters on the output console, at least for the last variant.

Now I maybe need a bit more advice on how to install the resulting .deb properly, and especially to be sure this is my own build that is used, because I suspect apt to re-use cache, or even re-download package. I tried to install using dpkg -i too, and even tried to unpack manually ceph-mon binary alone.
In each cases, the behavior stay the same, ceph-mon get an illegal instruction signal.

It was fun and I am so close, please help for the very last step. :)
 
Hmm, can you please post the full diff of your changes here in [code][/code] tags? As otherwise is hard to tell if there might be an error at your side or mine.

What you can also try is using this adding the following to ceph/debian/rules:
Code:
export DEB_CFLAGS_MAINT_APPEND = -O1
export DEB_CXXFLAGS_MAINT_APPEND = -O1

Note also that the ceph directory is only copied the first time, after that you need to run make clean before retrying a build with another change.
 
you need to run make clean before retrying a build with another change
Re-started with your last proposal, from scratch, make clean, git reset --hard, git clean, and this time I used make deb 2>&1 | tee ~/make.log. I think make clean between trials was the missing part.

After log analysis, it appears that the optimization flag is taken into account, I can see normal xFLAGS with -O2 followed by the -O1 append. Normally the last one is used.

Then I tried apt reinstall ./ceph-mon_17.2.6-pve1+3_amd64.deb that has confirmed to use the right path. Don't remember exactly, but it didn't worked as-is.
Finally I just brute-forced the thing by rm ./*-dbg*.deb and apt reinstall ./*.deb, which failed at some point because of non existing /home/cephadm/.ssh, then I mkdir -p /home/cephadm and re-installed the package again.

And... that's it. ceph-mon is alive. BTW I had this issue too, "solved" by disabling restful module. After letting the missed scrub job finish its pass, ceph is now up and heathly. I can access the data like nothing happened.
Pretty dirty, but as it was only for fun, I am happy and had a lot of fun. :) Now I can tell that ceph is really, really bullet proof.

Next I have to find a newer test machine (maybe my FX8350? pretty old too...) as replacement, and find what to do with the older one.

Oh, and it would be nice if pve7to8 could catch this problem BEFORE people pull the upgrade trigger.

Last thing, if needed, I can do more tests with this machine, even reinstall pve8 from scratch or do a pve7 to 8 upgrade and so on...

Thank you very much @t.lamprecht for the follow-up, it was very appreciable and instructive.
 
  • Like
Reactions: t.lamprecht
And... that's it. ceph-mon is alive. BTW I had this issue too, "solved" by disabling restful module. After letting the missed scrub job finish its pass, ceph is now up and heathly. I can access the data like nothing happened.
Great to hear! So what way did you set the flag in the end? For all of ceph or just the erasure-coding/gf-complete parts?

Then I tried apt reinstall ./ceph-mon_17.2.6-pve1+3_amd64.deb that has confirmed to use the right path. Don't remember exactly, but it didn't worked as-is.
Finally I just brute-forced the thing by rm ./*-dbg*.deb and apt reinstall ./*.deb, which failed at some point because of non existing /home/cephadm/.ssh, then I mkdir -p /home/cephadm and re-installed the package again.

If you need to do such a thing in the future, here's how you can make a local repo which you can use for a normal upgrade:
  1. bump the version in the changelog, normally that's in debian/changelog, but here for ceph we take over most of upstream packaging and so the changelog is located in changelog.Debian. Quickest way is adding a +1 to the end of the version located in the first line at the top.
  2. cleanly build
  3. copy the packages on the host from where they should be installed
  4. cd into that directory and run dpkg-scanpackages . >Packages
  5. Add a repo entry for this to e.g. /etc/apt/sources.list like:
    deb [trusted=yes] file:///path/to/pkg-archive ./
  6. do a standard apt update followed by apt full-upgrade

Oh, and it would be nice if pve7to8 could catch this problem BEFORE people pull the upgrade trigger.
Yes it definititefly would be good to do that until we can fix this more cleanly.
Would you mind opening a enhancement request for this over at out Bugzilla so we keep track of it:
https://bugzilla.proxmox.com/
 
So what way did you set the flag in the end? For all of ceph or just the erasure-coding/gf-complete parts?
The last one, adding this to ceph/debian/rules:
Code:
export DEB_CFLAGS_MAINT_APPEND = -O1
export DEB_CXXFLAGS_MAINT_APPEND = -O1
BUT, for the sake of completeness, and now that I have more experience in local package management thanks to your precious help, I will retry by restoring the official packages, limiting the flag to ceph/src/erasure-code/jerasure/CMakeLists.txt, bumping the version and so on.

Would you mind opening a enhancement request for this over at out Bugzilla so we keep track of it:
https://bugzilla.proxmox.com/
Sure, I am checking this out right now!
EDIT: #4953
 
Last edited:
So what way did you set the flag in the end? For all of ceph or just the erasure-coding/gf-complete parts?
Finally did the test by adding -O1 to ceph/src/erasure-code/jerasure/CMakeLists.txt COMPILE_FLAGS only. I paid attention to have a clean build and deployment this time. Obviously I restored the upstream packages in first place and confirmed the issue came back before this test.

It seems to be enough, ceph is working with this tiny bit fix.

I did played a little bit by adding a seventh osd, let rebalancing, read and write to CephFS and Ceph RBD, so far so good. However I do not use much advanced features.
Can't tell about the performance decrease, I don't really care and the system was already relatively slow before.
 
  • Like
Reactions: gurubert
I hope there will be some improvements here, as it seems to be the same issue with the AMD N40L/N54L :
AMD Turion(tm) II Neo N40L Dual-Core Processor and AMD Turion(tm) II Neo N54L Dual-Core Processor

No new firmware now
The issue on the HP N54L is resolved with 17.2.7
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!