[SOLVED] Error when creating pool after ceph reinstallation

fartboner

New Member
Aug 3, 2023
10
0
1
I posted in a previous thread about how I had an issue with my ceph configuration and purged ceph. I managed to get it reinstalled and I have the mons/mgrs setup but when I go to create a pool, I receive this message:

Code:
pool pool01: applying size = 3
got timeout
pool pool01: applying application = rbd
syswrite() on closed filehandle $child at /usr/share/perl5/PVE/RADOS.pm line 34.
write data failed - Bad file descriptor
pool pool01: applying crush_rule = replicated_rule
syswrite() on closed filehandle $child at /usr/share/perl5/PVE/RADOS.pm line 34.
write data failed - Bad file descriptor
pool pool01: applying min_size = 2
syswrite() on closed filehandle $child at /usr/share/perl5/PVE/RADOS.pm line 34.
write data failed - Bad file descriptor
pool pool01: applying pg_autoscale_mode = on
syswrite() on closed filehandle $child at /usr/share/perl5/PVE/RADOS.pm line 34.
write data failed - Bad file descriptor
pool pool01: applying pg_num = 128
syswrite() on closed filehandle $child at /usr/share/perl5/PVE/RADOS.pm line 34.
write data failed - Bad file descriptor
pool pool01: applying target_size_bytes = 0
syswrite() on closed filehandle $child at /usr/share/perl5/PVE/RADOS.pm line 34.
write data failed - Bad file descriptor
TASK ERROR: Could not set: application, crush_rule, min_size, pg_autoscale_mode, pg_num, size, target_size_bytes

I also intermittently get got timeout (500) errors. I think they may be from clock skew according to journalctl -u ceph-mon@host, so I've applied changes to my chrony.conf to remedy that (hopefully).

Any ideas?

EDIT:

In case anyone finds this thread, it was actually because my host OS disk was too slow. The slow storage was the cause of both timeouts and issues with pool creation.

I didn't properly read the requirements and recommendations and was running this on the Dell ISDDM SD cards built-in my M620 and M630 servers. The read/write is too slow and offers a limited lifespan. Opted instead to install Proxmox on one of the two SSDs on each host.

After installation on capable hardware, everything just works!
 
Last edited:
Interestingly, I just reloaded three hosts and created an entirely new cluster and setup ceph with the same topology and received the same error as originally posted.
  1. Installed proxmox on host1
  2. configured networking on host1
  3. setup ceph on host1
  4. created cluster
  5. joined host2
  6. configured networking on host2
  7. installed ceph on host2 (inherited configuration)
  8. repeat for host3
  9. configured OSDs
    1. couldn't wipe from gui, needed to run lsblk -> dmsetup remove <id> in order to make them available
    2. wiped via gui
    3. setup OSD as usual
  10. attempt to create pool; receive same error
Did I do something wrong?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!