Search results

  1. B

    [SOLVED] Changed root pw on proxmox backup server & proxmox host - cant re-add backup server to proxmox

    So last night I changed the root password on both my proxmox host server & my proxmox backup server. The new password works fine to log into ssh/webGUI etc of both systems. I deleted all historical backups, then removed the proxmox backup server storage from my proxmox (starting fresh)...
  2. B

    iGPU Passthrough on PVE 6.3-3

    Hello Everyone I've been wrestling with this all day, hoping someone can help. I have my iGPU onboard GPU passed through to a vm running windows 10 build 2004. Qemu utils are installed on this system. It's using OVMF bios (UEFI). I used the instructions in the documentation to set this up...
  3. B

    Ceph Cluster - loss of one host caused storage to go offline

    Hi Alwin, as below # ceph versions { "mon": { "ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable)": 1, "ceph version 12.2.13 (98af9a6b9a46b2d562a0de4b09263d70aeb1c9dd) luminous (stable)": 2 }, "mgr": { "ceph version 12.2.12...
  4. B

    Ceph Cluster - loss of one host caused storage to go offline

    That is not my problem. I have size 3 min 2. pool 9 'ceph_hdd' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 last_change 10523 flags hashpspool stripe_width 0 application rbd removed_snaps [1~3,9~4,f~114,124~14] pool 20 'ceph_pci_SSD' replicated...
  5. B

    Ceph Cluster - loss of one host caused storage to go offline

    Output of commands below. the ceph osd crush rule dump shows 3 rules total, but the first (default one that was "included" when ceph was set up) isn't in use anymore. # ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME -1 86.25343 -...
  6. B

    Ceph Cluster - loss of one host caused storage to go offline

    Hi There, We've got a 5-node ceph cluster with 10gbps networking. All hosts have 3 x HDDs in them for "slow" storage, and all hosts but 3 have SSDs in them for fast storage. Hosts 1,2,3,4,5 all have 3 x HDDs in them - we used this for anything that needs "Slow" storage. Hosts 1,2,4,5 all have...
  7. B

    Adding SSDs to improve Ceph performance

    Hi Alwin - I see what you mean now. I updated the rule by doing a ceph osd pool set ceph_ssd crush_rule replicated-ssd as per the documentation. The output of ceph osd dump now shows that the ssd pool is using rule id 2 - I was expecting this to trigger a rebalance of some sort, but it...
  8. B

    Adding SSDs to improve Ceph performance

    @Alwyn - I just ran ceph pg ls-by-pool <my pool name> And I checked the up, primary and acting columns - these all reference the OSDs that host the pool, and they're all our OSDs. Am I missing something? Looks correct to me
  9. B

    Adding SSDs to improve Ceph performance

    Hi Alwin, Could you please point out how you arrived at that conclusion? As per post #5, I created the ceph pool with this command ceph osd crush rule create-replicated replicated-ssd default host ssd and then by adding the ssds pveceph osd create /dev/<devicename> Then creating a pool...
  10. B

    Adding SSDs to improve Ceph performance

    And my ceph topology: # ceph osd dump epoch 3675 fsid 5524ca13-287b-46aa-a302-9b1853a5fb25 created 2018-03-17 17:03:08.615625 modified 2020-06-08 15:53:13.325398 flags sortbitwise,recovery_deletes,purged_snapdirs crush_version 145 full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85...
  11. B

    Adding SSDs to improve Ceph performance

    And iperf tests on the cluster Server listening on TCP port 5001 Binding to local address 10.10.10.5 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 4] local 10.10.10.5 port 5001 connected with 10.10.10.1 port 40912 [ ID] Interval...
  12. B

    Adding SSDs to improve Ceph performance

    Hi, so I finally got around to doing this. Unfortunately, performance is not as good as I'd expect it to be, especially in synthetic benchmarks: This is the new SSD pool (it is comprised of 4 x 1.2 TiB FusionIO devices that in my testing can easily deliver tens of thousands of IOPS and 500+...
  13. B

    Adding SSDs to improve Ceph performance

    Ok - I think I've got it. Just tested it in my lab. If this looks ok @Alwin could you mark as solution. First, I created a new ceph replication rule (The default one is simply called "replicated_rule"), and classify it for HDDs only: ceph osd crush rule create-replicated replicated-hdd default...
  14. B

    Adding SSDs to improve Ceph performance

    Hi @Alwin yes, that looks exactly like what I need. I just went over that doc. The doc says Perfect - a new SSD-only pool is exactly what I'm after. So to make this work I need to create a new rule, like so: ceph osd crush rule create-replicated replicated-ssd default-ssd host ssd That will...
  15. B

    Adding SSDs to improve Ceph performance

    Hi all. We've got a 3-node ceph cluster that contains 4 x 6 TB SATA drives experiencing very poor I/O write speeds. I do have WAL and block.db on fast SSDs but they are too small for our workload. So I've installed one enterprise SSD (1.2TB) in each host. First off, I am wanting to create a...
  16. B

    Ceph OSD Down & Out - can't bring back up - *** Caught signal (Segmentation fault) **

    After struggling for a while with this, it appears that this was caused by one of the SSDs failing. Since all OSDs use the one SSD cache, it caused the segfault on OSD start... wish this was handled slightly better in ceph (like an error stating that there was an issue writing to disk rather...
  17. B

    Ceph OSD Down & Out - can't bring back up - *** Caught signal (Segmentation fault) **

    Anyone have any idea on this? I still can't get these OSDs up. I could destroy and re-create but then without knowing what caused it, it could happen again and if it happens on more hosts than my CRUSH map allows for I could really be in trouble.
  18. B

    Ceph OSD Down & Out - can't bring back up - *** Caught signal (Segmentation fault) **

    Yes they are - no SMART errors. I just tested them using DD - looks like they are definitely working OK!
  19. B

    Ceph OSD Down & Out - can't bring back up - *** Caught signal (Segmentation fault) **

    Hi, I noticed that in my 3-node, 12-osd cluster (3 OSD per Node), one node has all 3 of its OSDs marked "Down" and "Out". I tried to bring them back 'In" and "Up", but, this is what the log shows: My setup is WAL and block.db is on SSD, but the OSD is SATA HDD. Each server has 2 SSDs, each SSD...