Search results

  1. J

    VZDump backup failed

    I've got the same issue, ironically on VMID 102 as well... I've got daily backups scheduled and it started failing daily about a week and a half ago. The strange thing when start the backup manually and it does not fail. The VM is a KVM living on a local SSD. It is being backed up to a local...
  2. J

    Is there a process to add monitors to existing ceph pools?

    Thanks MonXP. Let me re-ask the question. Is the only thing i have to do delete the existing RDB Storage Device and re-add it with the new monitor IPs? If I delete the CephFS's RDB storage device (under Datacenter/Storage) the pools should be unaffected correct?
  3. J

    Ceph VM backup and restore on PVE 4.1 very slow

    While I've not tested other versions I am in a similar situation in way of speed w/ running 4.1. Ceph is riding on a bonded 10G network w/ 2x10G on each host node and 4x10G on each ceph node. The NFS share is a RAID 6 riding on the same network hardware but using a different ip range. VM...
  4. J

    Is there a process to add monitors to existing ceph pools?

    I'm curious if there is an actual process to add monitors to existing ceph pools? So far, the only thing I can see to do is to create a new pool with the desired monitors and then migrate the content from the old pool to the new one. Thanks in advance for your feedback... :)
  5. J

    A little networking advice. Public/Cluster/Ceph

    I think I"m making some progress. While I'm not getting blazing fast speeds, I am getting 300-400MB/s with Ceph which is an improvement. I noticed this presentation where he suggested 1 journal SSD per 3 OSD's. So I'm going to try 2 journal SSD per 4 OSD's...
  6. J

    A little networking advice. Public/Cluster/Ceph

    Yes. 2 OSD nodes (2 JBOD servers w/ 8 spinners each for Ceph - capacity for 12 drives) w/ 4x10G each (overkill) 3 Compute Nodes (1x256SSD boot/swap, 1x1TB SSD Local Storage, 1x3TB Local Backup - web servers) 3 Monitor Nodes (Old spare parts servers brought back in) I read somewhere that the...
  7. J

    A little networking advice. Public/Cluster/Ceph

    Just to confirm, When you speak of 3 Ceph nodes, you are referring to 3 ceph nodes holding OSD's yes? Currently I have 3 low-power monitoring nodes, 3 2x8 e-5 front end web servers and 2x 12 drive JBOD nodes for CEPH with 8 spinners each. So I have more than 3 nodes - however I only have 2...
  8. J

    A little networking advice. Public/Cluster/Ceph

    Thanks Wolfgang... I've read in the Ceph docs that the minimum recommended OSD count is 11 and performance should increase from there. How many OSD's would you recommend as a minimum? I can add another JBOD hardware node and fill up the bays with drives, but I want to make sure that I'll see...
  9. J

    A little networking advice. Public/Cluster/Ceph

    From inside a VM with the drive on Ceph: hdparm -t /dev/vda1 I- 72 MB/sec Moving the VM to a local spinner: hdparm -t /dev/vda1 - 126 MB/sec Moving the VM to a local SSD hdparm -t /dev/vda1 336 MB/sec Doing an hdparm -t directly on the local SSD /dev/sdb1 on the host node I'm getting 497...
  10. J

    A little networking advice. Public/Cluster/Ceph

    OK... Thanks again Wolf. Rather than bonding 2 10G's on each host node would you recommend that I dedicate 1x10G for the cluster and 1x10G for Ceph? I'm not running any Ceph OSD's on the host nodes. Because the speed is quite painful. I'm at 22% after an hour with a 450G VM using the backup...
  11. J

    A little networking advice. Public/Cluster/Ceph

    Thanks for the feedback Wolf... With my configuration is the copy happening over the "cluster" network and not the "ceph/rbd" network? I'm curious as it would explain the lag.
  12. J

    trying to aquire lock...TASK ERROR: can't lock file '/var/lock/qemu-server/...

    I'm with Adrian. Is there a process to do this without resetting the nodes? Resetting any machine other than for a patch that requires it should always be a last resort.
  13. J

    A little networking advice. Public/Cluster/Ceph

    I'm currently running 3 networks. I've got public: 1x1G xxx.xxx.xxx.xxx Private/Cluster: 1x1G 10.100.10.x Ceph: (2x10G on each host node and 4x10G on each Ceph node - all bonded) 172.16.0.x Everything is going great with Ceph so far as I can see - and everything is working great with the...
  14. J

    [SOLVED] Ceph Help...

    Thank you udo! It was the default rbd that was killing me. All healthy now. How can I thank you for your time and efforts?
  15. J

    [SOLVED] Ceph Help...

    While diving into Ceph I ran all kinds of configs and tests. I did previously have a pool with the same name and a replica 3. But that was before the last reinstall of everything.
  16. J

    [SOLVED] Ceph Help...

    Thanks Udo... While browsing around I did find an orphaned disk image called vm-103-disk-1 which is probably the culprit. Any idea how I can remove it or kill it? I found this image under the Storage view and then the Content tab.
  17. J

    [SOLVED] Ceph Help...

    I set up a test environment and started over a few times. What's strange is each time I restart the Ceph network, even after writing 0's to all the osd's to make sure things were cleared out - I end up with: HEALTH_WARN 1 pgs degraded; 1 pigs stuck degraded; 64 pgs stuck unclean; 1pgs stuck...
  18. J

    Understanding Ceph

    Thanks for your continued advice. I appreciate it. While I'd love to put PM on every server for continuity, it's possible but unlikely that I will on the monitoring and/or OSD nodes. Licensing costs are a concern and I'm not 100% certain that support for Ceph via PM is fully available at this...