Search results

  1. T

    [SOLVED] How to get HA managed VM out of erorr state without a restart

    Thank you. That fits very nicely in the "why didn't I think of that" category. Worked as expected.
  2. T

    HA max_restart and max_relocate best practices

    I have a 13 node cluster using HA. What are the best practices for setting the max_restart and max_relocate values? As it stands right now, for VMs that can run on any node, I've simply picked a restart value of 4, and a max_relocate of 10. My thinking is that the HA service will try to...
  3. T

    [SOLVED] How to get HA managed VM out of erorr state without a restart

    So I have a 13 node cluster running a about 40 VMs. There a number of VMs scattered across the nodes that are in the HA state of error, but the VM iteslf is actually running. After moving the first VM from error->disabled->started, I realized that it was shutting down and restarting the VMs...
  4. T

    Ceph Performance

    Everything is stable right now, but it still doesn't feel as fast as it should. All the benchmarks show that the speed is there, but in practice the transfers as I move my images to the ceph storage still don't feel as fast as they should. One of the problems I had was playing around with MTU...
  5. T

    Ceph Performance

    Ah. No, I don't have that set. Not using storage for containers.
  6. T

    Ceph Performance

    Moving the disk is also tied to CPU usage since it appears to be using the qemu-img command to convert the image to the ceph storage. That also seems to be limiting the speed when moving a disk from local to ceph. I don't know how to answer your first question. I've installed ceph from the...
  7. T

    Ceph Performance

    Yes, that is interesting. Even though iperf shows 10G transfer bi-directional, using scp the 30Gig file transfers about the same 35MB/s rates. I'll look into what may be happening and report back.
  8. T

    Ceph Performance

    3. I'm copying a 30G image right now. It's already been a few minutes and the GUI shows 35MB/s.
  9. T

    Ceph Performance

    It does seem faster, but not 10G faster.
  10. T

    Ceph Performance

    Yes, all servers are synchronized to the proper time. I did find that when I deleted all my monitors except on the three storage nodes, the errors went away. I think there is something not happening (or happening incorrectly) when I created the extra monitors on other nodes.
  11. T

    Ceph Performance

    Followup... When trying to do some rbd commands, I get this sometimes: 7f19cf41d700 0 -- 192.168.201.236:0/4227972101 >> 192.168.201.245:6800/11670 conn(0x7f19b8017eb0 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER If I abort the...
  12. T

    Ceph Performance

    I am running PVE5.0 with Ceph Luminous from the test repository. I have a 13 node Ceph & PVE cluster with three storage nodes, each storage node has four 2-TB osd with bluestore backing. and a SSD drive for the journal (block.db) on partitions. The nodes are communicating over a 10Gb network...
  13. T

    PVE5 Ceph Storage

    Yes, there is a RAID controller on all three hosts, which is why the SSD shows up as a regular HD. The RAID controller says it is optimizing things for the SSD behind the scenes. There is a SSD on sdb, and the OSDs are sdc, sdd, sde, sdf. When I configured an OSD, I selected the data drive...
  14. T

    PVE5 Ceph Storage

    Thanks for all the help. Two more questions now that I have the third storage node up and added to the cluster and healthy: When I created the OSDs with Bluestore as the backing, it only created 1GB block.db partitions on my SSD, probably because the system didn't correctly detect the SSD and...
  15. T

    PVE5 Ceph Storage

    Perhaps I wasn't as specific as I could be. The ceph cluster is using the same hardware as my 13 node PVE 5 cluster. I currently have only two storage nodes (which are also PVE nodes), but I will be adding new hard drives to one of the PVE nodes to create a third ceph storage node. The...
  16. T

    remove/disable local storage?

    The answer to your first question is yes. You can disable local storage in the storage configuration and it will no longer show in the GUI on the left side in any of the views. Since I use local storage currently, I can't confirm that. In the worst case, you can remove all of the object types...
  17. T

    PVE5 Ceph Storage

    I added another monitor per your suggestion...but since I only have two nodes operational right now, I don't see the point of having three replicas yet. You say it's risky, but when I used a NAS server, that was a single point of failure...so what I asked is whether or not with the two replicas...
  18. T

    PVE5 Ceph Storage

    After a year of using freeNAS with LVM over iSCSI with 30 nodes and almost 100 VMs, that solution finally revealed its shortcomings. As a stopgap to keep the VMs running, I moved the VM images back onto each node's local hard drives. I'm now looking to setup a three node ceph cluster, running...
  19. T

    Virtuals in LVM ata errors

    Good job. I've been having a similar problem. I'm now moving to ceph storage because I didn't know exactly what the source of the problem was.
  20. T

    Clear HA error status from the GUI

    My understanding is that you move the "requested state" to "disabled" for the VM in HA resources, then back to "start". That should clear the error.

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!