Hey all I just updated my 6 node cluster from Luminous to Nautilus.
Overall it went well except on one node.
Looks like the /etc/ceph/osd/ directory is totally gone which seems to be the main issue. Any ideas on where this could have gone in the upgrade process?
I am thinking my only...
Actually I take that back, I was just made aware that it locked up again during the backup.
When you say detailed logs, what do you mean? Windows side or proxmox side?
The monitor is a VM part of a small cluster which we have the ability to run at either site. In the case of a burn down at our main site, we simply bring it up at the 2nd site. It works without any issues.
We run a stretched ceph cluster over two datacenters. We have 40GB dark fiber with about .7 ms latency.
Overall it runs well. We avoid split brains by have 1 extra monitor at the primary site.
Rbd-mirror with journal has some serious write overhead. Personally I wouldn't even consider...
We use ceph snapshots and so does benji. To make benji efficient it uses the rbd-diff command to determine what blocks have changed between snapshots. This way the entire rbd image doesn't need to be read each night.
I agree, the node isn't really "dieing" if its able to be live migrate VM's to another node. Most real failures cause a node to go down hard.
This would be a good read for you.
https://pve.proxmox.com/wiki/High_Availability
I am using Benji as a standalone backup solution.
Its setup inside a VM which has connectivity to my ceph public network. It is a dedicated node because Benji can use quite a bit of CPU. Our target storage is some simple ZFS array's which I present over NFS to the Benji VM.
We put benji...
Hey guys we recently started moving our production clusters to proxmox6. Everything has been going really well except for when we backup our 1 and only Windows Server 2012 VM. It causes this VM to lockup and we have to reset it via the proxmox GUI.
All of our linux VM's and Windows 10 VM's...
I am in the process of trying to automate Proxmox5 to Proxmox6 as much as possible for some simple 1 node setups out in the field.
I am able to automate most of the apt-get aspect using something like this.
DEBIAN_FRONTEND=noninteractive \
apt-get \
-o Dpkg::Options::=--force-confnew \...
One of our clients systems (3 node Proxmox 5.1) gracefully rebooted yesterday mid day and we aren't quite sure why. Its a HP DL 380 Gen10 that has been in place for a bit over 6 months.
Am I correct in thinking that the watchdog would never do a graceful restart? In the logs we saw the VM...
Interested to know if you figured this one out, I am looking at a ssd ceph cluster for small random read/writes as well. Seems like you should be getting way better numbers.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.