Hi
We have now been running Proxmox VE for a while and even before Proxmox Backup Server was released.
After switching to Proxmox Backup Server we have noticed that during the time where the backup of a VM is running, the Disk IO graph spikes to somewhere between 2G and 5G, which means that if...
Hi
We are looking into hardening our PVE setup. Currently access to the web UI is fairly locked down with restrictive inbound firewall rules and 2FA for all users, including root@pam.
We do however plan to update the SSH server configuration to disable password-based authentication entirely, so...
ceph.log has no entries since the described failure.
The ceph-mon.xxx.log files contain only these types of lines:
2022-07-15T11:19:58.832+0200 7fbf7e16d700 1 mon.x1-pve-srv1@0(electing) e3 handle_auth_request failed to assign global_id
ceph-mgr.xxx.log is empty
osd logs and volume logs are...
Hi
We recently deployed a small 3-node PVE cluster on-premise. All three nodes run Ceph OSD, Ceph Monitor, Ceph Manager and Ceph MDS.
We run the setup with 3 switches and each server has 4 NICs.
The two first NICs are configured as a linux bond in failover mode and have a cable to each of the...
Hi
We have been doing some lab testing on creating a prebuilt image of Ubuntu for various application types where things like the qemu-guest-agent is preinstalled etc.
These images are based off the official ubuntu cloud-images and we therefore use cloud-init inside Proxmox to do the...
Hi
We are looking into optmization of parts of our virtual infrastructure hosted on PVE. For this we have looked at the official Proxmox benchmarks for both Ceph and local ZFS pools and have noticed a very specific configuration, where OVMF BIOS was used on q35 machines type.
Are the any...
The ceph storage is not locally available on the node. but enabling it in the cluster storage configuration for the node does allow it to show up.
However since the current host only has a "slow" 2Gbps link, Ceph will perform poorly, asuming that it will even migrate to the Ceph cluster when...
Backup & Restore does not really solve it, as it increases service downtime a lot. This method would require us to the servers out of service, so that data does not change, then make the backup, delete the VM and then restore it on the target node. if the backup is then corrupt, all data is lost...
Hi
We have some larger virtual servers running on a ZFS mirror of nVME drives. These virtual servers run various different applications, but mostly they run Microsoft Windows Server with Microsoft SQL server on top.
In terms of ressources, they have 500-900GB of disk space on a single virtual...
So although the storage on the PVE host is fast enough that it should not starve the VM of IO, it will seem so until the backup is done?
Is there any way around it?
The data on the PBS host is currently being migrated to another host with faster and larger disks. The sync job eats disk IO as if it was candy, so data is being transferred as fast as possible, but since backups still have to run to the "old" PBS host. Once migration is done, performance will...
Hi
We have experienced an issue for some time now where a VM is running extremely slow and access to disk IO is almost impossible during the backup.
Usually it is not a problem as backups to PBS finish within a few hours. It is however a problem when backups take longer than usual and run ov er...
Hi
We rent physical hardware in a DC for storing the PVE guest backups off-site. On this server we have installed PBS.
Now our PBS server is starting to reach 80% capacity and as per the dashboard it is expected full within the next 30-40 days. All backups are running with encryption enabled
So...
Hi
We have now started the process of replacing some nodes in our cluster. The cluster runs Ceph + HA.
What we have done so far is "out" each OSD and let the cluster rebalance after each OSD. Then stopped the OSD and deleted MGR, MON and MDS roles from the node to remove.
Then we rebooted the...
Hi
I have found some documentation of the case that I am facing, but wanted to run it by some of experts here first :)
We currently have a 5-node PVE cluster where 4 nodes are running a Ceph cluster and with Proxmox HA configured for all VMs on these nodes. Each node has two NVMe drives each...
Hi
We are in the process of migrating services from servers running a local ZFS-mirror on 4TB nVME SSD drives. We are migrating to new servers with the same specifications and hardware as what is currently running, but instead of spreading services over two nodes (a 3rd node runs some internal...
Hi
We are running PBS 1.x in production with a LetsEncrypt cert and I can confirm that each time the cert is renewed, the fingerprint seems to change and backups are not running until we manually update the fingerprint for the storage in the PVE Cluster.
We have a lab-setup where PBS is still...
Hi
We have been doing some testing on HA with PVE and Ceph, which has given very possitive results in terms of our goals with exploring the setup.
From our testing, we have found two issues that we had not really expected
1. When a host has failed and comes back online, VMs are not migrated...
Thank you for the replies
We will be looking in to how we can use Ceph. Some initial testing shows that it gives the necessary HA, so we just need to figure out the monitoring of disks
Thank you for the reply
Are there any recommended tools that can help monitor disk health when Ceph has control over the disks?
With zfs-zed, our system-administrators get a notification when a disk failure is detected
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.