Search results

  1. K

    Switching existing Ceph OSD from unecrypted to encrypted

    Hi I am sure that this has been done before, but not exactly been able to out if this is possible. We currently a fairly large 3-node cluster on PVE with PVE-managed Ceph for the storage. Due to NIS2 and other local regulations we are being forced into having "encryption at rest" for all virtual...
  2. K

    ACME client cannot perform DNS validation with API Data

    Hi We have recently found that our DNS provider is supported in the PVE ACME client and we did therefore configure it according to the documentation Since the provider does not yet have specific form fields configured, we simply added the environment variables (for use with acme.sh) as key-value...
  3. K

    Live migration between Intel and AMD

    Hi We are looking into rotating our fleet of servers to gain more performance with less physical space used by going from Intel Xeon Gold 6226R to AMD Epyc Genoa based systems instead. Until now we have been primarily using Intel Xeon-based nodes in our cluster, but for our workloads we get...
  4. K

    Disk IO stats - spike during backups

    Hi We have now been running Proxmox VE for a while and even before Proxmox Backup Server was released. After switching to Proxmox Backup Server we have noticed that during the time where the backup of a VM is running, the Disk IO graph spikes to somewhere between 2G and 5G, which means that if...
  5. K

    Disable Password Auth

    Hi We are looking into hardening our PVE setup. Currently access to the web UI is fairly locked down with restrictive inbound firewall rules and 2FA for all users, including root@pam. We do however plan to update the SSH server configuration to disable password-based authentication entirely, so...
  6. K

    Ceph cluster timeout

    ceph.log has no entries since the described failure. The ceph-mon.xxx.log files contain only these types of lines: 2022-07-15T11:19:58.832+0200 7fbf7e16d700 1 mon.x1-pve-srv1@0(electing) e3 handle_auth_request failed to assign global_id ceph-mgr.xxx.log is empty osd logs and volume logs are...
  7. K

    Ceph cluster timeout

    Hi We recently deployed a small 3-node PVE cluster on-premise. All three nodes run Ceph OSD, Ceph Monitor, Ceph Manager and Ceph MDS. We run the setup with 3 switches and each server has 4 NICs. The two first NICs are configured as a linux bond in failover mode and have a cable to each of the...
  8. K

    Move cloud-init drive

    Hi We have been doing some lab testing on creating a prebuilt image of Ubuntu for various application types where things like the qemu-guest-agent is preinstalled etc. These images are based off the official ubuntu cloud-images and we therefore use cloud-init inside Proxmox to do the...
  9. K

    OVMF vs SeaBIOS

    Hi We are looking into optmization of parts of our virtual infrastructure hosted on PVE. For this we have looked at the official Proxmox benchmarks for both Ceph and local ZFS pools and have noticed a very specific configuration, where OVMF BIOS was used on q35 machines type. Are the any...
  10. K

    Cannot migrate between ZFS and Ceph

    The ceph storage is not locally available on the node. but enabling it in the cluster storage configuration for the node does allow it to show up. However since the current host only has a "slow" 2Gbps link, Ceph will perform poorly, asuming that it will even migrate to the Ceph cluster when...
  11. K

    Cannot migrate between ZFS and Ceph

    Backup & Restore does not really solve it, as it increases service downtime a lot. This method would require us to the servers out of service, so that data does not change, then make the backup, delete the VM and then restore it on the target node. if the backup is then corrupt, all data is lost...
  12. K

    Cannot migrate between ZFS and Ceph

    Hi We have some larger virtual servers running on a ZFS mirror of nVME drives. These virtual servers run various different applications, but mostly they run Microsoft Windows Server with Microsoft SQL server on top. In terms of ressources, they have 500-900GB of disk space on a single virtual...
  13. K

    Almost 0 disk IO during backup

    So although the storage on the PVE host is fast enough that it should not starve the VM of IO, it will seem so until the backup is done? Is there any way around it?
  14. K

    Almost 0 disk IO during backup

    The data on the PBS host is currently being migrated to another host with faster and larger disks. The sync job eats disk IO as if it was candy, so data is being transferred as fast as possible, but since backups still have to run to the "old" PBS host. Once migration is done, performance will...
  15. K

    Almost 0 disk IO during backup

    Hi We have experienced an issue for some time now where a VM is running extremely slow and access to disk IO is almost impossible during the backup. Usually it is not a problem as backups to PBS finish within a few hours. It is however a problem when backups take longer than usual and run ov er...
  16. K

    PBS Migration to new host

    Hi We rent physical hardware in a DC for storing the PVE guest backups off-site. On this server we have installed PBS. Now our PBS server is starting to reach 80% capacity and as per the dashboard it is expected full within the next 30-40 days. All backups are running with encryption enabled So...
  17. K

    Node does not completely disappear from cluster

    Hi We have now started the process of replacing some nodes in our cluster. The cluster runs Ceph + HA. What we have done so far is "out" each OSD and let the cluster rebalance after each OSD. Then stopped the OSD and deleted MGR, MON and MDS roles from the node to remove. Then we rebooted the...
  18. K

    Ceph - replacing a node

    Hi I have found some documentation of the case that I am facing, but wanted to run it by some of experts here first :) We currently have a 5-node PVE cluster where 4 nodes are running a Ceph cluster and with Proxmox HA configured for all VMs on these nodes. Each node has two NVMe drives each...
  19. K

    VM Disk cache config for Ceph

    Hi We are in the process of migrating services from servers running a local ZFS-mirror on 4TB nVME SSD drives. We are migrating to new servers with the same specifications and hardware as what is currently running, but instead of spreading services over two nodes (a 3rd node runs some internal...
  20. K

    PBS: error fetching datastores - fingerprint 'xx' not verified, abort! (500)

    Hi We are running PBS 1.x in production with a LetsEncrypt cert and I can confirm that each time the cert is renewed, the fingerprint seems to change and backups are not running until we manually update the fingerprint for the storage in the PVE Cluster. We have a lab-setup where PBS is still...