Search results

  1. L

    Node went down - unclear why - log attached

    We had a node go down two days ago and I'm at a loss figuring out why. I attached the log. This happened at 12:30. The other nodes simply show that the OSD's when down and feverishly started rebalancing the cluster. Is there any indication as to why? Sep 8 12:29:56 FT1-NodeA...
  2. L

    NVMe OSD generates crc error. Failing drive?

    I have relatively new Samsung Enterprise NVMe in a node that is generating the following error: ... 2025-08-26T15:56:43.870+0200 7fe8ac968700 0 bad crc in data 3326000616 != exp 1246001655 from v1:192.168.131.4:0/1799093090 2025-08-26T16:03:54.757+0200 7fe8ad96a700 0 bad crc in data...
  3. L

    Perplexing: When a node is turned off, the whole cluster looses it's network

    On a 4 cluster Proxmox installation, when one node is shut down, access to the network on the others goes away somehow. Here is configuration: Each node is set up similarly, but with the LAN, corosync and other address changed with each node. The enlan2.25 and enlan2.35 are legacy setups...
  4. L

    enabling ceph image replication: how to set up host addresses?

    I'm attempting to do a test to replicate a ceph image to a remote cluster by following this HOWTO. However, what I'm missing is the detail of how or where to specify where "site-a" is in the examples given in terms of ip address. When I follow the instructions, I see this in the status logs...
  5. L

    [SOLVED] 2 stuck OSD's in ceph database

    I tried to remove all OSD's from a cluster and recreate them, but 2 of them are still stuck in the ceph configuration database. I have done all the standard commands to remove them, but the reference stays. # ceph osd crush remove osd.1 removed item id 1 name 'osd.1' from crush map # ceph osd...
  6. L

    New install pve 8.2 on Debian 12 certificate blocks GUI

    I have done fresh install on a Debian 12 cloud host and all went well I thought, except that port 8006 is not responding. (I followed the documentation here) I the logs I find this: Jun 04 17:52:23 pmx1 pveproxy[12734]: /etc/pve/local/pve-ssl.pem: failed to use local certificate chain...
  7. L

    Windows Server 2022 reports disk errors on ceph volume

    We installed a new Windows server 2022 on a cluster that uses an SSD-based ceph volume. All seems to be going well, when suddenly windows event log reports: "An error was detected on device \Device\Harddisk0\DR0 during a paging operation" It's Windows error # 51 There are other Windows...
  8. L

    [SOLVED] Remote server doesn't deduplicate

    I have setup a remote server in a different city to which I ship all backups using a sync job. The remote PBS datastore however doens't seem to be doing deduplication. The local PBS. Usage : 91.02% (3.97 TB of 4.37 TB) Backup Count CT : 32 Groups, 380 Snapshots Host : 0 Groups, 0...
  9. L

    pvesh and how to list API endpoints

    I have seen a couple of blogs out there that claim one can simply use the pvesh command without any parameters and it will drop into an interactive mode where one can show the calls that can be done and a particular level. It doesn't work like that for me though and the documentation is really...
  10. L

    Remote PBS log shows error, but all processes look completed

    Can anyone see what causes this error? 2023-12-18T13:00:07+02:00: percentage done: 98.18% (54/55 groups) 2023-12-18T13:00:07+02:00: sync group vm/199 2023-12-18T13:00:07+02:00: re-sync snapshot vm/199/2023-11-20T08:36:28Z 2023-12-18T13:00:07+02:00: no data changes 2023-12-18T13:00:07+02:00...
  11. L

    Use API to get storage location for VM's

    I need to extract which storage is assigned to each VM and LXC in our cluster. I can retrieve the total allocation for the boot disk, but can't see an obvious way to get the detail for each storage volume allocated. Some of our VM's have a boot disk on an ceph SSD pool and a logging disk on...
  12. L

    Strange disk behaviour

    We're experiencing a problem with a FreeBSD KVM guest that works 100% on installation, but after a while starts complaining that it can't write to the disk anymore. What we have done so far: Moved the disk image off ceph to a lvm-thin volume Changed the disk from Virtio-SCSI to SATA and also...
  13. L

    [SOLVED] Ballooning memory: How to retrieve the max ram allowed from the guest OS?

    Scenario: Centos Guest OS with 8GB/24GB RAM as min/max allocated. The machine typically uses between 10GB and 12GB of the allowed RAM due to ballooning, but here's a problem: Using free -h shows only 14GB in total available. Can't find anything else that shows the 24GB max allowed. There are...
  14. L

    proxmox-backup-proxy rrd EINVAL error

    I'm getting the error below after something happened (it was not happening before) and not sure that I changed anything deliberately. It prevent the status graphs (rrd, right?) to be displayed on the PBS administration section. Oct 11 22:07:17 pbs3 systemd[1]: Starting...
  15. L

    Can one set PBS priority lower to prevent guest slowdowns?

    I have run into an issue a couple of times in that guest OS's slow down dramatically if the PBS server doesn't perform for whatever reason. Previously I had a network issue, which prevented backups from being written at a reasonable speed and it caused the guest machines being backed up to...
  16. L

    [SOLVED] SDN broken after underlying network change

    We ran into a very nasty issue a few days ago. Background: Systemd generates ridiculously long interface names (see https://manpages.debian.org/bookworm/udev/systemd.link.5.en.html and referenced here https://wiki.debian.org/NetworkInterfaceNames#CUSTOM_SCHEMES_USING_.LINK_FILES) like...
  17. L

    Mobile app noVNC cannot be shifted.

    When viewing a QEMU machine console with noVNC, the options are to either scale the screen locally, or not. When scaled locally, the text is so small that it's not practically usable. Disabling local scaling fixes that, but then the view screen cannot be shifted left / right or up / down, so...
  18. L

    What happens during VM migration?

    I have a FreeBSD 12.3 guest running a poller node and when it gets installed everything runs just fine. We can stop and start the guest too, no problem. The guest uses VirtIO SCSI and uses an ceph RBD image of 120GB. The FreeBSD qemu-guest-agent is installed. If for some reason the VM is...
  19. L

    Change WAL and DB location for running (slow) OSD's

    I need to do something about the horrible performance I get from the HDD pool on a production cluster. (I get around 500KB/s benchmark speeds!). As the disk usage has been increasing, so the performance has been dropping. I'm not sure why this is, since I have a test cluster, which higher...
  20. L

    [SOLVED] How to remove old mds from ceph? (actually slow mds message)

    I had a failed node, which I replaced, but the MDS (for cephfs) that was on that node is still reported in the GUI as slow. How can I remove that? It's not in ceph.conf or storage.conf MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs mdssm1(mds.0): 6 slow metadata IOs are blocked > 30 secs...