Don't see any followup to this - but we're having precisely the same problem.
This is a brand new PBS installation - all updates are done - and the zpool layout is as follows:
root@pbs-01:~# zpool status
scan: scrub repaired 0B in 00:17:24 with 0...
We have an issue with OVS native VLAN tagging.
For a subset of VMs - that should be natively tagged to VLAN 6 - we're unable to get the native VLAN tagging to work.
When we hardset the network interface to VLAN6 for the affected VM, everything works fine, but for some reason, if we leave...
To add to what Mira is saying..
You don't require shared storage in order to do either a live migrate or a migrate of shut-down VMs - the only difference is that with a live migrate you will be able to select the destination filesystem/dataset you would like the VM and its disks to be...
We have occasional issues across a 20-node cluster that has an NFS share configured as part of the cluster storage (i.e. in /etc/pve/storage.cfg).
Basically every so often the NFS server becomes unresponsive - we use it for backups and not for running VMs - but this causes hosts to...
I work with @Chaosekie so I'm adding my bit on this.
The zpool status is from the PVE host - whilst your comments are valid, we don't have performance issues on the PVE hosts using RAIDZ2/3 pools - and data resilience is critical for us. We limit default VM disk IO to 80Mbytes / 5000...
We run a 20-node production cluster, with just over 1100 VMs. Hosts are typically Intel XEON Scalable processors, with between 768GB-1TB RAM, and between 1-2 RAIDZ3 SSD pools per host - with roughly 10TB usable per pool.
We do not have separately installed OS pools i.e. the primary/first...
We're having a strange issue with inter-VLAN comms/networking between VMs on different VLANs on the same host.
We have experienced the same issue on hosts setup with Linux OR OVS bridging - and because of the fact that we segment customer VMs into customer-specific VLANs, we manage VMs...
We are busy having the same sorts of issues - to the point that a bunch of VM CPU loads hit 100% utilisation during backups (or close to that..)
As regards the diagnostics on this, I get what @dietmar is saying about computing data checksums and compression, but surely this wouldn't...
I can confirm the process works just fine.. :)
Just a very small suggestion to add to the notes (it's not immediately obvious): you have to do an upgrade to 6.4 first before trying to run proxmox-boot-tool and make sure NOT to do a 'zpool upgrade' yet (last step).
@fabian thanks for the info above. Just to confirm if my understanding is correct:
1. New 6.4 installations with ZFS will no longer have any issues with zpool upgrades and issues with GRUB on reboots - by default?
2. I'm assuming the howto on switching to this new boot mode will indicate...
Our issue is not updating each node :) - we just want to try and keep all hosts at the same update/'release' level.
By the time we've migrated VMs between hosts to do updates (which can take a couple of weeks), we're already in a place where updates have been released in the...
We (currently) have a 15-node cluster and adding a number of additional nodes migrating from SmartOS.
Each host has between 768GB-1.5TB RAM and either 1or 2 RAIDZ3 pools (with about 13TB of usable storage for each pool) - hosting between 50-100 VMs each, so we have fairly dense hosts...
Great news and congrats on the release :)
We're excited to try this out..
Quick question - as regards datastore options: I'm assuming it's possible to attach NFS mounts, CephFS,RBD etc. as datastore options for PBS? Or must it only be local direct-attached storage?
We'll start playing with...
We are experiencing issues with IP fragmentation to and from VMs on Proxmox hosts.
The issue is impacting ONLY the VMs on all our Proxmox hosts, which VMs all have network interfaces tagged to various VLANs.
Note that we have NO issues when pinging from any Proxmox HOST servers to servers...
We seem to be having some regular and predictable issues with ZFS on our Proxmox hosts - especially under load.
Although the following are 2 separate scenarios, we believe that in both cases, the zpool is under load and causes the same negative outcome.
In summary, we run a cluster of PVE...
We have an issue with IP fragmentation not working.
We're not exactly sure where the problem lies but it definitely seems to be related to Proxmox (not affecting VMs on SmartOS at all - connected to the same switches etc).
Basically, our setup is as follows:
1. Running PVE 6.1 with all...