Search results

  1. hepo

    Proxmox scalability - max clusters in a datacenter

    I thought VMWare has a limit of about 32 nodes in cluster... Multi-cluster management would be really nice!
  2. hepo

    Proxmox VE 8.0 released!

    Moving to version 8 is inevitable, I would love to see the issue resolved first, appreciate any updates you can/will provide!
  3. hepo

    Proxmox VE 8.0 released!

    "happy" to see more people reporting this issue as well as the issue being recognised and work being done to remediate... @apollo13 we have reinstalled the cluster back to version 7 since this issue was not resolved for more than a month. Happy to sit on a call to discuss setups and potential...
  4. hepo

    Kernel panic, machine stuck, task khugepaged:796 blocked for more than 120 seconds

    I have spewed tons of posts on this in the PVE8 thread, last time I've checked the issue continued. Ceph Quincy here...
  5. hepo

    Proxmox VE 8.0 released!

    Reviewing the backup jobs, just noticed one VM that had an error INFO: Starting Backup of VM 4138 (qemu) INFO: Backup started at 2023-11-21 13:20:11 INFO: status = running INFO: VM Name: prod-lws138-dbcl33 INFO: include disk 'scsi0' 'ceph:vm-4138-disk-0' 32G INFO: include disk 'scsi1'...
  6. hepo

    Proxmox VE 8.0 released!

    Thanks for engaging! Some details on the backup infra: PBS server is VM on the PVE cluster TrueNAS server has 128GB RAM (plenty of ARC) ZFS pool is striped mirror of HDDs VM for the example will be 4142, VM config: root@pvelw11:~# cat /etc/pve/qemu-server/4142.conf agent...
  7. hepo

    Proxmox VE 8.0 released!

    I need to come back to this... Did additional validation and testing as follows: OSD bench is consistent, no issues to report Rados bench shows slightly better results compared to the tests we keep record of 2 years ago Did fio testing in the VM and compared to previous results we have - no...
  8. hepo

    Proxmox VE 8.0 released!

    Thanks for the detailed write-up! We will also evaluate ceph monitoring via zabbix - https://www.zabbix.com/integrations/ceph
  9. hepo

    Proxmox VE 8.0 released!

    Thanks for the response... I would love to understand what monitoring you have implemented, sounds really good. We only collect standard proxmox metrics -> influx -> grafana... This cluster is really really quiet, we use it as hot standby to out production environment, and also for testing new...
  10. hepo

    Proxmox VE 8.0 released!

    virtio-scsi-single with iothreads was deemed better for our database servers long time ago when doing performance testing... can definitely give it a try but need to understand how to reproduce the problem (e.g. target to particular vm) Can you please expand what do you mean with this?
  11. hepo

    Proxmox VE 8.0 released!

    Random VMs, it also looks like this is happening after backup (early morning), which I need to confirm once again All VMs are configured in similar way, this VM was hanging this morning agent: 1,fstrim_cloned_disks=1 boot: order=scsi0;net0 cores: 32 cpu: x86-64-v2-AES memory: 65536 name...
  12. hepo

    Kernel panic, machine stuck, task khugepaged:796 blocked for more than 120 seconds

    Was this ever solved and how? We are observing this since recent upgrade to PVE 8 and Ceph Quincy All VM disks on Ceph agent: 1,fstrim_cloned_disks=1 boot: order=scsi0;net0 cores: 32 cpu: x86-64-v2-AES memory: 65536 name: prod-lws141-dbcl41 net0...
  13. hepo

    Proxmox VE 8.0 released!

    The issue continues (randomly)... We have noticed that migrating the VM to different node fixes the problem i.e. VM is responsive again immediately (issue yesterday was resolved by the migration, not rebooting the cluster nodes, nor patches). We have implemented detection mechanism to understand...
  14. hepo

    Proxmox VE 8.0 released!

    Just noticed bunch of patches released in the non-sub repo, which I rushed in deploying. The quickest/lamest way to detect non-responsive systems for me was to check if IP address is detected (as the qemu agent is not running). After patching all VMs appear to be ok. Not sure if rebooting the...
  15. hepo

    Proxmox VE 8.0 released!

    Another issue we observe, VMs are becoming non-responsive (cannot ssh to them), the following messages are displayed on the console I cannot reboot the VM cleanly as it lost connection to the storage...
  16. hepo

    Proxmox VE 8.0 released!

    Hi team, We just finished upgrading to version 8.... We are running 3 node cluster with Ceph, we are using no-subscription repo on this cluster. Syslog on all nodes has tons of the following Nov 18 19:38:40 pvelw11 ceph-crash[2163]: WARNING:ceph-crash:post...
  17. hepo

    Disabling Write Cache on SSDs with Ceph

    sadly no response... did you disabled the write cache?
  18. hepo

    [SOLVED] Question about Backups

    Does anyone knows if feature/enhancement request was actually made? We are running dedicated PBS with HDDs in mirrored vdevs (ZFS), 7 nodes running backups in the same time is choking the server/disks. Slow PBS server causes issues on the VM's while the backup is running. To mitigate this at...
  19. hepo

    Pve cluster with ceph - random VMs reboots with node reboot

    Wander if the norecover is creating the problem here... All docs refer to noout and norebalance flags only. Anyone?
  20. hepo

    Pve cluster with ceph - random VMs reboots with node reboot

    thanks for the comment, we have random VM reboots... the host reboot is controlled (ram pre-failure warning)

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!