Search results

  1. fstrankowski

    Proxmox 7.3.3 / Ceph 17.2.5 - OSDs crashing while rebooting

    So we can atleast say that our problem is unrelated to the kernel version. You're running 6.1 while we're at 5.15. Same issue on both systems.
  2. fstrankowski

    Proxmox 7.3.3 / Ceph 17.2.5 - OSDs crashing while rebooting

    We've recently (yesterday) updated our test-cluster to the latest PVE-Version. While rebooting the system (upgrade finished without any incidents), all OSDs on each system crashed: ** File Read Latency Histogram By Level [default] ** 2023-01-30T10:21:52.827+0100 7f5f16fd1700 -1 received...
  3. fstrankowski

    Ceph 17.2 Quincy Available as Stable Release

    You're indeed correct. So in the longrun it could be an idea to develop a ceph-ansible/cephadm inspired, proprietary Proxmox approach, to automatically calculate and adjust osd_memory_target values. Wdyt? Thats why i've been referring to it using values in between 0.1 <> 0.2 ;-)
  4. fstrankowski

    Proxmox 7.3 (LXC 5.0) using veth with multiple tagged and untagged vlans

    Since Proxmox 7.3 introduced LXC 5.0 i'm wondering when it will be possible to make use of LXC's feature of un/tagging VLANs on veth devices veth.vlan.id veth.vlan.tagged.id also IMHO it might be useful to add the rx/tx queues to the advanced tab veth.n_rxqueues veth.n_txqueues
  5. fstrankowski

    Ceph 17.2 Quincy Available as Stable Release

    Wondering why Proxmox does not enable osd_memory_target_autotune by default. Its advised if you're running Quincy >17.2.0 and for hyperconverged setups like Proxmox, we can scale it to between 0.1 and 0.2 to be on the safe side. Do you see any way to take advantage of autoscaling in the future...
  6. fstrankowski

    High disk usage in LVM-Thin

    Oneliner to trim all containers: pct list | awk '/^[0-9]/ {print $1}' | while read ct; do pct fstrim ${ct}; done
  7. fstrankowski

    windows server to VM

    Why is efidisk0 still pointing to your old volume?
  8. fstrankowski

    High disk usage in LVM-Thin

    You gotta enable discard in the VM disc mount options or run fstrim inside the VM to reclaim unused disc space.
  9. fstrankowski

    [SOLVED] fs-freeze breaks Debian 11 System

    Did you check an older Kernel? We had similiar experience with the 5.15.X Kernel tree (Proxmox 7.2 upwards) and went back to 5.13. See here: https://forum.proxmox.com/threads/proxmox-7-1-12-7-2-7-upgrade-from-ceph-16-2-7-to-ceph-16-2-9-snapshot-problems.114361/
  10. fstrankowski

    Timeout while starting VM / Snapshotting disk etc...

    Did you check an older Kernel? We had similiar experience with the 5.15.X Kernel tree (Proxmox 7.2 upwards) and went back to 5.13. See here: https://forum.proxmox.com/threads/proxmox-7-1-12-7-2-7-upgrade-from-ceph-16-2-7-to-ceph-16-2-9-snapshot-problems.114361/
  11. fstrankowski

    Proxmox 7.1.-12 > 7.2-7 Upgrade from Ceph 16.2.7 to Ceph 16.2.9 Snapshot Problems

    We'd like to share our findings regarding the described problem with backups and LXC snapshots in the latest 7.2-X version of Proxmox in combination of Ceph. After countless hours invested to debug the issue we're sure to have found the root cause of our reported issue. Summary: We've had...
  12. fstrankowski

    Proxmox 7.1.-12 > 7.2-7 Upgrade from Ceph 16.2.7 to Ceph 16.2.9 Snapshot Problems

    Good Morning everyone! Background: We've been running without errors prior to our yesterdays upgrade to 7.2-7 for weeks. Since our upgrade from 7.1-12 to 7.2-7 including the upgrade of Ceph to 16.2.9 we are not able to snapshot our LXC containers anymore, if they are running. This is...
  13. fstrankowski

    CEPH Outage "active+clean+laggy" resulted in task kmmpd-rbd*:7998 blocked

    Hello, tonight we've had quite the outage. Cluster has been healthy and not overloaded NVMe/SSD-Discs are all fine, 2-4% wearout It all started with: 2022-06-22T01:35:34.335404+0200 mgr.PXMGMT-AAA-N01 (mgr.172269982) 2351345 : cluster [DBG] pgmap v2353839: 513 pgs: 1 active+clean+laggy...
  14. fstrankowski

    Proxmox 7 - LXC SSH Root login not working

    I have to resurrect this thread because i encountered the same problem today after i've setup a brand new Proxmox 7 HV and a Debian 11 LXC container (official image) in it. I always change SSHD-Ports, thats why i had the same problem (i think the OP did the same, otherwise this error would not...
  15. fstrankowski

    Kernel Panic with our new Cluster

    Hey guys, we've bought some new Hardware and receive this kernel panic on several machines (although they all have the same HW, some panic): [ 20.699939] ------------[ cut here ]------------ [ 20.700608] kernel BUG at mm/slub.c:306! [ 20.701277] invalid opcode: 0000 [#1] SMP NOPTI [...
  16. fstrankowski

    Absturz ganzes Proxmox-Cluster mit 12 Nodes / Segfault cfs_loop

    Wir glauben den Ursprung gefunden zu haben, dies ist der einzige Server, der trotz Segfault nicht neugestartet worden ist: Feb 24 07:25:58 PX20-WW-N07 pmxcfs[11213]: [status] notice: received log Feb 24 07:25:59 PX20-WW-N07 pmxcfs[11213]: [dcdb] notice: members: 2/11238, 3/12062, 4/11417...
  17. fstrankowski

    Absturz ganzes Proxmox-Cluster mit 12 Nodes / Segfault cfs_loop

    Antworten Inline: 3 physisch voneinander getrennte Standorte. Abstand wie folgt: RZ1 <--- 600m ---> RZ2 <--- 3KM ---> RZ3. Multicast ist korrekt. Circa 1 Woche, wir benutzen HA auf allen Nodes. (Hat hier nichts gebracht, da das gesamte Cluster offline war. Glücklicherweise betreiben wir ein 2...
  18. fstrankowski

    Absturz ganzes Proxmox-Cluster mit 12 Nodes / Segfault cfs_loop

    SIcher. Wir betreiben 3 Cluster, jeweils über 3 Rechenzentren verteilt. Ein Cluster (über 3 RZ) ist komplett ausgefallen aufgrund des o.g. Fehlers. Alle zu exakt dem gleichen Zeitpunkt (Sekunden genau). Monitoring hat bis zuletzt funktioniert. Ist dann einfach ausgestiegen, da das ganze Cluster...
  19. fstrankowski

    Absturz ganzes Proxmox-Cluster mit 12 Nodes / Segfault cfs_loop

    Hallo, wir haben am Wochenende einen massiven Absturz eines unserer Proxmox-Cluster erlebt. Von jetzt auf gleich ist ein ganzes Cluster abgestürzt, zeitgleich. Hier der Auszug aus der messages: Feb 24 07:25:59 PX20-WW-SN06 kernel: [1448261.497103] cfs_loop[12091]: segfault at 7fbb0bd266ac ip...