Search results

  1. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    There was no unsuspected activity on that node at the time of hanging
  2. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    root@pve-node3:~# dmesg -T | grep Intel [Sun Sep 8 04:22:18 2019] Intel GenuineIntel [Sun Sep 8 04:22:19 2019] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz (family: 0x6, model: 0x2d, stepping: 0x7) [Sun Sep 8 04:22:19 2019] Performance Events: PEBS fmt1+, SandyBridge events...
  3. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    root@pve-node3:~# lspci 00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07) 00:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 1a (rev 07) 00:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2a (rev 07) 00:03.0 PCI bridge...
  4. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    root@pve-node3:~# uname -a Linux pve-node3 5.0.21-1-pve #1 SMP PVE 5.0.21-2 (Wed, 28 Aug 2019 15:12:18 +0200) x86_64 GNU/Linux root@pve-node3:~# pveversion -v proxmox-ve: 6.0-2 (running kernel: 5.0.21-1-pve) pve-manager: 6.0-7 (running version: 6.0-7/28984024) pve-kernel-5.0: 6.0-7...
  5. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    root@pve-node3:~# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host...
  6. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    [Sun Sep 8 04:23:20 2019] fwbr143i0: port 2(tap143i0) entered disabled state [Sun Sep 8 04:23:20 2019] fwbr143i0: port 2(tap143i0) entered blocking state [Sun Sep 8 04:23:20 2019] fwbr143i0: port 2(tap143i0) entered forwarding state [Sun Sep 8 07:25:56 2019] perf: interrupt took too long...
  7. W

    PVE 6 cluster nodes randomly hangs (10gbe network down)

    I've noticed that after installing PVE 6.x ckuster with 10Gb net for intercluster and storage (NFS) communications cluster nodes randomly hangs - still available through ethernet (1Gbe) nework but NOT accesible via main 10Gbe, so neither cluster nor storage are availible Yesterday it happened...
  8. W

    BlueFS spillover detected on 30 OSD(s)

    I agree with this assumption. The one should at least be warn before and upgrade. I'm facing the same issue with 50+ OSDs and have no idea how to sort it out I don't have another cluster to play with and found not much info how correctly destroy all OSDs on single node, wipe all disks (as well...
  9. W

    Multipath iSCSI /dev/mapper device is not created (Proxmox 6)

    Check your multipath.conf file. Seems one more “}” bracket is missing at the end
  10. W

    [SOLVED] Warning after sucessfull upgrade to PVE 6.x + Ceph Nautilus

    After a successful upgrade from PVE 5 to PVE 6 with Ceph the warning message "Legacy BlueStore stats reporting detected on ..." appears on Ceph monitoring panel Have I missed something during an upgrade or it's an expected behavior? Thanks in advance
  11. W

    lacp bond wihout speed increase

    Single connection will be always limited to the speed of single interface. LACP bond increase total throughput (read as sum of all connections).
  12. W

    Nodes unreachables in PVE Cluster

    Mine configs: root@pve2:~# cat /etc/network/interfaces # network interface settings; autogenerated # Please do NOT modify this file directly, unless you know what # you're doing. # # If you want to manage parts of the network configuration manually, # please utilize the 'source' or...
  13. W

    Nodes unreachables in PVE Cluster

    I'm facing almost the same issue with couple of setups after an upgrade to 5.4. Could you show network config and lspci output. Probably we could find out something in common
  14. W

    Proxmox cluster broke at upgrade

    There is no dedicated net. But switch is not loaded (according to SNMP stats). And once again: everything was fine before an upgrade
  15. W

    Proxmox cluster broke at upgrade

    Below how omping result looks like now: root@pve2:~# omping -c 600 -i 1 -q pve2 pve3 pve4A pve3 : waiting for response msg pve4A : waiting for response msg pve4A : joined (S,G) = (*, 232.43.211.234), pinging pve3 : joined (S,G) = (*, 232.43.211.234), pinging pve3 : given amount of query...
  16. W

    Proxmox cluster broke at upgrade

    omping test now shows 60% drop(( it was not the case with 5.3 (I performed that tests on all cluster setups)
  17. W

    Proxmox cluster broke at upgrade

    This morning I restarted corosync on all the nodes again. Cluster was forking for couple of minutes and than hanged May 15 09:40:10 pve1 systemd[1]: Starting Corosync Cluster Engine... May 15 09:40:10 pve1 corosync[24728]: [MAIN ] Corosync Cluster Engine ('2.4.4-dirty'): started and ready...
  18. W

    After upgrade to 5.4 redundant corosync ring does not work as expected

    On another cluster I'm facing different issue but again after an upgrade to 5.4 Could you please take a look into: https://forum.proxmox.com/threads/proxmox-cluster-broke-at-upgrade.54182/#post-250102 I'm fully confident that my network switches are configured inline with PVE docs IGMP snooping...

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!