Search results

  1. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    I've got a relatively recent issue. If one of my nodes reboots, the other 2 will lock up and reboot within a few seconds. I'm wondering if this is due to pinning the IRQ's to a specific CPU core, as this hasn't happened to me in the past and that's the most recent change i've made outside of...
  2. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    I've got 3x minisforum ms-01 with a 13900h. Even with the --bidir flag you have no problem with iperf hitting 27?
  3. A

    VM soft-locking up with ceph during disk benchmark

    hmmm, i'm seeing errors in dmesg around the time the system locks up. i'm sure there's a relation. [65039.247828] x86/split lock detection: #AC: CPU 0/KVM/623152 took a split_lock trap at address: 0xfffff8052744bb6d
  4. A

    VM soft-locking up with ceph during disk benchmark

    I made some progress on this. If i do a less aggressive benchmark using the "default" profile don't get a lockup. SEQ1M Q8T1, SEQ1M Q1T1, RND4K 32T1, RND4K Q1T1 it runs fine. The more aggressive "ssd" profile is where it has a problem. SEQ1M Q8TQ, SEQ128K Q32T1, RND4K Q32T16, RND4KQ1T1...
  5. A

    VM soft-locking up with ceph during disk benchmark

    I changed the io scheduler to native and it died again. it seems to keep dying on the random read tests. it seems to be getting through the sequential 1M Q8Tq, and the SEQ128kQ32T1. It keeps dying on either Random 4k Q32T16 or RND4kQ1T1. i'm unsure if that has any bearing. It doesn't make...
  6. A

    VM soft-locking up with ceph during disk benchmark

    So i did do this test under zfs before setting up ceph, i was curious to see the performance difference to a VM. It worked fine.
  7. A

    VM soft-locking up with ceph during disk benchmark

    I have a windows VM and i'm running crystal disk benchmark in it, and during the benchmark the VM does a soft-lockup. This is repeatable in my environment. I'm running the latest version of PVE 8.1.11, kernel 6.5.13-5-pve, and ceph reef version 18.2.2 What i mean by a soft lockup is: If task...
  8. A

    Opt-in Linux 6.8 Kernel for Proxmox VE 8 available on test & no-subscription

    as of today openzfs doesn't support kernels greater than 6.7, so would it be safe to assume if you use ZFS then upgrading to 6.8 is at your own risk?
  9. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    So cable pulls worked fine. But I'm having the same problem of frr restarting too early as a post-up command and not surviving a reboot, so I tried your if-up script and having the same issue of only 1 interface coming up on boot
  10. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    putting a post-up script by itself is not working on boot, but does work if i restart networking. Since I'm testing things remote i can't test if it helps with physically plugging cables in. I also tried @dovh 's method with an if-up.d script, this did work on a reboot! Next tonight i will...
  11. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    Pretty close to what I did, but running this on boot only doesn't work. On my system I've seen that the interrupt addresses have changed when thunderbolt cables were plugged and unplugged. So i'm thinking it might make more sense to make this part of the udev rule that scyto setup to bring up...
  12. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    ok, i've been messing with this for the better part of 2 hours. i figured it out. I noticed that when i was getting a lot of retries ksoftirqd was eating up nearly 100% of a CPU core, but when i told iperf3 to limit to 19gbps and i got few retries ksoftirqd was using hardly any cpu. I found...
  13. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    The CPU has its own tb4 controller and dedicated lanes for both ports. Ive also gotten 26gbps at other points of my testing I had to send a unit back due to a bad rj45 port. But that was also with high retries. I did also hit a weird wall where if my mtu size exceeded 35,000 or so, ceph would...
  14. A

    migrated esxi VM Boots to blank screen

    i never solved it, i just recreated the VM's with debian.
  15. A

    Intel Nuc 13 Pro Thunderbolt Ring Network Ceph Cluster

    I'm running different hardware than you guys, the minisforum ms-01. I'm getting a lot of retransmits with iperf3. I started with some no-name thunderbolt4 cables, "connbull". I ordered and tried a belkin tb4 cable and an owc tb4 cables no apparent change. I've followed @scyto guide over on...
  16. A

    MS-01 HCI HA Ceph Sanity Check

    I considered that, but i coudln't find any A/E nvme drives in the US. I found one in Europe and i found a bunch of adapters. the usb storage has been working fine though. I'm probably going to put a google coral in the A/E slot eventually, but going to try openvino first with the v-igpu. I...
  17. A

    MS-01 HCI HA Ceph Sanity Check

    My goal of this project was to decrease power utilization, and gain redundancy, from my single r730xd which has dual e5-2690v4 cpu's and 256 gb of ram. My r730xd's power floor, even with power tuning is around 280w (that's with storage). Each ms-01 node has 96gb of ram, and 3 nvme drives...
  18. A

    How do i reserve/prioritize resources for a critical VM/CT?

    A lot of my VM's and CT's are for fun home hosting stuff, but a few of them are important. My cluster has 3 nodes and while it has enough resources if all the nodes are up, memory get a little tight if one of them goes down. There's enough, but it depends on how the VM's wind up being...
  19. A

    MS-01 HCI HA Ceph Sanity Check

    I'm doing the homelab with a trio of ms-01s. Each host has 1x7.68 u.2, and 2x 3.84tb m.2. booting off of a USB to nvme adapter. Ceph network is a thunderbolt mesh with open fabric routing, and using the built in sfp ports lacp'ed to my switch for proxmox vms, 1 of the 2.5 ports dedicated for...
  20. A

    Alternative backup method for containers with big disk images?

    Thank you for the 2 links! I'm glad to see progress is being made on the issue

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!