Search results

  1. T

    Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

    Hey ;) Today, over half of my test cluster got affected by this bug (4 nodes out of 6) within 5 minutes of each other - PDU with watchdog was doing overtime, however one node somehow even thou it had this error happening it was responding to ping. Not great for ping based PDU watchdog. As they...
  2. T

    Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

    I think (or more hope) that those things are getting through the "backports" channel. Also it may depends on how severe the fix is. We just need to know which mainline kernel it got into then we can trace it into current version that pve kernel is based on.
  3. T

    Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

    Do you know whenever it already is or when it would show up in kernel ?
  4. T

    Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

    Yeah, but you know ... this issue is affecting people since 2020 - I doubt that 6.5 was even in numbering pipeline then. Not saying that 6.5 being somehow magical, but that would require a double tap - a fix into 6.5 and then a mess-up post 6.5. You know I like coincidences, but double...
  5. T

    Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

    And today like on queue, second server in the test cluster decided to noop out of the network with exactly the same messages in syslog, exactly the same hardware config, exactly the same interfaces file.
  6. T

    Error I40E_AQ_RC_ENOSPC, forcing overflow promiscuous on PF

    I've had similar problems and I've changed the interfaces file to what you have here, with only difference being: bridge-vids 2-4094 And today one of the the servers decided to spew out plenty of: Dec 24 22:38:52 asdf-3 kernel: i40e 0000:02:00.1: Error I40E_AQ_RC_ENOSPC, forcing overflow...
  7. T

    CEPH - one pool crashing can bring down other pools and derail whole cluster.

    That is interesting! I grant you that maybe my test setup did not replicate the original problem and is simply broken but this is something I can replicate. So for me if I pull two disk out of pool_2 I get "ceph error" (not ceph warn) and all VM's go down - which is a bit bizzare for me. I...
  8. T

    CEPH - one pool crashing can bring down other pools and derail whole cluster.

    @bbgeek17 - I've illustrated the problem with most minimalistic test cluster setup possible for anyone interested to test, production cluster is slightly different. @itNGO - as kindly as possible: I've replicated the problem on a test cluster that we've noticed in production and presented it...
  9. T

    CEPH - one pool crashing can bring down other pools and derail whole cluster.

    Hi, Since we've been migrating more and more stuff to ceph under proxmox, we've found a quirky behavior and I've built a test case for that on my test cluster. Create a small cluster with minimum 4 nodes. create one ceph pool sharing using one disk per node with 4 times mirroring, with minimum...
  10. T

    HEALTH_WARN 1 daemons have recently crashed

    An odd idea - maybe this could be an extension to the UI ?
  11. T

    Proxmox 4.4 virtio_scsi regression.

    hence it’s more dangerous - you do your preliminary tests for production, everything is cool, you dump you data in - few months later you hit a strange behaviour and realise all your data and backup is corrupt because it was slowly creeping in ;)
  12. T

    Proxmox 4.4 virtio_scsi regression.

    Yeah so bottom line is that good old Fabian created a patch that will change the default option to "not so really a passthrough disk", but the underlying qemu problem still persists. See ? I was right, there are some (covered by thick dust) skeletons in that closet ... and alto I'm an idiot...
  13. T

    [SOLVED] Performance comparison between ZFS and LVM

    Yeah, mitigation was suggested few month later, I guess you can imaging what business will say if you suggest "just sit on our hands until somebody graciously will give us a way of fixing that". I'm not knocking of proxmox chap - they have their product that is based on open source projects ->...
  14. T

    [SOLVED] Performance comparison between ZFS and LVM

    Well, story was pretty basic, I was one of first to actually perform the update of proxmox because it was during the christmas, and about 24 * 8TB drives worth of data went to hell (as far as I remember the size of the shelf). I was pretty "displeased" (and maybe was to harsh in the thread)...
  15. T

    [SOLVED] Performance comparison between ZFS and LVM

    I think you're trying to split hairs here. Lvm has more features than RAW but not as many as ZFS. LVM allows to have several VM's on single storage device, and device passthrough should be one per VM. From my (and few other admins) experience LVM seems to be more reliable than RAW disk...
  16. T

    [SOLVED] Performance comparison between ZFS and LVM

    yeah, passthrough direct IO devices ... magically becoming mdraid. Please read the thread from the beginning, not the tail end where some peps hijacked it. Bottom line is that there are esoteric bugs, and if people want to use it - they need to be aware of possibility of having all the data...
  17. T

    [SOLVED] Performance comparison between ZFS and LVM

    Ok so I will throw my knickers in the ring as well ;) In interest of full disclosure: - I'm an zfs fanboy (at least since it became available on linux and fixing some of my btrfs headaches). - I mostly use ZFS & ceph, then lvm if need for it arrises. ZFS is an fs that will allow you to go...