Search results

  1. hepo

    Disabling Write Cache on SSDs with Ceph

    sadly no response... did you disabled the write cache?
  2. hepo

    [SOLVED] Question about Backups

    Does anyone knows if feature/enhancement request was actually made? We are running dedicated PBS with HDDs in mirrored vdevs (ZFS), 7 nodes running backups in the same time is choking the server/disks. Slow PBS server causes issues on the VM's while the backup is running. To mitigate this at...
  3. hepo

    Pve cluster with ceph - random VMs reboots with node reboot

    Wander if the norecover is creating the problem here... All docs refer to noout and norebalance flags only. Anyone?
  4. hepo

    Pve cluster with ceph - random VMs reboots with node reboot

    thanks for the comment, we have random VM reboots... the host reboot is controlled (ram pre-failure warning)
  5. hepo

    Pve cluster with ceph - random VMs reboots with node reboot

    Hi community, Recently we started observing weird behavior as follows: - VMs are migrated out of of the cluster node (1/7) - norecover and norebalace OSD flags are set - The node (pve12) is shut down for HW maintenance (ram and battery replacement) - Random number of VMs are rebooted on another...
  6. hepo

    MySQL performance issue on Proxmox with Ceph

    I really like the collaboration ;) Some reading and testing is required here... on first look I am comfortable with disabling debug messages and will look into the rbd cache for the clients. @itNGO disclaimer acknowledged! We have a test cluster where we can test config, not performance...
  7. hepo

    MySQL performance issue on Proxmox with Ceph

    Yes, more testing was done... we were mostly focusing on enabling jumbo frames (MTU=9000) and can definitely see more stable and faster speeds. This is migrating VM disk from one Ceph pool to another which would load the reads and write of the Ceph cluster simultaneously: That 1.1GiB/s is the...
  8. hepo

    MySQL performance issue on Proxmox with Ceph

    I did more testing today with separate/standalone server we have (PVE installed but not configured/used). The server specs are 2x Xeon E5-2698 v4, 512GB RAM and 3x Samsung PM9A3 3.84TB NVMe. The test were done with the same fio command from above fio -ioengine=libaio -direct=1 -name=test -bs=4k...
  9. hepo

    MySQL performance issue on Proxmox with Ceph

    Looks like I've missed this... Agree, we have looked at block size miss-alignment, this issues would be even worst with the 4M block size that Ceph uses!
  10. hepo

    MySQL performance issue on Proxmox with Ceph

    I will give the jumo's a try... they are supported in our infra but so far I was hesitating. I know "ceph loves jumbo frames" and have seen the PVE team using them while benchmarking. I cannot get rid of the latency, this means remove one of the datacenters - no go. From all that was shared...
  11. hepo

    MySQL performance issue on Proxmox with Ceph

    What latency do you have between the DC's? Are you using jumbo frames? Here's mine: 40 packets transmitted, 40 received, 0% packet loss, time 39038ms rtt min/avg/max/mdev = 1.041/1.101/1.152/0.029 ms
  12. hepo

    MySQL performance issue on Proxmox with Ceph

    3 different fio commands - 4M and 4K block sizes and different iodepth (threads) fio -ioengine=libaio -direct=1 -name=test -bs=4M -iodepth=16 -rw=randwrite -runtime=60 -filename=/dev/sdc fio -ioengine=libaio -direct=1 -name=test -bs=4k -iodepth=1 -rw=randwrite -runtime=60 -filename=/dev/sdc...
  13. hepo

    MySQL performance issue on Proxmox with Ceph

    Yes, ceph uses synchronous writes and will only ack when all writes are done on all OSDs. We have 3 identical nodes in both DC's - so both redundancy and compute. Re-running the test in single DC now and will share the results shortly (crushmap already accommodates this, just need additional...
  14. hepo

    Random host reboot when moving VM disk from ceph to local-zfs

    Nothing interesting on the other nodes, no reply from OSDs and then quorum entries. I did get a fencing emails during the event but thought this is normal since the node is down.
  15. hepo

    Random host reboot when moving VM disk from ceph to local-zfs

    Micron 5300 Pro SSDs https://www.micron.com/products/ssd/bus-interfaces/sata-ssds/part-catalog/mtfddak480tds-1aw1zab
  16. hepo

    MySQL performance issue on Proxmox with Ceph

    Ceph is on Micron 7300 Pro NVMe Local ZFS is on Micron 5300 Pro SSD's This is what we have on each server.
  17. hepo

    MySQL performance issue on Proxmox with Ceph

    Host CPU is (dual socket) Intel Xeon Platinum 8160 - arc What tuning you would suggest for Ceph? if there is room for improvement then I am totally interested ;) Latency between the DC is about 1.5ms which is higher but not extremely high. I am still not convinced we are having IO problem, the...
  18. hepo

    MySQL performance issue on Proxmox with Ceph

    We have being default when we started, then started enabling various options and measure the difference. Just re-ran the tests with cache=none, CentOS is down to the Ubuntu numbers - conclusion CentOS leverages cache=writeback, Ubuntu does not. Ceph.conf is default, we like it that way unless...
  19. hepo

    MySQL performance issue on Proxmox with Ceph

    Hi there, hope all is in order ;) Here's the VM config Ubuntu VM agent: 1,fstrim_cloned_disks=1 balloon: 0 boot: order=scsi0;net0 cores: 32 memory: 73728 name: dev-rms113 net0: virtio=02:AB:5C:C3:42:7E,bridge=vmbr1,firewall=1 numa: 1 ostype: l26 scsi0...