Hi,
I'm trying to figure out which logs to use to try and diagnose the issue I'm having.
I have a lab/learning cluster with 3 nodes and a Ceph cluster using bluestore storage pool.
The ceph cluster uses a 1gbe interface for public network, and a dedicated 10gbe for the cluster network.
I previously just had 1 gbe all around, but I wanted to test more thoroughly so added some 10gb nics.
I cloned a Ubuntu server I had running on the local ZFS disks of the cluster, and put it on CEPH.
Almost immediately I noticed while trying to rsync stuff over to it the VM would crash (and HA would restart it). This happened continuously.
Eventually I finished configuraiton and it stopped crashing for a while, but during it's backup window it only got part way in before crashing again.
I'm not running out of RAM, disk space is fine etc etc - so I decided to test by moving it back onto the local disk with ZFS, and it has been rock solid.
So, the issue must be CEPH, I've probably misconfigured it or done something stupid as I'm only learning - but I don't know which logs to start looking at that can tell me why the VM is suddenly crashing.
I have had a look at ceph.log and ceph-osd.log and there are no errors being spat out that I can see obviously. Promxox summary pages for Ceph say all is healthy and happy.
Are there any other proxmox or qemu logs or something that might give me a clue as to where to start looking for the issue?
thanks
I'm trying to figure out which logs to use to try and diagnose the issue I'm having.
I have a lab/learning cluster with 3 nodes and a Ceph cluster using bluestore storage pool.
The ceph cluster uses a 1gbe interface for public network, and a dedicated 10gbe for the cluster network.
I previously just had 1 gbe all around, but I wanted to test more thoroughly so added some 10gb nics.
I cloned a Ubuntu server I had running on the local ZFS disks of the cluster, and put it on CEPH.
Almost immediately I noticed while trying to rsync stuff over to it the VM would crash (and HA would restart it). This happened continuously.
Eventually I finished configuraiton and it stopped crashing for a while, but during it's backup window it only got part way in before crashing again.
I'm not running out of RAM, disk space is fine etc etc - so I decided to test by moving it back onto the local disk with ZFS, and it has been rock solid.
So, the issue must be CEPH, I've probably misconfigured it or done something stupid as I'm only learning - but I don't know which logs to start looking at that can tell me why the VM is suddenly crashing.
I have had a look at ceph.log and ceph-osd.log and there are no errors being spat out that I can see obviously. Promxox summary pages for Ceph say all is healthy and happy.
Are there any other proxmox or qemu logs or something that might give me a clue as to where to start looking for the issue?
thanks