Ran Memtest86 and found out it was indeed a memory instability running DDR5 with EXPO enabled.
I'm not pushing pulling any real load or pushing for maximum performance, so it's no issue to disable EXPO and live with the small performance hit.
Thanks for pointing me in the right direction...
Just updated to Proxmox version 8.2 to see if that fixes anything, but sadly it did not help.
I'm starting to suspect that there is a hardware/BIOS instability that is causing these issues.
During backups to PBS the hypervisor will do a hard crash, it is not consistent at which point it does it.
Sometimes a backup succeeds, and sometimes it will not. But after a few backups 1 will fail and fully crash the hypervisor.
Does anyone have any idea where I can start at debugging this...
Hey Fiona,
Issuing another migrate_cancel command using `qm monitor 162` does not seem to do anything.
There is no command output or a syslog entry that says that it did anything.
Just tried another migration of the VM after issuing the migrate_cancel and it failed with the same error as...
Currently we are trying to live migrate a VM to another server within the same cluster.
The first migration successfully migrated all the attached disks and got a hangup at the "VM-state" migration step.
After 15 minutes of no progress I pressed the "Stop" button to abort the migration.
Now...
As a small update, I have tried editing the perl files for remove the strict mode that enforces the dependency sanitation.
But this did not seem to have any effect on the execution of the code, it still throws the same errors so I think the strict mode is enforced from dependency higher up in...
I'm trying to remote-migrate my LXC containers between 2 separate clusters but it keeps failing. Remote VM migrations do succeed (both online/offline).
At this point I can't seem to find the exact point the migration fails at.
Things I have searched for:
The error "failed: Insecure dependency...
I am creating a new ceph erasure coded pool using the following command:
pveceph pool create slow_ceph --erasure-coding k=2,m=1,failure-domain=osd
Using this I would expect a pool with 2 data chunks and 1 coding chunk.
So that would mean I get a pool that is able to use 66% of my pools space as...
I am trying to limit my osd RAM usage.
Currently my osd's (3) are using ~70% of my RAM (the ram is now completely full and lagging the host).
Is there a way to limit the RAM usage for each osd?
I'm in the middle of migrating my current osd's to Bluestore but the recovery speed is quite low (5600kb/s ~10 objects/s). Is there a way to increase the speed?
I currently have no virtual machines running on the cluster so performance doesn't matter at the moment. Only the recovery is running.
One of my samba containers is refusing to start after the latest update from BETA to release and I have no idea what is causing it. I have 2 disks mounted from ceph on the container.
Config file
lxc.arch = amd64
lxc.include = /usr/share/lxc/config/ubuntu.common.conf
lxc.monitor.unshare = 1...
Since the last update I have done from the BETA to the release version I am unable to delete my existing containers. Every time I try to delete a container I get this error message
2017-07-06 17:29:47.689666 7f4f08021700 0 client.1278217.objecter WARNING: tid 1 reply ops [] != request ops...
In addition to switching to Bluestore I found that Ceph supports Cache Tiering.
Would it be possible to add a "cold-storage" pool to the existing "hot-storage" pool (current pool are SSD's)?
If so, witch of the 2 pools do I have to create the Proxmox RBD storage on?
The "cold-storage" pool...
I am currently running a proxmox 5.0 beta server with ceph (luminous) storage.
I am trying to reduce the size of my ceph pools as I am running low on space.
Does ceph have some kind of option to use compression or deduplication to reduce the size of the pool on disk?
The steps I took for fixing it:
Boot up an Ubuntu (desktop) environment on the server
Open a Terminal
Display the name of all VG's and PV's using "pvdisplay"
Wipe all the VG's using "vgremove VGNAME"
Wipe all the PV's using "pvremove PVNAME"
Everything is up and running now ^^
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.