Nope, we have since replaced all production machines this error occurred on as there seems to be no possible fix for this fairly common problem (they were older AMD machines anyway). My testing machine however still runs into this mystery of a problem every few weeks or so if left on long...
After having experienced this issue any number of times, I have started to collect some metrics on the sort of machines it happens on. One thing that immediately stood out to me that on the about ~10ish machines I have actively and extensively run PVE on, this issue has in my case exclusively...
Exact same as original post:
root@pve:~# service pveproxy status
● pveproxy.service - PVE API Proxy Server
Loaded: loaded (/lib/systemd/system/pveproxy.service; enabled)
Active: failed (Result: timeout) since Tue 2016-09-27 14:58:46 CEST; 21h ago
Main PID: 830 (code=exited...
Pardon, I meant blocked. I get the same output as in original post. In any case:
root@pve:~# ps faxl | grep pveproxy
0 0 3372 3242 20 0 12732 1792 pipe_w S+ pts/3 0:00 \_ grep pveproxy
Stuck with the exact same problem on 3 servers, however whenever it happens `pvesm status` seems to execute fine. This still is quite a problem. I also don't see any weird I/O wait times or whatever in the logs/nagios.
Any ideas how to proceed?
Unfortunately no. We ended up ditching DRBD9 altogether in favour of a combination of snapshots and other live migration tools. To this day I would like to know what was causing this but at this rate, we might just wait for a bit and see if it gets fixed or it becomes a wider known issue. For...
Thanks @mir and @kobuki for your insights, I will look into it further and play around with various settings.
For now, I've added a 120GB SSD to the servers, split the capacity in half meaning effectively 55.9GB for both cache and log. This resulted in an already quite remarkable increase to...
What would you recommend in terms of capacity split for ZIL (cache) vs. L2ARC (log)? Split the SSD capacity 50/50 or something, or is there something more optimal, and are there any places where the trade offs are documented?
For some reason, on 2 of my servers I'm getting incredibly slow IOPS and FSYNC per sec running pveperf Pproxmox 4, on a ZFS RAIDZ-1 cluster with 4 4TB WD Red's.
root@pve:/# pveperf
CPU BOGOMIPS: 15959.00
REGEX/SECOND: 743734
HD SIZE: 9921.49 GB (rpool/ROOT/pve-1)...
After a system crash I needed to restart a VM, call it ID x, after one of the other nodes in the DRBD9 cluster derped out. Now, node 1 derped out, and the VM was running on node 2, the issue being that restarting the VM on either node returns a KVM-error: Could not open...
By mistake I accidentally created a cluster on one empty, meaning no VM's are running and nothing important is stored on it, server. Now the easiest way to remove this cluster configuration is obviously reinstalling, but that'd require me to physically go down to it which is difficult ATM, so is...
Thank you so much, this enabled me to diagnose the problem. Turns out, it was to do with something other than LXC itself. The booting container got stuck on apache2 asking for a password for a certificate... Not exactly sure how this caused LXC to crash while in OpenVZ this was never an issue...
Looking at the boot log (dmesg) of the LXC, I seem to be getting some weird errors similar to this:[ 9741.362095] audit: type=1400 audit(1445199000.045:102): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxc-container-default" name="/" pid=10171 comm="mount"...
After migrating a Debian 7 OpenVZ CT from Proxmox 3 to a LXC on Proxmox 4 it does not appear to start correctly. Fristly, migration was done by stopping the container on Proxmox 3, backing up to external harddisk, then loading that harddisk in Proxmox 4, restoring that backup and setting the...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.