You mentioned in another thread that you were able to duplicate this issue on a clean install. Can you please share the details here so Proxmox can attempt to duplicate the environment? Thanks!
There are 5.1 hypervisors (running pve-kernel) where I don't see the issue either and I'm at a loss as to why some hypervisors can reproduce this easily while others appear unaffected (at exactly the same patch level). My best guess is that it's an interaction or timing problem caused by a...
The kernels (4.14.20 and 4.14.23 -- both resolve the issue) are from the source tars on kernel.org with no patches applied. I just copied over the Proxmox config from the 4.13.13 kernel (make oldconfig) then used the make target "deb-pkg" to get the kernel and headers packaged so I could...
This sounds like the issue I posted here:
https://forum.proxmox.com/threads/lxc-container-reboot-fails-lxc-becomes-unusable.41264/#post-201351
The LXC bug report I filed at the same time (tl;dr -- it's a kernel bug):
https://github.com/lxc/lxc/issues/2141
And other users reporting the same...
That's my bug report at LXC for this issue. It doesn't seem difficult for me to duplicate the issue either so it's strange that more people aren't complaining. Anyway, Proxmox is aware that the newer kernels resolve the issue so hopefully an updated pve-kernel in the future will take care of this.
Which virtio version did you use to install the Balloon driver and service? I haven't had any problem with virtio-0.1.126 across several hundred Windows KVMs so you may want to try that version if you continue to have blue screens.
I expected to see a lot of other people running into this issue but there haven't been many "me too" posts. There could be many reasons:
* LXC container reboots may be infrequent in other environments.
* The issue may be specific to containers migrated from OpenVZ.
* Many people may be on...
If you're desperate for an immediate resolution, I have debian packages built using GCC 7.3 (full generic retpoline) for 4.14.23 along with ZFS 0.7.6 DKMS packages. The issue is definitely resolved in any kernel 4.14.20 or later (possibly earlier but that's where I started).
Things to...
Sounds just like:
https://forum.proxmox.com/threads/lxc-container-reboot-fails-lxc-becomes-unusable.41264/#post-200409
If that is your problem too, the only option currently is to build your own 4.14.20+ kernel with ZFS modules. The latest pve-kernel at the time of writing is 4.13.13-6...
That's is how I handle packages too. I work for a company with ~ 40 Proxmox licenses. Sometimes, I'll enable the no-subscription repo on a licensed server because I need a fix or feature. But not without evaluating it first; the process typically involves an install on my home server, then on...
Any chance there's a firewall in the way?
Try this:
netstat -telnp | grep 8006
It should show something like:
tcp 0 0 0.0.0.0:8006 0.0.0.0:* LISTEN 33 27492 31924/pveproxy
If it doesn't try:
systemctl restart pveproxy
systemctl status...
L2ARC is often misunderstood. I'll answer your question but with a few extra details for Google's sake.
Having an L2ARC does eat into the RAM available for primary caching via the ARC. If the hot areas of your ZFS pool(s) would have just fit nicely into the ARC without an L2ARC, you will take a...
ZFS snapshots are always crash consistent. Any application that can't recover from a crash consistent snapshot would be incapable of recovering from anything that caused the server to halt unexpectedly (power failure, kernel panic, hardware fault, etc.).
Databases use sync, fsync, etc...
4-13-13-6 (4.13.13-41) still has the issue. I test for the issue by triggering a reboot every minute via cron in a CentOS 6 or Ubuntu 14.04 container. It died within a few minutes with the same symptoms (stuck cloning the network namespace):
Here's the stack for the monitor process:
# cat...
I didn't realize the pve-kernel versions aren't unique. I evaluated pve-kernel-4.13.13-6 backed by kernel 4.13.13-40 which still had the issue. I see the update you referenced to pve-kernel-4.13.13-6 (4.13.13-41) and will test that shortly. Thanks for clarifying.
While reading kernel commits yesterday, I noticed multiple namespaces fixes in 4.14.16 - 4.14.20 so I built a vanilla 4.14.20 kernel (using the config but no patchsets from 4.13.13-6) and there have been no further namespace cloning issues after almost 1000 reboots.
I can confirm the issue is...
I have been trying different kernels, network changes, LXC container image changes, etc. without solving this problem. However, I notice these on the console regularly:
kernel:[ 1987.452238] unregister_netdevice: waiting for lo to become free. Usage count = 1
I've seen that before without it...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.