Repeatable Kernel Crash on LXC Creation with Proxmox 4.1 beta 1 (4.1 Kernel)

talos

Active Member
Aug 9, 2015
63
8
28
Hi,

i installed Proxmox 4 beta 1 on my Server a while ago. It works great with KVM and with LXC Containers. A few days ago i updated and got a new PVE Kernel 4.1. Now i have trouble creating LXC Containers. If i create a new Container for example Ubuntu 15.04, Proxmox starts to create the image and after a few seconds the machine locks up. After a reboot and retry i got another lookup, this is endless repeatable. On the local server console i can see an endless stream of kernel crash/panic messages, so fast i am unable to read it. The LXC storage is on a RAID-Z ZFS Volume.

I use this server heavly with several windows vm's for development purpose. I have never had any issues with KVM. The Proxmox Web Interface is crashing sometimes, i have to restart to the proxy to get it running again.

Here some specs of the machine:

# pveversion
pve-manager/4.0-26/5d4a615b (running kernel: 4.1.3-1-pve)

Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
32 GB of ECC Ram
8 TB Disk ZFS RAID Z with Cache and LOG SSD
2 TB Disk ZFS Stripped Mirror with Cache and LOG SSD

Anyone having the same issue?
 

talos

Active Member
Aug 9, 2015
63
8
28
I found several zfs errors in my system log overnight. I can't see any impact on my running KVM's. I have no idea if this is critical/problematic.

[ 3059.283094] Large kmem_alloc(65536, 0x1000), please file an issue at:
https://github.com/zfsonlinux/zfs/issues/new
[ 3059.283100] CPU: 0 PID: 843 Comm: zvol Tainted: P O 4.1.3-1-pve #1
[ 3059.283102] Hardware name: Supermicro X9SAE/X9SAE, BIOS 2.0b 07/10/2013
[ 3059.283104] 0000000000000000 ffff8807f7553bd8 ffffffff818015db ffff88081e2121b8
[ 3059.283107] 000000000000c210 ffff8807f7553c28 ffffffffc0552cc4 0000000000000017
[ 3059.283109] 00000000d878a128 ffff8807f7553c18 ffff8800354029a0 0000000000002000
[ 3059.283111] Call Trace:
[ 3059.283118] [<ffffffff818015db>] dump_stack+0x45/0x57
[ 3059.283129] [<ffffffffc0552cc4>] spl_kmem_zalloc+0x164/0x1e0 [spl]
[ 3059.283149] [<ffffffffc06bd8cb>] dmu_buf_hold_array_by_dnode+0x9b/0x4e0 [zfs]
[ 3059.283164] [<ffffffffc06bdded>] dmu_buf_hold_array+0x5d/0x80 [zfs]
[ 3059.283179] [<ffffffffc06bf17a>] dmu_write_req+0x6a/0x1e0 [zfs]
[ 3059.283209] [<ffffffffc076570b>] zvol_write+0x11b/0x470 [zfs]
[ 3059.283214] [<ffffffffc055653d>] taskq_thread+0x22d/0x480 [spl]
[ 3059.283218] [<ffffffff810a6f80>] ? wake_up_state+0x20/0x20
[ 3059.283223] [<ffffffffc0556310>] ? taskq_cancel_id+0x120/0x120 [spl]
[ 3059.283226] [<ffffffff810996bb>] kthread+0xdb/0x100
[ 3059.283229] [<ffffffff810995e0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3059.283232] [<ffffffff81809322>] ret_from_fork+0x42/0x70
[ 3059.283234] [<ffffffff810995e0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3059.302043] Large kmem_alloc(38912, 0x1000), please file an issue at:
https://github.com/zfsonlinux/zfs/issues/new

For now i downgrade to 3.19.8.

Thanks for your help.
 

talos

Active Member
Aug 9, 2015
63
8
28
Last night i found several entrys from zvol in my system log. But i see no impact on my running KVM's. I dont know if this is critical/problematic.

[ 3978.247496] CPU: 5 PID: 843 Comm: zvol Tainted: P O 4.1.3-1-pve #1
[ 3978.247498] Hardware name: Supermicro X9SAE/X9SAE, BIOS 2.0b 07/10/2013
[ 3978.247499] 0000000000000000 ffff8807f7553bd8 ffffffff818015db ffff88081e3521b8
[ 3978.247502] 000000000000c210 ffff8807f7553c28 ffffffffc0552cc4 0000000000000017
[ 3978.247505] 00000000d878a128 ffff8807f7553c18 ffff8800354029a0 0000000000001a01
[ 3978.247507] Call Trace:
[ 3978.247514] [<ffffffff818015db>] dump_stack+0x45/0x57
[ 3978.247533] [<ffffffffc0552cc4>] spl_kmem_zalloc+0x164/0x1e0 [spl]
[ 3978.247551] [<ffffffffc06bd8cb>] dmu_buf_hold_array_by_dnode+0x9b/0x4e0 [zfs]
[ 3978.247566] [<ffffffffc06bdded>] dmu_buf_hold_array+0x5d/0x80 [zfs]
[ 3978.247580] [<ffffffffc06bf17a>] dmu_write_req+0x6a/0x1e0 [zfs]
[ 3978.247608] [<ffffffffc076570b>] zvol_write+0x11b/0x470 [zfs]
[ 3978.247613] [<ffffffffc055653d>] taskq_thread+0x22d/0x480 [spl]
[ 3978.247616] [<ffffffff810a6f80>] ? wake_up_state+0x20/0x20
[ 3978.247621] [<ffffffffc0556310>] ? taskq_cancel_id+0x120/0x120 [spl]
[ 3978.247624] [<ffffffff810996bb>] kthread+0xdb/0x100
[ 3978.247627] [<ffffffff810995e0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3978.247629] [<ffffffff81809322>] ret_from_fork+0x42/0x70
[ 3978.247631] [<ffffffff810995e0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3978.620215] Large kmem_alloc(53256, 0x1000), please file an issue at:
https://github.com/zfsonlinux/zfs/issues/new
[ 3978.620221] CPU: 1 PID: 15845 Comm: zvol Tainted: P O 4.1.3-1-pve #1
[ 3978.620222] Hardware name: Supermicro X9SAE/X9SAE, BIOS 2.0b 07/10/2013
[ 3978.620224] 0000000000000000 ffff8801b30b3bd8 ffffffff818015db ffff88081e2521b8
[ 3978.620227] 000000000000c210 ffff8801b30b3c28 ffffffffc0552cc4 0000000000000017
[ 3978.620230] 00000000d878a128 ffff8801b30b3c18 ffff8800354029a0 0000000000001a01
[ 3978.620232] Call Trace:
[ 3978.620238] [<ffffffff818015db>] dump_stack+0x45/0x57
[ 3978.620249] [<ffffffffc0552cc4>] spl_kmem_zalloc+0x164/0x1e0 [spl]
[ 3978.620269] [<ffffffffc06bd8cb>] dmu_buf_hold_array_by_dnode+0x9b/0x4e0 [zfs]
[ 3978.620283] [<ffffffffc06bdded>] dmu_buf_hold_array+0x5d/0x80 [zfs]
[ 3978.620297] [<ffffffffc06bf17a>] dmu_write_req+0x6a/0x1e0 [zfs]
[ 3978.620324] [<ffffffffc076570b>] zvol_write+0x11b/0x470 [zfs]
[ 3978.620330] [<ffffffffc055653d>] taskq_thread+0x22d/0x480 [spl]
[ 3978.620333] [<ffffffff810a6f80>] ? wake_up_state+0x20/0x20
[ 3978.620337] [<ffffffffc0556310>] ? taskq_cancel_id+0x120/0x120 [spl]
[ 3978.620341] [<ffffffff810996bb>] kthread+0xdb/0x100
[ 3978.620343] [<ffffffff810995e0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3978.620346] [<ffffffff81809322>] ret_from_fork+0x42/0x70
[ 3978.620349] [<ffffffff810995e0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3982.996672] zd128: p1 p2
[ 3997.545643] zd64: p1 p2
[ 4754.304088] device tap109i0 entered promiscuous mode
[ 4757.049820] kvm: zapping shadow pages for mmio generation wraparound
[ 7868.460040] perf interrupt took too long (2511 > 2500), lowering kernel.perf_event_max_sample_rate to 50000


For now i downgrade to 3.xx.

Thanks for your help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!