Hello everyone.
I've setup a 2-nodes Proxmox environment with DRBD backed storage.
DRBD is not integrated with Proxmox (since it's still experimental, as I understand), I've manually configured it following these docs:
https://pve.proxmox.com/wiki/DRBD
https://pve.proxmox.com/wiki/DRBD9
NOTE: I came from an old Proxmox 1.9 setup and jumped to 4.1 by rebuilding everything from scratch (no DRBD metadata upgrade, everything was recreated). I only used old config files as a reference, adapting them to the new DRBD9 style.
Well, it worked great for about 3 weeks, then one of the nodes crashed with a kernel panic.
After restarting it and rebuilding the DRBD resource, I upgraded both the kernel (actually 4.2.8) and server BIOS... after a week the other node crashed (so it's not hardware or VM related).
After some trial&error I found that doing heavy IO traffic on the DRBD device will quickly lead to kernel panic. I've moved all the VMs to node A then attached a serial console to node B and caused it to crash:
(full log here: http://pastebin.com/8HEvR42w)
The first 4 lines show that DRBD has crashed.
This is my DRBD config:
As a workaround I downgraded host B to kernel 4.2.2-1-pve then started the IO traffic again.
It's running since 30 minutes ago without a crash (kernel 4.2.8-1-pve crashed after 3/4 minutes).
Do anyone has an idea of what's wrong?
I've setup a 2-nodes Proxmox environment with DRBD backed storage.
Code:
# pveversion -v
proxmox-ve: 4.1-37 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-13 (running version: 4.1-13/cfb599fb)
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.8-1-pve: 4.2.8-37
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-32
qemu-server: 4.0-55
pve-firmware: 1.1-7
libpve-common-perl: 4.0-48
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-40
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-5
pve-container: 1.0-44
pve-firewall: 2.0-17
pve-ha-manager: 1.0-21
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 0.13-pve3
cgmanager: 0.39-pve1
criu: 1.6.0-1
zfsutils: 0.6.5-pve7~jessie
drbdmanage: 0.91-1
DRBD is not integrated with Proxmox (since it's still experimental, as I understand), I've manually configured it following these docs:
https://pve.proxmox.com/wiki/DRBD
https://pve.proxmox.com/wiki/DRBD9
NOTE: I came from an old Proxmox 1.9 setup and jumped to 4.1 by rebuilding everything from scratch (no DRBD metadata upgrade, everything was recreated). I only used old config files as a reference, adapting them to the new DRBD9 style.
Well, it worked great for about 3 weeks, then one of the nodes crashed with a kernel panic.
After restarting it and rebuilding the DRBD resource, I upgraded both the kernel (actually 4.2.8) and server BIOS... after a week the other node crashed (so it's not hardware or VM related).
After some trial&error I found that doing heavy IO traffic on the DRBD device will quickly lead to kernel panic. I've moved all the VMs to node A then attached a serial console to node B and caused it to crash:
(full log here: http://pastebin.com/8HEvR42w)
Code:
http://pastebin.com/8HEvR42w
[ 2678.632647] drbd r0/0 drbd0: LOGIC BUG for enr=98347
[ 2678.637678] drbd r0/0 drbd0: LOGIC BUG for enr=98347
[ 2679.045598] ------------[ cut here ]------------
[ 2679.050273] kernel BUG at /home/dietmar/pve4-devel/pve-kernel/drbd-9.0.0/drbd/lru_cache.c:571!
[ 2679.058981] invalid opcode: 0000 [#1] SMP
[ 2679.063148] Modules linked in: ip_set ip6table_filter ip6_tables drbd_transport_tcp(O) softdog drbd(O) libcrc32c nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_comment xt_conntrack xt_multiport iptable_filter iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) ipmi_ssif amdkfd amd_iommu_v2 radeon gpio_ich ttm snd_pcm coretemp hpilo snd_timer drm_kms_helper snd drm kvm_intel soundcore i2c_algo_bit kvm psmouse 8250_fintek input_leds ipmi_si shpchp pcspkr ipmi_msghandler serio_raw acpi_power_meter lpc_ich i7core_edac edac_core mac_hid vhost_net vhost macvtap macvlan autofs4 hid_generic usbmouse usbkbd usbhid hid pata_acpi tg3 e1000e(O) ptp pps_core hpsa
[ 2679.149400] CPU: 0 PID: 2171 Comm: drbd_a_r0 Tainted: P IO 4.2.8-1-pve #1
[ 2679.157313] Hardware name: HP ProLiant ML350 G6, BIOS D22 08/16/2015
[ 2679.163733] task: ffff8800dcb9a580 ti: ffff8801fccd8000 task.ti: ffff8801fccd8000
[ 2679.171295] RIP: 0010:[<ffffffffc0ad25a0>] [<ffffffffc0ad25a0>] lc_put+0x90/0xa0 [drbd]
[ 2679.179499] RSP: 0018:ffff8801fccdbb08 EFLAGS: 00010046
[ 2679.184867] RAX: 0000000000000000 RBX: 000000000001802b RCX: ffff880207770f90
[ 2679.192078] RDX: ffff88020b8d4000 RSI: ffff880207770f90 RDI: ffff8800e20eb680
[ 2679.199288] RBP: ffff8801fccdbb08 R08: 0000000000000270 R09: 0000000000000000
[ 2679.206499] R10: ffff8800b0d8d460 R11: 0000000000000000 R12: ffff8800362a9800
[ 2679.213711] R13: 0000000000000000 R14: 000000000001802b R15: 0000000000000001
[ 2679.220923] FS: 0000000000000000(0000) GS:ffff880217400000(0000) knlGS:0000000000000000
[ 2679.229100] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2679.234906] CR2: 00007f506e540123 CR3: 0000000001e0d000 CR4: 00000000000026f0
[ 2679.242117] Stack:
[ 2679.244147] ffff8801fccdbb58 ffffffffc0acf2ea 0000000000000046 ffff8800362a9ab0
[ 2679.251682] ffff8801fccdbb68 ffff8800b0d8d028 ffff8800362a9800 ffff8800b0d8d038
[ 2679.259216] 0000000000000800 0000000000001000 ffff8801fccdbb68 ffffffffc0acf7f0
[ 2679.266751] Call Trace:
[ 2679.269229] [<ffffffffc0acf2ea>] put_actlog+0x6a/0x120 [drbd]
[ 2679.275131] [<ffffffffc0acf7f0>] drbd_al_complete_io+0x30/0x40 [drbd]
[ 2679.281735] [<ffffffffc0ac9a42>] drbd_req_destroy+0x442/0x880 [drbd]
[ 2679.288246] [<ffffffff81735350>] ? tcp_recvmsg+0x390/0xb90
[ 2679.293881] [<ffffffffc0aca385>] mod_rq_state+0x505/0x7c0 [drbd]
[ 2679.300131] [<ffffffffc0aca934>] __req_mod+0x214/0x8d0 [drbd]
[ 2679.310587] [<ffffffffc0ad441b>] tl_release+0x1db/0x320 [drbd]
[ 2679.321185] [<ffffffffc0ab8232>] got_BarrierAck+0x32/0xc0 [drbd]
[ 2679.331962] [<ffffffffc0ac8670>] drbd_ack_receiver+0x160/0x5c0 [drbd]
[ 2679.343127] [<ffffffffc0ad29a0>] ? w_complete+0x20/0x20 [drbd]
[ 2679.353619] [<ffffffffc0ad2a04>] drbd_thread_setup+0x64/0x120 [drbd]
[ 2679.364599] [<ffffffffc0ad29a0>] ? w_complete+0x20/0x20 [drbd]
[ 2679.375114] [<ffffffff8109b1fa>] kthread+0xea/0x100
[ 2679.384655] [<ffffffff8109b110>] ? kthread_create_on_node+0x1f0/0x1f0
[ 2679.395851] [<ffffffff81809e5f>] ret_from_fork+0x3f/0x70
[ 2679.405802] [<ffffffff8109b110>] ? kthread_create_on_node+0x1f0/0x1f0
[ 2679.416870] Code: 89 42 08 48 89 56 10 48 89 7e 18 48 89 07 83 6f 64 01 f0 80 a7 90 00 00 00 f7 f0 80 a7 90 00 00 00 fe 8b 46 20 5d c3 0f 0b 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
[ 2679.446124] RIP [<ffffffffc0ad25a0>] lc_put+0x90/0xa0 [drbd]
[ 2679.456617] RSP <ffff8801fccdbb08>
[ 2679.464764] ---[ end trace b1f10fd6ac931718 ]---
[ 2693.529503] ------------[ cut here ]------------
[ 2693.538650] WARNING: CPU: 7 PID: 0 at kernel/watchdog.c:311 watchdog_overflow_callback+0x84/0xa0()
[ 2693.552118] Watchdog detected hard LOCKUP on cpu 7
[ 2693.556788] Modules linked in: ip_set ip6table_filter ip6_tables drbd_transport_tcp(O) softdog drbd(O) libcrc32c nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_comment xt_conntrack xt_multiport iptable_filter iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables nfnetlink_log nfnetlink zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) ipmi_ssif amdkfd amd_iommu_v2 radeon gpio_ich ttm snd_pcm coretemp hpilo snd_timer[ 2693.637982] ------------[ cut here ]------------
This is my DRBD config:
Code:
global {
usage-count yes;
}
common {
handlers {
}
startup {
}
options {
}
disk {
resync-rate 50M;
c-max-rate 50M;
}
net {
}
}
Code:
resource r0 {
protocol C;
startup {
wfc-timeout 0;
degr-wfc-timeout 60;
become-primary-on both;
}
net {
cram-hmac-alg sha1;
shared-secret "myPassword4Drbd";
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
sndbuf-size 0;
rcvbuf-size 0;
max-buffers 8000;
max-epoch-size 8000;
}
on A {
node-id 0;
device /dev/drbd0;
disk /dev/sdb1;
address 10.0.0.1:7788;
meta-disk internal;
}
on B {
node-id 1;
device /dev/drbd0;
disk /dev/sdb1;
address 10.0.0.2:7788;
meta-disk internal;
}
disk {
# no-disk-barrier and no-disk-flushes should be applied only to systems with non-volatile (battery backed) controller caches.
# Follow links for more information:
# http://www.drbd.org/users-guide-8.3/s-throughput-tuning.html#s-tune-disable-barriers
# http://www.drbd.org/users-guide/s-throughput-tuning.html#s-tune-disable-barriers
no-disk-barrier;
no-disk-flushes;
}
}
As a workaround I downgraded host B to kernel 4.2.2-1-pve then started the IO traffic again.
It's running since 30 minutes ago without a crash (kernel 4.2.8-1-pve crashed after 3/4 minutes).
Do anyone has an idea of what's wrong?