Hi guys.
Today we watched strange bug with kernel, after that one of OSD went down and it's process hangs
Stack trace:
Jul 07 09:28:37 node12 kernel: divide error: 0000 [#1] SMP
Jul 07 09:28:37 node12 kernel: Modules linked in: veth rbd libceph nfsv3 rpcsec_gss_krb5 nfsv4 binfmt_misc ip_set ip6table_filter ip6_tables iptable_filter ip_tables x_tables softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp mrp bonding nfnetlink_log nfnetlink xfs libcrc32c ipmi_ssif intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul cryptd ast ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect snd_pcm sysimgblt snd_timer snd soundcore sb_edac pcspkr mei_me edac_core joydev input_leds i2c_i801 lpc_ich mei shpchp ioatdma wmi ipmi_si 8250_fintek ipmi_msghandler mac_hid acpi_power_meter vhost_net vhost macvtap macvlan autofs4
Jul 07 09:28:37 node12 kernel: zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) hid_generic usbkbd usbmouse usbhid hid ixgbe(O) dca vxlan ahci mpt3sas ip6_udp_tunnel libahci udp_tunnel raid_class ptp scsi_transport_sas pps_core fjes
Jul 07 09:28:37 node12 kernel: CPU: 10 PID: 31581 Comm: ceph-osd Tainted: P O 4.4.8-1-pve #1
Jul 07 09:28:37 node12 kernel: Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS 1.1 07/22/2015
Jul 07 09:28:37 node12 kernel: task: ffff881fee535280 ti: ffff881ef5268000 task.ti: ffff881ef5268000
Jul 07 09:28:37 node12 kernel: RIP: 0010:[<ffffffff810b598c>] [<ffffffff810b598c>] task_numa_find_cpu+0x2cc/0x710
Jul 07 09:28:37 node12 kernel: RSP: 0000:ffff881ef526bbd8 EFLAGS: 00010257
Jul 07 09:28:37 node12 kernel: RAX: 0000000000000000 RBX: ffff881ef526bc80 RCX: 000000000000000e
Jul 07 09:28:37 node12 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff883f8bdfb600
Jul 07 09:28:37 node12 kernel: RBP: ffff881ef526bc48 R08: 0000000000000009 R09: 0000000000000271
Jul 07 09:28:37 node12 kernel: R10: 0000000000000176 R11: 00000000000001b4 R12: ffff881f3fcba940
Jul 07 09:28:37 node12 kernel: R13: ffff883f8bdfb600 R14: 00000000000001b4 R15: 0000000000000011
Jul 07 09:28:37 node12 kernel: FS: 00007f945ec24700(0000) GS:ffff881fffa80000(0000) knlGS:0000000000000000
Jul 07 09:28:37 node12 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 07 09:28:37 node12 kernel: CR2: 00000000174212c0 CR3: 0000003f72832000 CR4: 00000000001426e0
Jul 07 09:28:37 node12 kernel: Stack:
Jul 07 09:28:37 node12 kernel: ffff881ef526bbe8 ffffffff81050b9d 0000000000000009 0000000000017180
Jul 07 09:28:37 node12 kernel: 000000000000000e 0000000000017180 00000000000000b9 fffffffffffffe3f
Jul 07 09:28:37 node12 kernel: 0000000000000009 ffff881fee535280 ffff881ef526bc80 0000000000000197
Jul 07 09:28:37 node12 kernel: Call Trace:
Jul 07 09:28:37 node12 kernel: [<ffffffff81050b9d>] ? native_smp_send_reschedule+0x4d/0x70
Jul 07 09:28:37 node12 kernel: [<ffffffff810b62b6>] task_numa_migrate+0x4e6/0xa00
Jul 07 09:28:37 node12 kernel: [<ffffffff810b6849>] numa_migrate_preferred+0x79/0x80
Jul 07 09:28:37 node12 kernel: [<ffffffff810bb348>] task_numa_fault+0x848/0xd10
Jul 07 09:28:37 node12 kernel: [<ffffffff810ba969>] ? should_numa_migrate_memory+0x59/0x130
Jul 07 09:28:37 node12 kernel: [<ffffffff811c03d4>] handle_mm_fault+0xc64/0x1a20
Jul 07 09:28:37 node12 kernel: [<ffffffff8170c3d4>] ? SYSC_recvfrom+0x144/0x160
Jul 07 09:28:37 node12 kernel: [<ffffffff818441aa>] ? __schedule+0x38a/0xa30
Jul 07 09:28:37 node12 kernel: [<ffffffff8106b4ed>] __do_page_fault+0x19d/0x410
Jul 07 09:28:37 node12 kernel: [<ffffffff8106b782>] do_page_fault+0x22/0x30
Jul 07 09:28:37 node12 kernel: [<ffffffff8184ab38>] page_fault+0x28/0x30
Jul 07 09:28:37 node12 kernel: Code: d0 4c 89 ef e8 26 d0 ff ff 49 8b 85 b0 00 00 00 49 8b 75 78 31 d2 49 0f af 84 24 d8 01 00 00 4c 8b 45 d0 48 8b 4d b0 48 83 c6 01 <48> f7 f6 4c 89 c6 48 89 da 48 8d 3c 01 48 29 c6 e8 9f cd ff ff
Jul 07 09:28:37 node12 kernel: RIP [<ffffffff810b598c>] task_numa_find_cpu+0x2cc/0x710
Jul 07 09:28:37 node12 kernel: RSP <ffff881ef526bbd8>
Jul 07 09:28:37 node12 kernel: ---[ end trace 1b119ce7b8e959c7 ]---
############################
After some googling i find the active bug with ubuntu kernel 4.4-4.6
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1568729
And it seems to be related to ceph and fair scheduler
http://thread.gmane.org/gmane.comp.file-systems.ceph.user/30793/focus=30987
We have 4-node ceph proxmox cluster. All nodes the same version of pve.
# pvecm status
Quorum information
------------------
Date: Thu Jul 7 17:20:18 2016
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 8992
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.16.110.20 (local)
0x00000002 1 172.16.110.30
0x00000003 1 172.16.110.40
0x00000004 1 172.16.110.50
# ceph -s
cluster 505fe7b7-0a8c-4adf-85bb-d82adda98a5a
health HEALTH_OK
monmap e4: 4 mons at {0=172.16.140.20:6789/0,1=172.16.140.30:6789/0,2=172.16.140.40:6789/0,3=172.16.140.50:6789/0}
election epoch 258, quorum 0,1,2,3 0,1,2,3
osdmap e1082: 64 osds: 64 up, 64 in
pgmap v3016545: 2112 pgs, 3 pools, 994 GB data, 252 kobjects
2952 GB used, 68552 GB / 71505 GB avail
2112 active+clean
client io 3021 B/s rd, 643 kB/s wr, 144 op/s
Today catch strange kernel panic with
# pveversion
pve-manager/4.2-11/2c626aa1 (running kernel: 4.4.8-1-pve)
Does anyone met the same bug?
Any chance to upgrade to patched kernel (after bug will be resolved by ubuntu)?
P.S.
The bug are very rare, first met on one node after 2 month in production.
Today we watched strange bug with kernel, after that one of OSD went down and it's process hangs
Stack trace:
Jul 07 09:28:37 node12 kernel: divide error: 0000 [#1] SMP
Jul 07 09:28:37 node12 kernel: Modules linked in: veth rbd libceph nfsv3 rpcsec_gss_krb5 nfsv4 binfmt_misc ip_set ip6table_filter ip6_tables iptable_filter ip_tables x_tables softdog nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp mrp bonding nfnetlink_log nfnetlink xfs libcrc32c ipmi_ssif intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul cryptd ast ttm drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect snd_pcm sysimgblt snd_timer snd soundcore sb_edac pcspkr mei_me edac_core joydev input_leds i2c_i801 lpc_ich mei shpchp ioatdma wmi ipmi_si 8250_fintek ipmi_msghandler mac_hid acpi_power_meter vhost_net vhost macvtap macvlan autofs4
Jul 07 09:28:37 node12 kernel: zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) hid_generic usbkbd usbmouse usbhid hid ixgbe(O) dca vxlan ahci mpt3sas ip6_udp_tunnel libahci udp_tunnel raid_class ptp scsi_transport_sas pps_core fjes
Jul 07 09:28:37 node12 kernel: CPU: 10 PID: 31581 Comm: ceph-osd Tainted: P O 4.4.8-1-pve #1
Jul 07 09:28:37 node12 kernel: Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS 1.1 07/22/2015
Jul 07 09:28:37 node12 kernel: task: ffff881fee535280 ti: ffff881ef5268000 task.ti: ffff881ef5268000
Jul 07 09:28:37 node12 kernel: RIP: 0010:[<ffffffff810b598c>] [<ffffffff810b598c>] task_numa_find_cpu+0x2cc/0x710
Jul 07 09:28:37 node12 kernel: RSP: 0000:ffff881ef526bbd8 EFLAGS: 00010257
Jul 07 09:28:37 node12 kernel: RAX: 0000000000000000 RBX: ffff881ef526bc80 RCX: 000000000000000e
Jul 07 09:28:37 node12 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff883f8bdfb600
Jul 07 09:28:37 node12 kernel: RBP: ffff881ef526bc48 R08: 0000000000000009 R09: 0000000000000271
Jul 07 09:28:37 node12 kernel: R10: 0000000000000176 R11: 00000000000001b4 R12: ffff881f3fcba940
Jul 07 09:28:37 node12 kernel: R13: ffff883f8bdfb600 R14: 00000000000001b4 R15: 0000000000000011
Jul 07 09:28:37 node12 kernel: FS: 00007f945ec24700(0000) GS:ffff881fffa80000(0000) knlGS:0000000000000000
Jul 07 09:28:37 node12 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 07 09:28:37 node12 kernel: CR2: 00000000174212c0 CR3: 0000003f72832000 CR4: 00000000001426e0
Jul 07 09:28:37 node12 kernel: Stack:
Jul 07 09:28:37 node12 kernel: ffff881ef526bbe8 ffffffff81050b9d 0000000000000009 0000000000017180
Jul 07 09:28:37 node12 kernel: 000000000000000e 0000000000017180 00000000000000b9 fffffffffffffe3f
Jul 07 09:28:37 node12 kernel: 0000000000000009 ffff881fee535280 ffff881ef526bc80 0000000000000197
Jul 07 09:28:37 node12 kernel: Call Trace:
Jul 07 09:28:37 node12 kernel: [<ffffffff81050b9d>] ? native_smp_send_reschedule+0x4d/0x70
Jul 07 09:28:37 node12 kernel: [<ffffffff810b62b6>] task_numa_migrate+0x4e6/0xa00
Jul 07 09:28:37 node12 kernel: [<ffffffff810b6849>] numa_migrate_preferred+0x79/0x80
Jul 07 09:28:37 node12 kernel: [<ffffffff810bb348>] task_numa_fault+0x848/0xd10
Jul 07 09:28:37 node12 kernel: [<ffffffff810ba969>] ? should_numa_migrate_memory+0x59/0x130
Jul 07 09:28:37 node12 kernel: [<ffffffff811c03d4>] handle_mm_fault+0xc64/0x1a20
Jul 07 09:28:37 node12 kernel: [<ffffffff8170c3d4>] ? SYSC_recvfrom+0x144/0x160
Jul 07 09:28:37 node12 kernel: [<ffffffff818441aa>] ? __schedule+0x38a/0xa30
Jul 07 09:28:37 node12 kernel: [<ffffffff8106b4ed>] __do_page_fault+0x19d/0x410
Jul 07 09:28:37 node12 kernel: [<ffffffff8106b782>] do_page_fault+0x22/0x30
Jul 07 09:28:37 node12 kernel: [<ffffffff8184ab38>] page_fault+0x28/0x30
Jul 07 09:28:37 node12 kernel: Code: d0 4c 89 ef e8 26 d0 ff ff 49 8b 85 b0 00 00 00 49 8b 75 78 31 d2 49 0f af 84 24 d8 01 00 00 4c 8b 45 d0 48 8b 4d b0 48 83 c6 01 <48> f7 f6 4c 89 c6 48 89 da 48 8d 3c 01 48 29 c6 e8 9f cd ff ff
Jul 07 09:28:37 node12 kernel: RIP [<ffffffff810b598c>] task_numa_find_cpu+0x2cc/0x710
Jul 07 09:28:37 node12 kernel: RSP <ffff881ef526bbd8>
Jul 07 09:28:37 node12 kernel: ---[ end trace 1b119ce7b8e959c7 ]---
############################
After some googling i find the active bug with ubuntu kernel 4.4-4.6
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1568729
And it seems to be related to ceph and fair scheduler
http://thread.gmane.org/gmane.comp.file-systems.ceph.user/30793/focus=30987
We have 4-node ceph proxmox cluster. All nodes the same version of pve.
# pvecm status
Quorum information
------------------
Date: Thu Jul 7 17:20:18 2016
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 8992
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.16.110.20 (local)
0x00000002 1 172.16.110.30
0x00000003 1 172.16.110.40
0x00000004 1 172.16.110.50
# ceph -s
cluster 505fe7b7-0a8c-4adf-85bb-d82adda98a5a
health HEALTH_OK
monmap e4: 4 mons at {0=172.16.140.20:6789/0,1=172.16.140.30:6789/0,2=172.16.140.40:6789/0,3=172.16.140.50:6789/0}
election epoch 258, quorum 0,1,2,3 0,1,2,3
osdmap e1082: 64 osds: 64 up, 64 in
pgmap v3016545: 2112 pgs, 3 pools, 994 GB data, 252 kobjects
2952 GB used, 68552 GB / 71505 GB avail
2112 active+clean
client io 3021 B/s rd, 643 kB/s wr, 144 op/s
Today catch strange kernel panic with
# pveversion
pve-manager/4.2-11/2c626aa1 (running kernel: 4.4.8-1-pve)
Does anyone met the same bug?
Any chance to upgrade to patched kernel (after bug will be resolved by ubuntu)?
P.S.
The bug are very rare, first met on one node after 2 month in production.