[SOLVED] Ceph Error

Dec 10, 2016
41
0
26
Hallo,
Heute sind alle VMs auf einem unser Servern ausgefallen. Folgender Fehler ist aufgetreten

Hat jemand eine Idee?

Code:
Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fb7c7b97780 time 2017-03-21 23:41:48.132092
common/Thread.cc: 131: FAILED assert(ret == 0)
ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0x5016d6]
2: /usr/bin/rbd() [0x4fe38f]
3: (CephContext::CephContext(unsigned int)+0x149) [0x505a99]
4: (common_preinit(CephInitParameters const&, code_environment_t, int)+0x32) [0x516832]
5: (global_pre_init(std::vector<char const*, std::allocator<char const*> >*, std::vector<char const*, std::allocator<char const*> >&, unsigned int, code_environment_t, int)+0x9a) [0x59908a]
6: (global_init(std::vector<char const*, std::allocator<char const*> >*, std::vector<char const*, std::allocator<char const*> >&, unsigned int, code_environment_t, int)+0x1c) [0x59998c]
7: (main()+0xad) [0x4b9bbd]
8: (__libc_start_main()+0xf5) [0x7fb7c0bfcb45]
9: /usr/bin/rbd() [0x4c2717]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
can't unmap rbd volume vm-111-disk-1: terminate called after throwing an instance of 'ceph::FailedAssertion'
TASK ERROR: start failed: command '/usr/bin/kvm -id 111 -chardev 'socket,id=qmp,path=/var/run/qemu-server/111.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/111.pid -daemonize -smbios 'type=1,uuid=fa4f7f02-7c2d-4c9d-a8d8-956864b10ff9' -name gameserver -smp '4,sockets=1,cores=4,maxcpus=4' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga std -vnc unix:/var/run/qemu-server/111.vnc,x509,password -no-hpet -cpu 'kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,enforce' -m 2048 -k de -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:3681fcbb6821' -drive 'file=/mnt/iso-data/template/iso/virtio-win.iso,if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -drive 'file=/dev/rbd/vm-data/vm-111-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap111i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=5E:E5:0F:50:D7:2F,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -rtc 'driftfix=slew,base=localtime' -global 'kvm-pit.lost_tick_policy=discard'' failed: open3: fork failed: Die Ressource ist zur Zeit nicht verfügbar at /usr/share/perl5/PVE/Tools.pm line 411.
 
Hi,
hast Du was an der ceph-installation geändert? Wie z.B. Updates ohne restarts (von den ceph-diensten)?

Kannst Du die Ausgabe folgender Befehle posten?
Code:
ceph -s
ceph osd tree
dpkg -l | grep ceph
Udo
 
Hey, war leider die Tage verhindert. Ja hatte ein apt-get upgrade gefahren, da würde aber nicht gesagt neustarten.

Hier die Daten:
ceph -s
Code:
    cluster 96f86bf1-a42d-4267-8928-d685eee56605
     health HEALTH_OK
     monmap e3: 3 mons at {0=10.10.1.1:6789/0,1=10.10.1.2:6789/0,2=10.10.1.3:6789/0}
            election epoch 406, quorum 0,1,2 0,1,2
     osdmap e1524: 6 osds: 6 up, 6 in
      pgmap v5832177: 128 pgs, 2 pools, 1188 GB data, 303 kobjects
            2411 GB used, 8761 GB / 11172 GB avail
                 128 active+clean
  client io 40826 B/s wr, 12 op/s
ceph osd tree
Code:
ID WEIGHT   TYPE NAME          UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 10.91995 root default
-2  3.63998     host pve01cp01
 0  1.81999         osd.0           up  1.00000          1.00000
 1  1.81999         osd.1           up  1.00000          1.00000
-3  3.63998     host pve02cp01
 2  1.81999         osd.2           up  1.00000          1.00000
 3  1.81999         osd.3           up  1.00000          1.00000
-4  3.63998     host pve03cp01
 4  1.81999         osd.4           up  1.00000          1.00000
 5  1.81999         osd.5           up  1.00000          1.00000
dpkg -l | grep ceph
Code:
ii  ceph                                 0.94.10-1~bpo80+1              amd64        distributed storage and file system
ii  ceph-common                          0.94.10-1~bpo80+1              amd64        common utilities to mount and interact with a ceph storage cluster
ii  libcephfs1                           0.94.10-1~bpo80+1              amd64        Ceph distributed file system client library
ii  python-ceph                          0.94.10-1~bpo80+1              amd64        Meta-package for python libraries for the Ceph libraries
ii  python-cephfs                        0.94.10-1~bpo80+1              amd64        Python libraries for the Ceph libcephfs library
 
Hey, war leider die Tage verhindert. Ja hatte ein apt-get upgrade gefahren, da würde aber nicht gesagt neustarten.
Hi,
bei proxmox immer dist-upgrade verwenden! Wenn nur "normale" Pakete erneuert werden, reicht zwar ein upgrade, aber bei pve-paketen kannst Du dir das System zerlegen, weil Abhängigkeiten nicht nachgezogen werden.

Wenn ceph-pakete geupdated werden, musst Du immer die ceph-dienste neustarten - am besten nach den Update-Anweisungen von ceph. Gewöhnlich werden erst die Monitore restartet und danach die OSDs.

Könnte mir vorstellen, dass danach Dein Fehler weg ist.

Udo
 
  • Like
Reactions: fireon

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!