Search results

D
OSD wont start after Ceph upgrade from Hammer to Jewel

I just updated one of our Ceph nodes using this tutorial, from Hammer to Jewel version. Unfortunately after upgrade OSD's wont start. We use Proxmox 4.4.5. OSD's have the journal mounted on SSD. Error is, root@ceph03:~# systemctl status ceph-osd@2.service ● ceph-osd@2.service - Ceph object...
- Dan Nicolae
- Thread
- Jan 18, 2017
- Replies: 11
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

In my case, we have 12 osd (6 nodes, 2 osd /node). Using pg_calc, rbd pool name, size 3, 12 osd, 100% data, target per osd 100 result pg count 512. At this moment we have 256. Should I change to 512 or jump to 1024? According to ceph documentation, the range is, Less than 5 OSDs set pg_num to...
- Dan Nicolae
- Post #19
- Jan 16, 2017
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

Like they say, It's also important to know that the PG count can be increased, but NEVER decreased without destroying / recreating the pool. However, increasing the PG Count of a pool is one of the most impactful events in a Ceph Cluster, and should be avoided for production clusters if...
- Dan Nicolae
- Post #18
- Jan 15, 2017
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

The situation in thesame. Default pool. Only difference is that we have 6 nodes. Use this to set pg_num. :) http://ceph.com/pgcalc/
- Dan Nicolae
- Post #16
- Jan 15, 2017
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

Sorry if I hijacked this thread, it was not my intention. I have thesame issue, how to make a small Ceph cluster HA. If you consider it's better, I'll open another thread. :)
- Dan Nicolae
- Post #11
- Jan 15, 2017
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

Unfortunately I can't give you at this moment what Ceph say if 2 osd's are down. We have other problems with the VM running on Ceph, partition corruption even when Ceph health is green. Now we are moving VM's out of Ceph until things become clear and they run without problems. No problems in the...
- Dan Nicolae
- Post #7
- Jan 15, 2017
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

Here it is, # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable straw_calc_version 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5...
- Dan Nicolae
- Post #5
- Jan 15, 2017
- Forum: Proxmox VE: Installation and configuration
D
Understanding Ceph

We have 6 nodes, each node running 2 osd with journal on Intel Enterprise SSD. When a node goes down (2 osd from 12 of them) 18% of the cluster goes out and still, a lot of faults. The VM go down, partition corruption... Still searching for a solution.
- Dan Nicolae
- Post #3
- Jan 15, 2017
- Forum: Proxmox VE: Installation and configuration
D
Proxmox 4.4.5 kernel: Out of memory: Kill process 8543 (kvm) score or sacrifice child

Problem on our side was with the CEPH nodes. From time to time, OSD daemons where killed and CEPH marked them down. In one morning, half of our OSD was down, cluster was rebuilding, most of the VM partitions where corupted, some of them impossible to recover, data loss, absolutely horror. After...
- Dan Nicolae
- Post #58
- Jan 10, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

Our cluster use as storage Ceph, this bug caused a lot of partition corruption, some of them impossible to recover and the result was data loss. A lot of pain... :(
- Dan Nicolae
- Post #9
- Jan 10, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

Hello, Fabian. Thanks for the answer. Today I found that topic and updated the kernel. I hope that it will be OK.
- Dan Nicolae
- Post #8
- Jan 10, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

Is anyone alive on this forum? Does Proxmox has a living community?
- Dan Nicolae
- Post #6
- Jan 9, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

I guess is a bug. The ceph-osd consume less than 500MB of RAM. There are 2 OSD's and 16GB of memory. It should be sufficient. root@ceph05:~# ceph tell osd.6 heap stats osd.6 tcmalloc heap stats:------------------------------------------------ MALLOC: 282135536 ( 269.1 MiB) Bytes in use by...
- Dan Nicolae
- Post #5
- Jan 8, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

Unfortunately the problem persist. Not that often as before adding ram, but from time to time it appears.
- Dan Nicolae
- Post #4
- Jan 6, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

Afeter some hours of hell, I came to a conclusion that could help someone that is in the same situation. Our Ceph cluster had 6 nodes, each node 2 OSD (HDD 2TB). Four of them has 16GB of RAM, two of them only 8GB of RAM each. There are no virtual machines running on the Ceph nodes. According...
- Dan Nicolae
- Post #3
- Jan 5, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

Could it be possible that 8GB of RAM is not enough for a node with 2 OSD drives (HDD)? In summary area (Proxmox dashboard) it says that 1.74GB of 7.68 GB are in use.
- Dan Nicolae
- Post #2
- Jan 4, 2017
- Forum: Proxmox VE: Installation and configuration
D
Intel Skylake video memory purge kill osd process

We have just updated to the latest version on Proxmox 4.4.5 when the problem start. Our configuration is using a cluster of ceph with 6 servers, 3 of them are Intel Skylake CPU's. On those Skylake based servers we have this, Jan 4 09:32:20 ceph07 kernel: [139775.594411] Purging GPU memory, 0...
- Dan Nicolae
- Thread
- Jan 4, 2017
- Replies: 8
- Forum: Proxmox VE: Installation and configuration
D
CEPH storage corrupting disks when a CEPH node goes down..

We changed to min_size=2 size=3 and after that we did not see any special hdd activity. After this modification shouldn't the cluster rebuild some data alocation?
- Dan Nicolae
- Post #12
- Apr 27, 2016
- Forum: Proxmox VE: Installation and configuration

View older results

Top Bottom

Search results

We value your privacy