Proxmox Node is down, VM is stucked and not possible to restart the VM

TMTMTM · May 18, 2017

Hello everyone,

I have a problem with my Proxmox Ceph Cluster:

There are 4 machines in a Proxmox 4.4-87 Cluster. All of this machines have 2 CEPH OSDs. So in summary we have 8 OSDs.

Ceph Pool config is like this:

ceph osd dump | grep -i rbd

pool 5 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 59469 flags hashpspool stripe_width 0

If one of the Proxmox nodes down, then all VMs are stucked. For example: I rebooted stor04 and all VMs are freezed / stucked. I stopped the VM on stor01 and tried to start the VM again. But nothing happened, until stor04 was back again...

Do you have any suggestions for me? Normally it should not be a problem if one node is down. Especially I have replicated size 3. So all VM data should be present on 3 OSDs, which means on 2 hardware machines. But the cluster is not usable if one proxmox node is down. That´s not normal and I think there must be an error!

spirit · May 19, 2017

how many monitors do you have ?

you have have defined all monitors in proxmox rbd storage ?

fabian · May 19, 2017

also, running with min_size 1 is asking for trouble IMHO..

TMTMTM · May 20, 2017

spirit said:
how many monitors do you have ?

you have have defined all monitors in proxmox rbd storage ?

Every node is a monitor. And I defined all monitors in Proxmox as RBD Storage.

TMTMTM · May 20, 2017

fabian said:
also, running with min_size 1 is asking for trouble IMHO..

Why is this asking for trouble? Sure, it could be very slowly. But normally it should be okay...?

I used min_size 1 because I have replica-size 3. And with this configuration it could be possible, that the VM100 is on both OSDs on proxmox4 and on one OSD of proxmox3. But with min_size 2 the VM would not be usable if proxmox4 is down. With min_size 1 it should be usable.

Or do you have a better idea for me / for my configuration?

Physically I have:

4 Nodes with 2 SATAs and 2 SSDs. What I did now is:

1 SSD used for system
1 SSD used for Index of both OSDs
Both SATAs as OSD.0 and OSD.1 (and so on)

dietmar · May 20, 2017

If you run 4 monitors, you simply use quorum as soon as one node fail! So you should only run 3 monitors.

TMTMTM · May 21, 2017

Hmm... Okay. So the rule of thumb is "Number_of_Nodes - 1" = Number_of_Monitors?
What´s about the min_size as Fabian said?

TMTMTM · May 21, 2017

Found an interesting problem:

rbd: ceph-storage
monhost 10.2.19.11;10.2.19.12;10.2.19.13
content images
krbd 0
pool rbd

This is my storage.conf configuration for the Ceph Storage. But I configured all 4 nodes (10.2.19.14 too) as a monitor in Ceph:

I will try to disable the monitor service on node .14 (this is the machine which was down and then nothing was working).

dietmar · May 21, 2017

TMTMTM said:
Hmm... Okay. So the rule of thumb is "Number_of_Nodes - 1" = Number_of_Monitors?

rule of dump: always use 3 monitors

TMTMTM said:
What´s about the min_size as Fabian said?

Most people suggest min size = 2 (it is simply safer).

fabian · May 22, 2017

TMTMTM said:
What´s about the min_size as Fabian said?

the problem with running with min_size 1 is that ceph will allow changing data which is not replicated, and if that single copy also fails, your data is gone. it is an additional safe guard, but the general recommendation is to run with min_size 2 in production, and only downgrade to 1 temporarily after careful consideration of the implications when doing emergency maintenance or disaster recovery, and only if required.

see this longer thread for some discussion about size and min_size: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-December/014846.html , and especially this mail for a summary of what running with min_size 1 might entail: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-December/014892.html

TMTMTM · May 27, 2017

Currently now one machine is down. And ALL VMs are freezed.... I have no idea. How I can figure out where the problem is?

Here is the output of pveceph status

{
"health" : {
"overall_status" : "HEALTH_WARN",
"summary" : [
{
"severity" : "HEALTH_WARN",
"summary" : "190 pgs stuck unclean"
},
{
"severity" : "HEALTH_WARN",
"summary" : "7 requests are blocked > 32 sec"
},
{
"summary" : "recovery 68380/502920 objects degraded (13.597%)",
"severity" : "HEALTH_WARN"
},
{
"severity" : "HEALTH_WARN",
"summary" : "recovery 124465/502920 objects misplaced (24.748%)"
},
{
"severity" : "HEALTH_WARN",
"summary" : "2/8 in osds are down"
},
{
"summary" : "noout flag(s) set",
"severity" : "HEALTH_WARN"
}
],
"health" : {
"health_services" : [
{
"mons" : [
{
"store_stats" : {
"last_updated" : "0.000000",
"bytes_misc" : 103528931,
"bytes_total" : 107870981,
"bytes_log" : 4342050,
"bytes_sst" : 0
},
"kb_avail" : 18129344,
"last_updated" : "2017-05-27 23:14:26.065363",
"health" : "HEALTH_OK",
"kb_total" : 26704124,
"avail_percent" : 67,
"kb_used" : 7342064,
"name" : "0"
},
{
"avail_percent" : 60,
"kb_used" : 3500644,
"name" : "1",
"store_stats" : {
"bytes_total" : 106411715,
"bytes_log" : 2146219,
"bytes_sst" : 0,
"last_updated" : "0.000000",
"bytes_misc" : 104265496
},
"health" : "HEALTH_OK",
"kb_total" : 10190136,
"last_updated" : "2017-05-27 23:15:11.261335",
"kb_avail" : 6148820
},
{
"avail_percent" : 57,
"kb_used" : 3796068,
"name" : "2",
"store_stats" : {
"bytes_sst" : 0,
"bytes_log" : 449948,
"bytes_total" : 94489705,
"bytes_misc" : 94039757,
"last_updated" : "0.000000"
},
"last_updated" : "2017-05-27 23:14:38.311855",
"kb_total" : 10190136,
"health" : "HEALTH_OK",
"kb_avail" : 5853396
}
]
}
]
},
"timechecks" : {
"mons" : [
{
"skew" : 0,
"health" : "HEALTH_OK",
"latency" : 0,
"name" : "0"
},
{
"latency" : 0.002798,
"health" : "HEALTH_OK",
"skew" : 0,
"name" : "1"
},
{
"name" : "2",
"health" : "HEALTH_OK",
"skew" : -0.000384,
"latency" : 0.013545
}
],
"epoch" : 156,
"round" : 250,
"round_status" : "finished"
},
"detail" : []
},
"mdsmap" : {
"epoch" : 1,
"up" : 0,
"max" : 0,
"by_rank" : [],
"in" : 0
},
"quorum" : [
0,
1,
2
],
"election_epoch" : 156,
"quorum_names" : [
"0",
"1",
"2"
],
"osdmap" : {
"osdmap" : {
"full" : false,
"num_remapped_pgs" : 190,
"nearfull" : false,
"num_in_osds" : 8,
"num_osds" : 8,
"epoch" : 79804,
"num_up_osds" : 6
}
},
"fsid" : "78667f72-e04d-416a-b37d-e86590ae0422",
"monmap" : {
"epoch" : 5,
"modified" : "2017-05-21 12:22:34.679432",
"created" : "2016-11-13 18:18:12.273042",
"mons" : [
{
"rank" : 0,
"name" : "0",
"addr" : "10.2.19.11:6789/0"
},
{
"name" : "1",
"addr" : "10.2.19.12:6789/0",
"rank" : 1
},
{
"rank" : 2,
"addr" : "10.2.19.13:6789/0",
"name" : "2"
}
],
"fsid" : "78667f72-e04d-416a-b37d-e86590ae0422"
},
"pgmap" : {
"num_pgs" : 256,
"misplaced_objects" : 124465,
"bytes_used" : 2103791534080,
"degraded_ratio" : 0.135966,
"bytes_total" : 15995368865792,
"degraded_total" : 502920,
"data_bytes" : 696680397124,
"degraded_objects" : 68380,
"misplaced_ratio" : 0.247485,
"write_bytes_sec" : 189638,
"misplaced_total" : 502920,
"bytes_avail" : 13891577331712,
"version" : 5669919,
"pgs_by_state" : [
{
"count" : 190,
"state_name" : "active+remapped"
},
{
"count" : 66,
"state_name" : "active+clean"
}
],
"op_per_sec" : 19
}
}

bizzarrone · Jan 21, 2019

Same happen to me. same configuration

Search

Search

Proxmox Node is down, VM is stucked and not possible to restart the VM

TMTMTM

New Member

spirit

Distinguished Member

fabian

Proxmox Staff Member

TMTMTM

New Member

TMTMTM

New Member

dietmar

Proxmox Staff Member

TMTMTM

New Member

TMTMTM

New Member

dietmar

Proxmox Staff Member

fabian

Proxmox Staff Member

TMTMTM

New Member

bizzarrone

Renowned Member