PVE HA cluster cannot start VM on failover

mike.admon · Apr 6, 2020

Hi
My setup includes:

3 PVE hyper-visors (pve1, pve2 and pve3) installed as nested VM in Hyper-V on Windows 10. All join to cluster.
2 of them have CEPH OSD 60G each forming 120G pool and they are forming ha group (pve1 and pve2).
The CEPH Nautilus is installed and set on all the 3 members.
Ceph includes 3 monitors and 3 managers (1 on each cluster member respectively).
3 MSD as well in same manner.
I have 1 VM running on the cluster, member of HA group and can run on pve1 or pve2 HA group members.
The PVE3 installed only to form a 3 peers quotum.

I can successfully migrate machines between PVE1 and PVE2, takes seconds with ~1 ping packet loss.
Also tried disk move - works as well.

When I am trying to disconnect pve server that the VM currently running on it, quorum votes to move the VM to another pve ha cluster member.
It starts the process but after a while getting "timeout".

What am I doing wrong ?

The error I am getting is :

--------------------------------------------------

task started by HA resource agent
TASK ERROR: start failed: command '/usr/bin/kvm -id 100 -name Centos-ceph -chardev 'socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/100.pid -daemonize -smbios 'type=1,uuid=7504c566-e10f-4cdf-bf50-1217d99fbd2a' -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/100.vnc,password -cpu qemu64 -m 1024 -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'vmgenid,guid=6f56f77e-235b-4bbf-a43f-3d28ec1595b7' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'VGA,id=vga,bus=pci.0,addr=0x2' -chardev 'socket,path=/var/run/qemu-server/100.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:241650b23c9' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=rbd:ceph_pool_1/vm-100-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/ceph_pool_1.keyring,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=C6:43:59:82:CA:65,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'accel=tcg,type=pc+pve1'' failed: got timeout

------------------------------------------------------------------------

fabian · Apr 6, 2020

if you have only two OSDs, and one of them goes down, Ceph will probably block writes (depending on your replication settings, which you did not include in your post). you probably want more than two OSDs though for Ceph to make any sense

mike.admon · Apr 6, 2020

The idea (the requirement) is to have 2 "strong" PVE servers with disks for Ceph distributed storage and the 3rd PVE as a "witness" for quorum and voting.
Given I have :
* 2 strong servers with 2 partitions each (1 for the PVE OS and the second for VMs storage)
* 1 mini server (or even PC) that is only for Quorum voting (the 3rd cluster member) without VM storage (no Ceph OSD)

How can I achieve redundancy and fail-over?
Do I need to have 3 identical storages in Ceph in order to have the working platform? Can't I just have 2 ? As the project provided 2 servers in total and I am just trying to push it with additional PC (I understood I cannot have an HA with only 2 nodes, the fail-over will not start with 1 vote).
Please advise.

Thanks

mike.admon · Apr 6, 2020

p.s. which file includes the replication settings so I can post as well?

mike.admon · Apr 6, 2020

[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.31.201/24
fsid = 7fdc14bc-837c-4af7-ba92-18e35762165f
mon_allow_pool_delete = true
mon_host = 192.168.31.201 192.168.31.202 192.168.31.203
osd_pool_default_min_size = 1
osd_pool_default_size = 2
public_network = 192.168.31.201/24
mon_pg_warn_max_per_osd = 0

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.pve]
host = pve
mds_standby_for_name = pve

[mds.pve2]
host = pve2
mds_standby_for_name = pve

[mds.pve3]
host = pve3
mds standby for name = pve

fabian · Apr 7, 2020

mike.admon said:
The idea (the requirement) is to have 2 "strong" PVE servers with disks for Ceph distributed storage and the 3rd PVE as a "witness" for quorum and voting.
Given I have :
* 2 strong servers with 2 partitions each (1 for the PVE OS and the second for VMs storage)
* 1 mini server (or even PC) that is only for Quorum voting (the 3rd cluster member) without VM storage (no Ceph OSD)

How can I achieve redundancy and fail-over?
Do I need to have 3 identical storages in Ceph in order to have the working platform? Can't I just have 2 ? As the project provided 2 servers in total and I am just trying to push it with additional PC (I understood I cannot have an HA with only 2 nodes, the fail-over will not start with 1 vote).
Please advise.

Thanks

that's still not a good idea, and not what Ceph works well for. of course you can set it up for 2/1 replication, but then if one node goes down, you have a single copy of your data! if an hour later your OSD disk on the still running node fails, at least that hour of changes is gone forever. if your second node does not come back up, everything's gone. also, performance will be bad (with just 2 OSDs you are firmly in 'all the negative aspects/overhead, but none of the advantages/scale' territory).

pepex7 · Oct 16, 2023

Could you solve it? I've been having the same problem for days.

Search

Search

PVE HA cluster cannot start VM on failover

mike.admon

New Member

fabian

Proxmox Staff Member

mike.admon

New Member

mike.admon

New Member

mike.admon

New Member

fabian

Proxmox Staff Member

pepex7

New Member