VM on HA cant start

Sasonsha

New Member
Jul 27, 2023
3
0
1
Hello,
i have the following setup.
on windows 10 i have vmware workstation, on it i have 3 VM's. on each one i have proxmox.
i created proxmox cluster from these 3 nodes and i have VM on 1 of the nodes,
i configured ceph and ntp.
i have kvm - off in the VM options cause it wont start with it
when i switch off the node with the VM on it (call it pve2), the VM moves to another node, but cant start, and i get this error
any help would be much appreciated! (if need more info, please tell me)


Code:
task started by HA resource agent
TASK ERROR: start failed: command '/usr/bin/kvm -id 100 -name 'ubuntu20.04,debug-threads=on' -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/100.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/100.pid -daemonize -smbios 'type=1,uuid=9c97de18-9fda-4bf0-acd5-808365732c83' -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc 'unix:/var/run/qemu-server/100.vnc,password=on' -cpu qemu64,+aes,+pni,+popcnt,+sse4.1,+sse4.2,+ssse3 -m 1024 -object 'iothread,id=iothread-virtioscsi0' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' -device 'vmgenid,guid=e74e7d02-3820-4b1b-99d2-554fedb7c038' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'VGA,id=vga,bus=pci.0,addr=0x2' -chardev 'socket,path=/var/run/qemu-server/100.qga,server=on,wait=off,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:862b1cb59b' -drive 'if=none,id=drive-ide2,media=cdrom,aio=io_uring' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101' -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0' -drive 'file=rbd:pvepool01/vm-100-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/pvepool01.keyring,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=4A:14:35:BD:2D:FF,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=1024,bootindex=102' -machine 'accel=tcg,type=pc+pve0'' failed: got timeout

and when the VM is on the new node and i try to start it, i get this error

Code:
VM 100 qmp command 'set_password' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
TASK ERROR: Failed to run vncproxy.

so to run it again, i need to go to datacenter > HA > click on the VM and disable it.

adding additional info:

ha-manager status
Code:
root@pve1:~# ha-manager status

quorum OK
master pve3 (active, Thu Jul 27 22:56:35 2023)
lrm pve1 (active, Thu Jul 27 22:56:34 2023)
lrm pve2 (idle, Thu Jul 27 22:56:34 2023)
lrm pve3 (idle, Thu Jul 27 22:56:35 2023)
service vm:100 (pve1, disabled)

ceph df
Code:
root@pve1:~# ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL    USED  RAW USED  %RAW USED
hdd    60 GiB  47 GiB  12 GiB    12 GiB      20.82
TOTAL  60 GiB  47 GiB  12 GiB    12 GiB      20.82
 
--- POOLS ---
POOL       ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr        1    1  449 KiB        2  1.3 MiB      0     15 GiB
pvepool01   2   32  4.1 GiB    1.24k   12 GiB  21.66     15 GiB

pveceph lspools
Code:
root@pve1:~# pveceph lspools
┌───────────┬──────┬──────────┬────────┬─────────────┬────────────────┬───────────────────┬──────────────────────────┬───────────────────────────┬─────────────────┬─────────────────────┬─────────────┐
│ Name      │ Size │ Min Size │ PG Num │ min. PG Num │ Optimal PG Num │ PG Autoscale Mode │ PG Autoscale Target Size │ PG Autoscale Target Ratio │ Crush Rule Name │              %-Used │        Used │
╞═══════════╪══════╪══════════╪════════╪═════════════╪════════════════╪═══════════════════╪══════════════════════════╪═══════════════════════════╪═════════════════╪═════════════════════╪═════════════╡
│ .mgr      │    3 │        2 │      1 │           1 │              1 │ on                │                          │                           │ replicated_rule │ 2.9181532227085e-05 │     1388544 │
├───────────┼──────┼──────────┼────────┼─────────────┼────────────────┼───────────────────┼──────────────────────────┼───────────────────────────┼─────────────────┼─────────────────────┼─────────────┤
│ pvepool01 │    3 │        2 │     32 │             │             32 │ on                │                          │                           │ replicated_rule │   0.216631069779396 │ 13158102408 │
└───────────┴──────┴──────────┴────────┴─────────────┴────────────────┴───────────────────┴──────────────────────────┴───────────────────────────┴─────────────────┴─────────────────────┴─────────────┘
 
Last edited:
additional logs:

pvecm status
Code:
root@pve1:~# pvecm status
Cluster information
-------------------
Name:             pve-cluster
Config Version:   3
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Thu Jul 27 22:56:45 2023
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1.141
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 192.168.1.51 (local)
0x00000002          1 192.168.1.52
0x00000003          1 192.168.1.53

ceph osd dump

Code:
root@pve1:~# ceph osd dump
epoch 607
fsid 3a90411b-86b5-4e01-996c-ff9db3e696fd
created 2023-07-25T12:46:52.007948+0300
modified 2023-07-27T21:37:16.498726+0300
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 23
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client jewel
require_osd_release quincy
stretch_mode_enabled false
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 18 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'pvepool01' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 470 lfor 0/467/465 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
max_osd 3
osd.0 up   in  weight 1 up_from 591 up_thru 606 down_at 587 last_clean_interval [558,585) [v2:192.168.100.10:6800/1171,v1:192.168.100.10:6801/1171] [v2:192.168.100.10:6802/1171,v1:192.168.100.10:6803/1171] exists,up 73a5818b-f8c1-4b7e-a8b4-5669b123a460
osd.1 up   in  weight 1 up_from 606 up_thru 606 down_at 600 last_clean_interval [596,604) [v2:192.168.100.11:6800/1181,v1:192.168.100.11:6801/1181] [v2:192.168.100.11:6802/1181,v1:192.168.100.11:6803/1181] exists,up 246bc409-98d0-4e95-9eb8-7cff466d4926
osd.2 up   in  weight 1 up_from 557 up_thru 606 down_at 532 last_clean_interval [519,524) [v2:192.168.100.12:6800/1145,v1:192.168.100.12:6801/1145] [v2:192.168.100.12:6802/1145,v1:192.168.100.12:6803/1145] exists,up 1ec533a0-9e90-49ec-be07-50c8768fdeca
blocklist 192.168.100.10:6809/993 expires 2023-07-28T18:08:09.718715+0300
blocklist 192.168.100.10:6808/993 expires 2023-07-28T18:08:09.718715+0300
blocklist 192.168.100.10:0/3734805876 expires 2023-07-28T18:08:09.718715+0300
blocklist 192.168.100.10:6809/1007 expires 2023-07-28T15:05:00.442177+0300
blocklist 192.168.100.10:6809/1004 expires 2023-07-28T14:52:33.243636+0300
blocklist 192.168.100.10:6809/33941 expires 2023-07-28T18:00:40.469004+0300
blocklist 192.168.100.10:6808/13713 expires 2023-07-28T14:37:41.810738+0300
blocklist 192.168.100.10:0/3094598365 expires 2023-07-28T15:46:14.846514+0300
blocklist 192.168.100.10:0/2179849135 expires 2023-07-28T15:05:15.452510+0300
blocklist 192.168.100.10:0/3520911704 expires 2023-07-28T14:23:35.917686+0300
blocklist 192.168.100.10:0/2565806435 expires 2023-07-28T14:23:35.917686+0300
blocklist 192.168.100.10:0/2329186381 expires 2023-07-28T15:05:00.442177+0300
blocklist 192.168.100.10:6809/1977 expires 2023-07-28T15:46:12.872485+0300
blocklist 192.168.100.10:6809/1002 expires 2023-07-28T15:04:58.124607+0300
blocklist 192.168.100.10:0/1086286658 expires 2023-07-28T15:05:00.442177+0300
blocklist 192.168.100.10:0/2845369698 expires 2023-07-28T13:38:09.264442+0300
blocklist 192.168.100.10:6809/13713 expires 2023-07-28T14:37:41.810738+0300
blocklist 192.168.100.10:0/3901187381 expires 2023-07-28T15:05:15.452510+0300
blocklist 192.168.100.10:0/1005308210 expires 2023-07-28T13:38:09.264442+0300
blocklist 192.168.100.10:0/3608177274 expires 2023-07-28T14:37:41.810738+0300
blocklist 192.168.100.10:6809/1817 expires 2023-07-28T17:57:40.402028+0300
blocklist 192.168.100.10:0/107191107 expires 2023-07-28T14:52:33.243636+0300
blocklist 192.168.100.10:6808/1803 expires 2023-07-28T13:38:09.264442+0300
blocklist 192.168.100.10:0/3821586579 expires 2023-07-28T15:46:12.872485+0300
blocklist 192.168.100.10:0/3452072671 expires 2023-07-28T15:46:12.872485+0300
blocklist 192.168.100.10:0/2040624878 expires 2023-07-28T14:37:41.810738+0300
blocklist 192.168.100.10:0/2827715855 expires 2023-07-28T17:57:40.402028+0300
blocklist 192.168.100.10:0/2542647554 expires 2023-07-28T14:23:35.917686+0300
blocklist 192.168.100.10:0/3052927007 expires 2023-07-28T15:04:58.124607+0300
blocklist 192.168.100.10:0/928675757 expires 2023-07-28T14:52:33.243636+0300
blocklist 192.168.100.10:0/4146399762 expires 2023-07-28T15:04:58.124607+0300
blocklist 192.168.100.10:6808/1977 expires 2023-07-28T15:46:12.872485+0300
blocklist 192.168.100.10:0/1774882260 expires 2023-07-28T18:08:09.718715+0300
blocklist 192.168.100.10:0/541245970 expires 2023-07-28T17:57:40.402028+0300
blocklist 192.168.100.10:6809/1811 expires 2023-07-28T15:05:15.452510+0300
blocklist 192.168.100.10:6809/1803 expires 2023-07-28T13:38:09.264442+0300
blocklist 192.168.100.10:6809/1011 expires 2023-07-28T14:23:35.917686+0300
blocklist 192.168.100.10:6808/1811 expires 2023-07-28T15:05:15.452510+0300
blocklist 192.168.100.10:6808/1817 expires 2023-07-28T17:57:40.402028+0300
blocklist 192.168.100.10:0/3407197941 expires 2023-07-28T17:57:40.402028+0300
blocklist 192.168.100.10:0/341049887 expires 2023-07-28T13:38:09.264442+0300
blocklist 192.168.100.10:0/4215985301 expires 2023-07-28T15:04:58.124607+0300
blocklist 192.168.100.10:0/38662766 expires 2023-07-28T15:05:15.452510+0300
blocklist 192.168.100.10:6808/1002 expires 2023-07-28T15:04:58.124607+0300
blocklist 192.168.100.10:0/2195675563 expires 2023-07-28T15:04:58.124607+0300
blocklist 192.168.100.10:0/748188489 expires 2023-07-28T18:00:40.469004+0300
blocklist 192.168.100.10:0/3340765638 expires 2023-07-28T15:46:12.872485+0300
blocklist 192.168.100.10:0/1930373304 expires 2023-07-28T14:23:35.917686+0300
blocklist 192.168.100.10:0/2007179513 expires 2023-07-28T15:05:15.452510+0300
blocklist 192.168.100.10:0/925927048 expires 2023-07-28T14:52:33.243636+0300
blocklist 192.168.100.10:0/4264418263 expires 2023-07-28T17:57:40.402028+0300
blocklist 192.168.100.10:6808/1004 expires 2023-07-28T14:52:33.243636+0300
blocklist 192.168.100.10:6808/1007 expires 2023-07-28T15:05:00.442177+0300
blocklist 192.168.100.10:6808/1011 expires 2023-07-28T14:23:35.917686+0300
blocklist 192.168.100.10:0/3018064986 expires 2023-07-28T18:08:09.718715+0300
blocklist 192.168.100.10:0/1992187892 expires 2023-07-28T15:46:12.872485+0300
blocklist 192.168.100.10:0/2971596168 expires 2023-07-28T14:52:33.243636+0300
blocklist 192.168.100.10:0/2633304891 expires 2023-07-28T18:00:40.469004+0300
blocklist 192.168.100.10:0/3985593172 expires 2023-07-28T18:00:40.469004+0300
blocklist 192.168.100.10:6808/33941 expires 2023-07-28T18:00:40.469004+0300
blocklist 192.168.100.10:0/3663724728 expires 2023-07-28T18:00:40.469004+0300
blocklist 192.168.100.10:0/2612267808 expires 2023-07-28T13:38:09.264442+0300
blocklist 192.168.100.10:0/3403540251 expires 2023-07-28T18:08:09.718715+0300
 
another thing, when the vm is migrating to the other node,
and i try to start the console, i get the error
please help me

Code:
()
VM 100 qmp command 'set_password' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
TASK ERROR: Failed to run vncproxy.