Migration - resource busy

C

Chris Rivera

Guest
Mar 26 08:35:40 starting migration of CT 454 to node 'proxmox9a'
Mar 26 08:35:40 starting rsync phase 1
Mar 26 08:35:40 # /usr/bin/rsync -aHAX --delete --numeric-ids --sparse /var/lib/vz/private/454 root@:/var/lib/vz/private
Mar 26 08:36:23 dump 2nd level quota
Mar 26 08:36:23 copy 2nd level quota to target node
Mar 26 08:36:24 ERROR: Failed to move config to node 'proxmox9a' - rename failed: Device or resource busy
Mar 26 08:36:24 aborting phase 1 - cleanup resources
Mar 26 08:36:24 removing copied files on target node
Mar 26 08:36:24 start final cleanup
Mar 26 08:36:24 ERROR: migration aborted (duration 00:00:44): Failed to move config to node 'proxmox9a' - rename failed: Device or resource busy
[FONT=tahoma, arial, verdana, sans-serif]TASK ERROR: migration aborted[/FONT]


[FONT=tahoma, arial, verdana, sans-serif]This was working but today has stopped working completely[/FONT]

[FONT=tahoma, arial, verdana, sans-serif]I have restarted pve-cluster on all nodes same issue.[/FONT]

[FONT=tahoma, arial, verdana, sans-serif]If i do a pve-cluster stop then pve-cluster start i get[/FONT]


root@proxmox2:~# service pve-cluster start
Starting pve cluster filesystem : pve-cluster unable to open file '/etc/pve/priv/authorized_keys.tmp.293404' - Device or resource busy.

root@proxmox2:~# service pve-cluster start
Starting pve cluster filesystem : pve-cluster apparently already running.


Not sure if i need to reboot the node im migrating from / the node im migrating to / or the master node



root@Proxmox10:~# pvecm status
Version: 6.2.0
Config Version: 53
Cluster Name: FL-Cluster
Cluster Id: 6836
Cluster Member: Yes
Cluster Generation: 5128452
Membership state: Cluster-Member
Nodes: 10
Expected votes: 10
Total votes: 10
Node votes: 1
Quorum: 6
Active subsystems: 1
Flags:
Ports Bound: 0
Node name: Proxmox10
Node ID: 10
 
root@proxmox2:~# clustat
Cluster Status for FL-Cluster @ Tue Mar 26 08:48:06 2013
Member Status: Quorate


Member Name ID Status
------ ---- ---- ------
proxmox11 1 Online
proxmox2 2 Online, Local
proxmox3a 3 Online
proxmox4 4 Online
poxmox5 5 Online
proxmox6 6 Online
proxmox7 7 Online
proxmox8 8 Online
proxmox9a 9 Online
Proxmox10 10 Online
 
None that i can see.

Ill give you syslog from both nodes and you tell me if you see anything.


Node 10: (HDD is corrupt and all vms need to be migrated ASAP )

Mar 26 08:34:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140
Mar 26 08:34:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: we (10/43653) left the process group
Mar 26 08:34:57 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:35:07 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:35:07 Proxmox10 pmxcfs[43653]: [dcdb] crit: internal error - unknown mode 0
Mar 26 08:35:07 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:35:17 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:35:17 Proxmox10 pmxcfs[43653]: [dcdb] crit: internal error - unknown mode 0
Mar 26 08:35:17 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140, 10/43653
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: starting data syncronisation
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/0000001A)
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/0000001B)
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/0000001C)
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] notice: we (10/43653) left the process group
Mar 26 08:35:27 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140, 10/43653
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: starting data syncronisation
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140, 10/43653
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/00000020)
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/00000021)
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/00000022)
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] notice: we (10/43653) left the process group
Mar 26 08:35:37 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:35:40 Proxmox10 pvedaemon[43359]: <root@pam> starting task UPID:proxmox10:0000AC1F:006F0A63:5151961C:vzmigrate:454:root@pam:
Mar 26 08:35:47 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:35:47 Proxmox10 pmxcfs[43653]: [dcdb] crit: internal error - unknown mode 0
Mar 26 08:35:47 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140, 10/43653
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: starting data syncronisation
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/0000002C)
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/0000002D)
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: received sync request (epoch 1/627564/0000002E)
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: members: 1/627564, 2/103606, 3/156625, 4/801004, 5/855040, 6/673130, 6/789766, 7/742308, 8/220844, 9/3140
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] notice: we (10/43653) left the process group
Mar 26 08:35:57 Proxmox10 pmxcfs[43653]: [dcdb] crit: leaving CPG group
Mar 26 08:36:07 Proxmox10 pmxcfs[43653]: [dcdb] notice: start cluster connection
Mar 26 08:36:07 Proxmox10 pmxcfs[43653]: [dcdb] crit: internal error - unknown mode 0


Node 9a

Mar 26 08:45:46 proxmox9a pmxcfs[3140]: [dcdb] notice: leader is 1/663005
Mar 26 08:45:46 proxmox9a pmxcfs[3140]: [dcdb] notice: synced members: 1/663005, 3/363854, 5/403319, 6/109830, 7/742308, 8/220844, 9/3140
Mar 26 08:45:46 proxmox9a pmxcfs[3140]: [dcdb] notice: all data is up to date
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 3/363854, 5/403319, 6/109830, 8/220844, 9/3140
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: starting data syncronisation
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 4/812681, 5/403319, 6/109830, 8/220844, 9/3140, 10/44456
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: starting data syncronisation
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000034)
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 3/363854, 5/403319, 6/109830, 8/220844, 9/3140
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/0000000E)
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 4/812681, 5/403319, 6/109830, 8/220844, 9/3140, 10/44456
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000035)
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/0000000F)
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: received all states
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: leader is 1/663005
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: synced members: 1/663005, 3/363854, 5/403319, 6/109830, 8/220844, 9/3140
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: all data is up to date
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: received all states
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [dcdb] notice: all data is up to date
Mar 26 08:45:47 proxmox9a pmxcfs[3140]: [status] notice: dfsm_deliver_queue: queue length 25
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 3/363854, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: starting data syncronisation
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000036)
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 4/812681, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140, 10/44456
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: starting data syncronisation
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000010)
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: received all states
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: leader is 1/663005
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: synced members: 1/663005, 3/363854, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: all data is up to date
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: received all states
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [dcdb] notice: all data is up to date
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [status] notice: dfsm_deliver_queue: queue length 2
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [status] notice: received log
Mar 26 08:45:48 proxmox9a pmxcfs[3140]: [main] notice: ignore duplicate
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: starting data syncronisation
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 4/812681, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 4/812681, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140, 10/44456
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000037)
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000038)
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: received sync request (epoch 1/663005/00000039)
Mar 26 08:45:57 proxmox9a pmxcfs[3140]: [dcdb] notice: members: 1/663005, 2/316706, 3/363854, 4/812681, 5/403319, 6/109830, 7/783852, 8/220844, 9/3140



I thought maybe the corruption on the HDD might be the problem but migrating from other nodes does not work either.

I've had this problem before, not frequent, but have posted about the drive or resource being busy which causes the whole cluster to not provision or migrate. This is specifically the pve mount that is mounted using the pve-cluster service.

What are some solutions or things i can do to better troubleshoot this?





 
[SOLVED]


I ended up using puttycs to run commands accross all nodes


service cman stop
service pve-cluster stop
service pve-cluster start
service cman start

this only worked when running commands on all nodes at the same time. when i did this node by node the problem was still present.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!