Hi,
Today I ran into a problem which took me a lot of manual hackery to recover from, hopefully you can fix it quickly and get the fix into the wild:
High-level steps:
create node1
create several containers and vms
create node2
cluster
fail containers and vms onto node2
restart node2
config filesystem will not mount on node2. Manually running pmxcfs shows that it is working with a .conf file which was migrated from the older node and says that the parent is not a directory.
The problem appears to be that in recreating the config filesystem it performs an ordered walk of the tree by inode. The inodes for nodes/ node2/ pve.. qemu.. etc are all greater in sequence than the nodes for container1 2 etc. and so the directories have not yet been created. The hack is to manually find some free inodes in the sqlite db and move your directories there fixing up parents as you go. Then copy the db to the other node(s) manually before restarting everything again.
# pveversion -v
pve-manager: 2.0-18 (pve-manager/2.0/16283a5a)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-55
pve-kernel-2.6.32-6-pve: 2.6.32-55
lvm2: 2.02.88-2pve1
clvm: 2.02.88-2pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.6.0-1
redhat-cluster-pve: 3.1.8-3
pve-cluster: 1.0-17
qemu-server: 2.0-13
pve-firmware: 1.0-14
libpve-common-perl: 1.0-11
libpve-access-control: 1.0-5
libpve-storage-perl: 2.0-9
vncterm: 1.0-2
vzctl: 3.0.29-3pve8
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-1
ksm-control-daemon: 1.1-1
Also, I recommend adding the cluster ips/hosts into the hosts files on all the other cluster members in case you're running DNS on a VM running on the cluster...
Thanks
Today I ran into a problem which took me a lot of manual hackery to recover from, hopefully you can fix it quickly and get the fix into the wild:
High-level steps:
create node1
create several containers and vms
create node2
cluster
fail containers and vms onto node2
restart node2
config filesystem will not mount on node2. Manually running pmxcfs shows that it is working with a .conf file which was migrated from the older node and says that the parent is not a directory.
The problem appears to be that in recreating the config filesystem it performs an ordered walk of the tree by inode. The inodes for nodes/ node2/ pve.. qemu.. etc are all greater in sequence than the nodes for container1 2 etc. and so the directories have not yet been created. The hack is to manually find some free inodes in the sqlite db and move your directories there fixing up parents as you go. Then copy the db to the other node(s) manually before restarting everything again.
# pveversion -v
pve-manager: 2.0-18 (pve-manager/2.0/16283a5a)
running kernel: 2.6.32-6-pve
proxmox-ve-2.6.32: 2.0-55
pve-kernel-2.6.32-6-pve: 2.6.32-55
lvm2: 2.02.88-2pve1
clvm: 2.02.88-2pve1
corosync-pve: 1.4.1-1
openais-pve: 1.1.4-1
libqb: 0.6.0-1
redhat-cluster-pve: 3.1.8-3
pve-cluster: 1.0-17
qemu-server: 2.0-13
pve-firmware: 1.0-14
libpve-common-perl: 1.0-11
libpve-access-control: 1.0-5
libpve-storage-perl: 2.0-9
vncterm: 1.0-2
vzctl: 3.0.29-3pve8
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-1
ksm-control-daemon: 1.1-1
Also, I recommend adding the cluster ips/hosts into the hosts files on all the other cluster members in case you're running DNS on a VM running on the cluster...
Thanks