CT Migration

Antonino89

Member
Jul 13, 2017
76
1
6
35
Hi guys,

i'm trying to migrate a CT from Server 3 to Sever2 HA avaibility is configured right...

i get this type of error:

task started by HA resource agent
2017-08-18 10:49:23 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=Server2' root@192.168.100.12 /bin/true
2017-08-18 10:49:23 Host key verification failed.
2017-08-18 10:49:23 ERROR: migration aborted (duration 00:00:02): Can't connect to destination address using public key
TASK ERROR: migration aborted


Any suggestions?

Thanks
 
Okay,

Solved. I tried to open an SSH connection between servers and i trusted the ssh connection.

but somethings goes wrong anyway

Task viewer: HA 101 - Migrate
Output
Status
Stop
Requesting HA migration for CT 101 to node Server3
service 'ct:101' in error state, must be disabled and fixed first
TASK ERROR: command 'ha-manager migrate ct:101 Server3' failed: exit code 255
 

uhm yes, after ha-manager set ct:101 --state disabled i was able to migrate back CT from Server 2 to 3 which is where was created.

If i try again to migrate from 3 to 2 i get this:

Task viewer: CT 101 - Start

OutputStatus

Stop
Job for lxc@101.service failed because the control process exited with error code.
See "systemctl status lxc@101.service" and "journalctl -xe" for details.
TASK ERROR: command 'systemctl start lxc@101' failed: exit code 1

Somethings wrong on server 2 conf?
 
what does it say?
also see https://pve.proxmox.com/wiki/Linux_Container
especially the chapter "Obtaining Debugging Logs"


root@Server2:~# systemctl status lxc@101.service
lxc@101.service - LXC Container: 101
Loaded: loaded (/lib/systemd/system/lxc@.service; disabled; vendor preset: enabled)
Drop-In: /usr/lib/systemd/system/lxc@.service.d
└─pve-reboot.conf
Active: failed (Result: exit-code) since Fri 2017-08-18 15:26:36 CEST; 6min ago
Docs: man:lxc-start
man:lxc
Process: 27259 ExecStart=/usr/bin/lxc-start -n 101 (code=exited, status=1/FAILURE)

Aug 18 15:26:30 Server2 systemd[1]: Starting LXC Container: 101...
Aug 18 15:26:36 Server2 lxc-start[27259]: lxc-start: tools/lxc_start.c: main: 366 The container failed to start.
Aug 18 15:26:36 Server2 lxc-start[27259]: lxc-start: tools/lxc_start.c: main: 368 To get more details, run the container in foreground mod
Aug 18 15:26:36 Server2 lxc-start[27259]: lxc-start: tools/lxc_start.c: main: 370 Additional information can be obtained by setting the --
Aug 18 15:26:36 Server2 systemd[1]: lxc@101.service: Control process exited, code=exited status=1
Aug 18 15:26:36 Server2 systemd[1]: Failed to start LXC Container: 101.
Aug 18 15:26:36 Server2 systemd[1]: lxc@101.service: Unit entered failed state.
Aug 18 15:26:36 Server2 systemd[1]: lxc@101.service: Failed with result 'exit-code'.


From the page you provide me:

lxc-start -n 101 -F -l DEBUG -o /tmp/lxc-101.log

can't activate LV '/dev/lvm1/vm-101-disk-1': Failed to find logical volume "lvm1/vm-101-disk-1"
lxc-start: conf.c: run_buffer: 464 Script exited with status 5.
lxc-start: start.c: lxc_init: 450 Failed to run lxc.hook.pre-start for container "101".
lxc-start: start.c: __lxc_start: 1337 Failed to initialize container "101".
lxc-start: tools/lxc_start.c: main: 366 The container failed to start.
lxc-start: tools/lxc_start.c: main: 370 Additional information can be obtained by setting the --logfile and --logpriority options.

Problems with LVM? Again?
 
can't activate LV '/dev/lvm1/vm-101-disk-1': Failed to find logical volume "lvm1/vm-101-disk-1"
Was lvm1/vm-101-disk-1 renamed/moved/deleted?
 
If the storage is marked as shared the image virtual disk won't be moved.
Marking a storage as shared on creation does not result in Proxmox actively sharing the storage, but simply considering it as already shared. Like NFS for example. So if LVM1 is not actually available to all nodes this may be the problem.
If LVM1 is not available to all nodes you can configure it as non-shared only from the command line via
Code:
# pvesm set LVM1 -shared 1
Afterwards the migration will automatically migrate the virtual disk along with the container and it should run on the new node.
 
If the storage is marked as shared the image virtual disk won't be moved.
Marking a storage as shared on creation does not result in Proxmox actively sharing the storage, but simply considering it as already shared. Like NFS for example. So if LVM1 is not actually available to all nodes this may be the problem.
If LVM1 is not available to all nodes you can configure it as non-shared only from the command line via
Code:
# pvesm set LVM1 -shared 1
Afterwards the migration will automatically migrate the virtual disk along with the container and it should run on the new node.


I issued the command on all servers, problem stil exist

2017-08-20 11:11:30 shutdown CT 101
2017-08-20 11:11:30 # lxc-stop -n 101 --timeout 180
2017-08-20 11:11:32 # lxc-wait -n 101 -t 5 -s STOPPED
2017-08-20 11:11:34 starting migration of CT 101 to node 'Server2' (192.168.100.12)
2017-08-20 11:11:34 volume 'LVM1:vm-101-disk-1' is on shared storage 'LVM1'
2017-08-20 11:11:34 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=Server2' root@192.168.100.12 pvesr set-state 101 \''{}'\'
2017-08-20 11:11:43 start final cleanup
2017-08-20 11:11:44 start container on target node
2017-08-20 11:11:44 # /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=Server2' root@192.168.100.12 pct start 101
2017-08-20 11:11:50 command 'systemctl start lxc@101' failed: exit code 1
2017-08-20 11:11:50 Job for lxc@101.service failed because the control process exited with error code.
2017-08-20 11:11:50 See "systemctl status lxc@101.service" and "journalctl -xe" for details.
2017-08-20 11:11:51 ERROR: command '/usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=Server2' root@192.168.100.12 pct start 101' failed: exit code 255
2017-08-20 11:11:51 ERROR: migration finished with problems (duration 00:00:22)
TASK ERROR: migration problems

I noticed that in the datacenter tab i can mark as NON-shared the LVM1
 
Same problem if i try to migrate a VM..

Migration it self is ok, but the vm can't start:

TASK ERROR: can't activate LV '/dev/lvm0/vm-101-disk-1': Failed to find logical volume "lvm0/vm-101-disk-1"

Any suggestions?
 
2017-08-20 11:11:34 volume 'LVM1:vm-101-disk-1' is on shared storage 'LVM1'
Marking a storage as shared on creation does not result in Proxmox actively sharing the storage, but simply considering it as already shared.
Sorry, just saw I had a typo in the command. It ought to be
Code:
# pvesm set LVM1 -shared 0
of course. Or you can mark the storage as non-shared in the datacenter.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!