Guest VMs hang on kexec

uib

Renowned Member
May 3, 2016
11
0
66
28
Hi
our company uib produces opsi.
Opsi is a software and operating system deployment tool.
To distribute linux operating systems we boot up a linux miniroot. Afterwards opsi prepares the hard disks and then performs a kexec into the distributor kernel.
For our tests we use a Proxmox virtualization.
For Ubuntu, Red Hat and openSUSE this works fine.
Debian 8, UCS 4.1 and SLES11SP4 hangs on the kexec command.
The guest syslog doesnt give additional information.

The system simply hangs and does not jump into the desired kernel.

Proxmox Virtual Environment 4.2-2/725d76f0

root@node4:~# kvm --version
QEMU emulator version 2.5.1 pve-qemu-kvm_2.5-14, Copyright (c) 2003-2008 Fabrice Bellard

Are there any known bugs?

Cheers
Mathias
 
Hi,
what is the specification of your mini root Linux?
This could also be a problem with the Distributions Kernels you want to jump into, maybe stupid question but does it work on real hardware with those.

If really only problematic in VMs I'd guess a shortcoming of KVM/QEMU here.
A simple test to reproduce that would then be really helpful, that would boil down to kexec a Debian 8 kernel from Debian 8 guest?
 
the mini root bases on Ubuntu 14.04
it uses kernel 4.1

It works on real hardware. It also does work on a Vshere Virtualization (v5.5) and a standard KVM/QEMU installation (QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.22), Copyright (c) 2003-2008 Fabrice Bellard)

I will try to kexec from debian to debian in a few days. If you need any more information let me know.

Cheers
Mathias
 
Well I've got some new information

Currently we switched to a xenial based bootimage.
This bootimage is supposed to boot a xenial kernel and also fails
it stalls at 'executing new kernel'
I cannot make a kernel dump as the system runs on a RAMDisk and is not accessible after the above message.

cmdline of the booted bootimage:
miniroot.bz2 video=vesa:ywrap,mtrr vga=791 quiet splash --no-log console=tty1 console=ttyS0
kernel cmdline of the kexec kernel
vga=788 locale=de_DE keymap=de-latin1-nodeadkeys console-keymaps-at/keymap=de-latin1-nodeadkeys auto=true priority=critical DEBCONF_DEBUG=5
 
How to reproduce the issue:

download this vm
http://download.uib.de/opsi4.0/opsidemo4.0.6-3.zip
this is an ubuntu14-04 based VM

documentation of the vm installation
http://download.uib.de/opsi4.0/doc/...tml#opsi-getting-started-installation-base-vm

run through the first boot and end up in a prepared opsi vm
update all the packages using apt-get
download the following debian8 package from here
http://download.uib.de/opsi4.0/products/opsi-linux/
add a proxmox client to opsi and set it to install debian8
Run the client and see how kexec hangs

If you have any further questions feel free to ask
 
Hi,

I am half through the update of the VM, I am not sure if I get time to test it today, else I will try it monday, I have to finish reading your docs.

After that I have to setup the Server which then provides the images (I'll do that on a jessie VM) and after that I should be ready?
Or do you mean something else with "proxmox client"?

PS: I got quite some mysql errors (from a python script) during the update, expected or a problem?
 
After that I have to setup the Server which then provides the images (I'll do that on a jessie VM) and after that I should be ready?
Or do you mean something else with "proxmox client"?

after you set up the server just download the debian8 or ubuntu16-04 packages and install them.
Afterwards you need to add a client through the configed. This client has to be a proxmox VM.
Then set the client to setup on one of the above named products and boot the client. The client shall boot via pxe into our boot image and start our initialization mask.
After everything completes it performs a kexec into the distribution installer.

PS: I got quite some mysql errors (from a python script) during the update, expected or a problem?

usually not.
feel free to post some messages, logs etc.

cheers
Mathias
 
Ok, so I got it to boot the minimal image over iPXE, had some troubles with the getting started guide.
Had to put the server and client in a separate network as I have no control over the local dhcp, so that I could use the one from the server (changing the IP from the server/interface was not that straightforwards).
Anyway, I hang on the samba step as I have to configure an DNS which resolves the opsi server so I'll continue tomorrow with that, sorry but have a little limited time here atm... :)


I'll try a kexec on a minimal debian in a VM also, normally if it hangs there it should not be a problem going out from kvm...
 
if your client cannot resolve your opsi server then you can change the 'remoteDepotUrl'

this one is changeable in the depot settings through the configed

just change it from
smb://DNSNAME/opsi_depot
to
smb://IP/opsi_depot
 
Ok, thanks now I got that working.

After the "initramrd patching" I get another error, "Failed to get file info for '/mnt/opsi/opsi-linux-client-agent/files/opsi/cfg/config.ini", the configed client shows the following in the log:

Code:
(1627)    [6] [Jul 07 16:53:49] Copying from '/opsi-linux-client-agent/files/opsi/cfg/config.ini' to '/mnt/hd/tmp/' (Repository.py|522)
(1628)    [2] [Jul 07 16:53:50] Traceback: (Logger.py|765)
(1629)    [2] [Jul 07 16:53:50]      line 1461 in '<module>' in file '/usr/local/bin/master.py' (Logger.py|765)
(1630)    [2] [Jul 07 16:53:50]      line 412 in '<module>' in file '/tmp/debuntu.py' (Logger.py|765)
(1631)    [2] [Jul 07 16:53:50]      line 614 in 'copy' in file '/usr/lib/python2.7/dist-packages/OPSI/Util/Repository.py' (Logger.py|765)
(1632)    [2] [Jul 07 16:53:50]      ==>>> Repository error: Failed to get file info for '/mnt/opsi/opsi-linux-client-agent/files/opsi/cfg/config.ini': File not found (master.py|1509)

I searched a little but the things I found did not seem directly related to my problem, whats sticking out is that
Code:
opsi-setup --init-current-config
shows error that it cannot read the opsi module file (I found forum post that said that shouldn't matter though...)...
Could you give me another tip here? :)

I tried afterwards to sniff a little in the miniroot around, what do you use for that roughly if I may ask? It's not busybox related AFIKT and `findmnt` gives me a segfault, which I do not like quite.
I'm just wondering because a debian to debian kexec works in general, so it could be something in the miniroot (in combination with other parts).
 
Ok, thanks now I got that working.

After the "initramrd patching" I get another error, "Failed to get file info for '/mnt/opsi/opsi-linux-client-agent/files/opsi/cfg/config.ini", the configed client shows the following in the log:

Code:
(1627)    [6] [Jul 07 16:53:49] Copying from '/opsi-linux-client-agent/files/opsi/cfg/config.ini' to '/mnt/hd/tmp/' (Repository.py|522)
(1628)    [2] [Jul 07 16:53:50] Traceback: (Logger.py|765)
(1629)    [2] [Jul 07 16:53:50]      line 1461 in '<module>' in file '/usr/local/bin/master.py' (Logger.py|765)
(1630)    [2] [Jul 07 16:53:50]      line 412 in '<module>' in file '/tmp/debuntu.py' (Logger.py|765)
(1631)    [2] [Jul 07 16:53:50]      line 614 in 'copy' in file '/usr/lib/python2.7/dist-packages/OPSI/Util/Repository.py' (Logger.py|765)
(1632)    [2] [Jul 07 16:53:50]      ==>>> Repository error: Failed to get file info for '/mnt/opsi/opsi-linux-client-agent/files/opsi/cfg/config.ini': File not found (master.py|1509)

Do you have the property 'install_linux_client_agent' set to true? If so set it to false.
Alternatively install this package
http://download.uib.de/opsi4.0/products/opsi-linux/opsi-linux-client-agent_4.0.6.3-20160225.opsi
with the opsi-package-manager and this package should provide the desired config.ini file

I searched a little but the things I found did not seem directly related to my problem, whats sticking out is that
Code:
opsi-setup --init-current-config
shows error that it cannot read the opsi module file (I found forum post that said that shouldn't matter though...)...
Could you give me another tip here? :)
this is not a bug. the /etc/opsi/modules file is a file which stores information if the customer has bought some confinanced modules. This file just makes it possible to use them and blocks in case the modules arent officially bought.

I tried afterwards to sniff a little in the miniroot around, what do you use for that roughly if I may ask? It's not busybox related AFIKT and `findmnt` gives me a segfault, which I do not like quite.
I'm just wondering because a debian to debian kexec works in general, so it could be something in the miniroot (in combination with other parts).

How did you run 'findmnt' in a chroot or on a proxmox vm?
running it in a vm gives me a correct output

What we basically do is the following: we partition the disk, load some data like initrd.gz and linux kernel, patch the initrd.gz and execute it.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!