PXE booting VMs not working.

jrosengren

Member
Aug 28, 2014
13
4
23
Somewhere along the way, PXE booting VMs on my Proxmox hosts stopped working. Here's what I'm running:

# pveversion -v
proxmox-ve-2.6.32: 3.2-132 (running kernel: 3.10.0-3-pve)
pve-manager: 3.2-4 (running version: 3.2-4/e24a91c1)
pve-kernel-2.6.32-29-pve: 2.6.32-126
pve-kernel-3.10.0-3-pve: 3.10.0-11
pve-kernel-2.6.32-31-pve: 2.6.32-132
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.5-1
pve-cluster: 3.0-12
qemu-server: 3.1-16
pve-firmware: 1.1-3
libpve-common-perl: 3.0-18
libpve-access-control: 3.0-11
libpve-storage-perl: 3.0-19
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-6
vzctl: 4.0-1pve5
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.7-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.2-1

I only recently moved to using the 3.10 kernel over the 2.6.32 kernel - I can't remember if I was ever able to successfully PXE boot a VM on 3.10 or not.

The behavior I'm seeing is that a VM will start booting, will begin to load vmlinuz and will hang halfway through the download. After a couple of minutes, the VM console will start spewing "can't find kernel vmlinux" messages.

Nothing in my support infrastructure has changed recently; I'm using a Fedora 20 host as a TFTP server which I've done countless sucessful PXE boots from in the past. I did see an earlier forum thread about a QEMU 1.7 issue related to PXE booting, so I'm wondering if a recent QEMU or kernel update caused a regression.

Let me know if I can send any other helpful information along.

Thanks!
 
r# cat 104.conf
balloon: 1024
boot: ndc
bootdisk: virtio0
cores: 1
ide2: none,media=cdrom
machine: pc-i440fx-1.4
memory: 2048
name: tester1
net0: e1000=02:78:8D:4D:E4:CF,bridge=vmbr0
ostype: l26
sockets: 1
vga: qxl
virtio0: proxvol:104/vm-104-disk-1.qcow2,format=qcow2,size=15G

The VM is currently configured to use an e1000 NIC, but it was originally using VirtIO. The behavior is the same for either virtual NIC.
 
can you try to remove
machine: pc-i440fx-1.4

(it's force to use qemu 1.4 compatibility)


also, can you try this on the host:

# iptables -A POSTROUTING -t mangle -p udp --dport bootpc -j CHECKSUM --checksum-fill

(I have already see dhcp problem sometimes, but inside the guest not at pxe boot)
 
Tried both of those suggestions and the problem still exists, unfortunately. The VM did appear to download a bit more of the PXE boot files than before - it got through the vmlinuz download and this time hung at the initrd.img file. I'm not sure if that had anything to do with the suggested changes, however.
 
As a test, I was able to PXE boot a bare metal server using the exact same PXE boot configuration without any issues. Any ideas?
 
Still seeing this issue. I've applied all of the available updates:

# pveversion -v
proxmox-ve-2.6.32: 3.2-136 (running kernel: 3.10.0-3-pve)
pve-manager: 3.2-30 (running version: 3.2-30/1d095287)
pve-kernel-2.6.32-32-pve: 2.6.32-136
pve-kernel-3.10.0-3-pve: 3.10.0-11
pve-kernel-2.6.32-31-pve: 2.6.32-132
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-1
pve-cluster: 3.0-14
qemu-server: 3.1-34
pve-firmware: 1.1-3
libpve-common-perl: 3.0-19
libpve-access-control: 3.0-15
libpve-storage-perl: 3.0-22
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-5
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

I've confirmed that my TFTP server is working correctly, as I was able to PXE boot a bare metal server without any issues. I've attached a screenshot of the behavior I'm seeing.

Screen Shot 2014-09-05 at 12.36.37 PM.png

I haven't yet done any A/B testing between the 3.x kernel and the 2.6 kernel. I may have a chance to spin up some test hardware for that this weekend.

Any ideas on what else I can try to troubleshoot/fix this issue?

Thanks much!!
 
Glad to hear somebody else is/was seeing the same issue. I tried changing my VM's disk from virtio to IDE, but that had no effect on the problem. PXE boot is still hanging on downloading the kernel image from the TFTP server.
 
A couple of other things I've tried with no effect on the issue:

1) Deleted the VM and recreated it. I removed the VM completely from the web console and recreated it.
2) Tried the e1000 and realtek network devices - no change from virtio there, either.
 
Alright, I finally got a chance to revert one of my hosts back to the 2.6.32-32 kernel and that seems to have resolved the issue. PXE booting is now working again. Is there a bug tracker that I can file a bug in for the 3.x kernel?

Thanks!
 
Hello,
we also had this problem.
I would really recommend the 3.1 kernel for KVMs.
To solve this problem, you need to boot the 3.10.0-1-pve kernel.
3.10.0-2 or higher is not able to boot via PXE. Don't know why.

If you need any help, just contact me via PM or reply to this thread.
I also can help you via TeamViewer (must not be related to this topic).
 
Hello,
we also had this problem.
I would really recommend the 3.1 kernel for KVMs.
To solve this problem, you need to boot the 3.10.0-1-pve kernel.
3.10.0-2 or higher is not able to boot via PXE. Don't know why.

If you need any help, just contact me via PM or reply to this thread.
I also can help you via TeamViewer (must not be related to this topic).


------------------

Can you install ipxe-qemu.deb package ?
 
I'm having a similar issue. I'm currently using 3.10.0-1-pve (from 3.10.0-7-pve). The same pxe settings/server works on non proxmox vms/bare metal. Pxe boot gets partway through then stalls on initrd.
 
Hey,
for me 3.10.0-1 is working fine with PXE booting VMs.
Didn't had a single issue with it.

Best regards
Henry
 
Last edited:
I did some more testing and if I use the ipxe.iso from http://ipxe.org/ I can get the node pxe booting w/o issue. I think this might be something with the ipxe firmware or roms? I do have ipxe-qemu installed. Any ideas?
 
I really don't know where to go next. For me pxe boot stalls during initrd when using the 'network' boot. To workaround I created a 'pxe' shared storage that has the ipxe.iso and will simply use that iso in the cdrom get pxeboot working. Hopefully this will/can be fixed and I won't have to continue with the kludge.
 
I have the same problem.

If I using the iptables command and a linux bridge, I'm able to boot with pxe in the VM.
If I use open-vswitch I can't use the iptables command. It seems, that both together doesn't play nicely (like mentioned in the OVS FAQ). Disabling TX-Offload, etc. like recommended in the OVS mailing list doesn't help either.

Older kernels work. But then again I'm not able to start OVS, so I cannot reproduce that exact behaviour.

Anyone got this working with OVS?
 
Havint the same issue since running a newer kernel version than 3.10.0-1. Using the ipxe ISO is an acceptable workaround but there's definitely something broken in Proxmox.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!