pmxcfs

O

orion

Guest
pmxcfs[10426]: segfault at f70e30 ip 0000000000f70e30 sp 00007fff264e6ce8 error 15

everything is running except clustat shows the other hosts offline
 
The issue seems to be in the updated code. I brought up a new box to create a new cluster and i can not even create from scratch without segfault.

@proxmox4:/etc/pve# pveversion --v
pve-manager: 3.0-23 (pve-manager/3.0/957f0862)
running kernel: 3.2.0-4-amd64
proxmox-ve-2.6.32: 3.0-107
pve-kernel-2.6.32-17-pve: 2.6.32-83
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-19-pve: 2.6.32-96
lvm2: 2.02.95-pve3
clvm: 2.02.95-pve3
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-4
qemu-server: 3.0-20
pve-firmware: 1.0-23
libpve-common-perl: 3.0-4
libpve-access-control: 3.0-4
libpve-storage-perl: 3.0-8
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-13
ksm-control-daemon: 1.1-1
 
Good catch that did fix the issue. My question now is why it happened in the first place when all i did was apt-get dist-upgrade for updates.
 
probably an update made it so the Debian kernel boots ahead of the Pve one.

I'd remove the Debian kernel. The pve-kernel should be all that is needed.

1- find which Debian kernel is installed :
Code:
aptitude search linux-image | grep ^i

then test removing the kernel.
Code:
aptitude --simulate  remove linux-image-2.6-amd64

post the results of the output.

see ' man aptitude ' for info on --simulate
 
@proxmox1:~# aptitude --simulate remove linux-image-2.6-amd64
The following packages will be REMOVED:
linux-image-2.6-amd64
0 packages upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
Need to get 0 B of archives. After unpacking 4,096 B will be freed.
Would download/install/remove packages.
 
one more thing, make sure you still have a pve kernel installed by doing this before and after removing the Debian kernel:
Code:
grep pve /boot/grub/grub.cfg
must return something like this:
Code:
menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-22-pve' --class proxmox --class gnu-linux --class gnu --class os {
        echo    'Loading Linux 2.6.32-22-pve ...'
        linux   /vmlinuz-2.6.32-22-pve root=/dev/mapper/pve-root ro  quiet
        initrd  /initrd.img-2.6.32-22-pve
menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-22-pve (recovery mode)' --class proxmox --class gnu-linux --class gnu --class os {
        echo    'Loading Linux 2.6.32-22-pve ...'
        linux   /vmlinuz-2.6.32-22-pve root=/dev/mapper/pve-root ro single 
        initrd  /initrd.img-2.6.32-22-pve

Be certain to Backup up your files before proceding. There are bugs, typos, mistakes and bad advice that can cause the system to be unusable after trying to remove a kernel. Be prepared to reinstall the operating system. Do not send any hack attacks this way if your system will not reboot!. Seriously be careful ;)

after backing up files :
Code:
aptitude remove linux-image-2.6-amd64

do ' grep pve /boot/grub/grub.cfg ' again. only reboot if you have a Pve kernel installed.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!