pmxcfs

  • Thread starter Thread starter orion
  • Start date Start date
O

orion

Guest
pmxcfs[10426]: segfault at f70e30 ip 0000000000f70e30 sp 00007fff264e6ce8 error 15

everything is running except clustat shows the other hosts offline
 
The issue seems to be in the updated code. I brought up a new box to create a new cluster and i can not even create from scratch without segfault.

@proxmox4:/etc/pve# pveversion --v
pve-manager: 3.0-23 (pve-manager/3.0/957f0862)
running kernel: 3.2.0-4-amd64
proxmox-ve-2.6.32: 3.0-107
pve-kernel-2.6.32-17-pve: 2.6.32-83
pve-kernel-2.6.32-22-pve: 2.6.32-107
pve-kernel-2.6.32-20-pve: 2.6.32-100
pve-kernel-2.6.32-19-pve: 2.6.32-96
lvm2: 2.02.95-pve3
clvm: 2.02.95-pve3
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-1
pve-cluster: 3.0-4
qemu-server: 3.0-20
pve-firmware: 1.0-23
libpve-common-perl: 3.0-4
libpve-access-control: 3.0-4
libpve-storage-perl: 3.0-8
vncterm: 1.1-4
vzctl: 4.0-1pve3
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-13
ksm-control-daemon: 1.1-1
 
Good catch that did fix the issue. My question now is why it happened in the first place when all i did was apt-get dist-upgrade for updates.
 
probably an update made it so the Debian kernel boots ahead of the Pve one.

I'd remove the Debian kernel. The pve-kernel should be all that is needed.

1- find which Debian kernel is installed :
Code:
aptitude search linux-image | grep ^i

then test removing the kernel.
Code:
aptitude --simulate  remove linux-image-2.6-amd64

post the results of the output.

see ' man aptitude ' for info on --simulate
 
@proxmox1:~# aptitude --simulate remove linux-image-2.6-amd64
The following packages will be REMOVED:
linux-image-2.6-amd64
0 packages upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
Need to get 0 B of archives. After unpacking 4,096 B will be freed.
Would download/install/remove packages.
 
one more thing, make sure you still have a pve kernel installed by doing this before and after removing the Debian kernel:
Code:
grep pve /boot/grub/grub.cfg
must return something like this:
Code:
menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-22-pve' --class proxmox --class gnu-linux --class gnu --class os {
        echo    'Loading Linux 2.6.32-22-pve ...'
        linux   /vmlinuz-2.6.32-22-pve root=/dev/mapper/pve-root ro  quiet
        initrd  /initrd.img-2.6.32-22-pve
menuentry 'Proxmox Virtual Environment GNU/Linux, with Linux 2.6.32-22-pve (recovery mode)' --class proxmox --class gnu-linux --class gnu --class os {
        echo    'Loading Linux 2.6.32-22-pve ...'
        linux   /vmlinuz-2.6.32-22-pve root=/dev/mapper/pve-root ro single 
        initrd  /initrd.img-2.6.32-22-pve

Be certain to Backup up your files before proceding. There are bugs, typos, mistakes and bad advice that can cause the system to be unusable after trying to remove a kernel. Be prepared to reinstall the operating system. Do not send any hack attacks this way if your system will not reboot!. Seriously be careful ;)

after backing up files :
Code:
aptitude remove linux-image-2.6-amd64

do ' grep pve /boot/grub/grub.cfg ' again. only reboot if you have a Pve kernel installed.