Segmentation Faults

Mr.Embedded

Well-Known Member
Jul 8, 2008
58
3
48
Hi All,

I am having an issue with some containered VMs.

<hostname>:~# pveversion -v
pve-manager: 1.1-3 (pve-manager/1.1/3718)
qemu-server: 1.0-10
pve-kernel: 2.6.24-5
pve-kvm: 83-1
pve-firmware: 1
vncterm: 0.9-1
vzctl: 3.0.23-1pve1
vzdump: 1.1-1
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1dso1

Basically what is happening is, I am installing a Debian 5.0 container on this system and it goes fine and runs for a few days and after when I try and add packages there are all kinds of segmentation faults as well as other issues. The host seems fine and other containers (non 5.0) seem fine as well. I have tried to make a 2nd container from scratch 5.0 also and that worked well for a few days and then same thing is happening, all kins of segmentation faults.

I will try and use a new Debian 4.0 container and see if this issue continues to happen. Has anyone ever seen this? Is this an issue with the 5.0 container? In the past I have installed a 4.0 container and then upgraded it to 5.0 myself without issues.

Any help would be appreciated.
 
runs for a few days and after when I try and add packages there are all kinds of segmentation faults as well as other issues.

Probably the first thing you could try:

Download the rescuecd image from http://www.sysresccd.org/
and run memtest86+ for a night.

I just assembled new hardware and got bitten by a bad ram module.
Or if you overclocked your machine you could see similar results.
Do other guests or the host have similar problems ?
Do you have screenshots/logs from the errors ?
 
Here is something:

<hostname>:/# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages have been kept back:
bind9-host dnsutils libbind9-40 libisccc40 libisccfg40 liblwres40
0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.
1 not fully installed or removed.
After this operation, 0B of additional disk space will be used.
Do you want to continue [Y/n]? y
Setting up ca-certificates (20080809) ...
/var/lib/dpkg/info/ca-certificates.postinst: line 39: 4703 Segmentation fault rm -f $(cat /usr/share/doc/ca-certificates/oldpemfiles)
/var/lib/dpkg/info/ca-certificates.postinst: line 39: 5868 Segmentation fault rm -f $(cat /usr/share/doc/ca-certificates/oldpemfiles)
Updating certificates in /etc/ssl/certs..../usr/sbin/update-ca-certificates: line 86: 5891 Segmentation fault ln -sf "$CERTSDIR/$crt" "$pem"
dpkg: error processing ca-certificates (--configure):
subprocess post-installation script returned error exit status 139
Errors were encountered while processing:
ca-certificates
E: Sub-process /usr/bin/dpkg returned an error code (1)
<hostname>:/#

I have a windows2003 host there as well as one Debian 4.0 which seems ok and then 2 Debian 5.0 which both give the faults. I also notice that the hostname defaults to the proxmox host name when the segmentation faults start and you restart the container. The second Debian 5.0 host was made as a replacement for the first after it got bad, and no the 2nd is behaving the same way.

I just did an apt-get update/upgrade on the Debian 4.0 container and it did a 70mb upgrade with no segmentation fault. I am wondering if the issue is with the Debian 5.0 container.

Even a simple ps gives the segmentation fault in the 5.0 container.

I will try a memory test when I can physically get to the box but do not believe its that at all as this box has been up for several months.
 
/proc/user_beancounters does not seem to show any error entries on the host or the container itself.
 
You use a standard install? (i.e. no xfs and no SW RAID)?

Whats the output of:

# mount

on the host.

- Dietmar
 
Its a standard install and all mounts are local

<hostname>:~# mount
/dev/pve/root on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/mapper/pve-data on /var/lib/vz type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
 
None. Syslog only shows my numerous container reboot attempts and some network reconfigurations.

I have a Debian 4.0 container setup to replace the corrupted 5.0 container so I am watching this like a hawk to see if the same thing happens. Its my immediate feeling that something is funky with the 5.0 container.
 
I basically downloaded and installed the 5.0 container within the www interface. Then installed the following from the repositories:

apache2
mysql-server
php5
php5-mysql
php-pear

Then we put our small www application in there and configured mysql for it. After a few days things go wonky. I rebuilt a 2nd container because I thought it was an issue caused by a developer and the same thing happened a 2nd time on that container. They were both Debian 5.0 containers hence the 4.0 container test at the moment.