hi,
i restored an old ubuntu "9.10 server" backup to a new vm in pve 2.1.
(i know it's old, i'm just testing stability of an old 1.x vm in new 2.1, i will upgrade that in production)
the original vm had 1 socket 1 core (the old pve still uses kernel 2.6.18)
i started the restored clone (changing mac, ip, hostname, ssh keys, etc) in the same original hw config. and all went smooth.
the vm is doing nothing, atm, it has just mysql for testing connection.
then i tried to stop it, add more RAM, and more socket / cores but after a few, it may be minutes or hours, it jumps to 100% load and both by ssh and pve console i can just barely connect, then it stops responding at all, also switching to other terminals (alt-f2) works but stops responding in seconds. after start i just can launch "top".
i could see the 100% load from pve timeline. two night ago it switched from 1% to 100% in the night, when nobody was interacting with it. but happened again and again as i tried
- set 2 sockets, 2 cores
- set 1 socket, 2 cores
- set 2 sockets, 1 core
it seemed to me, before anythink locked and i can't run 'top' anymore, that ksoftirqd processes were at highest loads...
i've seen two of them summing up to around 50% but from pve pov the vm was at 100% and soon became unresponsive.
yesterday i tried to keep it all night with 1 socket, 1 core
this morning it's perfect, responsive and the graph (thanks a million pve 2.x!!) shows minimal load in the night, 1-3%
now, after this, i am sure something in the vm is not reacting well to cpu changes (i suspected eth before, or other...)
what you suggest, should this happen also with other vm (windows/newer linuxes) ?
- could it be an old os specific issue, so... upgrade?
- could be a guest kernel issue so, update/recompile kernel?
- reinstall the os from scratch on a multisocket/core machine and then keep that config forever?
- other?
Thanks anyone that can give hints..., details follow
Marco
host is ibm x3650m2, storage is lvm/iscsi on 1GB eth
here are host details
--------------------
root@pve2:~# pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-16
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1
here host performances
------------------------
root@pve2:~# pveperf
CPU BOGOMIPS: 72523.88
REGEX/SECOND: 899376
HD SIZE: 16.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 145.31 MB/sec
AVERAGE SEEK TIME: 3.78 ms
FSYNCS/SECOND: 3040.08
DNS EXT: 62.94 ms
DNS INT: 0.84 ms
here are vm details
------------------
root@pve2:~# cat /etc/pve/nodes/pve2/qemu-server/300.conf
acpi: 1
boot: dca
bootdisk: ide0
cores: 1
cpuunits: 1000
freeze: 0
ide0: vm_disks:vm-300-disk-1
ide1: vm_disks:vm-300-disk-2
ide2: cdrom,media=cdrom
kvm: 1
memory: 2048
name: MySQL-test-V
net0: e1000=BE:30:12:AA:58:1A,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1
i restored an old ubuntu "9.10 server" backup to a new vm in pve 2.1.
(i know it's old, i'm just testing stability of an old 1.x vm in new 2.1, i will upgrade that in production)
the original vm had 1 socket 1 core (the old pve still uses kernel 2.6.18)
i started the restored clone (changing mac, ip, hostname, ssh keys, etc) in the same original hw config. and all went smooth.
the vm is doing nothing, atm, it has just mysql for testing connection.
then i tried to stop it, add more RAM, and more socket / cores but after a few, it may be minutes or hours, it jumps to 100% load and both by ssh and pve console i can just barely connect, then it stops responding at all, also switching to other terminals (alt-f2) works but stops responding in seconds. after start i just can launch "top".
i could see the 100% load from pve timeline. two night ago it switched from 1% to 100% in the night, when nobody was interacting with it. but happened again and again as i tried
- set 2 sockets, 2 cores
- set 1 socket, 2 cores
- set 2 sockets, 1 core
it seemed to me, before anythink locked and i can't run 'top' anymore, that ksoftirqd processes were at highest loads...
i've seen two of them summing up to around 50% but from pve pov the vm was at 100% and soon became unresponsive.
yesterday i tried to keep it all night with 1 socket, 1 core
this morning it's perfect, responsive and the graph (thanks a million pve 2.x!!) shows minimal load in the night, 1-3%
now, after this, i am sure something in the vm is not reacting well to cpu changes (i suspected eth before, or other...)
what you suggest, should this happen also with other vm (windows/newer linuxes) ?
- could it be an old os specific issue, so... upgrade?
- could be a guest kernel issue so, update/recompile kernel?
- reinstall the os from scratch on a multisocket/core machine and then keep that config forever?
- other?
Thanks anyone that can give hints..., details follow
Marco
host is ibm x3650m2, storage is lvm/iscsi on 1GB eth
here are host details
--------------------
root@pve2:~# pveversion -v
pve-manager: 2.1-1 (pve-manager/2.1/f9b0f63a)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-11-pve: 2.6.32-66
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-39
pve-firmware: 1.0-16
libpve-common-perl: 1.0-27
libpve-access-control: 1.0-21
libpve-storage-perl: 2.0-18
vncterm: 1.0-2
vzctl: 3.0.30-2pve5
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1
here host performances
------------------------
root@pve2:~# pveperf
CPU BOGOMIPS: 72523.88
REGEX/SECOND: 899376
HD SIZE: 16.49 GB (/dev/mapper/pve-root)
BUFFERED READS: 145.31 MB/sec
AVERAGE SEEK TIME: 3.78 ms
FSYNCS/SECOND: 3040.08
DNS EXT: 62.94 ms
DNS INT: 0.84 ms
here are vm details
------------------
root@pve2:~# cat /etc/pve/nodes/pve2/qemu-server/300.conf
acpi: 1
boot: dca
bootdisk: ide0
cores: 1
cpuunits: 1000
freeze: 0
ide0: vm_disks:vm-300-disk-1
ide1: vm_disks:vm-300-disk-2
ide2: cdrom,media=cdrom
kvm: 1
memory: 2048
name: MySQL-test-V
net0: e1000=BE:30:12:AA:58:1A,bridge=vmbr0
onboot: 1
ostype: l26
sockets: 1