Backup geht mit Fehlermeldung schief

Dec 19, 2012
495
14
83
Hi.
Ich erhalte manchmal solche Meldungen bei unterschiedlichen VMs, deren Backup nicht funktioniert hat. Mir ist die Ursache aber nicht klar...
Was ist das Problem??

Code:
Win10-64-Bit-1709  
FAILED  

00:00:03    start failed: command '
/usr/bin/kvm -id 102 -name Win10-64-Bit-1709 
-chardev 'socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait' -mon 
'chardev=qmp,mode=control' -chardev 
'socket,id=qmp-event,path=/var
/run/qmeventd.sock,reconnect=5' -mon 
'chardev=qmp-event,mode=control' -pidfile 
/var/run/qemu-server/102.pid -daemonize 
-smbios 'type=1,uuid=6c7cb5de-82f6-4750-
b3f7-db0b00f8ecaf'-smp 
'4,sockets=1,cores=4,maxcpus=4' -nodefaults 
-boot 'menu=on,strict=on,reboot-
timeout=1000,splash=/usr/share/qemu-server
/bootsplash.jpg' -vnc unix:/var/run/qemu-server
/102.vnc,x509,password -no-hpet -cpu 
'kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv
_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_re
set,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer,enforce' -m 8192 -device 'pci-
bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x
1f' -device 'pci-
bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x
1e' -device 'piix3-usb-uhci,id=u 
hci,bus=pci.0,addr=0x1.0x2' -device 'usb-
tablet,id=tablet,bus=uhci.0,port=1' -device 
'vmware-svga,id=vga,bus=pci.0,addr=0x2' 
-device 'virtio-balloon-
pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 
'initiator-
name=iqn.1993-08.org.debian:01:e1bc7a1e367' 
-drive 'if=none,id=drive-
ide2,media=cdrom,aio=threads' -device 'ide-
cd,bus=ide.1,unit=0,drive=drive-
ide2,id=ide2,bootindex=300' -device 
'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=
0x7' -drive 'file=/dev/zvol/rpool/vm-102-
disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-
hd,bus=ahci0.0,drive=drive-
sata0,id=sata0,bootindex=100' -netdev 
'type=tap,id=net0,ifname=tap102i0,script=/var
/lib/qemu-server/pve-bridge,downscript=
/var/lib/qemu-server/pve-bridgedown' -device 
'e1000,mac=62:8E:A7:5E:61:54,netdev=net0,bus
=pci.0,addr=0x12,id=net0,bootindex=200' -rtc 
'driftfix=slew,base=localtime' -machine 'type=pc'
-global 'kvm-pit.lost_tick_policy=discard' -S' 
failed: exit code 1
 
Bitte poste das komplette Log von einem fehlerhaften Backup. Zusätzlich noch den Output von 'pveversion -v'.
 
Code:
pveversion -v
proxmox-ve: 5.4-1 (running kernel: 4.15.18-14-pve)
pve-manager: 5.4-6 (running version: 5.4-6/aa7856c5)
pve-kernel-4.15: 5.4-2
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-9-pve: 4.15.18-30
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-10
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-52
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-43
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
pve-zsync: 1.7-4
qemu-server: 5.0-51
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2

Die Meldung kam heute Nacht erneut bei dieser VM -- das Backup wird auf eine Synology-Diskstation per NFS abgelegt. Alle anderen VMs haben das Problem nicht.
Hier die vollständige Log-Datei:

Code:
cat /var/log/vzdump/qemu-102.log
2019-05-30 16:21:59 INFO: Starting Backup of VM 102 (qemu)
2019-05-30 16:21:59 INFO: status = stopped
2019-05-30 16:22:00 INFO: update VM 102: -lock backup
2019-05-30 16:22:00 INFO: backup mode: stop
2019-05-30 16:22:00 INFO: ionice priority: 7
2019-05-30 16:22:00 INFO: VM Name: Win10-64-Bit-1709
2019-05-30 16:22:00 INFO: include disk 'sata0' 'rpool:vm-102-disk-0' 128G
2019-05-30 16:22:00 INFO: skip unused drive 'RAID10-local:102/vm-102-disk-1.qcow2' (not included into backup)
2019-05-30 16:22:00 INFO: creating archive '/mnt/pve/DSSchule/dump/vzdump-qemu-102-2019_05_30-16_21_59.vma.lzo'
2019-05-30 16:22:00 INFO: starting kvm to execute backup task
2019-05-30 16:22:01 ERROR: Backup of VM 102 failed - start failed: command '/usr/bin/kvm -id 102 -name Win10-64-Bit-1709 -chardev 'socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/102.pid -daemonize -smbios 'type=1,uuid=6c7cb5de-82f6-4750-b3f7-db0b00f8ecaf' -smp '4,sockets=1,cores=4,maxcpus=4' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/102.vnc,x509,password -no-hpet -cpu 'kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer,enforce' -m 8192 -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'vmware-svga,id=vga,bus=pci.0,addr=0x2' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:e1bc7a1e367' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=300' -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' -drive 'file=/dev/zvol/rpool/vm-102-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap102i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=62:8E:A7:5E:61:54,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=200' -rtc 'driftfix=slew,base=localtime' -machine 'type=pc' -global 'kvm-pit.lost_tick_policy=discard' -S' failed: exit code 1

Ach ja -- ich hatte die Platte kürzlich per WebGUI von einem RAID-10 auf einen ZFS-Pool verschoben (aber das Original behalten)...
 
Starting the VM does work, just not in the context of backups?
 
Hm -- that's new ... starting doesn't work either any more

Code:
()
kvm: -drive file=/dev/zvol/rpool/vm-102-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=native,detect-zeroes=on: Could not open '/dev/zvol/rpool/vm-102-disk-0': No such file or directory
TASK ERROR: start failed: command '/usr/bin/kvm -id 102 -name Win10-64-Bit-1709 -chardev 'socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/102.pid -daemonize -smbios 'type=1,uuid=6c7cb5de-82f6-4750-b3f7-db0b00f8ecaf' -smp '4,sockets=1,cores=4,maxcpus=4' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/102.vnc,x509,password -no-hpet -cpu 'kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer,enforce' -m 8192 -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'vmware-svga,id=vga,bus=pci.0,addr=0x2' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:e1bc7a1e367' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=300' -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' -drive 'file=/dev/zvol/rpool/vm-102-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap102i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=62:8E:A7:5E:61:54,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=200' -rtc 'driftfix=slew,base=localtime' -machine 'type=pc' -global 'kvm-pit.lost_tick_policy=discard'' failed: exit code 1
zfs list gives me:
NAME USED AVAIL REFER MOUNTPOINT
rpool 168G 2.89T 140K /rpool
rpool/vm-102-disk-0 132G 2.94T 80.6G -
 
Sorry, die Automatik mit Deutsch/Englisch wechseln. Bitte poste den Output von 'zfs list'. Ist 'vm-102-disk-0' vorhanden darin?
 
Seit ein paar Tagen erscheint die Fehlermeldung verstärkt -- und für ganz unterschiedliche VMs. Ich nehme an, dass es damit zusammenhängt, dass ich mit den virt HDDs von einem RAID auf den ZFS-Pool umgezogen bin, denn die Meldungen erscheinen erst seitdem.
Hier zunächst "zfs list":
Code:
zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
rpool                    1.64T  1.41T   140K  /rpool
rpool/CONTAINERS         2.28G  1.41T  2.28G  /rpool/CONTAINERS
rpool/subvol-106-disk-0  3.55G  4.45G  3.55G  /rpool/subvol-106-disk-0
rpool/vm-100-disk-0      49.9G  1.44T  13.8G  -
rpool/vm-101-disk-0       132G  1.49T  51.9G  -
rpool/vm-102-disk-0       132G  1.46T  80.6G  -
rpool/vm-104-disk-0      33.0G  1.43T  8.23G  -
rpool/vm-108-disk-0      41.3G  1.44T  5.16G  -
rpool/vm-111-disk-0      33.0G  1.44T  81.4K  -
rpool/vm-199-disk-0      33.0G  1.43T  13.1G  -
rpool/vm-200-disk-0       132G  1.48T  60.2G  -
rpool/vm-400-disk-0      51.6G  1.44T  15.6G  -
rpool/vm-502-disk-0      5.16G  1.41T  4.13G  -
rpool/vm-503-disk-0      24.3G  1.41T  24.3G  -
rpool/vm-503-disk-1      51.6G  1.42T  42.0G  -
rpool/vm-504-disk-0      51.6G  1.41T  48.0G  -
rpool/vm-504-disk-1       258G  1.61T  49.7G  -
rpool/vm-506-disk-0      22.7G  1.43T  4.76G  -
rpool/vm-507-disk-0       258G  1.62T  36.1G  -
rpool/vm-600-disk-0      33.0G  1.42T  22.0G  -
rpool/vm-699-disk-0      8.25G  1.41T  2.47G  -
rpool/vm-699-disk-1      6.19G  1.41T  81.4K  -
rpool/vm-700-disk-0       309G  1.56T   150G  -
rpool/vm-902-disk-0      10.3G  1.42T  2.58G  -

Hier nochmal die Meldung von heute Nacht (es waren 3 VMs mit dieser Meldung betroffen.
Das Backup wird auf einer Synology Diskstation über NFS angelegt.

Code:
400    Ubuntu-Server-16-04    FAILED    00:00:02    start failed: command '/usr/bin/kvm -id 400 -name Ubuntu-Server-16-04 -chardev 'socket,id=qmp,path=/var/run/qemu-server/400.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/400.pid -daemonize -smbios 'type=1,uuid=4a8ebb8c-df2c-468c-a1e2-e96932f74850' -smp '4,sockets=1,cores=4,maxcpus=4' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/400.vnc,x509,password -cpu kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 8192 -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'VGA,id=vga,bus=pci. 0,addr=0x2' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:e1bc7a1e367' -drive 'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/rpool/vm-400-disk-0,if=none,id=drive-scsi0,cache=writethrough,format=raw,aio=threads,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap400i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=32:35:36:30:62:38,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc' -S' failed: exit code 1

Was kann das sein?
 
Wenn dieser Fehler auftritt, was ist der Output von 'ls -l /dev/zvol/rpool'?
 
Hi. Das kann ich kaum in dem Moment nachsehen, weil es nachts stattfindet ... hier ist die Ausgabe im Moment -- da sieht alles richtig aus?!
Code:
ls -l /dev/zvol/rpool
total 0
lrwxrwxrwx 1 root root  9 May 23 14:48 vm-100-disk-0 -> ../../zd0
lrwxrwxrwx 1 root root 11 May 23 14:48 vm-100-disk-0-part1 -> ../../zd0p1
lrwxrwxrwx 1 root root 11 May 23 14:48 vm-100-disk-0-part2 -> ../../zd0p2
lrwxrwxrwx 1 root root 10 Jun 12 16:51 vm-101-disk-0 -> ../../zd80
lrwxrwxrwx 1 root root 10 Jun 12 17:25 vm-104-disk-0 -> ../../zd96
lrwxrwxrwx 1 root root 11 Jun 12 17:36 vm-108-disk-0 -> ../../zd112
lrwxrwxrwx 1 root root 11 Jun 12 17:37 vm-111-disk-0 -> ../../zd128
lrwxrwxrwx 1 root root 10 Jun 12 16:40 vm-199-disk-0 -> ../../zd64
lrwxrwxrwx 1 root root 11 Jun 12 17:38 vm-200-disk-0 -> ../../zd144
lrwxrwxrwx 1 root root 11 Jun 12 17:39 vm-400-disk-0 -> ../../zd160
lrwxrwxrwx 1 root root 11 Jun 12 18:17 vm-502-disk-0 -> ../../zd272
lrwxrwxrwx 1 root root 11 Jun 12 18:04 vm-503-disk-0 -> ../../zd240
lrwxrwxrwx 1 root root 11 Jun 12 18:08 vm-503-disk-1 -> ../../zd256
lrwxrwxrwx 1 root root 11 Jun 13 21:32 vm-504-disk-0 -> ../../zd304
lrwxrwxrwx 1 root root 11 Jun 13 21:40 vm-504-disk-1 -> ../../zd320
lrwxrwxrwx 1 root root 11 Jun 12 18:03 vm-506-disk-0 -> ../../zd224
lrwxrwxrwx 1 root root 11 Jun 12 18:02 vm-507-disk-0 -> ../../zd176
lrwxrwxrwx 1 root root 11 Jun 12 22:38 vm-600-disk-0 -> ../../zd288
lrwxrwxrwx 1 root root 10 Jun  2 20:18 vm-699-disk-0 -> ../../zd16
lrwxrwxrwx 1 root root 10 Jun  2 20:19 vm-699-disk-1 -> ../../zd32
lrwxrwxrwx 1 root root 11 Jun 13 21:41 vm-700-disk-0 -> ../../zd336
lrwxrwxrwx 1 root root 10 Jun 12 15:40 vm-902-disk-0 -> ../../zd48

Mir kommt das ganze weiterhin wie ein Timeout-Problem vor, da sich die Diskstation ja nachts schlafen legt und immer einen Moment braucht, bis das Backup anfahren kann. Ein anderes Platten auf internen Platten (direkt gemountet), zeigt das Verhalten übrigens nicht (ist aber auch kein NFS). Kann das damit zusammenhängen?
 
Last edited:
Das sollte nicht damit zusammenhängen. Lokal wird die Disk nicht gefunden in dem Moment.
Eine Möglichkeit wäre ein hookscript hinzuzufügen beim Start des Backups das 'ls -l' ausführt und ausgibt. Dies erzeugt nur leider viel Output weil es bei jeder VM (bei der das Problem auftritt) hinzugefügt werden muss und somit bei jedem Start auch ausgeführt wird (nicht nur beim Backup selbst).
Code:
#!/usr/bin/perl

use strict;
use warnings;

print "GUEST HOOK: " . join(' ', @ARGV). "\n";

# First argument is the vmid

my $vmid = shift;

# Second argument is the phase

my $phase = shift;

if ($phase eq 'pre-start') {

    # First phase 'pre-start' will be executed before the guest
    # ist started. Exiting with a code != 0 will abort the start

    print "$vmid is starting, doing preparations.\n";
    system('ls', '-l', '/dev/zvol/rpool');


    # print "preparations failed, aborting."
    # exit(1);

}
Es ist ein vereinfachtes Skript basierend auf dem Beispiel in /usr/share/pve-docs/examples/guest-example-hookscript.pl
 
Ok, gerade habe ich gesehen, dass das Problem in eine ganz andere Richtung geht. Die VMs laufen wie gesagt nicht. Als ich aber versucht habe sie zu starten, erhielt ich diesen Fehler:
Code:
()
ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory
kvm: failed to initialize KVM: Cannot allocate memory
Danach habe ich schon im Forum gesucht -- bisher aber keine Lösung in Sicht. Das Backup scheitert aber seltsamerweise nicht immer -- auch nicht konsequent, oder?

Übrigens hat der Server 256 GB RAM -- davon sind laut GUI
RAM-Auslastung 33.32% (83.94 GiB von 251.88 GiB)

Andererseits meldet "free -m" nur:
Code:
              total        used        free      shared  buff/cache   available
Mem:         257924       81654        1132          82      175137      174373
Swap:         20479        4109       16370
Scheinbar ist hier irgendwo das Problem ... nur warum?

Hier noch
Code:
pveversion -v
proxmox-ve: 5.4-1 (running kernel: 4.15.18-14-pve)
pve-manager: 5.4-6 (running version: 5.4-6/aa7856c5)
pve-kernel-4.15: 5.4-4
pve-kernel-4.15.18-16-pve: 4.15.18-41
pve-kernel-4.15.18-15-pve: 4.15.18-40
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-9-pve: 4.15.18-30
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-10
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-52
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-43
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-37
pve-container: 2.0-39
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
pve-zsync: 1.7-4
qemu-server: 5.0-52
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
 
Last edited:
Bitte die VM Config posten ('qm config <vmid>'). Ist 'numa' aktiviert?
 
Hallo. Es sind wie gesagt ganz unterschiedliche VMs ... mit teilweise unterschiedlichen Configs. Hier kommt dennoch eine:
Code:
#Ubuntu 16.04 LTS Server (64 Bit)
#mit LAMP und SSHD
boot: cdn
bootdisk: scsi0
cores: 2
ide2: none,media=cdrom
memory: 8192
name: Ubuntu-Server-16-04
net0: virtio=32:35:36:30:62:38,bridge=vmbr0
numa: 0
ostype: l26
scsi0: RAID10-local:400/vm-400-disk-0.qcow2,cache=writethrough,size=50G
scsihw: virtio-scsi-pci
smbios1: uuid=4a8ebb8c-df2c-468c-a1e2-e96932f74850
sockets: 1
startup: order=5
Also numa:0 .... aber ob es daran liegt?
Der Server wurde übrigens heute neu gestartet. Damit ist hoffentlich auch das o.g. Speicherproblem vorerst beseitigt. Warten wir's ab...
 
Kurze Rückmeldung: Das Problem besteht weiterhin -- in der letzten Nacht waren sogar 4 VMs davon betroffen. Die Meldung ist immer die gleiche wie oben "start failed..." ... ich habe die genannten VMs alle nochmal nachgesehen: Sie starten ohne Probleme, allerdings läuft von diesen VMs zum Zeitpunkt des Backups keine.

Hier nochmal die Ausgabe von "free --giga" -- wobei mir weiterhin nicht klar ist, warum da nur "3 GB" als free angezeigt werden??

Code:
# free --giga
              total        used        free      shared  buff/cache   available
Mem:            264          80           3           0         180         181
Swap:            20           0          20


Kann das Problem evtl hiermit zusammen hängen??
https://forum.proxmox.com/threads/memory-allocation-failure.41441/
 
Last edited:
Wenn der 'memory allocation failed' Fehler auftritt, was geben 'arcstat' und 'arc_summary' aus?
 
Also im Moment des Fehlers kann ich's gerade nicht nachsehen ... aber ich hatte den arc-cache schon mal reduziert:
Code:
arcstat
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c 
12:35:11     0     0      0     0    0     0    0     0    0   6.5G  6.5G

Code:
arc_summary

------------------------------------------------------------------------
ZFS Subsystem Report                            Tue Jul 16 12:36:12 2019
ARC Summary: (HEALTHY)
        Memory Throttle Count:                  0

ARC Misc:
        Deleted:                                2.62G
        Mutex Misses:                           702.93k
        Evict Skips:                            518.91M

ARC Size:                               10.16%  6.50    GiB
        Target Size: (Adaptive)         10.21%  6.54    GiB
        Min Size (Hard Limit):          6.25%   4.00    GiB
        Max Size (High Water):          16:1    64.00   GiB

ARC Size Breakdown:
        Recently Used Cache Size:       13.62%  464.27  MiB
        Frequently Used Cache Size:     86.38%  2.87    GiB

ARC Hash Breakdown:
        Elements Max:                           37.47M
        Elements Current:               88.15%  33.03M
        Collisions:                             2.42G
        Chain Max:                              13
        Chains:                                 8.67M

ARC Total accesses:                                     10.77G
        Cache Hit Ratio:                74.52%  8.02G
        Cache Miss Ratio:               25.48%  2.74G
        Actual Hit Ratio:               73.83%  7.95G

        Data Demand Efficiency:         77.47%  2.69G
        Data Prefetch Efficiency:       5.10%   2.15G

        CACHE HITS BY CACHE LIST:
          Anonymously Used:             0.79%   63.29M
          Most Recently Used:           21.19%  1.70G
          Most Frequently Used:         77.88%  6.25G
          Most Recently Used Ghost:     0.13%   10.09M
          Most Frequently Used Ghost:   0.02%   1.60M

        CACHE HITS BY DATA TYPE:
          Demand Data:                  26.00%  2.09G
          Prefetch Data:                1.36%   109.51M
          Demand Metadata:              72.59%  5.83G
          Prefetch Metadata:            0.05%   3.67M

        CACHE MISSES BY DATA TYPE:
          Demand Data:                  22.11%  606.55M
          Prefetch Data:                74.20%  2.04G
          Demand Metadata:              3.56%   97.63M
          Prefetch Metadata:            0.13%   3.45M

L2 ARC Summary: (HEALTHY)
        Low Memory Aborts:                      3.55k
        Free on Write:                          589.47k
        R/W Clashes:                            0
        Bad Checksums:                          0
        IO Errors:                              0

L2 ARC Size: (Adaptive)                         255.17  GiB
        Compressed:                     85.21%  217.43  GiB
        Header Size:                    1.12%   2.87    GiB

L2 ARC Evicts:
        Lock Retries:                           10.21k
        Upon Reading:                           4

L2 ARC Breakdown:                               2.74G
        Hit Ratio:                      31.14%  854.28M
        Miss Ratio:                     68.86%  1.89G
        Feeds:                                  1.49M

L2 ARC Writes:
        Writes Sent:                    100.00% 1.29M

DMU Prefetch Efficiency:                                        855.65M
        Hit Ratio:                      34.38%  294.22M
        Miss Ratio:                     65.62%  561.44M



ZFS Tunables:
        dbuf_cache_hiwater_pct                            10
        dbuf_cache_lowater_pct                            10
        dbuf_cache_max_bytes                              104857600
        dbuf_cache_max_shift                              5
        dmu_object_alloc_chunk_shift                      7
        ignore_hole_birth                                 1
        l2arc_feed_again                                  1
        l2arc_feed_min_ms                                 200
        l2arc_feed_secs                                   1
        l2arc_headroom                                    2
        l2arc_headroom_boost                              200
        l2arc_noprefetch                                  0
        l2arc_norw                                        0
        l2arc_write_boost                                 8388608
        l2arc_write_max                                   8388608
        metaslab_aliquot                                  524288
        metaslab_bias_enabled                             1
        metaslab_debug_load                               0
        metaslab_debug_unload                             0
        metaslab_fragmentation_factor_enabled             1
        metaslab_lba_weighting_enabled                    1
        metaslab_preload_enabled                          1
        metaslabs_per_vdev                                200
        send_holes_without_birth_time                     1
        spa_asize_inflation                               24
        spa_config_path                                   /etc/zfs/zpool.cache
        spa_load_verify_data                              1
        spa_load_verify_maxinflight                       10000
        spa_load_verify_metadata                          1
        spa_slop_shift                                    5
        zfetch_array_rd_sz                                1048576
        zfetch_max_distance                               8388608
        zfetch_max_streams                                8
        zfetch_min_sec_reap                               2
        zfs_abd_scatter_enabled                           1
        zfs_abd_scatter_max_order                         10
        zfs_admin_snapshot                                1
        zfs_arc_average_blocksize                         8192
        zfs_arc_dnode_limit                               0
        zfs_arc_dnode_limit_percent                       10
        zfs_arc_dnode_reduce_percent                      10
        zfs_arc_grow_retry                                0
        zfs_arc_lotsfree_percent                          10
        zfs_arc_max                                       68719476736
        zfs_arc_meta_adjust_restarts                      4096
        zfs_arc_meta_limit                                0
        zfs_arc_meta_limit_percent                        75
        zfs_arc_meta_min                                  0
        zfs_arc_meta_prune                                10000
        zfs_arc_meta_strategy                             1
        zfs_arc_min                                       4294967296
        zfs_arc_min_prefetch_lifespan                     0
        zfs_arc_p_dampener_disable                        1
        zfs_arc_p_min_shift                               0
        zfs_arc_pc_percent                                0
        zfs_arc_shrink_shift                              0
        zfs_arc_sys_free                                  0
        zfs_autoimport_disable                            1
        zfs_checksums_per_second                          20
        zfs_compressed_arc_enabled                        1
        zfs_dbgmsg_enable                                 0
        zfs_dbgmsg_maxsize                                4194304
        zfs_dbuf_state_index                              0
        zfs_deadman_checktime_ms                          5000
        zfs_deadman_enabled                               1
        zfs_deadman_synctime_ms                           1000000
        zfs_dedup_prefetch                                0
        zfs_delay_min_dirty_percent                       60
        zfs_delay_scale                                   500000
        zfs_delays_per_second                             20
        zfs_delete_blocks                                 20480
        zfs_dirty_data_max                                4294967296
        zfs_dirty_data_max_max                            4294967296
        zfs_dirty_data_max_max_percent                    25
        zfs_dirty_data_max_percent                        10
        zfs_dirty_data_sync                               67108864
        zfs_dmu_offset_next_sync                          0
        zfs_expire_snapshot                               300
        zfs_flags                                         0
        zfs_free_bpobj_enabled                            1
        zfs_free_leak_on_eio                              0
        zfs_free_max_blocks                               100000
        zfs_free_min_time_ms                              1000
        zfs_immediate_write_sz                            32768
        zfs_max_recordsize                                1048576
        zfs_mdcomp_disable                                0
        zfs_metaslab_fragmentation_threshold              70
        zfs_metaslab_segment_weight_enabled               1
        zfs_metaslab_switch_threshold                     2
        zfs_mg_fragmentation_threshold                    85
        zfs_mg_noalloc_threshold                          0
        zfs_multihost_fail_intervals                      5
        zfs_multihost_history                             0
        zfs_multihost_import_intervals                    10
        zfs_multihost_interval                            1000
        zfs_multilist_num_sublists                        0
        zfs_no_scrub_io                                   0
        zfs_no_scrub_prefetch                             0
        zfs_nocacheflush                                  0
        zfs_nopwrite_enabled                              1
        zfs_object_mutex_size                             64
        zfs_pd_bytes_max                                  52428800
        zfs_per_txg_dirty_frees_percent                   30
        zfs_prefetch_disable                              0
        zfs_read_chunk_size                               1048576
        zfs_read_history                                  0
        zfs_read_history_hits                             0
        zfs_recover                                       0
        zfs_recv_queue_length                             16777216
        zfs_resilver_delay                                2
        zfs_resilver_min_time_ms                          3000
        zfs_scan_idle                                     50
        zfs_scan_ignore_errors                            0
        zfs_scan_min_time_ms                              1000
        zfs_scrub_delay                                   4
        zfs_send_corrupt_data                             0
        zfs_send_queue_length                             16777216
        zfs_sync_pass_deferred_free                       2
        zfs_sync_pass_dont_compress                       5
        zfs_sync_pass_rewrite                             2
        zfs_sync_taskq_batch_pct                          75
        zfs_top_maxinflight                               32
        zfs_txg_history                                   0
        zfs_txg_timeout                                   5
        zfs_vdev_aggregation_limit                        131072
        zfs_vdev_async_read_max_active                    3
        zfs_vdev_async_read_min_active                    1
        zfs_vdev_async_write_active_max_dirty_percent     60
        zfs_vdev_async_write_active_min_dirty_percent     30
        zfs_vdev_async_write_max_active                   10
        zfs_vdev_async_write_min_active                   2
        zfs_vdev_cache_bshift                             16
        zfs_vdev_cache_max                                16384
        zfs_vdev_cache_size                               0
        zfs_vdev_max_active                               1000
        zfs_vdev_mirror_non_rotating_inc                  0
        zfs_vdev_mirror_non_rotating_seek_inc             1
        zfs_vdev_mirror_rotating_inc                      0
        zfs_vdev_mirror_rotating_seek_inc                 5
        zfs_vdev_mirror_rotating_seek_offset              1048576
        zfs_vdev_queue_depth_pct                          1000
        zfs_vdev_raidz_impl                               [fastest] original scalar sse2 ssse3
        zfs_vdev_read_gap_limit                           32768
        zfs_vdev_scheduler                                noop
        zfs_vdev_scrub_max_active                         2
        zfs_vdev_scrub_min_active                         1
        zfs_vdev_sync_read_max_active                     10
        zfs_vdev_sync_read_min_active                     10
        zfs_vdev_sync_write_max_active                    10
        zfs_vdev_sync_write_min_active                    10
        zfs_vdev_write_gap_limit                          4096
        zfs_zevent_cols                                   80
        zfs_zevent_console                                0
        zfs_zevent_len_max                                640
        zil_replay_disable                                0
        zil_slog_bulk                                     786432
        zio_delay_max                                     30000
        zio_dva_throttle_enabled                          1
        zio_requeue_io_start_cut_in_line                  1
        zio_taskq_batch_pct                               75
        zvol_inhibit_dev                                  0
        zvol_major                                        230
        zvol_max_discard_blocks                           16384
        zvol_prefetch_bytes                               131072
        zvol_request_sync                                 0
        zvol_threads                                      32
        zvol_volmode                                      1
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!