[SOLVED] VM stops (crashes?) while cloning

oknet

Member
Oct 13, 2022
36
1
13
I use to clone "on the fly" a running VM with a script recalled by crontab this way :

qm destroy 101 ### remove old clone
qm clone 100 101 --full --name VM100clone ### 100 is the running VM
qm set 101 --onboot 0 ### to prevent clone VM to start at boot


sometimes it ends with error like this :

create full clone of drive scsi0 (local-zfs:vm-100-disk-0)
drive mirror is starting for drive-scsi0
drive-scsi0: transferred 0.0 B of 50.0 GiB (0.00%) in 0s
drive-scsi0: transferred 1.7 GiB of 50.0 GiB (3.30%) in 1s
drive-scsi0: transferred 2.0 GiB of 50.0 GiB (4.08%) in 2s
drive-scsi0: transferred 2.5 GiB of 50.0 GiB (4.97%) in 3s
drive-scsi0: transferred 2.9 GiB of 50.0 GiB (5.72%) in 4s
blablabla..... omitted..
drive-scsi0: transferred 14.0 GiB of 50.0 GiB (27.98%) in 1m 58s
drive-scsi0: transferred 14.0 GiB of 50.0 GiB (28.06%) in 1m 59s
drive-scsi0: transferred 14.1 GiB of 50.0 GiB (28.12%) in 2m
drive-scsi0: Cancelling block job
drive-scsi0: Cancelling block job
zfs error: cannot destroy 'rpool/data/vm-101-disk-0': dataset is busy

TASK ERROR: clone failed: block job (mirror) error: VM 100 not running



Why is this ?
Why it ends with running VM100 that runs no more ??
Thanks
 
Hi,
can you please share the system logs/journal from around the time the issue happened? Please provide the output of pveversion -v and qm config 100.
 
Well....
Now I got to investigate why VM suddenly stops :

This is /var/log/messages of VM 100 (a pbx) :

Code:
Dec  9 12:28:03 mypbx systemd: Reloaded The Apache HTTP Server.
Dec  9 12:28:03 mypbx systemd: Starting Fail2Ban Service...
Dec  9 12:28:04 mypbx fail2ban-client: 2023-12-09 12:28:04,040 fail2ban.server [27315]: INFO    Starting Fail2ban v0.8.14
Dec  9 12:28:04 mypbx fail2ban-client: 2023-12-09 12:28:04,040 fail2ban.server [27315]: INFO    Starting in daemon mode
Dec  9 12:28:04 mypbx systemd: Started Fail2Ban Service.
Dec  9 12:28:04 mypbx yum[27320]: Installed: socat-1.7.3.2-2.el7.x86_64
Dec  9 17:36:02 mypbx rsyslogd: imjournal: journal reloaded... [v8.24.0-52.el7_8.2 try http://www.rsyslog.com/e/0 ]
Dec  9 17:36:02 mypbx rsyslogd: imjournal: journal reloaded... [v8.24.0-52.el7_8.2 try http://www.rsyslog.com/e/0 ]
Dec  9 20:15:43 mypbx systemd: Starting Cleanup of Temporary Directories...
Dec  9 20:15:43 mypbx systemd: Started Cleanup of Temporary Directories.
Dec 11 07:46:53 mypbx journal: Runtime journal is using 8.0M (max allowed 391.0M, trying to leave 586.5M free of 3.8G available → current limit 391.0M).
Dec 11 07:46:53 mypbx kernel: Initializing cgroup subsys cpuset
Dec 11 07:46:53 mypbx kernel: Initializing cgroup subsys cpu
Dec 11 07:46:53 mypbx kernel: Initializing cgroup subsys cpuacct
Dec 11 07:46:53 mypbx kernel: Linux version 3.10.0-1127.19.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Tue Aug 25 17:23:54 UTC 2020
Dec 11 07:46:53 mypbx kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-1127.19.1.el7.x86_64 root=/dev/mapper/SangomaVG-root ro crashkernel=auto rd.lvm.lv=SangomaVG/root rd.lvm.lv=SangomaVG/swaplv1 biosdevname=0 net.ifnames=0 rhgb quiet
Dec 11 07:46:53 mypbx kernel: e820: BIOS-provided physical RAM map:

Clearly, there's a black hole from saturday Dec 9 20:15:43 to monday Dec 11 07:46:53 (UTC) when I started VM manually, so something happened in between...
Looking at proxmox journal I found Dec 10 02:00 (when crontab starts cloning VM100) :

Code:
Dec 10 02:00:01 pve CRON[3656585]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Dec 10 02:00:01 pve CRON[3656586]: (root) CMD (   /home/backup.sh)
Dec 10 02:00:01 pve qm[3656590]: <root@pam> starting task UPIDve:0037CB91:08DCB044:65750D91:qmdestroy:102:root@pam:
Dec 10 02:00:01 pve qm[3656593]: destroy VM 102: UPIDve:0037CB91:08DCB044:65750D91:qmdestroy:102:root@pam:
Dec 10 02:00:02 pve qm[3656590]: <root@pam> end task UPIDve:0037CB91:08DCB044:65750D91:qmdestroy:102:root@pam: OK
Dec 10 02:00:02 pve qm[3656677]: <root@pam> starting task UPIDve:0037CBE6:08DCB0B0:65750D92:qmclone:100:root@pam:
Dec 10 02:00:22 pve pvestatd[1968]: status update time (9.078 seconds)
Dec 10 02:01:06 pve pve-firewall[1966]: firewall update time (5.350 seconds)
Dec 10 02:02:05 pve kernel: task UPIDve:0 invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
Dec 10 02:02:07 pve kernel: CPU: 3 PID: 3656678 Comm: task UPIDve:0 Tainted: P           O       6.2.16-3-pve #1
Dec 10 02:02:07 pve kernel: Hardware name: HPE ProLiant MicroServer Gen10 Plus v2/ProLiant MicroServer Gen10 Plus v2, BIOS U64 06/30/2022
Dec 10 02:02:07 pve kernel: Call Trace:
Dec 10 02:02:07 pve kernel:  <TASK>
Dec 10 02:02:07 pve kernel:  dump_stack_lvl+0x48/0x70
Dec 10 02:02:07 pve kernel:  dump_stack+0x10/0x20
Dec 10 02:02:07 pve kernel:  dump_header+0x50/0x290
Dec 10 02:02:07 pve kernel:  oom_kill_process+0x10d/0x1c0
Dec 10 02:02:07 pve kernel:  out_of_memory+0x23c/0x570
Dec 10 02:02:08 pve kernel:  __alloc_pages+0x1180/0x13a0
Dec 10 02:02:08 pve kernel:  alloc_pages+0x90/0x1a0
Dec 10 02:02:08 pve kernel:  folio_alloc+0x1d/0x60
Dec 10 02:02:08 pve kernel:  filemap_alloc_folio+0xfd/0x110
Dec 10 02:02:08 pve kernel:  __filemap_get_folio+0x1d4/0x3c0
Dec 10 02:02:08 pve kernel:  ? psi_group_change+0x219/0x530
Dec 10 02:02:08 pve kernel:  filemap_fault+0x14a/0x940
Dec 10 02:02:08 pve kernel:  ? filemap_map_pages+0x14b/0x6f0
Dec 10 02:02:08 pve kernel:  __do_fault+0x36/0x150
Dec 10 02:02:08 pve kernel:  do_fault+0x1c7/0x430
Dec 10 02:02:08 pve kernel:  __handle_mm_fault+0x6d9/0x1070
Dec 10 02:02:08 pve kernel:  handle_mm_fault+0x119/0x330
Dec 10 02:02:08 pve kernel:  do_user_addr_fault+0x1c1/0x720
Dec 10 02:02:08 pve kernel:  exc_page_fault+0x80/0x1b0
Dec 10 02:02:08 pve kernel:  asm_exc_page_fault+0x27/0x30
Dec 10 02:02:08 pve kernel: RIP: 0033:0x7fc891f7c303
Dec 10 02:02:08 pve kernel: Code: Unable to access opcode bytes at 0x7fc891f7c2d9.
Dec 10 02:02:08 pve kernel: RSP: 002b:00007fffec059cf8 EFLAGS: 00010202
Dec 10 02:02:08 pve kernel: RAX: 0000000000000000 RBX: ffffffffffffff78 RCX: 00007fc891f7c303
Dec 10 02:02:08 pve kernel: RDX: 00007fffec059d10 RSI: 0000000000000000 RDI: 0000000000000000
Dec 10 02:02:08 pve kernel: RBP: 000000000000001d R08: 0000000000000000 R09: 000000000000010f
Dec 10 02:02:08 pve kernel: R10: 00007fffec059d10 R11: 0000000000000202 R12: 00005578018dcf78
Dec 10 02:02:08 pve kernel: R13: 00005577fd19ec88 R14: 00005577fc426030 R15: 00007fc8921a8020
Dec 10 02:02:08 pve kernel:  </TASK>
Dec 10 02:02:08 pve kernel: Mem-Info:
Dec 10 02:02:08 pve kernel: active_anon:1117405 inactive_anon:490517 isolated_anon:0
Dec 10 02:02:08 pve kernel: Node 0 active_anon:4469620kB inactive_anon:1962068kB active_file:168kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:37732kB dirty:40kB writeback:0kB shmem:41424kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1263616kB writeback_tmp:0kB kernel_stack:4420kB pagetables:21552kB sec_pagetables:9260kB all_unreclaimable? no
Dec 10 02:02:08 pve kernel: Node 0 DMA free:13312kB boost:0kB min:64kB low:80kB high:96kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Dec 10 02:02:08 pve kernel: lowmem_reserve[]: 0 1826 15820 15820 15820
Dec 10 02:02:08 pve kernel: Node 0 DMA32 free:59884kB boost:0kB min:7796kB low:9744kB high:11692kB reserved_highatomic:0KB active_anon:1393616kB inactive_anon:469764kB active_file:56kB inactive_file:64kB unevictable:0kB writepending:0kB present:1997392kB managed:1930660kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Dec 10 02:02:08 pve kernel: lowmem_reserve[]: 0 0 13994 13994 13994
Dec 10 02:02:08 pve kernel: Node 0 Normal free:218780kB boost:109788kB min:169508kB low:184436kB high:199364kB reserved_highatomic:0KB active_anon:3076004kB inactive_anon:1492304kB active_file:468kB inactive_file:404kB unevictable:0kB writepending:40kB present:14680064kB managed:14338460kB mlocked:0kB bounce:0kB free_pcp:1112kB local_pcp:0kB free_cma:0kB
Dec 10 02:02:08 pve kernel: lowmem_reserve[]: 0 0 0 0 0
Dec 10 02:02:08 pve kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13312kB
Dec 10 02:02:08 pve kernel: Node 0 DMA32: 1723*4kB (M) 699*8kB (UM) 268*16kB (UM) 155*32kB (UM) 86*64kB (UM) 26*128kB (UM) 17*256kB (M) 11*512kB (M) 5*1024kB (M) 5*2048kB (UM) 1*4096kB (M) = 60004kB
Dec 10 02:02:08 pve kernel: Node 0 Normal: 42679*4kB (UME) 4960*8kB (UME) 525*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 218796kB
Dec 10 02:02:08 pve kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Dec 10 02:02:08 pve kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 10 02:02:08 pve kernel: 10468 total pagecache pages
Dec 10 02:02:08 pve kernel: 0 pages in swap cache
Dec 10 02:02:08 pve kernel: Free swap  = 0kB
Dec 10 02:02:08 pve kernel: Total swap = 0kB
Dec 10 02:02:08 pve kernel: 4173358 pages RAM
Dec 10 02:02:08 pve kernel: 0 pages HighMem/MovableOnly
Dec 10 02:02:08 pve kernel: 102238 pages reserved
Dec 10 02:02:08 pve kernel: 0 pages hwpoisoned
Dec 10 02:02:08 pve kernel: Tasks state (memory values in pages):
Dec 10 02:02:08 pve kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Dec 10 02:02:08 pve kernel: [    885]     0   885    10308      288   110592        0          -250 systemd-journal
Dec 10 02:02:08 pve kernel: [    988]     0   988     6686      503    86016        0         -1000 systemd-udevd
Dec 10 02:02:08 pve kernel: [   1337]   103  1337     1968      160    53248        0             0 rpcbind
Dec 10 02:02:08 pve kernel: [   1348]   101  1348     2293      192    53248        0          -900 dbus-daemon
Dec 10 02:02:08 pve kernel: [   1355]     0  1355    38186      128    61440        0         -1000 lxcfs
Dec 10 02:02:08 pve kernel: [   1356]     0  1356    69538      160    81920        0             0 pve-lxc-syscall
Dec 10 02:02:08 pve kernel: [   1358]     0  1358     2970      480    61440        0             0 smartd
Dec 10 02:02:08 pve kernel: [   1359]     0  1359     1328       64    49152        0             0 qmeventd
Dec 10 02:02:08 pve kernel: [   1361]     0  1361     4157      288    73728        0             0 systemd-logind
Dec 10 02:02:08 pve kernel: [   1362]     0  1362      583       64    40960        0         -1000 watchdog-mux
Dec 10 02:02:08 pve kernel: [   1365]     0  1365    60163      320    94208        0             0 zed
Dec 10 02:02:08 pve kernel: [   1379]     0  1379     1766      115    53248        0             0 ksmtuned
Dec 10 02:02:08 pve kernel: [   1542]     0  1542     1256       96    49152        0             0 lxc-monitord
Dec 10 02:02:08 pve kernel: [   1557]     0  1557     1468       64    49152        0             0 agetty
Dec 10 02:02:08 pve kernel: [   1571]     0  1571     3850      384    69632        0         -1000 sshd
Dec 10 02:02:08 pve kernel: [   1578]   100  1578     4715      202    57344        0             0 chronyd
Dec 10 02:02:08 pve kernel: [   1584]   100  1584     2633      211    53248        0             0 chronyd
Dec 10 02:02:08 pve kernel: [   1794]     0  1794   126542      291   147456        0             0 rrdcached
Dec 10 02:02:08 pve kernel: [   1835]     0  1835   200788    11517   356352        0             0 pmxcfs
Dec 10 02:02:08 pve kernel: [   1930]     0  1930    10664      165    73728        0             0 master
Dec 10 02:02:08 pve kernel: [   1932]   104  1932    10810      192    69632        0             0 qmgr
Dec 10 02:02:08 pve kernel: [   1941]     0  1941     1652      128    57344        0             0 cron
Dec 10 02:02:08 pve kernel: [   1966]     0  1966    37371    22668   282624        0             0 pve-firewall
Dec 10 02:02:08 pve kernel: [   1968]     0  1968    36087    21364   299008        0             0 pvestatd
Dec 10 02:02:08 pve kernel: [   1971]     0  1971      615       96    40960        0             0 bpfilter_umh
Dec 10 02:02:08 pve kernel: [   1996]     0  1996    55825    31563   389120        0             0 pvedaemon
Dec 10 02:02:08 pve kernel: [   1997]     0  1997    55956    31660   397312        0             0 pvedaemon worke
Dec 10 02:02:08 pve kernel: [   1998]     0  1998    55956    31628   397312        0             0 pvedaemon worke
Dec 10 02:02:08 pve kernel: [   2001]     0  2001    55957    31660   397312        0             0 pvedaemon worke
Dec 10 02:02:08 pve kernel: [   2013]     0  2013    52837    26037   356352        0             0 pve-ha-crm
Dec 10 02:02:08 pve kernel: [   2014]    33  2014    56203    31961   409600        0             0 pveproxy
Dec 10 02:02:08 pve kernel: [   2015]    33  2015    56269    31993   413696        0             0 pveproxy worker
Dec 10 02:02:08 pve kernel: [   2016]    33  2016    56269    31993   413696        0             0 pveproxy worker
Dec 10 02:02:09 pve kernel: [   2017]    33  2017    56269    32025   413696        0             0 pveproxy worker
Dec 10 02:02:09 pve kernel: [   2021]    33  2021    19668    12554   176128        0             0 spiceproxy
Dec 10 02:02:09 pve kernel: [   2022]    33  2022    19725    12651   176128        0             0 spiceproxy work
Dec 10 02:02:09 pve kernel: [   2023]     0  2023    52691    25943   352256        0             0 pve-ha-lrm
Dec 10 02:02:09 pve kernel: [   2061]     0  2061  2417585  1543653 13611008        0             0 kvm
Dec 10 02:02:09 pve kernel: [   2298]     0  2298    51612    26201   352256        0             0 pvescheduler
Dec 10 02:02:09 pve kernel: [3245162]     0 3245162    19796       96    53248        0             0 pvefw-logger
Dec 10 02:02:09 pve kernel: [3656585]     0 3656585     2124      129    57344        0             0 cron
Dec 10 02:02:09 pve kernel: [3656586]     0 3656586      644       64    40960        0             0 sh
Dec 10 02:02:09 pve kernel: [3656587]     0 3656587     1733       96    49152        0             0 backup.sh
Dec 10 02:02:09 pve kernel: [3656677]     0 3656677    51763    26013   401408        0             0 qm
Dec 10 02:02:09 pve kernel: [3656678]     0 3656678    53538    26101   368640        0             0 task UPIDve:0
Dec 10 02:02:09 pve kernel: [3708720]   104 3708720    10764      224    69632        0             0 pickup
Dec 10 02:02:09 pve kernel: [3748914]     0 3748914     1366       64    45056        0             0 sleep
Dec 10 02:02:09 pve kernel: [3774475]     0 3774475    53420    26202   360448        0             0 pvescheduler
Dec 10 02:02:09 pve kernel: [3774477]     0 3774477    51612    26170   352256        0             0 pvescheduler
Dec 10 02:02:09 pve kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=cron.service,mems_allowed=0,global_oom,task_memcg=/qemu.slice/100.scope,task=kvm,pid=2061,uid=0
Dec 10 02:02:09 pve kernel: Out of memory: Killed process 2061 (kvm) total-vm:9670340kB, anon-rss:6174228kB, file-rss:384kB, shmem-rss:0kB, UID:0 pgtables:13292kB oom_score_adj:0
Dec 10 02:02:09 pve kernel: fwbr100i0: port 2(tap100i0) entered disabled state
Dec 10 02:02:09 pve kernel: fwbr100i0: port 2(tap100i0) entered disabled state
Dec 10 02:02:05 pve systemd[1]: 100.scope: A process of this unit has been killed by the OOM killer.
Dec 10 02:02:05 pve qm[3656678]: VM 100 qmp command failed - VM 100 not running
Dec 10 02:02:09 pve qmeventd[3776821]: Starting cleanup for 100
Dec 10 02:02:09 pve qmeventd[3776821]: trying to acquire lock...
Dec 10 02:02:05 pve systemd[1]: 100.scope: Failed with result 'oom-kill'.
Dec 10 02:02:05 pve qm[3656678]: VM 100 qmp command failed - VM 100 not running
Dec 10 02:02:05 pve systemd[1]: 100.scope: Consumed 21h 44min 2.890s CPU time.
Dec 10 02:02:05 pve qm[3656678]: VM 100 qmp command failed - VM 100 not running
Dec 10 02:02:05 pve systemd[1]: qemu.slice: A process of this unit has been killed by the OOM killer.
Dec 10 02:02:05 pve qm[3656678]: VM 100 qmp command failed - VM 100 not running
Dec 10 02:02:05 pve qm[3656678]: VM 100 qmp command failed - VM 100 not running
Dec 10 02:02:12 pve qmeventd[3776821]:  OK
Dec 10 02:02:12 pve qm[3656678]: clone failed: block job (mirror) error: VM 100 not running
Dec 10 02:02:12 pve qm[3656677]: <root@pam> end task UPIDve:0037CBE6:08DCB0B0:65750D92:qmclone:100:root@pam: clone failed: block job (mirror) error: VM 100 not running
Dec 10 02:02:12 pve kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
Dec 10 02:02:12 pve kernel: vmbr0: port 2(fwpr100p0) entered disabled state
Dec 10 02:02:12 pve kernel: device fwln100i0 left promiscuous mode
Dec 10 02:02:12 pve kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
Dec 10 02:02:12 pve kernel: device fwpr100p0 left promiscuous mode
Dec 10 02:02:12 pve kernel: vmbr0: port 2(fwpr100p0) entered disabled state
Dec 10 02:02:12 pve qmeventd[3776821]: Finished cleanup for 100
Dec 10 02:02:13 pve CRON[3656585]: pam_unix(cron:session): session closed for user root

where in my ignorance I can read "out of memory" "OOM killer" and VM 100 not running...

Do you know what happened ??

Leesteken: this happened on the host running wd red HDDs, I don't know if both issues are related....
 
Last edited:
And sorry.... pveversion -v :

proxmox-ve: 8.0.1 (running kernel: 6.2.16-3-pve)
pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)
pve-kernel-6.2: 8.0.2
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.0
libpve-access-control: 8.0.3
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.5
libpve-guest-common-perl: 5.0.3
libpve-http-server-perl: 5.0.3
libpve-rs-perl: 0.8.3
libpve-storage-perl: 8.0.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 2.99.0-1
proxmox-backup-file-restore: 2.99.0-1
proxmox-kernel-helper: 8.0.2
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.5
pve-cluster: 8.0.1
pve-container: 5.0.3
pve-docs: 8.0.3
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.2
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.4
pve-qemu-kvm: 8.0.2-3
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1

and qm config 100 :

boot: order=scsi0;ide2;net0
cores: 4
cpu: host
ide2: local:iso/SNG7-PBX16-64bit-2302-1.iso,media=cdrom,size=2375M
memory: 8192
meta: creation-qemu=8.0.2,ctime=1689245019
name: Freepbx
net0: virtio=1A:E1:F2:CF:E8:1C,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-100-disk-0,discard=on,iothread=1,size=50G
scsihw: virtio-scsi-single
smbios1: uuid=7791891e-cbb9-407d-9fd9-b91431702134
sockets: 1
vmgenid: 7c92a789-d6f0-4d2b-9969-52291beb1efc

host machine (hp proliant microserver gen10+) is equipped with 16GB ram
 
Last edited:
Now I got to investigate why VM suddenly stops

Bash:
Dec 10 02:02:09 pve kernel: Out of memory: Killed process 2061 (kvm) total-vm:9670340kB, anon-rss:6174228kB, file-rss:384kB, shmem-rss:0kB, UID:0 pgtables:13292kB oom_score_adj:0
[...]
Dec 10 02:02:05 pve systemd[1]: 100.scope: A process of this unit has been killed by the OOM killer.

PS.: Instead of writing all outputs in bold, it would be way better to put them in code-tags...
 
where in my ignorance I can read "out of memory" "OOM killer" and VM 100 not running...

Do you know what happened ??

memory: 8192
[...]
scsi0: local-zfs:vm-100-disk-0,discard=on,iothread=1,size=50G
host machine (hp proliant microserver gen10+) is equipped with 16GB ram

Installations with an ISO before the PVE 8.1 one defaulted to up to 50% of the host RAM for the ZFS ARC.
See here on how to limit it:
https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
 
Thanks.
I set it to 4GB : as minimum requirement seems to be 2GB base plus 1GB per TB storage, having 500GB storage, I think it could be fine.
 
However, not being an IT engineer nor a master in informatic, from a scratch point of view, it seem at least "weird" to produce an environment that by default says "128 GB ram ? well, 64 are mine" and who cares of Virtual Machines I host (the real needed systems) , they are expendable and can be killed...
 
Last edited:
Not being an IT engineer or a master in informatic, from a scratch point of view, it seem at least "weird" to produce an environment that by default says "128 GB ram ? well, 64 are mine" and who cares of Virtual Machines I host (the real needed systems) , they are expendable and can be killed...
PVE is designed for enterprise usage and it's all in the manual. The choice for using ZFS is by the administrator and ZFS always takes 50% by default, this is not Proxmox specific.
 
However, not being an IT engineer nor a master in informatic, from a scratch point of view, it seem at least "weird" to produce an environment that by default says "128 GB ram ? well, 64 are mine" and who cares of Virtual Machines I host (the real needed systems) , they are expendable and can be killed...

Like @leesteken already said, 50% is simply the (current) default from ZFS on Linux; but Proxmox did change it now, starting with the PVE 8.1 ISO:
Installation ISO
  • The arc_max parameter for installations on ZFS can now be set in the Advanced Options. If not explicitly set by the user, it is set to a value targeting 10% of system memory instead of 50%, which is a better fit for a virtualization workload (issue 4829).
https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_8.1
 
  • Like
Reactions: leesteken