RAM consumption (Cannot allocate memory)

borisko · Jul 30, 2019

Hi everyone,

First of, I am well aware of the famous "Linux ate my RAM" page and I post here because my issue makes no sense.

I am also aware of the usual RAM requirements of 1GB per TB or storage for ZFS, but again, seems a bit odd in my situation.

I have 5 VMs on my server, of which only ONE is started (in bold, I named them below for simpler explanation). All others are off.

2 MacOS VMs:
(off) Template Macos, 8GB Ram, 60GB disc
(on) Work Macos, 32GB Ram, 900GB disc

3 Windows VMs:
(off) Template Win10, 8GB RAM, 60GB disc
(off) Test Win10, 2GB RAM, 60GB disc
(off) Gaming Win10, 32GB RAM, 60GB disc

When trying to start the "Test" one of 2GB, I get the classic "Cannot allocate memory"

Code:

kvm: cannot set up guest memory 'pc.ram': Cannot allocate memory

Complete error at the bottom of this post.

I have a total of 64GB of RAM on this system, with a Raidz1 pool of 3x1TB hard drives. Most of the storage of my mac vm is on a NVMe drive that I passed-through, so not much storage overall so far on the node's pool, 400GB allocated total accross the node.

How is all my RAM used, and why (or how?) can't I recover some to open a tiny 2GB vm on the go? Is there something I am missing here?

3GB for storage
8GB for Hypervisor
32GB for running VM
----------
We're at 43GB max, I should have at least 10 to 15GB left.

Should I just throw a spare consumer 120GB and add it as a L2ARC/ZIL drive? From what I understood, I shouldn't really need such.

Or should I just assign only very small amounts of RAM to other VMs even when they're off, and eventually bump up their ram when I need it before starting it? Is that why all the ram is considered used in that situation? I might have completely overseen that parameter, but I don't understand how I could run out of RAM or how the system would consider that used.

I understand the principles of overcommitting though, so if that is normal behavior, fine by me, I can adjust all RAM sizes of my unused VMs and bump it up before starting them whenever needed,

Sorry in advance if that's a stupid question.

Complete Task error below:

Code:

TASK ERROR: start failed: command '/usr/bin/kvm -id 101 -name testwin -chardev 'socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/101.pid -daemonize -smbios 'type=1,uuid=2c337003-cde9-4508-9326-38c660d9e99d' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,file=/dev/zvol/rpool/data/vm-101-disk-0' -smp '2,sockets=1,cores=2,maxcpus=2' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/101.vnc,x509,password -no-hpet -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,hv_synic,hv_stimer' -m 2048 -device 'vmgenid,guid=c262f99d-01d7-4312-bdd2-0ab82ec0f654' -readconfig /usr/share/qemu-server/pve-q35.cfg -device 'usb-host,vendorid=0x1b1c,productid=0x0c19,id=usb0' -device 'usb-host,vendorid=0x1b1c,productid=0x0c0b,id=usb1' -device 'qxl-vga,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/101.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -spice 'tls-port=61000,addr=127.0.0.1,tls-ciphers=HIGH,seamless-migration=on' -device 'virtio-serial,id=spice,bus=pci.0,addr=0x9' -chardev 'spicevmc,id=vdagent,name=vdagent' -device 'virtserialport,chardev=vdagent,name=com.redhat.spice.0' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:8b9e5279db' -drive 'file=/dev/zvol/rpool/data/vm-101-disk-1,if=none,id=drive-ide0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,rotation_rate=1,bootindex=100' -drive 'file=/rpool/iso/template/iso/Win10_1903_V1_EnglishInternational_x64.iso,if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -netdev 'type=tap,id=net0,ifname=tap101i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'e1000,mac=FA:78:25:78:9F:B6,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -rtc 'driftfix=slew,base=localtime' -machine 'type=q35' -global 'kvm-pit.lost_tick_policy=discard'' failed: exit code 1

LnxBil · Jul 30, 2019

That sounds really strange. Could you please post the output of free -g in CODE tags?

spirit · Jul 31, 2019

Do you have tuned zfs to not eat half of your memory by default ?

borisko · Aug 4, 2019

Hi, sorry for my late reply, I couldn't replicate the issue until now. Seems fine when I restart the server, but after a day or 2 of usage without turning it off, the problem appears again.

Here is the output of free -g

Code:

root@serv0:~# free -g
              total        used        free      shared  buff/cache   available
Mem:             62          61           1           0           0           0
Swap:             0           0           0

@spirit: I didn't do any particular zfs configuration no, is there a detailed guide on how to do this that I could follow without risking of compromising my current data and setup?

Let me know if I should run more tests on this, and thanks for your help as always!

LnxBil · Aug 4, 2019

borisko said:
Here is the output of free -g

Code:

root@serv0:~# free -g total used free shared buff/cache available Mem: 62 61 1 0 0 0 Swap: 0 0 0

We now know that you actually run out of memory. Normally, your machine should reclaim ZFS ARC space if it needs it so that it does not run out of memory. Could you please also post the output of arcstat?

borisko said:
@spirit: I didn't do any particular zfs configuration no, is there a detailed guide on how to do this that I could follow without risking of compromising my current data and setup?

Just follow the guide:

https://pve.proxmox.com/wiki/ZFS_on_Linux#_limit_zfs_memory_usage

borisko · Aug 4, 2019

Thanks for your help @LnxBil .

Here is the out put

Code:

root@serv0:~# arcstat
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c 
20:46:28    34     0      0     0    0     0    0     0    0    26G   26G

I will study the wiki you linked to and see if I can sort it out that way. Any other suggestion is welcome though!

borisko · Aug 4, 2019

On a side note, as long as I have roughly 2-3TB of disc space max on the system, giving zfs a max of 8GB should be plenty enough, correct? I don't have enough experience with ZFS to know what is acceptable or recommended, only info I found online about the 1GB memory = 1TB disc space rule. Thanks !

EDIT: Full arc summary if needed below:

Code:

root@serv0:~# arc_summary

------------------------------------------------------------------------
ZFS Subsystem Report                Sun Aug 04 20:55:07 2019
ARC Summary: (HEALTHY)
    Memory Throttle Count:            0

ARC Misc:
    Deleted:                1.12M
    Mutex Misses:                2
    Evict Skips:                5

ARC Size:                85.02%    26.71    GiB
    Target Size: (Adaptive)        85.05%    26.72    GiB
    Min Size (Hard Limit):        6.25%    1.96    GiB
    Max Size (High Water):        16:1    31.42    GiB

ARC Size Breakdown:
    Recently Used Cache Size:    91.18%    23.09    GiB
    Frequently Used Cache Size:    8.82%    2.23    GiB

ARC Hash Breakdown:
    Elements Max:                4.44M
    Elements Current:        99.73%    4.43M
    Collisions:                4.35M
    Chain Max:                7
    Chains:                    841.46k

ARC Total accesses:                    38.77M
    Cache Hit Ratio:        94.89%    36.79M
    Cache Miss Ratio:        5.11%    1.98M
    Actual Hit Ratio:        94.62%    36.69M

    Data Demand Efficiency:        93.05%    16.80M
    Data Prefetch Efficiency:    60.08%    185.08k

    CACHE HITS BY CACHE LIST:
      Anonymously Used:        0.13%    48.71k
      Most Recently Used:        25.27%    9.30M
      Most Frequently Used:        74.44%    27.39M
      Most Recently Used Ghost:    0.02%    7.37k
      Most Frequently Used Ghost:    0.14%    49.71k

    CACHE HITS BY DATA TYPE:
      Demand Data:            42.48%    15.63M
      Prefetch Data:        0.30%    111.19k
      Demand Metadata:        57.13%    21.02M
      Prefetch Metadata:        0.08%    31.17k

    CACHE MISSES BY DATA TYPE:
      Demand Data:            58.88%    1.17M
      Prefetch Data:        3.73%    73.89k
      Demand Metadata:        37.12%    735.60k
      Prefetch Metadata:        0.28%    5.46k


DMU Prefetch Efficiency:                    10.20M
    Hit Ratio:            2.61%    265.77k
    Miss Ratio:            97.39%    9.93M



ZFS Tunables:
    dbuf_cache_hiwater_pct                            10
    dbuf_cache_lowater_pct                            10
    dbuf_cache_max_bytes                              104857600
    dbuf_cache_max_shift                              5
    dmu_object_alloc_chunk_shift                      7
    ignore_hole_birth                                 1
    l2arc_feed_again                                  1
    l2arc_feed_min_ms                                 200
    l2arc_feed_secs                                   1
    l2arc_headroom                                    2
    l2arc_headroom_boost                              200
    l2arc_noprefetch                                  1
    l2arc_norw                                        0
    l2arc_write_boost                                 8388608
    l2arc_write_max                                   8388608
    metaslab_aliquot                                  524288
    metaslab_bias_enabled                             1
    metaslab_debug_load                               0
    metaslab_debug_unload                             0
    metaslab_fragmentation_factor_enabled             1
    metaslab_lba_weighting_enabled                    1
    metaslab_preload_enabled                          1
    metaslabs_per_vdev                                200
    send_holes_without_birth_time                     1
    spa_asize_inflation                               24
    spa_config_path                                   /etc/zfs/zpool.cache
    spa_load_verify_data                              1
    spa_load_verify_maxinflight                       10000
    spa_load_verify_metadata                          1
    spa_slop_shift                                    5
    zfetch_array_rd_sz                                1048576
    zfetch_max_distance                               8388608
    zfetch_max_streams                                8
    zfetch_min_sec_reap                               2
    zfs_abd_scatter_enabled                           1
    zfs_abd_scatter_max_order                         10
    zfs_admin_snapshot                                1
    zfs_arc_average_blocksize                         8192
    zfs_arc_dnode_limit                               0
    zfs_arc_dnode_limit_percent                       10
    zfs_arc_dnode_reduce_percent                      10
    zfs_arc_grow_retry                                0
    zfs_arc_lotsfree_percent                          10
    zfs_arc_max                                       0
    zfs_arc_meta_adjust_restarts                      4096
    zfs_arc_meta_limit                                0
    zfs_arc_meta_limit_percent                        75
    zfs_arc_meta_min                                  0
    zfs_arc_meta_prune                                10000
    zfs_arc_meta_strategy                             1
    zfs_arc_min                                       0
    zfs_arc_min_prefetch_lifespan                     0
    zfs_arc_p_dampener_disable                        1
    zfs_arc_p_min_shift                               0
    zfs_arc_pc_percent                                0
    zfs_arc_shrink_shift                              0
    zfs_arc_sys_free                                  0
    zfs_autoimport_disable                            1
    zfs_checksums_per_second                          20
    zfs_compressed_arc_enabled                        1
    zfs_dbgmsg_enable                                 0
    zfs_dbgmsg_maxsize                                4194304
    zfs_dbuf_state_index                              0
    zfs_deadman_checktime_ms                          5000
    zfs_deadman_enabled                               1
    zfs_deadman_synctime_ms                           1000000
    zfs_dedup_prefetch                                0
    zfs_delay_min_dirty_percent                       60
    zfs_delay_scale                                   500000
    zfs_delays_per_second                             20
    zfs_delete_blocks                                 20480
    zfs_dirty_data_max                                4294967296
    zfs_dirty_data_max_max                            4294967296
    zfs_dirty_data_max_max_percent                    25
    zfs_dirty_data_max_percent                        10
    zfs_dirty_data_sync                               67108864
    zfs_dmu_offset_next_sync                          0
    zfs_expire_snapshot                               300
    zfs_flags                                         0
    zfs_free_bpobj_enabled                            1
    zfs_free_leak_on_eio                              0
    zfs_free_max_blocks                               100000
    zfs_free_min_time_ms                              1000
    zfs_immediate_write_sz                            32768
    zfs_max_recordsize                                1048576
    zfs_mdcomp_disable                                0
    zfs_metaslab_fragmentation_threshold              70
    zfs_metaslab_segment_weight_enabled               1
    zfs_metaslab_switch_threshold                     2
    zfs_mg_fragmentation_threshold                    85
    zfs_mg_noalloc_threshold                          0
    zfs_multihost_fail_intervals                      5
    zfs_multihost_history                             0
    zfs_multihost_import_intervals                    10
    zfs_multihost_interval                            1000
    zfs_multilist_num_sublists                        0
    zfs_no_scrub_io                                   0
    zfs_no_scrub_prefetch                             0
    zfs_nocacheflush                                  0
    zfs_nopwrite_enabled                              1
    zfs_object_mutex_size                             64
    zfs_pd_bytes_max                                  52428800
    zfs_per_txg_dirty_frees_percent                   30
    zfs_prefetch_disable                              0
    zfs_read_chunk_size                               1048576
    zfs_read_history                                  0
    zfs_read_history_hits                             0
    zfs_recover                                       0
    zfs_recv_queue_length                             16777216
    zfs_resilver_delay                                2
    zfs_resilver_min_time_ms                          3000
    zfs_scan_idle                                     50
    zfs_scan_ignore_errors                            0
    zfs_scan_min_time_ms                              1000
    zfs_scrub_delay                                   4
    zfs_send_corrupt_data                             0
    zfs_send_queue_length                             16777216
    zfs_sync_pass_deferred_free                       2
    zfs_sync_pass_dont_compress                       5
    zfs_sync_pass_rewrite                             2
    zfs_sync_taskq_batch_pct                          75
    zfs_top_maxinflight                               32
    zfs_txg_history                                   0
    zfs_txg_timeout                                   5
    zfs_vdev_aggregation_limit                        131072
    zfs_vdev_async_read_max_active                    3
    zfs_vdev_async_read_min_active                    1
    zfs_vdev_async_write_active_max_dirty_percent     60
    zfs_vdev_async_write_active_min_dirty_percent     30
    zfs_vdev_async_write_max_active                   10
    zfs_vdev_async_write_min_active                   2
    zfs_vdev_cache_bshift                             16
    zfs_vdev_cache_max                                16384
    zfs_vdev_cache_size                               0
    zfs_vdev_max_active                               1000
    zfs_vdev_mirror_non_rotating_inc                  0
    zfs_vdev_mirror_non_rotating_seek_inc             1
    zfs_vdev_mirror_rotating_inc                      0
    zfs_vdev_mirror_rotating_seek_inc                 5
    zfs_vdev_mirror_rotating_seek_offset              1048576
    zfs_vdev_queue_depth_pct                          1000
    zfs_vdev_raidz_impl                               [fastest] original scalar sse2 ssse3 avx2
    zfs_vdev_read_gap_limit                           32768
    zfs_vdev_scheduler                                noop
    zfs_vdev_scrub_max_active                         2
    zfs_vdev_scrub_min_active                         1
    zfs_vdev_sync_read_max_active                     10
    zfs_vdev_sync_read_min_active                     10
    zfs_vdev_sync_write_max_active                    10
    zfs_vdev_sync_write_min_active                    10
    zfs_vdev_write_gap_limit                          4096
    zfs_zevent_cols                                   80
    zfs_zevent_console                                0
    zfs_zevent_len_max                                256
    zil_replay_disable                                0
    zil_slog_bulk                                     786432
    zio_delay_max                                     30000
    zio_dva_throttle_enabled                          1
    zio_requeue_io_start_cut_in_line                  1
    zio_taskq_batch_pct                               75
    zvol_inhibit_dev                                  0
    zvol_major                                        230
    zvol_max_discard_blocks                           16384
    zvol_prefetch_bytes                               131072
    zvol_request_sync                                 0
    zvol_threads                                      32
    zvol_volmode                                      1

borisko · Aug 4, 2019

If needed, I can get my hands on a spare 128GB SSD to use as cache it that would help with the whole thing in your opinion (Can I add L2ARC cache after the fact ?). I am quite happy of the performance of the whole setup so far despite that little issue, so I don't mind adding to it to make it better instead of crippling it, if that makes any sense.

LnxBil · Aug 7, 2019

borisko said:
Can I add L2ARC cache after the fact ?

Yes

borisko said:
If needed, I can get my hands on a spare 128GB SSD to use as cache it that would help with the whole thing in your opinion

Adding L2ARC reduces ARC so it may not what you want.

borisko said:
I am quite happy of the performance of the whole setup so far despite that little issue, so I don't mind adding to it to make it better instead of crippling it, if that makes any sense.

What about adding more RAM?

borisko · Aug 10, 2019

Indeed L2ARC seems counter-productive knowing that my data is on a 3 SSDs raidz1 pool, I dont see how the additional ssd would help, or I got this wrong?

As for RAM, at this point even if my motherboard supports up to 128GB (4 sockets), i'm currently decked at 64GB, and I don't really see how I would need more for a single VM and a small windows machine on the side. I will rarely need more than those two up at any time.

How much ram do you think does ZFS need for my config? 8GB is fine or I should go for 16GB?

LnxBil · Aug 11, 2019

borisko said:
How much ram do you think does ZFS need for my config? 8GB is fine or I should go for 16GB?

Generally speaking, the smaller your ARC, the slower your system. You ask if that performance is sufficient for you, I do not know :-D

I'm running ZFS on boxes with only 4 GB of RAM and I also ran ZFS on a raspberry pi with only 1 GB or RAM. It all works, but do not expect miracles from the performance. IMHO, if you have a pool of only SSDs, the arc does not have to be soo big, for spindles, this is totally different matter.

RolandK · Mar 12, 2020

qemu/kvm memory allocation issue.

workaround: add swap

see : https://bugzilla.proxmox.com/show_bug.cgi?id=2157
see: https://github.com/openzfs/zfs/issues/6427

Search

Search

RAM consumption (Cannot allocate memory)

borisko

Member

LnxBil

Distinguished Member

spirit

Distinguished Member

borisko

Member

LnxBil

Distinguished Member

borisko

Member

borisko

Member

borisko

Member

LnxBil

Distinguished Member

borisko

Member

LnxBil

Distinguished Member

RolandK

Renowned Member