[SOLVED] Backup causes ZFS: arc_reclaim/arc_prune to nearly freeze system

Yakuraku

Active Member
Sep 24, 2016
69
0
26
39
Hi,

I'm have a backup-job scheduled which will daily backup my system.
Sometimes while running the backup, the system will spawn one arc_reclaim and lots of arc_prune processes. All of these processes will eat up the CPU time and cause the VMs which are still running to run into (timeout) errors of different kinds. Even SSH to the host does have "lags"!

Source and destination are separated ZFS pools (on separated disks) in the same server, so no problems with networking. Both have mirrored SSDs as log devices, the Source has SSDs as cache devices.

The problem will did never resolve itself until I rebooted (only had patience for a few hours)

Code:
root@host:~# pveversion
pve-manager/5.4-7/fc10404a (running kernel: 4.15.18-16-pve)

Here is a screenshot of htop when this happens:
Bildschirmfoto 2019-07-14 um 09.41.21.png


Edit: I just saw that today a scheduled zfs scrub is running on one of the pools. I think that could cause the problem. Is there a trick to enable scrub and backup a the same time (bwlimit on the backup maybe?)
Edit: I killed the backup job, and the zfs arc_reclaim and arc_prune are still on 100% CPU :(
Edit: I stopped scrubbing with zpool scrub -s but the processes still on max CPU :/
 
Last edited:
Why you don't use pve-zsync instead of backup?
qemu bacukps are problematic with zfs in my experience too
 
The reason is quite simple: I'm using the PVE Web UI for simplicity whenever I can, like configure backups.

I have just read a bit of pve-zsync and it only replicate your VMs on a 2nd PVE Server, so it is not a replacement for a backup. Or do I miss something?
 
For the main topic: No advice to to run a Backup-Job (vzdump) on zfs which prevent this possible devastating behavior? If so, there might be a warning for the enduser nice which tells them to use alternatives. In my case, I had to force-stop vms to reboot the server (30min wait for vms shutdown was to much for me), which lead to (minor) ext4 filesystem errors.

About zsync I have some questions:
- Is it a full backup?
- Can it store multiple versions backups?
- Is it planned to add to the web-ui creation, modification and monitoring of (local) zsync tasks?
 
About zsync I have some questions:
- Is it a full backup?

Yes

- Can it store multiple versions backups?

Yes, zsync make a snapshot and does always incremental backups (not the first time of course).
You can keep a number of snapshots wich resides both on source and destination

EDIT: it's always a full backup but it sends only differences, it's very efficient and reliable


- Is it planned to add to the web-ui creation, modification and monitoring of (local) zsync tasks?

If you have a cluster, you can use the replication from the UI, but you can't keep snapshots

I did read for a disaster recovery stragegy from the UI, witch includes pve-zsync I think.. +1 for this featrure :)
 
Hi Yakuraku,

I would like to dig a little further into the actual problem, can you please post the outputs of the following commands:

# cat /proc/spl/kstat/zfs/arcstats
# zfs pool list status
# df -h
 
cat /proc/spl/kstat/zfs/arcstats
Code:
13 1 0x01 96 26112 33681678111 262688631198035
name                            type data
hits                            4    4250996872
misses                          4    285396505
demand_data_hits                4    271067561
demand_data_misses              4    25224980
demand_metadata_hits            4    3971628106
demand_metadata_misses          4    1494089
prefetch_data_hits              4    3613228
prefetch_data_misses            4    252339861
prefetch_metadata_hits          4    4687977
prefetch_metadata_misses        4    6337575
mru_hits                        4    174071806
mru_ghost_hits                  4    3404547
mfu_hits                        4    4073100157
mfu_ghost_hits                  4    176642
deleted                         4    299413094
mutex_miss                      4    3355594086
access_skip                     4    49
evict_skip                      4    47648638885
evict_not_enough                4    4407102136
evict_l2_cached                 4    2114110135296
evict_l2_eligible               4    2226493770240
evict_l2_ineligible             4    212529987584
evict_l2_skip                   4    51753796
hash_elements                   4    65983704
hash_elements_max               4    66759026
hash_collisions                 4    477077001
hash_chains                     4    8359843
hash_chain_max                  4    25
p                               4    1378427617
c                               4    6702746189
c_min                           4    2109579776
c_max                           4    8589934592
size                            4    6607118128
compressed_size                 4    58740224
uncompressed_size               4    202206208
overhead_size                   4    202035200
hdr_size                        4    3885240
data_size                       4    104285696
metadata_size                   4    156489728
dbuf_size                       4    3981488
dnode_size                      4    3787560
bonus_size                      4    1165120
anon_size                       4    739328
anon_evictable_data             4    0
anon_evictable_metadata         4    0
mru_size                        4    234991104
mru_evictable_data              4    49152
mru_evictable_metadata          4    0
mru_ghost_size                  4    0
mru_ghost_evictable_data        4    0
mru_ghost_evictable_metadata    4    0
mfu_size                        4    25044992
mfu_evictable_data              4    0
mfu_evictable_metadata          4    0
mfu_ghost_size                  4    0
mfu_ghost_evictable_data        4    0
mfu_ghost_evictable_metadata    4    0
l2_hits                         4    12736771
l2_misses                       4    272649189
l2_feeds                        4    297209
l2_rw_clash                     4    16
l2_read_bytes                   4    66372276736
l2_write_bytes                  4    493952091136
l2_writes_sent                  4    178724
l2_writes_done                  4    178724
l2_writes_error                 4    0
l2_writes_lock_retry            4    1118
l2_evict_lock_retry             4    0
l2_evict_reading                4    0
l2_evict_l1cached               4    0
l2_free_on_write                4    986513
l2_abort_lowmem                 4    321
l2_cksum_bad                    4    0
l2_io_error                     4    0
l2_size                         4    550822236160
l2_asize                        4    450288787456
l2_hdr_size                     4    6333523296
memory_throttle_count           4    0
memory_direct_count             4    27215
memory_indirect_count           4    14557
memory_all_bytes                4    67506552832
memory_free_bytes               4    15824482304
memory_available_bytes          3    14769692672
arc_no_grow                     4    0
arc_tempreserve                 4    0
arc_loaned_bytes                4    0
arc_prune                       4    1550239706
arc_meta_used                   4    6502832432
arc_meta_limit                  4    6442450944
arc_dnode_limit                 4    644245094
arc_meta_max                    4    6762190216
arc_meta_min                    4    16777216
sync_wait_for_async             4    2822054
demand_hit_predictive_prefetch  4    245042740
arc_need_free                   4    0
arc_sys_free                    4    1054789888

Note on the pools, the system was created before the zfs native encryption was available. So the disks are encrypted with cryptsetup (luks), an the zfs build on top of it.
zpool status
Code:
  pool: pve1.sh-bb.de_tank
 state: ONLINE
  scan: scrub repaired 0B in 22h10m with 0 errors on Wed Jul 17 11:32:14 2019
config:

    NAME                                                           STATE     READ WRITE CKSUM
    pve1.sh-bb.de_tank                                             ONLINE       0     0     0
      mirror-0                                                     ONLINE       0     0     0
        crypt_ata-ST4000VN000-1H4168_Z303WGWV                      ONLINE       0     0     0
        crypt_ata-WDC_WD40EZRZ-22GXCB0_WD-WCC7K7VR4C0E             ONLINE       0     0     0
        crypt_ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E3LLZ611             ONLINE       0     0     0
    logs
      mirror-1                                                     ONLINE       0     0     0
        crypt_ata-Samsung_SSD_750_EVO_250GB_S33SNB0H430526L-part1  ONLINE       0     0     0
        crypt_ata-KINGSTON_SHFS37A240G_50026B725A00B1D1-part1      ONLINE       0     0     0
        crypt_ata-Crucial_CT250MX200SSD1_162112C5FE88-part1        ONLINE       0     0     0
    cache
      crypt_ata-Samsung_SSD_750_EVO_250GB_S33SNB0H430526L-part3    ONLINE       0     0     0
      crypt_ata-Crucial_CT250MX200SSD1_162112C5FE88-part3          ONLINE       0     0     0
      crypt_ata-KINGSTON_SHFS37A240G_50026B725A00B1D1-part3        ONLINE       0     0     0

errors: No known data errors

  pool: pve1_sh-bb_de_storage
 state: ONLINE
  scan: scrub repaired 0B in 20h46m with 0 errors on Sun Jul 14 21:10:05 2019
config:

    NAME                                                           STATE     READ WRITE CKSUM
    pve1_sh-bb_de_storage                                          ONLINE       0     0     0
      mirror-0                                                     ONLINE       0     0     0
        crypt_ata-ST8000AS0002-1NA17Z_Z840KK1Q                     ONLINE       0     0     0
        crypt_ata-ST8000VN0002-1Z8112_ZA11X7L6                     ONLINE       0     0     0
        crypt_ata-WDC_WD80EFZX-68UW8N0_VKKWL5SY                    ONLINE       0     0     0
    logs
      mirror-1                                                     ONLINE       0     0     0
        crypt_ata-Samsung_SSD_750_EVO_250GB_S33SNB0H430526L-part2  ONLINE       0     0     0
        crypt_ata-KINGSTON_SHFS37A240G_50026B725A00B1D1-part2      ONLINE       0     0     0
        crypt_ata-Crucial_CT250MX200SSD1_162112C5FE88-part2        ONLINE       0     0     0

errors: No known data errors

  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 6h54m with 0 errors on Sun Jul 14 11:39:00 2019
config:

    NAME                                                    STATE     READ WRITE CKSUM
    rpool                                                   ONLINE       0     0     0
      mirror-0                                              ONLINE       0     0     0
        crypt_ata-APPLE_HDD_HTS545050A7E362_TNS5193T0BAPNH  ONLINE       0     0     0
        crypt_ata-SAMSUNG_HD500LJ_S0Q3J1NP709057            ONLINE       0     0     0
        crypt_ata-SAMSUNG_HD501LJ_S0MUJ1PP408979            ONLINE       0     0     0

errors: No known data errors

zfs list
Code:
NAME                                          USED  AVAIL  REFER  MOUNTPOINT
pve1.sh-bb.de_tank                            621G  2.91T   112K  /pve1.sh-bb.de_tank
pve1.sh-bb.de_tank/backup                     104K  2.91T   104K  /pve1.sh-bb.de_tank/backup
pve1.sh-bb.de_tank/ct-disks                    96K  2.91T    96K  /pve1.sh-bb.de_tank/ct-disks
pve1.sh-bb.de_tank/vm-disks                   620G  2.91T    96K  /pve1.sh-bb.de_tank/vm-disks
pve1.sh-bb.de_tank/vm-disks/vm-10010-disk-0  1.31G  2.91T  1.31G  -
pve1.sh-bb.de_tank/vm-disks/vm-10011-disk-0  14.0G  2.91T  14.0G  -
pve1.sh-bb.de_tank/vm-disks/vm-10012-disk-0  3.82G  2.91T  3.82G  -
pve1.sh-bb.de_tank/vm-disks/vm-10013-disk-0  2.64G  2.91T  2.64G  -
pve1.sh-bb.de_tank/vm-disks/vm-10021-disk-0    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-10021-disk-1  1.52G  2.91T  1.52G  -
pve1.sh-bb.de_tank/vm-disks/vm-10025-disk-0  20.5G  2.91T  20.5G  -
pve1.sh-bb.de_tank/vm-disks/vm-10026-disk-0  1.29G  2.91T  1.29G  -
pve1.sh-bb.de_tank/vm-disks/vm-10026-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-30021-disk-0  2.45G  2.91T  2.45G  -
pve1.sh-bb.de_tank/vm-disks/vm-30025-disk-0  26.6G  2.91T  26.6G  -
pve1.sh-bb.de_tank/vm-disks/vm-30025-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-30025-disk-2   107G  2.91T   107G  -
pve1.sh-bb.de_tank/vm-disks/vm-40012-disk-1   132M  2.91T   132M  -
pve1.sh-bb.de_tank/vm-disks/vm-40012-disk-2   277G  2.91T   277G  -
pve1.sh-bb.de_tank/vm-disks/vm-40019-disk-0  9.40G  2.91T  9.40G  -
pve1.sh-bb.de_tank/vm-disks/vm-40020-disk-0  2.87G  2.91T  2.87G  -
pve1.sh-bb.de_tank/vm-disks/vm-40021-disk-0  3.68G  2.91T  3.68G  -
pve1.sh-bb.de_tank/vm-disks/vm-50012-disk-0  48.5G  2.91T  48.5G  -
pve1.sh-bb.de_tank/vm-disks/vm-50012-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-50012-disk-2  36.8G  2.91T  36.8G  -
pve1.sh-bb.de_tank/vm-disks/vm-50013-disk-0  25.3G  2.91T  25.3G  -
pve1.sh-bb.de_tank/vm-disks/vm-50013-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-50014-disk-0    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-50014-disk-1  3.25G  2.91T  3.25G  -
pve1.sh-bb.de_tank/vm-disks/vm-50016-disk-0  2.21G  2.91T  2.21G  -
pve1.sh-bb.de_tank/vm-disks/vm-50016-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-50019-disk-0  3.21G  2.91T  3.21G  -
pve1.sh-bb.de_tank/vm-disks/vm-50019-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-50020-disk-0  1.71G  2.91T  1.71G  -
pve1.sh-bb.de_tank/vm-disks/vm-50021-disk-0  2.72G  2.91T  2.72G  -
pve1.sh-bb.de_tank/vm-disks/vm-50021-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-90041-disk-0  11.0G  2.91T  11.0G  -
pve1.sh-bb.de_tank/vm-disks/vm-90041-disk-1    68K  2.91T    68K  -
pve1.sh-bb.de_tank/vm-disks/vm-90041-disk-2   574M  2.91T   574M  -
pve1.sh-bb.de_tank/vm-disks/vm-90041-disk-3  11.4G  2.91T  11.4G  -
pve1_sh-bb_de_storage                        3.99T  3.03T    96K  /pve1_sh-bb_de_storage
pve1_sh-bb_de_storage/backup                 3.99T  3.03T   116K  /pve1_sh-bb_de_storage/backup
pve1_sh-bb_de_storage/backup/archive         13.7G  3.03T  13.7G  /pve1_sh-bb_de_storage/backup/archive
pve1_sh-bb_de_storage/backup/local           3.98T  3.03T  3.98T  /pve1_sh-bb_de_storage/backup/local
pve1_sh-bb_de_storage/backup/zsync             96K  3.03T    96K  /pve1_sh-bb_de_storage/backup/zsync
pve1_sh-bb_de_storage/iso                      96K  3.03T    96K  /pve1_sh-bb_de_storage/iso
rpool                                        2.23G   447G    96K  /
rpool/ROOT                                   2.04G   447G    96K  none
rpool/ROOT/debian                            2.04G   447G  1.76G  /
rpool/home                                    316K   447G    96K  /home
rpool/home/root                               220K   447G   220K  /root
rpool/tmp                                     128K   447G   128K  legacy
rpool/usr                                     232K   447G    96K  /usr
rpool/usr/local                               136K   447G   136K  /usr/local
rpool/var                                     179M   447G    96K  /var
rpool/var/cache                               169M   447G   169M  /var/cache
rpool/var/lib                                  96K   447G    96K  /var/lib
rpool/var/log                                9.31M   447G  9.31M  legacy
rpool/var/spool                               780K   447G   780K  legacy
rpool/var/tmp                                 140K   447G   140K  legacy

df -h
Code:
Filesystem                            Size  Used Avail Use% Mounted on
udev                                   32G     0   32G   0% /dev
tmpfs                                 6.3G   19M  6.3G   1% /run
rpool/ROOT/debian                     450G  1.8G  448G   1% /
tmpfs                                  32G   40M   32G   1% /dev/shm
tmpfs                                 5.0M     0  5.0M   0% /run/lock
tmpfs                                  32G     0   32G   0% /sys/fs/cgroup
/dev/sdn1                             2.0G  248K  2.0G   1% /boot/efi_2
/dev/sdl1                             2.0G  248K  2.0G   1% /boot/efi_3
/dev/sdk1                             1.8G  248K  1.8G   1% /boot/efi
rpool/var/log                         448G  9.4M  448G   1% /var/log
rpool/tmp                             448G  128K  448G   1% /tmp
rpool/var/tmp                         448G  128K  448G   1% /var/tmp
rpool/var/spool                       448G  768K  448G   1% /var/spool
rpool/home                            448G  128K  448G   1% /home
pve1.sh-bb.de_tank                    3.0T  128K  3.0T   1% /pve1.sh-bb.de_tank
pve1.sh-bb.de_tank/backup             3.0T  128K  3.0T   1% /pve1.sh-bb.de_tank/backup
pve1.sh-bb.de_tank/ct-disks           3.0T  128K  3.0T   1% /pve1.sh-bb.de_tank/ct-disks
pve1.sh-bb.de_tank/vm-disks           3.0T  128K  3.0T   1% /pve1.sh-bb.de_tank/vm-disks
rpool/home/root                       448G  256K  448G   1% /root
rpool/usr/local                       448G  128K  448G   1% /usr/local
rpool/var/cache                       448G  169M  448G   1% /var/cache
/dev/fuse                              30M   32K   30M   1% /etc/pve
192.168.90.41:/nfs/vm-isos             98G   11G   83G  12% /mnt/pve/ISOs--nfs.sh-bb.de
tmpfs                                 6.3G     0  6.3G   0% /run/user/0
pve1_sh-bb_de_storage                 3.1T  128K  3.1T   1% /pve1_sh-bb_de_storage
pve1_sh-bb_de_storage/backup          3.1T  128K  3.1T   1% /pve1_sh-bb_de_storage/backup
pve1_sh-bb_de_storage/backup/archive  3.1T   14G  3.1T   1% /pve1_sh-bb_de_storage/backup/archive
pve1_sh-bb_de_storage/backup/local    7.1T  4.0T  3.1T  57% /pve1_sh-bb_de_storage/backup/local
pve1_sh-bb_de_storage/backup/zsync    3.1T  128K  3.1T   1% /pve1_sh-bb_de_storage/backup/zsync
pve1_sh-bb_de_storage/iso             3.1T  128K  3.1T   1% /pve1_sh-bb_de_storage/iso
 
An increase of zfs_arc_max resolved the problem! Thanks
Would you be able to advise on what % of your total memory size you had to increase zfs_arc_max?
As per wiki https://pve.proxmox.com/wiki/ZFS_on_Linux, it says "allocate at least 2 GiB Base + 1 GiB/TiB-Storage". Which is what I've 3GB for 1TB of storage, but I still face this issue, even after making sure zfs_arc_meta_limit > arc_meta_used & zfs_arc_dnode_limit > dnode_size.
I've latest PVE 6.4-13.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!