Good day.
I have a development PVE host with this configuration.
i7-8700, 64Gb RAM, 2xSATA-10T, 2xNVME-512Gb
OS Installed on the one of the NVME.
Then from PVE GUI was created ZFS Pool from 2xSATA.
And manually added 2nd NVME as log and cache.
When I try to restore backup from a system nvme disk - host just dying under the load.
And at the same time there is no huge speeds with restore process.
Could you please advise what can be done with all this situation? Maybe I have some misconfiguration with the ZFS pool? Or something else?
So, I will be very grateful for any help.
This is what's happened on other VM while restore process is running.
I have a development PVE host with this configuration.
i7-8700, 64Gb RAM, 2xSATA-10T, 2xNVME-512Gb
OS Installed on the one of the NVME.
Then from PVE GUI was created ZFS Pool from 2xSATA.
And manually added 2nd NVME as log and cache.
When I try to restore backup from a system nvme disk - host just dying under the load.
And at the same time there is no huge speeds with restore process.
Could you please advise what can be done with all this situation? Maybe I have some misconfiguration with the ZFS pool? Or something else?
So, I will be very grateful for any help.
Code:
pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.103-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.4: 6.3-7
pve-kernel-helper: 6.3-7
pve-kernel-5.4.103-1-pve: 5.4.103-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
openvswitch-switch: 2.12.3-1
proxmox-backup-client: 1.0.10-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-6
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-8
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.3-pve2
--
zpool status local-zfs
pool: local-zfs
state: ONLINE
scan: scrub repaired 0B in 00:00:01 with 0 errors on Wed Mar 17 16:38:56 2021
config:
NAME STATE READ WRITE CKSUM
local-zfs ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST10000NM0156-2AA111_ZA27Z0GK ONLINE 0 0 0
ata-ST10000NM0156-2AA111_ZA26AQV9 ONLINE 0 0 0
logs
nvme-KXG50ZNV512G_TOSHIBA_58MS101VTYST-part1 ONLINE 0 0 0
cache
nvme-KXG50ZNV512G_TOSHIBA_58MS101VTYST-part2 ONLINE 0 0 0
errors: No known data errors
--
zpool iostat -v
capacity operations bandwidth
pool alloc free read write read write
---------------------------------------------- ----- ----- ----- ----- ----- -----
local-zfs 933G 8.18T 0 456 6.55K 20.0M
mirror 933G 8.18T 0 455 6.55K 19.9M
ata-ST10000NM0156-2AA111_ZA27Z0GK - - 0 134 2.85K 9.97M
ata-ST10000NM0156-2AA111_ZA26AQV9 - - 0 320 3.69K 9.97M
logs - - - - - -
nvme-KXG50ZNV512G_TOSHIBA_58MS101VTYST-part1 564K 69.5G 0 0 6 22.0K
cache - - - - - -
nvme-KXG50ZNV512G_TOSHIBA_58MS101VTYST-part2 407G 194M 2 55 18.0K 5.69M
---------------------------------------------- ----- ----- ----- ----- ----- -----
Total DISK READ: 26.95 M/s | Total DISK WRITE: 56.26 M/s
Current DISK READ: 26.95 M/s | Current DISK WRITE: 125.53 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
14074 be/4 root 0.00 B/s 29.12 M/s 0.00 % 95.35 % vma extract -v -r /var/tmp/vzdumptmp13918.fifo - /var/tmp/vzdumptmp13918
507 be/4 root 0.00 B/s 0.00 B/s 0.00 % 40.18 % [l2arc_feed]
683 be/0 root 0.00 B/s 0.00 B/s 0.00 % 5.74 % [z_wr_int]
13920 be/4 root 26.95 M/s 0.00 B/s 0.00 % 1.18 % zstd -q -d -c /var/lib/vz/dump/vzdump-qemu-131-2021_03_18-21_45_00.vma.zst
678 be/0 root 0.00 B/s 0.00 B/s 0.00 % 1.01 % [z_wr_iss]
677 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.84 % [z_wr_iss]
675 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.82 % [z_wr_iss]
674 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.03 % [z_wr_iss]
687 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.02 % [z_wr_int]
402 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14023 be/0 root 0.00 B/s 866.05 K/s 0.00 % 0.00 % [zvol]
14026 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14028 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14076 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14077 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14078 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14079 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14080 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14081 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14082 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
14083 be/0 root 0.00 B/s 869.85 K/s 0.00 % 0.00 % [zvol]
--
from /var/log/messages
Mar 18 20:46:21 px3 kernel: [72985.361356] kvm D 0 23596 1 0x00000000
Mar 18 20:46:21 px3 kernel: [72985.361435] Call Trace:
Mar 18 20:46:21 px3 kernel: [72985.361511] __schedule+0x2e6/0x700
Mar 18 20:46:21 px3 kernel: [72985.361587] schedule+0x33/0xa0
Mar 18 20:46:21 px3 kernel: [72985.361665] schedule_timeout+0x205/0x330
Mar 18 20:46:21 px3 kernel: [72985.361781] ? zvol_request+0x271/0x2f0 [zfs]
Mar 18 20:46:21 px3 kernel: [72985.361859] ? generic_make_request+0xcf/0x310
Mar 18 20:46:21 px3 kernel: [72985.361936] io_schedule_timeout+0x1e/0x50
Mar 18 20:46:21 px3 kernel: [72985.362013] wait_for_completion_io+0xb7/0x140
Mar 18 20:46:21 px3 kernel: [72985.362091] ? wake_up_q+0x80/0x80
Mar 18 20:46:21 px3 kernel: [72985.362166] submit_bio_wait+0x61/0x90
Mar 18 20:46:21 px3 kernel: [72985.362242] blkdev_issue_flush+0x8e/0xc0
Mar 18 20:46:21 px3 kernel: [72985.362318] blkdev_fsync+0x35/0x50
Mar 18 20:46:21 px3 kernel: [72985.362394] vfs_fsync_range+0x48/0x80
Mar 18 20:46:21 px3 kernel: [72985.362470] ? __fget_light+0x59/0x70
Mar 18 20:46:21 px3 kernel: [72985.362545] do_fsync+0x3d/0x70
Mar 18 20:46:21 px3 kernel: [72985.362622] __x64_sys_fdatasync+0x17/0x20
Mar 18 20:46:21 px3 kernel: [72985.362702] do_syscall_64+0x57/0x190
Mar 18 20:46:21 px3 kernel: [72985.362779] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar 18 20:46:21 px3 kernel: [72985.362857] RIP: 0033:0x7f30cff112e7
Mar 18 20:46:21 px3 kernel: [72985.362935] Code: Bad RIP value.
Mar 18 20:46:21 px3 kernel: [72985.363009] RSP: 002b:00007f30c09fbcc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
Mar 18 20:46:21 px3 kernel: [72985.363104] RAX: ffffffffffffffda RBX: 000000000000001a RCX: 00007f30cff112e7
Mar 18 20:46:21 px3 kernel: [72985.363186] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000001a
Mar 18 20:46:21 px3 kernel: [72985.363269] RBP: 0000558fb7b027f0 R08: 0000000000000000 R09: 00007f30c09fbc40
Mar 18 20:46:21 px3 kernel: [72985.363352] R10: 000000006053ad6e R11: 0000000000000293 R12: 0000558fb58a47da
Mar 18 20:46:21 px3 kernel: [72985.363433] R13: 0000558fb7b02858 R14: 0000558fb9132130 R15: 0000558fb8e7c000
This is what's happened on other VM while restore process is running.