CPU: AMD R9 5900HX
MEM: 32G
it just freeze randomly, might be in couple of hours or miniutes after boots.
couple days ago, i found it would got 100% freeze, if i move vm disk from one physical disk to another.
but i think i fixed it by changing dirty* options in /etc/sysctl.conf
still get random freeze
all of these seem happen after i added two more disks and setup zfs raidz1
but it still get freezed, even i export the zfs pool, no vm start.
could some one help me? where should i look into?
i have tried:
reset bios settings to default
remove all settings in /etc/modprobe.d and /etc/modules
BIOS reseted, grub reseted and enable kdump
/etc/default/grub
I got kdump enabled, but catch nothing
dmesg got one error, but it seem irrelevant.
MEM: 32G
it just freeze randomly, might be in couple of hours or miniutes after boots.
couple days ago, i found it would got 100% freeze, if i move vm disk from one physical disk to another.
still get random freeze
Code:
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
all of these seem happen after i added two more disks and setup zfs raidz1
but it still get freezed, even i export the zfs pool, no vm start.
could some one help me? where should i look into?
i have tried:
reset bios settings to default
remove all settings in /etc/modprobe.d and /etc/modules
BIOS reseted, grub reseted and enable kdump
/etc/default/grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet nomodeset crashkernel=1024M"
# GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction video=vesafb:off video=efifb:off video=simplefb:off video=vesa:off initcall_blacklist=sysfb_init"
I got kdump enabled, but catch nothing
Code:
dmesg -HT | grep crash
[Sat May 27 10:02:00 2023] Command line: BOOT_IMAGE=/boot/vmlinuz-6.2.6-1-pve root=/dev/mapper/pve-root ro quiet nomodeset crashkernel=1024M crashkernel=384M-:128M
[Sat May 27 10:02:00 2023] Reserving 128MB of memory at 3552MB for crashkernel (System RAM: 30617MB)
[Sat May 27 10:02:00 2023] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.2.6-1-pve root=/dev/mapper/pve-root ro quiet nomodeset crashkernel=1024M crashkernel=384M-:128M
[Sat May 27 10:02:04 2023] pstore: Using crash dump compression: deflate
Code:
ll /var/crash/
total 4.0K
-rw-r--r-- 1 root root 0 May 27 10:02 kdump_lock
-rw-r--r-- 1 root root 276 May 27 10:02 kexec_cmd
dmesg got one error, but it seem irrelevant.
Code:
dmesg -HT | grep error
[Sat May 27 10:02:00 2023] ACPI Error: Aborting method \_SB.GPIO._EVT due to previous error (AE_NOT_EXIST) (20221020/psparse-529)
Code:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 111.8G 0 disk
├─sda1 8:1 0 512M 0 part
└─sda2 8:2 0 111.3G 0 part
sdb 8:16 0 3.6T 0 disk
├─sdb1 8:17 0 16M 0 part
└─sdb2 8:18 0 3.6T 0 part
sdc 8:32 0 10.9T 0 disk
├─sdc1 8:33 0 10.9T 0 part
└─sdc9 8:41 0 8M 0 part
sdd 8:48 0 10.9T 0 disk
├─sdd1 8:49 0 10.9T 0 part
└─sdd9 8:57 0 8M 0 part
sde 8:64 0 10.9T 0 disk
├─sde1 8:65 0 10.9T 0 part
└─sde9 8:73 0 8M 0 part
zd0 230:0 0 14.9T 0 disk
└─zd0p1 230:1 0 14.9T 0 part
nvme1n1 259:0 0 931.5G 0 disk
└─nvme1n1p1 259:1 0 931.5G 0 part /mnt/pve/rc20-1t
nvme0n1 259:2 0 931.5G 0 disk
├─nvme0n1p1 259:3 0 1007K 0 part
├─nvme0n1p2 259:4 0 512M 0 part /boot/efi
└─nvme0n1p3 259:5 0 931G 0 part
├─pve-swap 253:0 0 8G 0 lvm [SWAP]
├─pve-root 253:1 0 96G 0 lvm /
├─pve-data_tmeta 253:2 0 8.1G 0 lvm
│ └─pve-data-tpool 253:4 0 794.8G 0 lvm
│ ├─pve-data 253:5 0 794.8G 1 lvm
│ ├─pve-vm--100--disk--0 253:6 0 256G 0 lvm
│ ├─pve-vm--102--disk--0 253:7 0 64G 0 lvm
│ └─pve-vm--100--disk--1 253:8 0 4M 0 lvm
└─pve-data_tdata 253:3 0 794.8G 0 lvm
└─pve-data-tpool 253:4 0 794.8G 0 lvm
├─pve-data 253:5 0 794.8G 1 lvm
├─pve-vm--100--disk--0 253:6 0 256G 0 lvm
├─pve-vm--102--disk--0 253:7 0 64G 0 lvm
└─pve-vm--100--disk--1 253:8 0 4M 0 lvm
Code:
pveversion -v
proxmox-ve: 7.4-1 (running kernel: 6.2.6-1-pve)
pve-manager: 7.4-3 (running version: 7.4-3/9002ab8a)
pve-kernel-5.15: 7.4-3
pve-kernel-6.2.6-1-pve: 6.2.6-1
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 17.2.6-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-3
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-1
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.6
libpve-storage-perl: 7.4-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.2-1
proxmox-backup-file-restore: 2.4.2-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.7.0
pve-cluster: 7.3-3
pve-container: 4.4-3
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-2
pve-firewall: 4.3-2
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
Last edited: