I was running the following config:
AMD EPYC 7351P
128 GB RAM (2666 MHz) 8 * 16 GB
Asrock Rack EPYCD8-2T
AMD S7150x2
Adaptec 71605 in HBA mode running 15 4tb seagate 7e8 hdds
ZFS with an Intel 750 series ssd as slog device.
Proxmox 7.2
Recently I decided to upgrade the processor to the AMD 7532 32 core. I installed the processor. Initially the system displayed a blank screen with BIOS code b2 even through the IPMI was detecting the processor and all 8 sticks of ram and pci-e cards, after which I decided to reseat the processor and reset the BIOS.
I was still getting stuck at code b2, so I removed the graphics card and the system booted up. The system booted up fine with the the nvidia p40 as well. I reinstalled the amd firepro s1750x2 and set the oprom mode to legacy and the system booted up.
With only one or two windows vms running and a practically empty zfs pool, I am getting major IO delays and random stuttering and zfs blocked messages.
AMD EPYC 7351P
128 GB RAM (2666 MHz) 8 * 16 GB
Asrock Rack EPYCD8-2T
AMD S7150x2
Adaptec 71605 in HBA mode running 15 4tb seagate 7e8 hdds
ZFS with an Intel 750 series ssd as slog device.
Proxmox 7.2
Recently I decided to upgrade the processor to the AMD 7532 32 core. I installed the processor. Initially the system displayed a blank screen with BIOS code b2 even through the IPMI was detecting the processor and all 8 sticks of ram and pci-e cards, after which I decided to reseat the processor and reset the BIOS.
I was still getting stuck at code b2, so I removed the graphics card and the system booted up. The system booted up fine with the the nvidia p40 as well. I reinstalled the amd firepro s1750x2 and set the oprom mode to legacy and the system booted up.
I got errors initially that reported the the system is out of memory and that zfs stopped rebuilding L2ARC. I removed my cache and log device and it removed this error
With only one or two windows vms running and a practically empty zfs pool, I am getting major IO delays and random stuttering and zfs blocked messages.
PS. I am getting the errors even after replacing the amd s7150x2 with 1070ti.
Dec 12 01:03:30 hoserver kernel: INFO: task txg_sync:1544 blocked for more than 241 seconds.
Dec 12 01:04:50 hoserver kernel: Tainted: P O 5.15.74-1-pve #1
Dec 12 01:04:50 hoserver kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 12 01:04:50 hoserver kernel: task:txg_sync state stack: 0 pid: 1544 ppid: 2 flags:0x00004000
Dec 12 01:04:50 hoserver kernel: Call Trace:
Dec 12 01:04:50 hoserver kernel: <TASK>
Dec 12 01:04:50 hoserver kernel: __schedule+0x34e/0x1740
Dec 12 01:04:50 hoserver kernel: ? lock_timer_base+0x3b/0xd0
Dec 12 01:04:50 hoserver kernel: ? __mod_timer+0x271/0x440
Dec 12 01:04:50 hoserver kernel: schedule+0x69/0x110
Dec 12 01:04:50 hoserver kernel: schedule_timeout+0x87/0x140
Dec 12 01:04:50 hoserver kernel: ? __bpf_trace_tick_stop+0x20/0x20
Dec 12 01:04:50 hoserver kernel: io_schedule_timeout+0x51/0x80
Dec 12 01:04:50 hoserver kernel: __cv_timedwait_common+0x135/0x170 [spl]
Dec 12 01:04:50 hoserver kernel: ? wait_woken+0x70/0x70
Dec 12 01:04:50 hoserver kernel: __cv_timedwait_io+0x19/0x20 [spl]
Dec 12 01:04:50 hoserver kernel: zio_wait+0x137/0x300 [zfs]
Dec 12 01:04:50 hoserver kernel: ? __cond_resched+0x1a/0x50
Dec 12 01:04:50 hoserver kernel: dsl_pool_sync+0xcc/0x4f0 [zfs]
Dec 12 01:04:50 hoserver kernel: ? spa_suspend_async_destroy+0x60/0x60 [zfs]
Dec 12 01:04:50 hoserver kernel: ? add_timer+0x20/0x30
Dec 12 01:04:50 hoserver kernel: spa_sync+0x55a/0x1020 [zfs]
Dec 12 01:04:50 hoserver kernel: ? spa_txg_history_init_io+0x10a/0x120 [zfs]
Dec 12 01:04:50 hoserver kernel: txg_sync_thread+0x278/0x400 [zfs]
Dec 12 01:04:50 hoserver kernel: ? txg_init+0x2c0/0x2c0 [zfs]
Dec 12 01:04:50 hoserver kernel: thread_generic_wrapper+0x64/0x80 [spl]
Dec 12 01:04:50 hoserver kernel: ? __thread_exit+0x20/0x20 [spl]
Dec 12 01:04:50 hoserver kernel: kthread+0x12a/0x150
Dec 12 01:04:50 hoserver kernel: ? set_kthread_struct+0x50/0x50
Dec 12 01:04:50 hoserver kernel: ret_from_fork+0x22/0x30
Dec 12 01:04:50 hoserver kernel: </TASK>
Last edited: