Hey everyone,
a few days ago i bought a server at a server action from Hetzner: 5950x, 128GB RAM, 2 x 3.84 TB SSD.
First i tried the 'easy' way without ordering a KVM from them. That would be using their default debian image and then installing PVE on top of that.
First it worked but i saw some instabilites.
After a few crashes i ordered a KVM from them and installed PVE directly from the PVE iso to be on the safe side.
After the install via the ISO i still observed crashes so i opened a ticket with Hetzner. They started a hardware check on their side but it said everything with the server is ok:
After that i tried updating the microcode to the latest version via the community scripts and updating to the latest opt-in kernel 6.14.5-1-bpo12-pve.
The server still keeps crashing.
These are the logs from journalctl before and after one of the crashes (nothing of interest in there imo):
What else can i check?
a few days ago i bought a server at a server action from Hetzner: 5950x, 128GB RAM, 2 x 3.84 TB SSD.
First i tried the 'easy' way without ordering a KVM from them. That would be using their default debian image and then installing PVE on top of that.
First it worked but i saw some instabilites.
After a few crashes i ordered a KVM from them and installed PVE directly from the PVE iso to be on the safe side.
After the install via the ISO i still observed crashes so i opened a ticket with Hetzner. They started a hardware check on their side but it said everything with the server is ok:
Code:
CPU check: OK
CPU 1: OK
Temperature: OK
Clock speed: OK
Memory module check: OK
DIMM 1 `F2EA9ADC`: OK
DIMM 2 `F2EA9AE4`: OK
DIMM 3 `F2EA9D81`: OK
DIMM 4 `F2EA9AE6`: OK
Disk check: OK
NVMe SSD `S64HNE0T346710`: OK
S.M.A.R.T Tests: OK
Error counters: OK
NVMe SSD `S64HNE0T344075`: OK
S.M.A.R.T Tests: OK
Error counters: OK
NIC check: OK
PCI-E NIC ``: OK
Negotiated speed: OK
Error counters: OK
PCI error counters: OK
Stresstest: OK
System log check: OK
After that i tried updating the microcode to the latest version via the community scripts and updating to the latest opt-in kernel 6.14.5-1-bpo12-pve.
The server still keeps crashing.
These are the logs from journalctl before and after one of the crashes (nothing of interest in there imo):
Code:
Jun 21 10:17:01 proxmox-hetzner CRON[1837]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jun 21 10:17:01 proxmox-hetzner CRON[1838]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
Jun 21 10:17:01 proxmox-hetzner CRON[1837]: pam_unix(cron:session): session closed for user root
Jun 21 10:31:33 proxmox-hetzner systemd[1]: Starting systemd-tmpfiles-clean.service - Cleanup of Temporary Directories...
Jun 21 10:31:33 proxmox-hetzner systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
Jun 21 10:31:33 proxmox-hetzner systemd[1]: Finished systemd-tmpfiles-clean.service - Cleanup of Temporary Directories.
Jun 21 10:31:33 proxmox-hetzner systemd[1]: run-credentials-systemd\x2dtmpfiles\x2dclean.service.mount: Deactivated successfully.
-- Boot b9fb281ba8b049ada38f2fd1e0854291 --
Jun 21 11:15:03 proxmox-hetzner kernel: Linux version 6.14.5-1-bpo12-pve (build@proxmox) (gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC PMX 6.14.5-1~bpo12+1 (2025-05-21T15:55Z) ()
Jun 21 11:15:03 proxmox-hetzner kernel: Command line: BOOT_IMAGE=/vmlinuz-6.14.5-1-bpo12-pve root=ZFS=/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet
Jun 21 11:15:03 proxmox-hetzner kernel: KERNEL supported cpus:
Jun 21 11:15:03 proxmox-hetzner kernel: Intel GenuineIntel
Jun 21 11:15:03 proxmox-hetzner kernel: AMD AuthenticAMD
Jun 21 11:15:03 proxmox-hetzner kernel: Hygon HygonGenuine
Jun 21 11:15:03 proxmox-hetzner kernel: Centaur CentaurHauls
Jun 21 11:15:03 proxmox-hetzner kernel: zhaoxin Shanghai
What else can i check?