Proxmox VE All Items Unknown State

KingJohn

New Member
Oct 8, 2019
3
0
1
29
Hello all,

This is happening on a fresh install of Proxmox VE. After some time (maybe three or so hours) everything goes into an unknown state. Curiously, live stats on the summary page (such as CPU utilization) are working, along with being able to get into the noVNC console. No odd network config, besides the main adapter being on a VLAN. Thoughts?

chrome_diHRetC66Y.png

Thanks!
 

wolfgang

Proxmox Staff Member
Staff member
Oct 1, 2014
4,958
330
83
Hi,

check if the pvestatd.service is running.
does you have any shared network storage?
If yes and the statd is hanging please check your network and storage.
 

KingJohn

New Member
Oct 8, 2019
3
0
1
29
The service does appear to be hung. Upon attempting to stop the service, it hangs. I do not have any network storage, though I do have one local disk mounted with LUKS where the VM images live. Could this be causing the problem? I'm able to read and write from this disk like normal. All the VM's live on this disk and are responding normally.
 

wolfgang

Proxmox Staff Member
Staff member
Oct 1, 2014
4,958
330
83
If pvestatd .service is hanging it indicates blocking io.
Often this comes from shared network storages, but it can also come from a local disk.
LUKS is not supported in our stack.
So I have no experience with Proxmox VE in combination with LUKS.
You have to debug where it is blocking.
 

KingJohn

New Member
Oct 8, 2019
3
0
1
29
I dug a bit more into this today. I removed the LUKS store to temporarily test. It seems that the /sbin/vgs is hanging, so not related to my LUKS stuff? Process status is 'D' (Uninterruptible IO sleep?) so I'm unable to kill without rebooting the server.

The full vgs start line was
Code:
 /sbin/vgs --separator : --noheadings --units b --unbuffered --nosuffix --options vg_name,vg_size,vg_free,lv_count
/var/log/messages had a ton of this, which is not confidence inspiring:

Code:
Oct 15 15:59:13 vmhost1 kernel: [ 3414.638003] perf: interrupt took too long (2513 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
Oct 15 16:09:46 vmhost1 kernel: [ 4047.826241] perf: interrupt took too long (3175 > 3141), lowering kernel.perf_event_max_sample_rate to 63000
Oct 15 16:25:03 vmhost1 kernel: [ 4965.276211] perf: interrupt took too long (3989 > 3968), lowering kernel.perf_event_max_sample_rate to 50000
Oct 15 16:45:04 vmhost1 kernel: [ 6166.242205] perf: interrupt took too long (4987 > 4986), lowering kernel.perf_event_max_sample_rate to 40000
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559009] vgs             D    0 16935   1344 0x00000000
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559014] Call Trace:
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559033]  __schedule+0x2d4/0x870
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559038]  schedule+0x2c/0x70
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559043]  schedule_timeout+0x258/0x360
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559051]  ? ttwu_do_activate+0x67/0x90
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559056]  wait_for_completion+0xb7/0x140
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559059]  ? wake_up_q+0x80/0x80
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559065]  __flush_work+0x138/0x200
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559068]  ? worker_detach_from_pool+0xb0/0xb0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559072]  ? get_work_pool+0x40/0x40
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559074]  __cancel_work_timer+0x115/0x190
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559083]  ? exact_lock+0x11/0x20
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559087]  cancel_delayed_work_sync+0x13/0x20
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559090]  disk_block_events+0x78/0x80
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559097]  __blkdev_get+0x73/0x550
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559100]  ? bd_acquire+0xd0/0xd0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559103]  blkdev_get+0x10c/0x330
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559106]  ? bd_acquire+0xd0/0xd0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559109]  blkdev_open+0x92/0x100
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559113]  do_dentry_open+0x143/0x3a0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559117]  vfs_open+0x2d/0x30
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559120]  path_openat+0x2d4/0x16d0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559123]  ? filename_lookup.part.60+0xe0/0x170
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559132]  ? strncpy_from_user+0x56/0x1b0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559135]  do_filp_open+0x93/0x100
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559139]  ? strncpy_from_user+0x56/0x1b0
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559145]  ? __alloc_fd+0x46/0x150
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559148]  do_sys_open+0x177/0x280
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559152]  __x64_sys_openat+0x20/0x30
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559159]  do_syscall_64+0x5a/0x110
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559164]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559169] RIP: 0033:0x7f1127cee1ae
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559181] Code: Bad RIP value.
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559182] RSP: 002b:00007ffd677291f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559186] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1127cee1ae
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559188] RDX: 0000000000044000 RSI: 00005583bcc054e0 RDI: 00000000ffffff9c
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559189] RBP: 00007ffd67729350 R08: 00005583bcc3d620 R09: 0000000000000000
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559191] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd6772bedf
Oct 15 16:49:03 vmhost1 kernel: [ 6405.559192] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000




Not sure where to go from here, honestly.

EDIT: Going to assume there's some hardware fault with the drive I installed on. Going to install on another and see what happens.
 
Last edited:

wolfgang

Proxmox Staff Member
Staff member
Oct 1, 2014
4,958
330
83
Yes, it looks like a Disk/Controller failure.
The requested operation does not return.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!