Node with question mark

Discussion in 'Proxmox VE: Installation and configuration' started by decibel83, Feb 5, 2018.

  1. speatzle_

    speatzle_ New Member

    Joined:
    Nov 4, 2017
    Messages:
    2
    Likes Received:
    0
    So im having the same issue really anoying. I already reinstalled Proxmox 3 times. Found this thread but for me its the pvestatd service which is at fault. After running for about 12 hours this messege shows up in the syslog(var/log/syslog) once every second:

    Jul 7 19:25:53 rocinante pvestatd[1505]: malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/share/perl5/PVE/Tools.pm line 949, <GEN1264905> chunk 1.

    After a bit more time this message gets mixed with it:

    Jul 7 19:26:03 rocinante kernel: [23150.449501] traps: pvestatd[20280] general protection ip:55c07c3cf856 sp:7ffd9899f4b0 error:0 in perl[55c07c2ee000+1e6000]

    and at last this happens:

    Jul 8 04:17:25 rocinante kernel: [55031.916102] pvestatd[1505]: segfault at 7f1dcfac0031 ip 000055c07c3dd32a sp 00007ffd9899f2c0 error 4 in perl[55c07c2ee000+1e6000]
    Jul 8 04:17:25 rocinante systemd[1]: pvestatd.service: Main process exited, code=killed, status=11/SEGV
    Jul 8 04:17:26 rocinante systemd[1]: pvestatd.service: Unit entered failed state.
    Jul 8 04:17:26 rocinante systemd[1]: pvestatd.service: Failed with result 'signal'.

    I noticed this because my external metics server (influxdb) with grafana dosent get updated anymore when this happens and i get alerts on my phone in the middle of the night...

    For now the solution is setting systemd to do this for the pvestatd service:
    Restart=on-failure

    pls mail me if you want my syslog because its bigger than 10,000kb

    -Sammy

    Edit:

    So later that night the pvestatd service failed me again but this time there was nothing in the syslog and systemd thought it was ok until i tried restarting pvestatd. (A new record! It died after only 3h and 40min)

    Jul 9 14:37:29 rocinante pvestatd[19619]: start failed - can't aquire lock '/var/run/pvestatd.pid.lock' - Resource temporarily unavailable

    I got it back up and running but for how long.
    This breaks my temporary fix. If anybody knows even a temporary solution pls tell me.

    -Sammy
     
    #41 speatzle_, Jul 8, 2018
    Last edited: Jul 9, 2018
  2. A71

    A71 New Member

    Joined:
    Feb 22, 2018
    Messages:
    4
    Likes Received:
    0
    same error again for the past month.
    So far; I was only able to (temporarily, as in some days) fix a host by rebooting it.

    Restarting the services as written above solves the problem for only short time (30 minutes / 1 hour )

    Any other idea from the community ?

    My version is
    pveversion
    pve-manager/5.2-2/b1d1c7f4 (running kernel: 4.15.17-3-pve)
     
  3. grobs

    grobs Member

    Joined:
    Apr 1, 2016
    Messages:
    42
    Likes Received:
    0
  4. tuxis

    tuxis Member
    Proxmox Subscriber

    Joined:
    Jan 3, 2014
    Messages:
    31
    Likes Received:
    0
    We had the same issue on a cluster and foud a cause.
    A failing DNS server caused it.

    pvestatd serves stats to the Proxmox gui. In our case we let pvestatd export metrics to graphite.
    There was no DNS at one point. pvestatd could not connect to Graphite and that caused a lot of workers and a the cosmetic question marks in the qui.
    All te vm's were not affected by this issue.
     
  5. chchang

    chchang New Member

    Joined:
    Feb 6, 2018
    Messages:
    25
    Likes Received:
    1
    I have the same problem , but my dns server did not "fail"
    any other suggestions ?
     
  6. albert_a

    albert_a New Member

    Joined:
    Mar 22, 2018
    Messages:
    3
    Likes Received:
    0
    Same problem.
    Restarting services won't help.
    I have some containers on "question" node, and I noticed that `lxc-ls` hangs.
    Starting the container is also hangs.
    The bug is probably in the kernel or in the lxc tools.
     
  7. Stoiko Ivanov

    Stoiko Ivanov Proxmox Staff Member
    Staff Member

    Joined:
    May 2, 2018
    Messages:
    1,202
    Likes Received:
    102
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. Elliott Partridge

    Elliott Partridge New Member

    Joined:
    Oct 7, 2018
    Messages:
    6
    Likes Received:
    1
    I just encountered this symptom, then realized one of my LXC containers had 100% disk space usage (ZFS subvol). I was able to resize the disk in the web GUI, then the node & all running containers/VMs were restored to the normal green "play" indicator. No reboot was necessary.

    Edit:
    I'm adding more info here, in case there is another problem here that's masked by my apparent solution. Nothing that I could see was in syslog/journalctl. I checked dmesg and found the following:

    Code:
    [May30 01:09] CIFS VFS: Server <SMB host IP redacted> has not responded in 120 seconds. Reconnecting...
    [  +0.010734] CIFS VFS: Free previous auth_key.response = <redacted>
    [May30 01:26] INFO: task apache2:28526 blocked for more than 120 seconds.
    [  +0.000823]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000661] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000676] apache2         D    0 28526  32642 0x00000100
    [  +0.000004] Call Trace:
    [  +0.000011]  __schedule+0x3e0/0x870
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000003]  rwsem_down_read_failed+0x10a/0x170
    [  +0.000005]  call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  ? call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  down_read+0x20/0x40
    [  +0.000003]  lookup_slow+0x60/0x170
    [  +0.000002]  ? lookup_fast+0xe8/0x300
    [  +0.000001]  walk_component+0x1c5/0x360
    [  +0.000002]  ? path_init+0x1bd/0x300
    [  +0.000002]  path_lookupat+0x73/0x220
    [  +0.000003]  ? profile_path_perm.part.7+0x78/0xa0
    [  +0.000002]  filename_lookup+0xb8/0x1a0
    [  +0.000004]  ? __check_object_size+0xb3/0x190
    [  +0.000005]  ? strncpy_from_user+0x4d/0x170
    [  +0.000002]  user_path_at_empty+0x36/0x40
    [  +0.000001]  ? user_path_at_empty+0x36/0x40
    [  +0.000004]  vfs_statx+0x76/0xe0
    [  +0.000001]  ? memzero_explicit+0x12/0x20
    [  +0.000002]  SYSC_newstat+0x3d/0x70
    [  +0.000006]  ? __secure_computing+0x3f/0x100
    [  +0.000004]  ? syscall_trace_enter+0xca/0x2e0
    [  +0.000002]  SyS_newstat+0xe/0x10
    [  +0.000002]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000003] RIP: 0033:0x7f7c88efd295
    [  +0.000001] RSP: 002b:00007ffe736f7d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007ffe736f7de0 RCX: 00007f7c88efd295
    [  +0.000001] RDX: 00007ffe736f7d50 RSI: 00007ffe736f7d50 RDI: 00007ffe736f7de0
    [  +0.000001] RBP: 0000000000000002 R08: 000000000000c1de R09: 0000000000000005
    [  +0.000001] R10: 00000000000006c0 R11: 0000000000000246 R12: 00007ffe736f8e00
    [  +0.000001] R13: 000000000000000c R14: 00007ffe736f9040 R15: 00007f7c804244d0
    [May30 01:54] INFO: task apache2:17245 blocked for more than 120 seconds.
    [  +0.000780]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000687] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000689] apache2         D    0 17245  32642 0x00000100
    [  +0.000004] Call Trace:
    [  +0.000010]  __schedule+0x3e0/0x870
    [  +0.000004]  ? path_parentat+0x3e/0x80
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000003]  rwsem_down_write_failed+0x208/0x390
    [  +0.000002]  ? getname_flags+0x4f/0x1f0
    [  +0.000004]  call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  ? call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  down_write+0x2d/0x40
    [  +0.000002]  do_unlinkat+0x1a5/0x310
    [  +0.000002]  SyS_unlink+0x1f/0x30
    [  +0.000004]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000003] RIP: 0033:0x7f7c88eff0e7
    [  +0.000001] RSP: 002b:00007ffe736f8038 EFLAGS: 00000217 ORIG_RAX: 0000000000000057
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007f7c805fede0 RCX: 00007f7c88eff0e7
    [  +0.000001] RDX: 000000000000001a RSI: 00007f7c59623cc8 RDI: 00007ffe736f8040
    [  +0.000001] RBP: 00007ffe736f9100 R08: 000000000000c1de R09: 0000000000000000
    [  +0.000001] R10: 0000000000000000 R11: 0000000000000217 R12: 00007f7c85f36460
    [  +0.000002] R13: 0000000000000010 R14: 00007f7c804244f0 R15: 00007f7c5a929c58
    [May30 03:09] INFO: task apache2:28426 blocked for more than 120 seconds.
    [  +0.000764]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000694] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000909] apache2         D    0 28426  32642 0x00000100
    [  +0.000003] Call Trace:
    [  +0.000011]  __schedule+0x3e0/0x870
    [  +0.000004]  ? path_parentat+0x3e/0x80
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000002]  rwsem_down_write_failed+0x208/0x390
    [  +0.000002]  ? getname_flags+0x4f/0x1f0
    [  +0.000005]  call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  ? call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  down_write+0x2d/0x40
    [  +0.000002]  do_unlinkat+0x1a5/0x310
    [  +0.000002]  SyS_unlink+0x1f/0x30
    [  +0.000004]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000003] RIP: 0033:0x7f7c88eff0e7
    [  +0.000001] RSP: 002b:00007ffe736f8038 EFLAGS: 00000217 ORIG_RAX: 0000000000000057
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007f7c805fede0 RCX: 00007f7c88eff0e7
    [  +0.000001] RDX: 0000000000000000 RSI: 00007f7c58423cc8 RDI: 00007ffe736f8040
    [  +0.000001] RBP: 00007ffe736f9100 R08: 000000000000c1de R09: 0000000000000000
    [  +0.000001] R10: 0000000000000000 R11: 0000000000000217 R12: 00007f7c85f36460
    [  +0.000001] R13: 0000000000000010 R14: 00007f7c804244f0 R15: 00007f7c5a929c58
    [May30 03:21] INFO: task apache2:9497 blocked for more than 120 seconds.
    [  +0.000793]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000745] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000763] apache2         D    0  9497  32642 0x00000100
    [  +0.000003] Call Trace:
    [  +0.000011]  __schedule+0x3e0/0x870
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000003]  rwsem_down_read_failed+0x10a/0x170
    [  +0.000005]  call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  ? call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  down_read+0x20/0x40
    [  +0.000003]  lookup_slow+0x60/0x170
    [  +0.000001]  ? lookup_fast+0xe8/0x300
    [  +0.000002]  walk_component+0x1c5/0x360
    [  +0.000002]  ? path_init+0x1bd/0x300
    [  +0.000001]  path_lookupat+0x73/0x220
    [  +0.000004]  ? profile_path_perm.part.7+0x78/0xa0
    [  +0.000002]  filename_lookup+0xb8/0x1a0
    [  +0.000004]  ? __check_object_size+0xb3/0x190
    [  +0.000004]  ? strncpy_from_user+0x4d/0x170
    [  +0.000002]  user_path_at_empty+0x36/0x40
    [  +0.000002]  ? user_path_at_empty+0x36/0x40
    [  +0.000003]  vfs_statx+0x76/0xe0
    [  +0.000001]  ? memzero_explicit+0x12/0x20
    [  +0.000002]  SYSC_newstat+0x3d/0x70
    [  +0.000006]  ? __secure_computing+0x3f/0x100
    [  +0.000004]  ? syscall_trace_enter+0xca/0x2e0
    [  +0.000002]  SyS_newstat+0xe/0x10
    [  +0.000002]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000002] RIP: 0033:0x7f7c88efd295
    [  +0.000001] RSP: 002b:00007ffe736f7d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007ffe736f7de0 RCX: 00007f7c88efd295
    [  +0.000001] RDX: 00007ffe736f7d50 RSI: 00007ffe736f7d50 RDI: 00007ffe736f7de0
    [  +0.000001] RBP: 0000000000000002 R08: 000000000000c1de R09: 0000000000000005
    [  +0.000001] R10: 00000000000001f8 R11: 0000000000000246 R12: 00007ffe736f8e00
    [  +0.000002] R13: 000000000000000c R14: 00007ffe736f9040 R15: 00007f7c804244d0
    [May30 03:41] INFO: task apache2:11059 blocked for more than 120 seconds.
    [  +0.000815]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000767] apache2         D    0 11059  32642 0x00000100
    [  +0.000004] Call Trace:
    [  +0.000011]  __schedule+0x3e0/0x870
    [  +0.000003]  ? path_parentat+0x3e/0x80
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000003]  rwsem_down_write_failed+0x208/0x390
    [  +0.000005]  call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  ? call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  down_write+0x2d/0x40
    [  +0.000003]  do_unlinkat+0x1a5/0x310
    [  +0.000002]  SyS_unlink+0x1f/0x30
    [  +0.000004]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000003] RIP: 0033:0x7f7c88eff0e7
    [  +0.000001] RSP: 002b:00007ffe736f8038 EFLAGS: 00000217 ORIG_RAX: 0000000000000057
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007f7c805fede0 RCX: 00007f7c88eff0e7
    [  +0.000001] RDX: 0000000000000000 RSI: 00007f7c45223cc8 RDI: 00007ffe736f8040
    [  +0.000001] RBP: 00007ffe736f9100 R08: 000000000000c1de R09: 0000000000000000
    [  +0.000001] R10: 0000000000000000 R11: 0000000000000217 R12: 00007f7c85f36460
    [  +0.000001] R13: 0000000000000010 R14: 00007f7c804254f0 R15: 00007f7c5a929c58
    [  +0.000005] INFO: task apache2:28381 blocked for more than 120 seconds.
    [  +0.000761]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000776] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000767] apache2         D    0 28381  32642 0x00000100
    [  +0.000003] Call Trace:
    [  +0.000003]  __schedule+0x3e0/0x870
    [  +0.000002]  ? path_parentat+0x3e/0x80
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000002]  rwsem_down_write_failed+0x208/0x390
    [  +0.000004]  call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  ? call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  down_write+0x2d/0x40
    [  +0.000001]  do_unlinkat+0x1a5/0x310
    [  +0.000002]  SyS_unlink+0x1f/0x30
    [  +0.000002]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000001] RIP: 0033:0x7f7c88eff0e7
    [  +0.000001] RSP: 002b:00007ffe736f8038 EFLAGS: 00000217 ORIG_RAX: 0000000000000057
    [  +0.000001] RAX: ffffffffffffffda RBX: 00007f7c805fede0 RCX: 00007f7c88eff0e7
    [  +0.000002] RDX: 0000000000000000 RSI: 00007f7c45023cc8 RDI: 00007ffe736f8040
    [  +0.000001] RBP: 00007ffe736f9100 R08: 000000000000c1de R09: 0000000000000000
    [  +0.000001] R10: 0000000000000000 R11: 0000000000000217 R12: 00007f7c85f36460
    [  +0.000001] R13: 0000000000000010 R14: 00007f7c804254f0 R15: 00007f7c5a929c58
    [  +0.000002] INFO: task apache2:28383 blocked for more than 120 seconds.
    [  +0.000760]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000745] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000774] apache2         D    0 28383  32642 0x00000100
    [  +0.000002] Call Trace:
    [  +0.000003]  __schedule+0x3e0/0x870
    [  +0.000011]  ? spl_kmem_cache_alloc+0x72/0x8c0 [spl]
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000002]  rwsem_down_read_failed+0x10a/0x170
    [  +0.000003]  call_rwsem_down_read_failed+0x18/0x30
    [  +0.000001]  ? call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  down_read+0x20/0x40
    [  +0.000002]  lookup_slow+0x60/0x170
    [  +0.000001]  ? lookup_fast+0xe8/0x300
    [  +0.000002]  walk_component+0x1c5/0x360
    [  +0.000002]  ? path_init+0x1bd/0x300
    [  +0.000002]  path_lookupat+0x73/0x220
    [  +0.000002]  ? profile_path_perm.part.7+0x78/0xa0
    [  +0.000002]  filename_lookup+0xb8/0x1a0
    [  +0.000003]  ? __check_object_size+0xb3/0x190
    [  +0.000004]  ? strncpy_from_user+0x4d/0x170
    [  +0.000002]  user_path_at_empty+0x36/0x40
    [  +0.000002]  ? user_path_at_empty+0x36/0x40
    [  +0.000003]  vfs_statx+0x76/0xe0
    [  +0.000002]  ? memzero_explicit+0x12/0x20
    [  +0.000002]  SYSC_newstat+0x3d/0x70
    [  +0.000005]  ? __secure_computing+0x3f/0x100
    [  +0.000002]  ? syscall_trace_enter+0xca/0x2e0
    [  +0.000003]  SyS_newstat+0xe/0x10
    [  +0.000001]  do_syscall_64+0x73/0x130
    [  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000001] RIP: 0033:0x7f7c88efd295
    [  +0.000001] RSP: 002b:00007ffe736f7d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
    [  +0.000001] RAX: ffffffffffffffda RBX: 00007ffe736f7de0 RCX: 00007f7c88efd295
    [  +0.000001] RDX: 00007ffe736f7d50 RSI: 00007ffe736f7d50 RDI: 00007ffe736f7de0
    [  +0.000002] RBP: 0000000000000002 R08: 000000000000c1de R09: 0000000000000005
    [  +0.000001] R10: 0000000000000140 R11: 0000000000000246 R12: 00007ffe736f8e00
    [  +0.000001] R13: 000000000000000c R14: 00007ffe736f9040 R15: 00007f7c804254d0
    [  +0.000007] INFO: task apache2:7623 blocked for more than 120 seconds.
    [  +0.000804]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000803] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000809] apache2         D    0  7623  32642 0x00000100
    [  +0.000002] Call Trace:
    [  +0.000004]  __schedule+0x3e0/0x870
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000002]  rwsem_down_read_failed+0x10a/0x170
    [  +0.000002]  call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  ? call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  down_read+0x20/0x40
    [  +0.000002]  lookup_slow+0x60/0x170
    [  +0.000001]  ? lookup_fast+0xe8/0x300
    [  +0.000002]  walk_component+0x1c5/0x360
    [  +0.000002]  ? path_init+0x1bd/0x300
    [  +0.000001]  path_lookupat+0x73/0x220
    [  +0.000002]  ? profile_path_perm.part.7+0x78/0xa0
    [  +0.000003]  filename_lookup+0xb8/0x1a0
    [  +0.000003]  ? __check_object_size+0xb3/0x190
    [  +0.000002]  ? strncpy_from_user+0x4d/0x170
    [  +0.000002]  user_path_at_empty+0x36/0x40
    [  +0.000001]  ? user_path_at_empty+0x36/0x40
    [  +0.000002]  vfs_statx+0x76/0xe0
    [  +0.000002]  ? memzero_explicit+0x12/0x20
    [  +0.000002]  SYSC_newstat+0x3d/0x70
    [  +0.000002]  ? __secure_computing+0x3f/0x100
    [  +0.000002]  ? syscall_trace_enter+0xca/0x2e0
    [  +0.000003]  SyS_newstat+0xe/0x10
    [  +0.000001]  do_syscall_64+0x73/0x130
    [  +0.000002]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000002] RIP: 0033:0x7f7c88efd295
    [  +0.000000] RSP: 002b:00007ffe736f7d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007ffe736f7de0 RCX: 00007f7c88efd295
    [  +0.000001] RDX: 00007ffe736f7d50 RSI: 00007ffe736f7d50 RDI: 00007ffe736f7de0
    [  +0.000001] RBP: 0000000000000002 R08: 000000000000c1de R09: 0000000000000005
    [  +0.000001] R10: 00000000000004c8 R11: 0000000000000246 R12: 00007ffe736f8e00
    [  +0.000001] R13: 000000000000000c R14: 00007ffe736f9040 R15: 00007f7c804254d0
    [  +0.000003] INFO: task apache2:9497 blocked for more than 120 seconds.
    [  +0.000825]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000834] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000870] apache2         D    0  9497  32642 0x00000100
    [  +0.000003] Call Trace:
    [  +0.000003]  __schedule+0x3e0/0x870
    [  +0.000002]  ? path_parentat+0x3e/0x80
    [  +0.000001]  schedule+0x36/0x80
    [  +0.000002]  rwsem_down_write_failed+0x208/0x390
    [  +0.000002]  ? getname_flags+0x4f/0x1f0
    [  +0.000003]  call_rwsem_down_write_failed+0x17/0x30
    [  +0.000001]  ? call_rwsem_down_write_failed+0x17/0x30
    [  +0.000002]  down_write+0x2d/0x40
    [  +0.000002]  do_unlinkat+0x1a5/0x310
    [  +0.000002]  SyS_unlink+0x1f/0x30
    [  +0.000002]  do_syscall_64+0x73/0x130
    [  +0.000002]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000001] RIP: 0033:0x7f7c88eff0e7
    [  +0.000001] RSP: 002b:00007ffe736f8038 EFLAGS: 00000217 ORIG_RAX: 0000000000000057
    [  +0.000002] RAX: ffffffffffffffda RBX: 00007f7c805fede0 RCX: 00007f7c88eff0e7
    [  +0.000001] RDX: 0000000000000000 RSI: 00007f7c59a23cc8 RDI: 00007ffe736f8040
    [  +0.000001] RBP: 00007ffe736f9100 R08: 000000000000c1de R09: 0000000000000000
    [  +0.000001] R10: 0000000000000000 R11: 0000000000000217 R12: 00007f7c85f36460
    [  +0.000002] R13: 0000000000000010 R14: 00007f7c804244f0 R15: 00007f7c5a929c58
    [  +0.000006] INFO: task apache2:20033 blocked for more than 120 seconds.
    [  +0.000857]       Tainted: P          IO     4.15.18-14-pve #1
    [  +0.000866] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [  +0.000893] apache2         D    0 20033  32642 0x00000100
    [  +0.000002] Call Trace:
    [  +0.000003]  __schedule+0x3e0/0x870
    [  +0.000002]  schedule+0x36/0x80
    [  +0.000002]  rwsem_down_read_failed+0x10a/0x170
    [  +0.000002]  call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  ? call_rwsem_down_read_failed+0x18/0x30
    [  +0.000002]  down_read+0x20/0x40
    [  +0.000002]  lookup_slow+0x60/0x170
    [  +0.000001]  ? lookup_fast+0xe8/0x300
    [  +0.000002]  walk_component+0x1c5/0x360
    [  +0.000001]  ? path_init+0x1bd/0x300
    [  +0.000002]  path_lookupat+0x73/0x220
    [  +0.000002]  ? profile_path_perm.part.7+0x78/0xa0
    [  +0.000002]  filename_lookup+0xb8/0x1a0
    [  +0.000002]  ? __check_object_size+0xb3/0x190
    [  +0.000003]  ? strncpy_from_user+0x4d/0x170
    [  +0.000001]  user_path_at_empty+0x36/0x40
    [  +0.000002]  ? user_path_at_empty+0x36/0x40
    [  +0.000002]  vfs_statx+0x76/0xe0
    [  +0.000001]  ? memzero_explicit+0x12/0x20
    [  +0.000002]  SYSC_newstat+0x3d/0x70
    [  +0.000003]  ? __secure_computing+0x3f/0x100
    [  +0.000002]  ? syscall_trace_enter+0xca/0x2e0
    [  +0.000002]  SyS_newstat+0xe/0x10
    [  +0.000002]  do_syscall_64+0x73/0x130
    [  +0.000002]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [  +0.000001] RIP: 0033:0x7f7c88efd295
    [  +0.000001] RSP: 002b:00007ffe736f7d48 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
    [  +0.000001] RAX: ffffffffffffffda RBX: 00007ffe736f7de0 RCX: 00007f7c88efd295
    [  +0.000001] RDX: 00007ffe736f7d50 RSI: 00007ffe736f7d50 RDI: 00007ffe736f7de0
    [  +0.000001] RBP: 0000000000000002 R08: 000000000000c1de R09: 0000000000000005
    [  +0.000001] R10: 0000000000000478 R11: 0000000000000246 R12: 00007ffe736f8e00
    [  +0.000001] R13: 000000000000000c R14: 00007ffe736f9040 R15: 00007f7c804254d0
    [May30 04:17] EXT4-fs (loop2): mounted filesystem with ordered data mode. Opts: (null)
    [May30 09:06] EXT4-fs (loop1): error count since last fsck: 1
    [  +0.000016] EXT4-fs (loop1): initial error at time 1559037951: kmmpd:178
    [  +0.000004] EXT4-fs (loop1): last error at time 1559037951: kmmpd:178
    The PVE stats are blank between 3AM and 9AM (when I increased the disk size). For some more info, the offending container was running nextcloud, and I was generating image previews overnight. So, it's not really surprising that disk space blew up, but the response of Proxmox was a little concerning.
     
    #48 Elliott Partridge, May 30, 2019
    Last edited: May 30, 2019
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice