Server hanging: "blocked for more than 120 seconds"

Discussion in 'Proxmox VE: Installation and configuration' started by altano, May 14, 2019.

  1. altano

    altano New Member

    Joined:
    Apr 6, 2019
    Messages:
    6
    Likes Received:
    0
    Out of the blue my server started hanging and I don't know why. When it hangs a few things happen:

    1) The web GUI, ssh, and all remote access becomes totally unresponsive
    2) All my VMs are hung as well
    3) I can't interact with the machine, but the console will have errors that look like:

    upload_2019-5-13_19-38-31.png

    The first part transcribed, for better searchability:

    Code:
         INFO: task kworker/u256:5:279 blocked for more than 120 seconds.
    Tainted: P           0     4.15.18-14-pve #1
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    ...
    I've found a number of other threads where people have SIMILAR error messages, but they are either errors in their guest VMs and not the Proxmox host itself, or seem to be unrelated in other ways.

    Usually when I reboot everything is fine again, so I'm not sure how to diagnose this problem. Can anyone think of anything that might help me figure out what's going on here?

    Thanks!
     
  2. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,444
    Likes Received:
    304
    This is usually a problem with the storage. What is the output of

    # pvesm status
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. altano

    altano New Member

    Joined:
    Apr 6, 2019
    Messages:
    6
    Likes Received:
    0
    Code:
    root@red:~# pvesm status
    Name              Type     Status           Total            Used       Available        %
    local              dir     active        28510260         2562444        24476536    8.99%
    local-lvm      lvmthin     active        62562304               0        62562304    0.00%
    nfsproxmox         nfs     active     13341864960       153466880     13188398080    1.15%
    All my VMs are on "nfsproxmox" which is an NFS share backed by a ZFS pool on another machine. "local" is an M.2 NVME ssd that proxmox is installed onto.

    And thanks for replying!
     
  4. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,444
    Likes Received:
    304
    Above is the output from pvesm when the server starts hanging?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. altano

    altano New Member

    Joined:
    Apr 6, 2019
    Messages:
    6
    Likes Received:
    0
    No it’s the output from right now while it’s running fine.

    I can’t run that command when it’s hung because it’s totally unresponsive by local shell and ssh. The above output is from the local shell but typing doesn’t do anything.
     
  6. altano

    altano New Member

    Joined:
    Apr 6, 2019
    Messages:
    6
    Likes Received:
    0
    Ugh, this is happening every few days. My server is unusable. Anyone have any ideas?
     
  7. pottsi

    pottsi New Member

    Joined:
    Jan 18, 2016
    Messages:
    8
    Likes Received:
    0
    I have the same issue, i can't get error logs as it's a remote server. It's been happening for sometime now. I checked my hosting logs and over a 3 month period i'm rebooting my server on average ever 2.25 days. In my hosting panel it shows online but i'm unable to ssh to the server and all my vm's/containers/GUI are unresponsive
     
  8. altano

    altano New Member

    Joined:
    Apr 6, 2019
    Messages:
    6
    Likes Received:
    0
    I went ~two weeks without this hang and it just happened again. Interestingly, my containers and VMs are running fine. I just can't access the Proxmox web GUI or ssh into the host. Local console is also hung. Anyone have any ideas how I might go about diagnosing this problem WHILE the machine is in this state?

    Once I reboot I can look at some of the logging I turned on but since the machine is semi-usable I figured someone might have some ideas?
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice