1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

reboots hang with "watchdog did not stop"

Discussion in 'Proxmox VE: Installation and configuration' started by RobFantini, Jan 8, 2017.

  1. RobFantini

    RobFantini Active Member
    Proxmox VE Subscriber

    Joined:
    May 24, 2012
    Messages:
    1,186
    Likes Received:
    7
    Hello

    at least since mid December 2016 3 of 4 nodes take a long time to reboot. when here I do a manual reset.

    at system console there is this: [ may not be exact]

    Code:
    watchdog  watchdog0: watchdog did not stop!
    
    as far as I know the system will eventually restart after what seems like a long time.

    version info :
    Code:
    # pveversion --verbose
    proxmox-ve: 4.4-77 (running kernel: 4.4.35-1-pve)
    pve-manager: 4.4-5 (running version: 4.4-5/c43015a5)
    pve-kernel-4.4.35-1-pve: 4.4.35-77
    lvm2: 2.02.116-pve3
    corosync-pve: 2.4.0-1
    libqb0: 1.0-1
    pve-cluster: 4.0-48
    qemu-server: 4.0-102
    pve-firmware: 1.1-10
    libpve-common-perl: 4.0-85
    libpve-access-control: 4.0-19
    libpve-storage-perl: 4.0-71
    pve-libspice-server1: 0.12.8-1
    vncterm: 1.2-1
    pve-docs: 4.4-1
    pve-qemu-kvm: 2.7.0-10
    pve-container: 1.0-90
    pve-firewall: 2.0-33
    pve-ha-manager: 1.0-38
    ksm-control-daemon: 1.2-1
    glusterfs-client: 3.5.2-2+deb8u2
    lxc-pve: 2.0.6-5
    lxcfs: 2.0.5-pve2
    criu: 1.6.0-1
    novnc-pve: 0.5-8
    smartmontools: 6.5+svn4324-1~pve80
    zfsutils: 0.6.5.8-pve13~bpo80
    ceph: 10.2.5-1~bpo80+1
    
     
  2. fireon

    fireon Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Oct 25, 2010
    Messages:
    1,632
    Likes Received:
    59
    Same Problem here, same Systemversion.
     
  3. fireon

    fireon Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Oct 25, 2010
    Messages:
    1,632
    Likes Received:
    59
  4. mir

    mir Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Apr 14, 2012
    Messages:
    3,368
    Likes Received:
    79
    I see you are not running latest kernel. Maybe a kernel upgrade will resolve the issue.
     
  5. fireon

    fireon Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Oct 25, 2010
    Messages:
    1,632
    Likes Received:
    59
    Did not. It is really depending on hardware. Never had this on new DELL Servers, but on every new supermicro with zfs, and sometimes some HP ML350.
     
  6. Kevo

    Kevo New Member

    Joined:
    May 7, 2017
    Messages:
    9
    Likes Received:
    0
    I'm also having this issue on my new v5 installs. Is there any solution/workaround for this yet? It really adds a significant delay to reboots.
     
  7. fireon

    fireon Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Oct 25, 2010
    Messages:
    1,632
    Likes Received:
    59
    Same sometimes here. So... NOT FIXED!
     
  8. Kevo

    Kevo New Member

    Joined:
    May 7, 2017
    Messages:
    9
    Likes Received:
    0
    I've modified the systemd config file so the watchdog timeout is 10 seconds. That seems to limit the delay to about 10 seconds, so I'm going to use that as a work around for now.
     
  9. fireon

    fireon Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Oct 25, 2010
    Messages:
    1,632
    Likes Received:
    59
    Maybe this is bad when i have an Cluster?
     
  10. rwadi

    rwadi New Member

    Joined:
    Aug 5, 2013
    Messages:
    8
    Likes Received:
    0
    We are all seeing this on new v5 installs. Updates are coming from the enterprise repo and we are fully updated.

    Rebooting a node sites on watchdog watchdog0: watchdog did not stop! for several minutes before the host reboots.
     

Share This Page