[SOLVED] Hookscript doesn't appear to run on VM starting

ArnaudN

Member
Sep 11, 2022
4
0
6
France
Hello :)

I have a VM running on PVE 8.3.3 with a hookscript setup on it.
The hookscript appears to be running only when the VM shutdowns (either from the VM itself or via PVE webui). As part of the troubleshooting of this, I have replaced my own brewed script with the example PERL script and the same behavior is observed.
On the "System Log" of the node hosting my VM (VMID is 4020) I can only see the output for the "stop" phase as below

Code:
Feb 12 13:19:37 HW-1-NDZ-001-AeosPrime pvedaemon[710948]: start VM 4020: UPID:HW-1-NDZ-001-AeosPrime:000AD924:00E15ED3:67AC91D9:qmstart:4020:root@pam:
Feb 12 13:19:37 HW-1-NDZ-001-AeosPrime pvedaemon[703744]: <root@pam> starting task UPID:HW-1-NDZ-001-AeosPrime:000AD924:00E15ED3:67AC91D9:qmstart:4020:root@pam:
[... Removed Pass-through output info for clarity ...]
Feb 12 13:19:38 HW-1-NDZ-001-AeosPrime systemd[1]: Started 4020.scope.
[... Removed Pass-through output info for clarity ...]
Feb 12 13:19:42 HW-1-NDZ-001-AeosPrime pvedaemon[710948]: VM 4020 started with PID 710982.
[... Removed Pass-through output info for clarity ...]
Feb 12 15:12:06 HW-1-NDZ-001-AeosPrime qmeventd[763273]: Starting cleanup for 4020
Feb 12 15:12:06 HW-1-NDZ-001-AeosPrime qmeventd[763273]: GUEST HOOK: 4020 post-stop
Feb 12 15:12:06 HW-1-NDZ-001-AeosPrime qmeventd[763273]: 4020 stopped. Doing cleanup.
Feb 12 15:12:06 HW-1-NDZ-001-AeosPrime qmeventd[763273]: Finished cleanup for 4020
[... Removed Pass-through output info for clarity ...]
Feb 12 15:12:09 HW-1-NDZ-001-AeosPrime systemd[1]: 4020.scope: Deactivated successfully.
Feb 12 15:12:09 HW-1-NDZ-001-AeosPrime systemd[1]: 4020.scope: Consumed 18min 25.597s CPU time.

The hookscript itself (example script cleaned up from comments):
Perl:
#!/usr/bin/perl

use strict;
use warnings;
print "GUEST HOOK: " . join(' ', @ARGV). "\n";

my $vmid = shift;
my $phase = shift;

if ($phase eq 'pre-start') {
    print "$vmid is starting, doing preparations.\n";
} elsif ($phase eq 'post-start') {
    print "$vmid started successfully.\n";
} elsif ($phase eq 'pre-stop') {
    print "$vmid will be stopped.\n";
} elsif ($phase eq 'post-stop') {
    print "$vmid stopped. Doing cleanup.\n";
} else {
    die "got unknown phase '$phase'\n";
}
exit(0);



I have no idea when it stopped working, I had CPU pining performed for the VM to contain it on one CCD of my CPU. This seems to not apply anymore and it triggering checking the hookscript.

Does any of you have an idea why the "pre-start" and "post-start" are triggering output ?
Is there any additional logs I can check to confirm the behavior of hookscript ?

I'm unsure what additional info you may want, please don't ignore if you are missing something simply tell me and I will provide.

ArnaudN
 
Hi ArnaudN!

The hookscript appears to be running only when the VM shutdowns (either from the VM itself or via PVE webui). As part of the troubleshooting of this, I have replaced my own brewed script with the example PERL script and the same behavior is observed.
On the "System Log" of the node hosting my VM (VMID is 4020) I can only see the output for the "stop" phase as below
What does your actual hook script look like?

I can see where this could be confusing, but the hook script should be run on each of those four phases, but the reason why there's only a syslog item for the "post-stop" phase is because any prints from the other three phases are discarded since they are not forwarded to the syslog. You should see this behavior if you change the print statements, e.g. to write to a file:

Perl:
#!/usr/bin/perl
use strict;
use warnings;

open(FH, '>>', '/foo/bar') or die $!;
print FH "GUEST HOOK: " . join(' ', @ARGV). "\n";

my $vmid = shift;
my $phase = shift;

if ($phase eq 'pre-start') {
    print FH "$vmid is starting, doing preparations.\n";
} elsif ($phase eq 'post-start') {
    print FH "$vmid started successfully.\n";
} elsif ($phase eq 'pre-stop') {
    print FH "$vmid will be stopped.\n";
} elsif ($phase eq 'post-stop') {
    print FH "$vmid stopped. Doing cleanup.\n";
} else {
    die "got unknown phase '$phase'\n";
}
close(FH);
exit(0);

which generates the output

Bash:
# cat /foo/bar
GUEST HOOK: 101 pre-start
101 is starting, doing preparations.
GUEST HOOK: 101 post-start
101 started successfully.
GUEST HOOK: 101 pre-stop
101 will be stopped.
GUEST HOOK: 101 post-stop
101 stopped. Doing cleanup.
 
  • Like
Reactions: ArnaudN
Hi Daniel and thank you for your answer.

The documentation about hookscript doesn't mention any of this (or I missed it a few times) and neither does the sample hookscript. Additionnaly seeing all phases have "print", I assumed I will see these output in System Log the same way does "post-stop" print.

Currently my hookscript is the one "quoted" above from the sample hookscript, I put in place to troubleshoot the behavior. So the current hookscript does absolutely nothing except the different "print".

My original hookscript is supposed to do CPUs pinning and I couldn't see the expected output from the script in System Log and I'm still unsure why the CPU pinning isn't operational if the previous script was running.

I will test writing out to a file as per your recommendation using the sample script for test run and see then if it works as intended. Then adapt my own brewed script accordingly to test further and if all goes well there too, I guess I will need to investigate the CPU pinning separetely.

Be right back with feedback about the above suggestions.
So indeed writing to file definitively shows each stages are running properly and you were right about me being confused about the 4 different phases "not running". Therefore the hookscript isn't the culprit of my CPU pinning issues, I will investigate that further then.

I will mark as Solved if possible. If I cannot figure out my CPU pinning issues I may come back for further assistance on that other topic even if it is border line of support, maybe the community will have some nice inputs to share.

Thanks again Daniel / @dakralex
 
Last edited:
Last quick update, for future readers that maybe coming here for similar hookscript not applying their CPU pinning:
I disabled the hookscript and used the Processors "CPU affinity" option on the VM Hardware configuration view. Applied and restarted the VM.
That works like a charm and confirmed using stress test that all CPU cores/threads are pinned to my wishes, this option seems new/relatively recent I have no recollection of it.
 
I will mark as Solved if possible. If I cannot figure out my CPU pinning issues I may come back for further assistance on that other topic even if it is border line of support, maybe the community will have some nice inputs to share.
Great to hear that it worked out both problems in the end!