Notes |
|
(0006342)
|
ja
|
2009-06-26 14:52
|
|
Did I understand it right, that in your config file the pid_file directive is missing?
Then, this should do the trick in /var/opt/csw/svc/method/svc-cswnrpe, do you agree?
'stop')
if [ -f "$pidfile" ]; then
[ -n "`pgrep -x -u 0,1,$NRPE_USER nrpe`" ] && /usr/bin/kill `head -1 $pidfile`
rm "$pidfile"
else
/usr/bin/kill `pgrep -x -u 0,1,$NRPE_USER nrpe`
fi
;; |
|
|
(0006344)
|
gadavis
|
2009-06-27 01:07
|
|
The restart function looks like it would still be broken, but that seems like it will work.
Now that I look at things closer, I would almost consider treating a configuration file without a pid_file declared to be an error on Solaris 10 and higher because pgrep will find multiple pids if it is run in the global zone and there are non-global zones running nrpe as well. As it currently stands, the script will attempt to kill all of them if it is run without a pidfile.
You might also want to replace lines 32 and 33 with:
pidfile=`awk -F'=' '/^[ \t]*pid_file/ {print $2}' $CONFIG_FILE`
NRPE_USER=`awk -F'=' '/^[ \t]*nrpe_user/ { print $2 }' $CONFIG_FILE`
This fixes a couple of problems with spaces at the beginning of the line for both config options and commented out pid_file lines |
|
|
(0006345)
|
ja
|
2009-06-27 12:39
|
|
Good point with the zones! What do you think about this?
'stop')
# remove pid file
if [ -f "$pidfile" ]; then
[ -n "`pgrep -x -u 0,1,$NRPE_USER nrpe`" ] && /usr/bin/kill `head -1 $pidfile`
rm "$pidfile"
else
if [ `uname -r` = 5.8 -o `uname -r` = 5.9 ]
then
/usr/bin/kill `pgrep -x -u 0,1,$NRPE_USER nrpe`
else
/usr/bin/kill `pgrep -x -u 0,1,$NRPE_USER -z \`zonename\` nrpe`
fi
fi
;;
Works for me reliable in a global zone and works around a missing pid_file line in the config.
Thanks for the modified lines 32 and 33, cool! |
|
|
(0006360)
|
gadavis
|
2009-06-29 22:32
|
|
Looks like it should work |
|
|
(0006361)
|
ja
|
2009-06-29 23:39
|
|
I put packages with the fixed start / stop script into testing. Please test them, if there aren't other issues I will release them at the end of the week. |
|
|
(0006381)
|
gadavis
|
2009-07-03 02:51
|
|
I'm not quite sure where to look for this package. I don't see it on the ibiblio or purdue mirrors under unstable or testing in the 5.10 directories. Most recent version I see is: nrpe-2.12,REV=2009.06.25-SunOS5.8-sparc-CSW, and this version predates me opening this ticket.
Am I looking in the right places? |
|
|
(0006382)
|
ja
|
2009-07-03 08:55
|
|
|
|
(0006406)
|
gadavis
|
2009-07-07 18:36
|
|
I tried to install the package but got errors in the non-global zones when the zones are not booted. It only installs in zones that are currently running.
I don't think I had noticed the error before, but the old versions of the package apparently give the same error.
# zoneadm list -cv
ID NAME STATUS PATH BRAND IP
0 global running / native shared
1 anfweb-dev running /zones/anfweb-dev native shared
- anfwfproc installed /zones/anfwfproc native shared
# pkgadd -d nrpe-2.12\,REV\=2009.06.30-SunOS5.8-sparc-CSW.pkg all
## Verifying package <CSWnrpe> dependencies in zone <anfweb-dev>
## Booting non-running zone <anfwfproc> into administrative state
## Verifying package <CSWnrpe> dependencies in zone <anfwfproc>
## Restoring state of global zone <anfwfproc>
The package <CSWnrpe> contains scripts which will be executed on
zones <anfwfproc, anfweb-dev> with super-user permission during the
process of installing this package.
Do you want to continue with the installation of <CSWnrpe> [y,n,?] y
Processing package instance <CSWnrpe> from </root/nrpe-2.12,REV=2009.06.30-SunOS5.8-sparc-CSW.pkg>
## Installing package <CSWnrpe> in global zone
nrpe - nagios remote plugin executor(sparc) 2.12,REV=2009.06.30
http://downloads.sourceforge.net/nagios/ [^] packaged for CSW by Juergen Arndt
## Executing checkinstall script.
nagios user detected
nagios group detected
## Processing package information.
## Processing system information.
2 package pathnames are already properly installed.
## Verifying package dependencies.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
## Checking for setuid/setgid programs.
This package contains scripts which will be executed with super-user
permission during the process of installing this package.
Do you want to continue with the installation of <CSWnrpe> [y,n,?] y
Installing nrpe - nagios remote plugin executor as <CSWnrpe>
## Executing preinstall script.
## Installing part 1 of 1.
/opt/csw/bin/nrpe <symbolic link>
/opt/csw/bin/nrpe_1k
/opt/csw/bin/nrpe_8k
/opt/csw/share/doc/nrpe/LEGAL
/opt/csw/share/doc/nrpe/NRPE.pdf
/opt/csw/share/doc/nrpe/README
/opt/csw/share/doc/nrpe/README.SSL
/opt/csw/share/doc/nrpe/README_8k
/opt/csw/share/doc/nrpe/SECURITY
[ verifying class <none> ]
Restoring /etc/opt/csw/preserve/CSWnrpe/nrpe.cfg
[ verifying class <cswpreserveconf> ]
Installing class <cswinitsmf> ...
Creating /var/opt/csw/svc/manifest/application ...
Creating service script in /var/opt/csw/svc/method/svc-cswnrpe ...
Creating manifest ...
Configuring service in SMF ...
CSWnrpe is using Service Management Facility. The FMRI is svc:/application/cswnrpe:default
[ verifying class <cswinitsmf> ]
Installation of <CSWnrpe> was successful.
## Installing package <CSWnrpe> in zone <anfweb-dev>
nrpe - nagios remote plugin executor(sparc) 2.12,REV=2009.06.30
## Executing checkinstall script.
nagios user detected
nagios group detected
## Processing package information.
## Processing system information.
2 package pathnames are already properly installed.
Installing nrpe - nagios remote plugin executor as <CSWnrpe>
## Executing preinstall script.
## Installing part 1 of 1.
/opt/csw/bin/nrpe <symbolic link>
/opt/csw/bin/nrpe_1k
/opt/csw/bin/nrpe_8k
/opt/csw/share/doc/nrpe/LEGAL
/opt/csw/share/doc/nrpe/NRPE.pdf
/opt/csw/share/doc/nrpe/README
/opt/csw/share/doc/nrpe/README.SSL
/opt/csw/share/doc/nrpe/README_8k
/opt/csw/share/doc/nrpe/SECURITY
[ verifying class <none> ]
Copying sample config to /opt/csw/etc/nrpe.cfg
[ verifying class <cswpreserveconf> ]
Installing class <cswinitsmf> ...
Creating service script in /var/opt/csw/svc/method/svc-cswnrpe ...
Creating manifest ...
Configuring service in SMF ...
CSWnrpe is using Service Management Facility. The FMRI is svc:/application/cswnrpe:default
[ verifying class <cswinitsmf> ]
Installation of <CSWnrpe> on zone <anfweb-dev> was successful.
## Booting non-running zone <anfwfproc> into administrative state
## Installing package <CSWnrpe> in zone <anfwfproc>
nrpe - nagios remote plugin executor(sparc) 2.12,REV=2009.06.30
## Executing checkinstall script.
nagios user detected
nagios group detected
/var/tmp//installM_aiEa/checkinstallR_aiEa: /tmp/sh2470: cannot create
pkginstall: ERROR: checkinstall script did not complete successfully
Installation of <CSWnrpe> on zone <anfwfproc> failed.
No changes were made to the system.
## Restoring state of global zone <anfwfproc> |
|
|
(0006407)
|
gadavis
|
2009-07-07 18:42
|
|
Another oddity, and probably the reason why the system hands when the method script errors out, is that the timeout values are all set to something huge.
[root@plinian:/root]
{516}# svccfg -s cswnrpe listprop start/timeout_seconds
start/timeout_seconds count 18446744073709551615
[root@plinian:/root]
{517}# svccfg -s cswnrpe listprop stop/timeout_seconds
stop/timeout_seconds count 18446744073709551615
[root@plinian:/root]
{518}# svccfg -s cswnrpe listprop restart/timeout_seconds
restart/timeout_seconds count 18446744073709551615
Could you tweak your manifest so that those timeout values are brought down to something reasonable like 60 seconds?
You might also consider just changing the stop/method property to ":kill" - this negates the whole pid_file problem as well as the zone problem |
|
|
(0006426)
|
ja
|
2009-07-12 21:00
|
|
I'll try to reproduce the strange behaviour when installing on a system with zones.
Concerning the timeout values I have to investigate the reason for this. Give me some time, because I'm a little bit under load these days. |
|
|
(0006429)
|
gadavis
|
2009-07-13 19:00
|
|
I get the feeling both are related to cswclassutils or MGAR, specifically the automatic manifest generation routines in cswclassutils. I actually opened bug 0003764 against cswclassutils but haven't heard back from the maintainer yet. |
|
|
(0006430)
|
gadavis
|
2009-07-13 19:21
|
|
|
|
(0010017)
|
ja
|
2012-07-12 11:05
|
|
Issue closed. Start / Stopp method redesigned and tested. |
|