NRPE with SmartSNO

How To Use NRPE (Nagios Remote Plugin Executor) plugins with SmartSNO

This How-To will focus on how to integrate plugins that are normally executed via NRPE with SmartSNO.
We will run nagios-plugins scripts and forward statuses that they output to Passive Checks probes in SmartSNO over HTTP.
Making passive checks over HTTP is a recommended reading if you are unfamiliar with Passive Checks.

Installation of nagios-plugins

You have to have nagios plugins on your system. You can get them with $ sudo apt-get install nagios-plugins or
download them from their homepage.

If you installed via apt-get you can find scripts in /usr/lib/nagios/plugins.
There are many useful scripts and you can find documentation here.
In this How-To we will demonstrate the use of check_swap plugin.

Using check_swap

$ ./check_swap -w 90% -c 10%
SWAP OK - 100% free (2043 MB out of 2045 MB) |swap=2043MB;1841;204;0;2045
$ ./check_swap -w 100% -c 10%
SWAP WARNING - 100% free (2043 MB out of 2045 MB) |swap=2043MB;2045;204;0;2045

We can see that on my machine I have totally free swap and when I run the check_swap with -w 90% option it returns status OK.
Only when I tell it to raise a warning when 100% or less of the swap is free it returns status WARNING. Scripts return statuses via return status.
In first example return status of the script was 0 and in second case it was 1. It would be 2 for CRITICAL and 3 for UNKNOWN.

Configuring SmartSNO for NRPE

  1. We start by adding a new Probe (Settings->Supported devices->Probes). We will name this probe Check Swap.
    For protocol we choose HTTP Passive and for format we select String.
    For handle we input check_swap. Handle will be important later when we will construct URL to post data to.
    Enable monitoring while we are at it.
  2. Next thing we want to do is add newly created Probe to a Probe Group which we will add to a device.
    We add a new Probe Group (Settings->Supported devices->Probe Groups) named Linux Metrics.
    We add Check Swap probe to it and set check interval to 5 minutes.
  3. After we have Probe Group set up we choose a device which represents Linux server.
    I have one with IP 10.100.100.10, but you can also choose to use SmartSNO server itself which should be on 127.0.0.1.
    Select the device and under setup add our Linux Metrics probe group to it.
    Also take note of which access credentials the device is using, we will need it in the next step.
  4. The last thing we need to configure is our access credentials (Settings->Credentials) that the device is using.
    We enable HTTP passive checks and set up username and password to admin and password123.

Once we have that set up we are ready to start collecting passive checks on our probe.
The starting state is UNKNOWN and it stays this way until first check has been received.

Using python script and crontab to post passive checks

We will use this python script to run plugins and post passive checks. We will start by downloading and placing the script in /opt/nrpe_sno.
We also have to configure a few things. SMARTSNO_URL on line 7 should be the url of your SmartSNO server. DEVICE_IP should be the same as is configured
in SmartSNO. In my case that is 10.100.100.10. USERNAME and PASSWORD should be the same as configured in Credentials. We used admin and password123 in our example.

The script will get probe handle from the name of the plugin. This is why it was important to use check_swap when configuring Probe.

We can post a passive check like this:

$ /opt/nrpe_sno /usr/lib/nagios/plugins/check_swap -w 90% -c 10%

This will make a single passive check with status as is returned by check_swap. But we configured Probe Group to 5 minutes and if do not post another
passive check Probe status will go to FAIL.

The solution is to add our script to crontab and configure it to execute every 5 minutes. This is done by inserting this line into your crontab:

*/5 * * * * /opt/nrpe_sno /usr/lib/nagios/plugins/check_swap -w 90% -c 10%

If you have a plugin that executes longer you should probably configure probe with more time or execute the script more often.

Possible improvements

The example script only posts statuses to Passive Checks API, but Passive Checks also support value.
You could parse the output of the check_swap script and send how many % of the swap is free. If you want to chart this data
you need to configure type of the Probe in SmartSNO to Number and enable charting.

You could also improve security by verifying the hosts identity. Consult urllib2 documentation
or use Requests package as is recommended by the docs.

Python script for running plugins and posting passive checks

#!/usr/bin/env python
"""
WARNING: this example makes POST requests without verifying the host
and is thus susceptible to man in the middle attacks.
"""

SMARTSNO_URL = 'smartsno.company.com'
DEVICE_IP = '10.100.100.10'
USERNAME = 'test'
PASSWORD = 'testpass'

import argparse
import base64
import os
import subprocess
import sys
import urllib
import urllib2

# --- no configuration needed beyond this --- #


status_mappings = {
    0: 'ok',
    1: 'warn',
    2: 'crit',
    3: 'unknown',
}


def main(script):
    probe = os.path.split(script[0])[-1]
    url = 'https://{sno_url}/pchecks/{ip}/{probe_handle}'.format(sno_url=SMARTSNO_URL, ip=DEVICE_IP, probe_handle=probe)
    status = status_mappings.get(subprocess.call(script), 'unknown')
    values = {'status': status}
    data = urllib.urlencode(values)
    req = urllib2.Request(url, data)
    base64string = base64.b64encode('{}:{}'.format(USERNAME, PASSWORD))
    req.add_header('Authorization', 'Basic {}'.format(base64string))
    try:
        response = urllib2.urlopen(req)
    except urllib2.HTTPError as e:
        sys.stderr.write(e.read())


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('script', action='store', nargs=argparse.REMAINDER,
                        help='Path to script you want to execute with arguments')
    args = parser.parse_args()
    main(args.script)