Monitoring your infrastructure, one of the pillars of DevOps … and Agile … with Nagios

Having a solution of Continuous Integration and Continuous Delivery is very good. It’s even a prerequisite for an Agile team. But we must ensure that it is operational and that people are alerted in case of problems. It is precisely this role that Nagios fulfills.

We can classify the monitoring tools into two categories: black boxes (like Nagios), which simply check that services answer to the pings or that there is enough disk space, and white boxes (like Prometheus), which do more by ensuring that the services work properly.

Nagios is a server health monitoring solution released in 1999. It has a large community and a large number of plugins and extensions. His name comes from “Nagios Is not Gonna Insist On Sainthood”.

Nagios allows you to monitor hosts and services and to indicate problems (example: unreachable host, disk of a host full, …).

There are two verification classes:

  • active checks: Initiated by Nagios, they regularly execute code. It’s typically used to monitor HTTP services, SSH or databases;

Specifically, Nagios is able for example to ping a web server every 10 minutes and trigger an action such as sending an email to a predefined contact group if the server does not respond within the time limit.

The configuration of Nagios is done via configuration files.

In this article, we will see how to implement Nagios with Docker. For that, we will use the image jasonrivers/nagios. The scenario we are going to follow is the following:

  • A container run an instance of Nagios.

To deploy these Docker containers, we will use Docker Compose. Therefore, we will use a docker-compose.yml file like that:

version: '3'
services:
nagios:
image: jasonrivers/nagios
ports:
- 8081:80
environment:
- NAGIOSADMIN_USER=nagiosadmin
- NAGIOSAMDIN_PASS=nagios
volumes:
- ./etc/:/opt/nagios/etc/
nginx:
image: nginx
ports:
- 8082:80

This file defines two services: nagios, which runs on the port 8081, and nginx, which runs on the port 8082. It also defines the username and password of Nagios (nagiosadmin / nagios) and shares the directory /opt/nagios/etc from the Nagios instance via the local subdirectory ./etc. It is in this directory that we store the Nagios configuration files.

We are ready to launch nagios and nginx services:

docker-compose up

You can verify that both services are running with a web browser by going to:

The basic Nagios configuration file is nagios.cfg. It is in the ./etc subdirectory. It is a good practice to divide all information into separate files, for example: hostsgroups.cfg, hosts.cfg, and services.cfg.

For that, you must declare them in the nagios.cfg file. In the “OBJECT CONFIGURATION FILE (S)” section, add these lines:

cfg_file=/opt/nagios/etc/objects/hostgroups.cfg
cfg_file=/opt/nagios/etc/objects/hosts.cfg
cfg_file=/opt/nagios/etc/objects/services.cfg

The hostgroups.cfg file is used to define host groups. It’s very convenient when you manage a server farm. Here is an example of content:

define hostgroup{
hostgroup_name test-group
alias Test Servers
members my_client
}

The hosts.cfg file is used to define the hosts. Here is an example of the contents of the hosts.cfg file:

define host{
use generic-host
host_name my_client
address IP_ADDRESS
contact_groups admins
max_check_attempts 1
notes Test my_client
}

Instead of “IP_ADDRESS”, write the IP address (for example 192.168.204.31).

Finally, we define the services to be tested in the services.cfg file:

define service{
use generic-service
hostgroup_name test-group
service_description Ping
check_command check_ping!200.0,20%!600.0,60%
}
define service{
host_name my_client
use generic-service
service_description Nginx Web server
check_command check_http_port!8082
}
define command{
command_name check_http_port
command_line $USER1$/check_http -I $HOSTADDRESS$ -p $ARG1$
}

What are we doing here? Well, we define two services:

  • a first “ping” service pinging the servers of the test-group,

We use the plugin “check_http”. The basic syntax is:

check_command check_http

However, this plugin does the test on port 80 by default. Our server running on port 8082, we have to modify the way we call this plugin by specifying the port, namely 8082.

If you go to Nagios, after a certain period of time you will see that the verification of the services is done successfully and their status goes green.

Now, let’s stop the Nginx service with this command:

docker-compose stop nginx

By returning to Nagios, after a certain delay, and after refreshing the page, you will see that the service verification fails and their status turns red.

Now, it would be nice to be alerted by email. To do this, you must add a contact, or simply modify the predefined contact. To do that, open the contacts.cfg file and specify the recipient’s email in the line starting with “email”:

define contact {  contact_name nagiosadmin  use generic-contact  alias Nagios Admin  email XXXXXX }

Here are some examples of plugins to monitor services.

In first, there are plugins to verify that a number of servers are working properly. Essentially, this is to verify that they return the expected header when connecting:

  • check_http to monitor an HTTP server,

But there are also other plugins that can monitor servers:

  • check_local_disk to check the available disk space: it is thus possible to set up a notification of type warning if there remains less than 20% free space and an alert notification if it remains less than 10%;

Nagios is one of the services monitoring solutions, among others. It has the advantage of being very old and having a large number of plugins. With a solution like Nagios, the supervision of hosts and their services becomes a much simpler exercise. But there are other more modern monitoring tools. We will see them later.

More on my blog http://www.DevOpsTestLab.com

My DevOpsTestLab Youtube channel.

My LinkedIn profile: https://fr.linkedin.com/in/brunodelb

Interests in the full lifecycle: design, Agile Coaching, development, testing, DevOps, Cloud, Management 3.0, ITIL. It defines me.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store