Vous êtes sur la page 1sur 41

Monitoring IT Infrastructure

with Nagios Core (on Ubuntu)

Nagios
Core
Is an open source system and network monitoring
application

initially designed to work on Linux


it should work on most Unixes, as well
(Ubuntu, openSUSE, Fedora etc.)

Free Server & Network Monitoring Tools


Monit, Ganglia, Munin, Cacti, Nagios, Zabbix,
Observium, Zenoss, Collectd, Argus

More details on these tools can be found here:


http://sixrevisions.com/tools/10-free-server-networkmonitoring-tools-that-kick-ass/

http://en.wikipedia.org/wiki/Comparison_of_network_monit
oring_systems

Nagios can be complicated to install and


configure, but its wealth of features are
unmatched by any free tool out in the market

It is geared for the experienced IT network


administrators

Requirements
Mandatory :
A machine

running Linux (or UNIX variant) that has network access and
a C compiler installed

Optional :
You are not required to use the CGIs included with Nagios Core, but if you
use it, you need to have installed:
a web server (preferably Apache)
PHP
Thomas Boutells GD library version 1.6.3 or higher (this allows to use
Nagios with web interface)

Nagios Features

Monitoring of host resources (Processor load, disk and


memory usage, running processes, log files, etc.)

Monitoring of network services (SMTP, POP3, HTTP, NNTP,


PING, etc.)

Ability to acknowledge problems via the web interface

Ability to define network host hierarchy

Allows detection of and distinction between hosts that are down


and those that are unreachable

Contact notifications when service or host problems occur or


get resolved (via email, pager, etc.)

Optional escalation of notifications to different contact groups


Ability to define event handlers that are run during service or
host events, allowing proactive problem resolution
Simple authorization scheme that allows you restrict what
users can see and do from the web interface

How it works:

Installing (steps):
1) Required packages

Install Apache 2
Install PHP
Install GCC compiler and development libraries *
Install GD development libraries **

> sudo apt-get install apache2


> sudo apt-get install libapache2-mod-php5
> sudo apt-get install build-essential
> sudo apt-get install libgd2-xpm-dev (With Ubuntu 7.10+)
(sudo apt-get install libgd2-dev with With Ubuntu 6.10)

*GNU Compiler Collection (GCC) is a compiler system produced by the GNU Project, supporting various programming languages
**GD Graphics Library is a graphics software library by Thomas Boutell and others for dynamically manipulating images

2) Users and groups


Create a new nagios user, nagios group (for Ubuntu 6.0 and earlier) and add the
user to the group
Create a new nagcmd group for allowing external commands to be submitted
through the web interface
Add nagios user (nagios) and apache user (www-data) to nagcmd group
> sudo -s
> /usr/sbin/useradd -m -s /bin/bash nagios
> passwd nagios
-On older Ubuntu (6.01 and earlier) do also:
> /usr/sbin/groupadd nagios
> /usr/sbin/usermod -G nagios nagios
-> /usr/sbin/groupadd nagios
> /usr/sbin/usermod -G nagios nagios

> /usr/sbin/groupadd nagcmd


> /usr/sbin/usermod -a -G nagcmd nagios
> /usr/sbin/usermod -a -G nagcmd www-data

3) Download Nagios and the Plugins

> mkdir ~/downloads


> cd ~/downloads
> wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.5.0.tar.gz
> wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins1.4.16.tar.gz

4) Compile and Install Nagios

Compile (make all)


Install binaries, init script, sample config files and set permissions on the
external command directory (make install, make install-init, make install-config,
make install-commandmode)
> cd ~/downloads
> tar xzf nagios-3.5.0.tar.gz
> cd nagios-3.5.0
> ./configure --with-command-group=nagcmd
> make all
> make install
> make install-init
> make install-config
> make install-commandmode

5) Configure email
Change the email address associated with the nagiosadmin contact definition to the
address youd like to use for receiving alerts
(in usr/local/nagios/etc/objects/contacts.cfg file)
6) Configure the Web Interface
Install the Nagios web config file in the Apache conf.d directory (make install-webconf)
Create a nagiosadmin account for logging into the Nagios web interface
Restart Apache
> make install-webconf
> htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
> /etc/init.d/apache2 reload

7) Compile and Install the Nagios Plugins


It might be necessary to install first OpenSSL library (sudo apt-get install libssl-dev)
if using last nagios version (nagios-3.5.0.tar.gz, nagios-plugins-1.4.16.tar.gz)
> sudo apt-get install libssl-dev( Install OpenSSL library )
> /etc/init.d/nagios restart
> cd ~/downloads
> tar xzf nagios-plugins-1.4.16.tar.gz
> cd nagios-plugins-1.4.16
> ./configure --with-nagios-user=nagios --with-nagios-group=nagios
> make
> make install

8) Verify and start Nagios


Start Nagios when the system boots
Verify sample Nagios configuration files
Start Nagios
> ln -s /etc/init.d/nagios /etc/rcS.d/S99nagios
> /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
> /etc/init.d/nagios start

9) Login to the Web Interface


http://<IP>/nagios/; credentials: nagiosadmin/<chosen_pass>

Nagios main files:


Nagios Config Folder -> /usr/local/nagios/etc/
Nagios Main Config File -> nagios.cfg
Resource File -> resource.cfg
CGI Config File -> cgi.cfg
Object Config Files:
commands -> .../objects/commands.cfg
contacts -> .../objects/contacts.cfg
localhost -> .../objects/localhosts.cfg
printer-> .../objects/printer.cfg
switch -> .../objects/switch.cfg
templates -> .../objects/templates.cfg
timeperiods -> .../objects/timeperiods.cfg
windows -> .../objects/windows.cfg

What are the Objects?


Objects are all the elements that are involved in the monitoring and notification logic

Hosts:
Are one of the central objects in the monitoring logic
are usually physical devices on your network (servers, workstations, routers,
switches, printers, etc.)
have an address of some kind (e.g. an IP or MAC address)
have one or more more services associated with them
can have parent/child relationships with other hosts, often representing real-world
network connections, which is used in the network reachability logic

Services
Are one of the central objects in the monitoring logic
Services are associated with hosts and can be:
Attributes of a host (CPU load, disk usage, uptime, etc.)
Services provided by the host (HTTP, POP3, FTP, SSH, etc.)
Other things associated with the host (DNS records, etc.)

Contacts
Are people involved in the notification process, which:
have one or more notification methods (cellphone, pager,email, instant
messaging, etc.)
receive notifications for hosts and services they are responsible for

Timeperiods
Are used to control:
When hosts and services can be monitored
When contacts can receive notifications
Commands:
Are used to control:
When hosts and services can be monitored
When contacts can receive notifications
Host Groups/Service Groups/Contact Groups
Host/Service groups make it easier to view the status of related hosts/services in
the Nagios web interface
Contact groups make it easier to define all the people who get notified when
certain host or service problems occur

Host - Service - Command - executable command file

Sample command from /usr/local/nagios/etc/objects/commands.cfg

#'check_local_users' command definition


define command{
command_name check_local_users
command_line $USER1$/check_users -w $ARG1$ -c $ARG2$
}
$USER1$ appears in resource.cfg and is the default location for plugins
(/usr/local/nagios/libexec)
check_users is an executable command file from $USER1$ location

Service definition from localhost.cfg:


uses the command name as it appears in commands.cfg
$ARG1$, $ARG2$ are replaced with concrete values

Monitor Windows machine

On Ubuntu monitoring server:


1. In /usr/local/nagios/etc/nagios.cfg file uncomment the following line:
cfg_file=/usr/local/nagios/etc/objects/windows.cfg
2.In /usr/local/nagios/etc/objects/windows.cfg file replace the IP address of the
host with the IP of monitored Windows machine

On the monitored Windows machine:


1. Download NSClient++ (http://sourceforge.net/projects/nscplus/)
2. Install NSClient++ (no password set)
3. Start NSClient++ service

Monitor a printer

In order to be able to do this, check_hpjd plugin must exist


into plugin home directory (/usr/local/nagios/libexec)

check_hpjd plugin
Check_hpjd plugin will only get compiled and installed if you have the net-snmp and
net-snmp-utils packages installed on your system
Used terms:
Simple Network Management Protocol (SNMP) is an "Internet-standard protocol
for managing devices on IP networks.
SNMP is one of the most commonly used technologies when it comes to network
monitoring
MIB(Management Information Base) is a collection of information organized
hierarchically. These are accessed using a protocol such as SNMP.

OIDs or Object Identifiers uniquely identify manged objects in a MIB hierarchy

Install net-snmp
1. Install snmp if not installed:
> apt-get update && apt-get install snmpd
2. Enable snmp:
Edit the file /etc/snmp/snmp.conf and comment out the line containing "mibs:
From the console (Ctrl-Alt-t), enter the following commands:
> sudo apt-get install snmp-mibs-downloader
> sudo download-mibs

3. Restart smtp: /etc/init.d/snmpd restart


4. You should be able to test this configuration by running the following command:
> snmpwalk -v 2c -c public <InsertYourIPAddressHere> system

snmpwalk -v 2c -c public 127.0.0.1 system

Re-compile Nagios plugin


1. Go to the location where the plugin (nagios-plugins-1.4.16.tar.gz) is:
> cd /home/c/downloads/
2. Go to the location where the plugin (nagios-plugins-1.4.16.tar.gz) is:
> cd /home/c/downloads/
3. Recompile:
> ./configure --with-nagios-user=nagios --with-nagios-group=nagios
> make
> make install
4. Restart Nagios:
> sudo /etc/init.d/nagios restart

Configure Nagios to display the printer


1. Remove the # sign from the following line in the main configuration file
(/usr/local/nagios/etc/nagios.cfg):

#cfg_file=/usr/local/nagios/etc/objects/printer.cfg
2. Edit the printer.cfg file (/usr/local/nagios/etc/objects/printer.cfg) and replace
printer IP (and name, alias and so on) with the correct value

3. Restart Nagios:
> sudo /etc/init.d/nagios restart

Monitor a Linux/Unix host

On Monitored Linux host


Download Nagios Plugins and NRPE Add-on
Create nagios account
Install Nagios Plugin
Install NRPE
Setup NRPE to run as daemon
Modify the /usr/local/nagios/etc/nrpe.cfg

1) Download Nagios Plugins and NRPE Add-on


2) Create Nagios account
> useradd nagios
> passwd nagios
3) Install nagios-plugin

4) Install NRPE (NRPE - Nagios Remote Plugin Executor) addon


> cd ~/downloads
> wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.14/nrpe2.14.tar.gz
> tar xzf nrpe-2.14.tar.gz
> cd nrpe-2.14
> ./configure
###################

if ./configure does not succeed do this:


> apt-get install libssl-dev
> apt-file search libssl | grep libssl-dev
> ./configure --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/i386-linux-gnu (in
blue is the location where ssl library actually is)
####################

> make all


> make install-plugin
> make install-daemon
> make install-daemon-config
> sudo apt-get install xinetd
> make install-xinetd
5) Setup NRPE to run as daemon (i.e as part of xinetd):
Xinetd (extended Internet daemon) is an open-source super-server daemon which
runs on many Unix-like systems and manages Internet-based connectivity.
Modify the /etc/xinetd.d/nrpe to add the ip-address of the Nagios monitoring
server to the only_from directive.
Note that there is a space after the 127.0.0.1 and the nagios monitoring server
ip-address (in this example, nagios monitoring server ip-address is: 192.168.1.2)
only_from
= 127.0.0.1 192.168.1.2

Modify the /etc/services and add the following at the end of the file.
nrpe 5666/tcp # NRPE
Start the service
service xinetd restart
Verify whether NRPE is listening
netstat -at | grep nrpe
Verify to make sure the NRPE is functioning properly
/usr/local/nagios/libexec/check_nrpe -H localhost

6) Modify the /usr/local/nagios/etc/nrpe.cfg


The nrpe.cfg file located on the remote host contains the commands that are
needed to check the services on the remote host. By default the nrpe.cfg comes
with few standard check commands as samples. check_users and check_load
are shown below as an example.
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c
30,25,20

On Monitoring Server
Download NRPE Add-on
Install check_nrpe
Create host and service definition for remote host
Restart the nagios service
1. Download NRPE Add-on
2. Install check_nrpe on the nagios monitoring server
> cd ~/downloads
> wget http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe2.14/nrpe-2.14.tar.gz
> tar xzf nrpe-2.14.tar.gz
> cd nrpe-2.14
> ./configure

###################

Note: If you get the checking for SSL headers configure: error: Cannot find ssl
headers error message while performing ./configure, do the following:

> apt-get install libssl-dev


> apt-file search libssl | grep libssl-dev
> ./configure --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/i386-linux-gnu
###################

> make all


> make install-plugin
Verify whether nagios monitoring server can talk to the remotehost.
> /usr/local/nagios/libexec/check_nrpe -H 192.168.1.3
Note: 192.168.1.3 in the ip-address of the monitored host described above

3. Create host and service definition for remotehost


Create a new configuration file /usr/local/nagios/etc/objects/remotehost.cfg to
define the host and service definition for this particular remotehost. It is good to
take the localhost.cfg and copy it as remotehost.cfg and start modifying it
according to your needs.
define host{
use linux-server
host_name remotehost
alias Remote Host
address 192.168.1.3
contact_groups admins
}

define service{
use generic-service
host_name remotehost
service_description Root Partition
contact_groups admins
check_command check_nrpe!check_disk
}
Note: In all the above examples, replace remotehost with the corresponding
hostname of your remotehost.
4) Restart the nagios service
service nagios reload

Resources:
1. http://nagios.sourceforge.net/docs/3_0/quickstart.html
2. http://library.nagios.com/library/products/nagioscore/manuals/
3. http://sixrevisions.com/tools/10-free-server-network-monitoring-tools-thatkick-ass/
4. http://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
5. https://nagios.demo.netways.de/nagios/
6. http://www.paessler.com/knowledgebase/en/topic/653-how-do-snmp-mibsand-oids-work
7. http://askubuntu.com/questions/141564/what-is-snmp-used-for
8. http://www.thegeekstuff.com/2008/06/how-to-monitor-remote-linux-hostusing-nagios-30/

Vous aimerez peut-être aussi