Monday, March 28, 2011

Nagios : Ubuntu Quickstart

Introduction

This guide is intended to provide you with simple instructions on how to install Nagios from source (code) on Ubuntu and have it monitoring your local machine inside of 20 minutes. No advanced installation options are discussed here - just the basics that will work for 95% of users who want to get started.

These instructions were written based on an Ubuntu 6.10 (desktop) installation. They should work for an Ubuntu 7.10 install as well.

What You'll End Up With

If you follow these instructions, here's what you'll end up with:
Nagios and the plugins will be installed underneath /usr/local/nagios
Nagios will be configured to monitor a few aspects of your local system (CPU load, disk usage, etc.)
The Nagios web interface will be accessible at http://localhost/nagios/


Required Packages

Make sure you've installed the following packages on your Ubuntu installation before continuing.
Apache 2
PHP
GCC compiler and development libraries
GD development libraries

You can use apt-get to install these packages by running the following commands:


sudo apt-get install apache2 
sudo apt-get install libapache2-mod-php5 
sudo apt-get install build-essential

With Ubuntu 6.10, install the gd2 library with this command:

sudo apt-get install libgd2-dev

With Ubuntu 7.10, the gd2 library name has changed, so you'll need to use the following:

sudo apt-get install libgd2-xpm-dev

1) Create Account Information
Become the root user.

sudo -s

Create a new nagios user account and give it a password.

/usr/sbin/useradd -m -s /bin/bash nagios passwd nagios

On older Ubuntu server editions (6.01 and earlier), you will need to also add a nagios group (it's not created by default). You should be able to skip this step on desktop, or newer server editions of Ubuntu.

/usr/sbin/groupadd nagios 
/usr/sbin/usermod -G nagios nagios
Create a new nagcmd group for allowing external commands to be submitted through the web interface. Add both the nagios user and the apache user to the group.

/usr/sbin/groupadd nagcmd 
/usr/sbin/usermod -a -G nagcmd nagios 
/usr/sbin/usermod -a -G nagcmd www-data

2) Download Nagios and the Plugins
Create a directory for storing the downloads.

mkdir ~/downloads 
cd ~/downloads

Download the source code tarballs of both Nagios and the Nagios plugins (visit http://www.nagios.org/download/ for links to the latest versions). These directions were tested with Nagios 3.1.1 and Nagios Plugins 1.4.11.

wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.2.3.tar.gz 
wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz

3) Compile and Install Nagios
Extract the Nagios source code tarball.

cd ~/downloads 
tar xzf nagios-3.2.3.tar.gz 
cd nagios-3.2.3
Run the Nagios configure script, passing the name of the group you created earlier like so:

./configure --with-command-group=nagcmd
Compile the Nagios source code.

make all
Install binaries, init script, sample config files and set permissions on the external command directory.

make install 
make install-init 
make install-config 
make install-commandmode
Don't start Nagios yet - there's still more that needs to be done...

4) Customize Configuration
Sample configuration files have now been installed in the /usr/local/nagios/etc directory. These sample files should work fine for getting started with Nagios. You'll need to make just one change before you proceed...

Edit the /usr/local/nagios/etc/objects/contacts.cfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address you'd like to use for receiving alerts.

vi /usr/local/nagios/etc/objects/contacts.cfg
define contact{
        contact_name                    edge                 ; Short name of user
        use                             generic-contact         ; Inherit default values from generic-contact template (defined above)
        alias                           EDGE         ; Full name of user

        email                           edge@blogspot.com      ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
        }

define contactgroup{
        contactgroup_name       admins
        alias                   Nagios Administrators
        members                 edge
        }

5) Configure the Web Interface

Install the Nagios web config file in the Apache conf.d directory.

make install-webconf
Create a edge account for logging into the Nagios web interface. Remember the password you assign to this account - you'll need it later.

htpasswd -c /usr/local/nagios/etc/htpasswd.users edge 
Restart Apache to make the new settings take effect.

/etc/init.d/apache2 reload
Note: Consider implementing the ehanced CGI security measures described here to ensure that your web authentication credentials are not compromised.

6) Compile and Install the Nagios Plugins
To make Nagios install check_snmp, you should install net-snmp before process next step

Installing net-snmp:

cd ~/downloads
wget http://sourceforge.net/projects/net-snmp/files/net-snmp/5.6.1/net-snmp-5.6.1.tar.gz
tar xzf net-snmp-5.6.1.tar.gz
cd net-snmp-5.6.1
./configure
make
make install


Extract the Nagios plugins source code tarball.

cd ~/downloads 
tar xzf nagios-plugins-1.4.11.tar.gz 
cd nagios-plugins-1.4.11 

Compile and install the plugins.

./configure --with-nagios-user=nagios --with-nagios-group=nagios 
make 
make install

7) Start Nagios
Configure Nagios to automatically start when the system boots.

ln -s /etc/init.d/nagios /etc/rcS.d/S99nagios
Verify the sample Nagios configuration files.

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
If there are no errors, start Nagios.

/etc/init.d/nagios start
8) Login to the Web Interface

You should now be able to access the Nagios web interface at the URL below. You'll be prompted for the username (nagiosadmin) and password you specified earlier.

http://localhost/nagios/

Click on the "Service Detail" navbar link to see details of what's being monitored on your local machine. It will take a few minutes for Nagios to check all the services associated with your machine, as the checks are spread out over time.

9) Other Modifications
If you want to receive email notifications for Nagios alerts, you need to install the mailx (Postfix) package.

sudo apt-get install mailx 
sudo apt-get install postfix

You'll have to edit the Nagios email notification commands found in /usr/local/nagios/etc/objects/commands.cfg and change any '/bin/mail' references to '/usr/bin/mail'. Once you do that you'll need to restart Nagios to make the configuration changes live.

sudo /etc/init.d/nagios restart
Configuring email notifications is outside the scope of this documentation. Refer to your system documentation, search the web, or look to the Nagios Support Portal or Nagios Community Wiki for specific instructions on configuring your Ubuntu system to send email messages to external addresses.

10) Define host and configure hostgroups

To define a hostgroup, I add this line to /usr/local/nagios/etc/nagios.cfg 

cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg

Then create file /usr/local/nagios/etc/objects/hostgroups.cfg and add the content as follow:

vi /usr/local/nagios/etc/objects/hostgroups.cfg
define hostgroup{
        hostgroup_name  linux-servers ; The name of the hostgroup
        alias           Linux Servers ; Long name of the group
        }

define hostgroup{
        hostgroup_name  windows-servers;
        alias           Windows Servers;
        }

define hostgroup{
        hostgroup_name  cisco-devices;
        alias           Cisco Devices;
        }

define hostgroup{
        hostgroup_name  printers;
        alias           Printers;
        }

Then you have to define the host template to be a member of these groups in /usr/local/nagios/etc/objects/templates.cfg file:

define host{
        name                            linux-server    ; The name of this host template
        use                             generic-host    ; This template inherits other values from the generic-host template
        check_period                    24x7            ; By default, Linux hosts are checked round the clock
        check_interval                  5               ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10              ; Check each Linux host 10 times (max)
        check_command                   check-host-alive ; Default command to check Linux hosts
        notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day
                                                        ; Note that the notification_period variable is being overridden from
                                                        ; the value that is inherited from the generic-host template!
        notification_interval           120             ; Resend notifications every 2 hours
        notification_options            d,u,r           ; Only send notifications for specific host states
        contact_groups                  admins          ; Notifications get sent to the admins by default
        hostgroups                      linux-servers   ;
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

# Cisco host definition template - This is NOT a real host, just a template!

define host{
        name                            cisco-device    ; The name of this host template
        use                             generic-host    ; This template inherits other values from the generic-host template
        check_period                    24x7            ; By default, Linux hosts are checked round the clock
        check_interval                  5               ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10              ; Check each Linux host 10 times (max)
        check_command                   check-host-alive ; Default command to check Linux hosts
        notification_period             workhours       ; Linux admins hate to be woken up, so we only notify during the day
                                                        ; Note that the notification_period variable is being overridden from
                                                        ; the value that is inherited from the generic-host template!
        notification_interval           120             ; Resend notifications every 2 hours
        notification_options            d,u,r           ; Only send notifications for specific host states
        contact_groups                  admins          ; Notifications get sent to the admins by default
        hostgroups                      cisco-devices   ;
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

So that, every time you define a new host and assign it to a host-template (cisco-device, linux-device,...) then that host will belong the the group:

define host{
        use                     linux-server            ; Name of host template to use
                                                        ; This host definition will inherit all variables that are defined
                                                        ; in (or inherited by) the linux-server host template definition.
        host_name               localhost
        alias                   localhost
        address                 127.0.0.1
        }

11) Add others plugin

You have to download additional plugin from exchange.nagios.org, then install to your server.
To make the plugin work, first you have to define the command for your plugin in commands.cfg

For example, I have check_snmp_cisco_loadavg and check_snmp_cisco_memutil, I add the commands to commands.cfg as follow:

# 'check_snmp_cisco_loadavg' command definition
define command{
        command_name    check_snmp_cisco_loadavg
        command_line    $USER1$/check_snmp_cisco_loadavg -H $HOSTADDRESS$ -C $ARG1$ -w $ARG2$ -c $ARG3$
        }

# 'check_snmp_cisco_memutil' command definition
define command{
        command_name    check_snmp_cisco_memutil
        command_line    $USER1$/check_snmp_cisco_memutil -H $HOSTADDRESS$ -C $ARG1$ -w $ARG2$ -c $ARG3$
        }

Then define a service for a specific host as below:

define service{
       use                  generic-service ; Inherit values from a template
       host_name            router
       service_description  CPU Load Average
       check_command        check_snmp_cisco_loadavg!communitystring!50!80
       }

define service{
       use                     generic-service ; Inherit values from a template
       host_name               router
       service_description     Memory Usage
       check_command           check_snmp_cisco_memutil!communitystring!60!80
       }

Then reload Nagios:

sudo /etc/init.d/nagios reload

12) Install nagiosgraph


http://nagiosgraph.svn.sourceforge.net/viewvc/nagiosgraph/trunk/nagiosgraph/INSTALL

No comments:

Post a Comment