50.nagios监控工具

nagios监控工具

创新互联是一家专注于网站建设、网站设计与策划设计,左云网站建设哪家好?创新互联做网站,专注于网站建设10年,网设计领域的专业建站公司;建站业务涵盖:左云等地区。左云做网站价格咨询:18982081108

Nagios是一款开源的计算机系统和网络监视工具,能有效的监控windows,linux和unix的主机状态,在系统和服务器状态异常的时候发送电子邮件或短信报警,第一时间通知网站运维成员,在状态恢复后发出正常的电子邮件或短信通知。

Nagios四种监控状态:

v 0(OK)表示状态正常/绿色

v 1(WARNING)表示出现警告/×××

v 2(CRITICAL)表示出现非常严重的错误/红色

v 3(UNKNOWN)表示未知错误/深×××

Nagios通过NRPE插件来远程管理服务。

 

Nagios配置思想:

要监控那台主机,监控主机上那些服务,我要在什么时间段内做监控,当被监控端出现故障时我要给那个联系人发送邮件通知。

Nagios配置文件路径:/usr/local/nagios/etc/

Nagios的配置文件解析:

v Nagios.cfg:主配置文件,定义其他配置文件存放位置

v Hostgroups.cfg:主机组配置文件,定义主机组

v Contacts.cfg:联系人配置文件,定义联系人和联系人组

v Commands.cfg:命令配置文件,定义使用那些命令做监控

v Timeperiods.cfg:时间段配置文件,定义在那个时间范围做监控

v Templates.cfg:模板文件,用于资源引用

v Localhost.cfg:本地主机配置文件,用于监控本地

部署nagios监控系统:

关闭防火墙

#iptbles -F

#setenforce 0

创建nagios运行用户和组,创建安装目录,授权目录

#useradd  -s /sbin/nologin  nagios

#mkdir /usr/local/nagios

#chown -R nagios.nagios   /usr/local/nagios

编译安装nagios软件

#tar xzvf nagios-4.0.1.tar.gz

#cd agios-4.0.1

#./configure --prefix=/usr/local/nagios

#make all

#make install

#make install-init

#make install-commandmode

#make install-config

#chkconfig --add nagios

#chkconfig nagios on

安装nagios-plugins插件

#tar xzvf nagios-plugins-1.5.tar.gz

#cd nagios-plugins-1.5

#./configure --prefix=/usr/local/nagios

#make && make install

安装NRPE插件

#yum -y install openssl-devel

#tar xzvf nrpe-2.15.tar.gz

#cd nrpe-2.15

#./configure

#make all

#make install-plugin

安装配置apche和PHP

#yum -y install httpd php

#vim /etc/httpd/conf/httpd.conf

添加相关内容:Nagios的web页面需要经过授权才可以访问所以!

ScriptAlias  /nagios/cgi-bin "/usr/local/nagios/sbin"

    Options ExecCGI

    AllowOverride None

    Order allow,deny

    Allow from all

    Authname "Nagios Access"

    AuthType  Basic

    AuthUserFile /usr/local/nagios/etc/htpasswd.users    用于此目录访问身份验证的文件

    Require  valid-user

Alias /nagios "/usr/local/nagios/share"

    Options None

    AllowOverride None

    Order allow,deny

    Allow from all

    Authname "Nagios Access"

    AuthType  Basic

    AuthUserFile /usr/local/nagios/etc/htpasswd.users

    Require  valid-user

#service httpd restart

创建web页面的用户为nagiosadm密码为nagiosadm

# /usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadm

配置nagios系统

#vim /usr/local/nagios/etc/nagios.cfg

添加参数:

cfg_dir=/usr/local/nagios/etc/conf         指定主机文件目录

cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg     指定主机组文件位置

#mkdir /usr/local/nagios/etc/conf/         创建目录

# cp localhost.cfg   /usr/local/nagios/etc/conf/192.168.254.129.cfg   拷贝主机文件

#vim  /usr/local/nagios/etc/conf/192.168.254.129.cfg        主机文件

修改内容:

define host{                  被监控主机

        use                     linux-server            调用模板参数

        host_name               web                  主机名

        alias                    web                  别名

        address                 192.168.254.129        被控主机地址

        }

define service{                被监控主机服务                       

        use                    local-service             调用模板参数

        host_name              web                   主机名

        service_description       PING                   服务描述

        check_command         check_ping!100.0,20%!500.0,60% 使用的命令

        }

#vim /usr/local/nagios/etc/objects/hostgroups.cfg

define hostgroup{            被监控主机组

        hostgroup_name   webs                       主机组名

        alias             webs                       主机组别名

        members         web                        成员主机(调用主机)

}

 

#vim /usr/local/nagios/etc/objects/templates.cfg              模板文件

define host{

        name                   generic-host              模板名(主机)

        notifications_enabled           1                 通告开关

        event_handler_enabled         1                 事件处理开关

        flap_detection_enabled         1                 移动检测开关

        process_perf_data             1                 过程性能数据开关

        retain_status_information       1                 保持状态信息开关

        retain_nonstatus_information    1                 保持无状态信息开关

        notification_period             24x7              通告周期(调用时间

        register                        0                不注册

        }

define service{

        name                   generic-service          模板名(服务)

        active_checks_enabled          1                存活检查开关   

        passive_checks_enabled         1                被动检查开关      

        parallelize_check               1                并行化检查开关      

        obsess_over_service            1                痴迷服务开关     

        check_freshness               0                检查新的服务开关 

        notifications_enabled           1                通告开关      

        event_handler_enabled         1                 事件处理开关    

        flap_detection_enabled         1                 移动监测开关    

        process_perf_data             1                 过程性能数据开关    

        retain_status_information       1                 保持状态信息开关     

        retain_nonstatus_information    1                 保持无状态信息开关     

        is_volatile                    0                 服务是否不稳定   

        check_period                 24x7               检查周期(调用时间

        max_check_attempts           3                  最大检查尝试时间(分钟)  

        normal_check_interval          10                正常检查间隔(分钟)     

        retry_check_interval            2                 重试检查间隔(分钟)   

        contact_groups                admins            联系人组(调用联系人)   

        notification_options            w,u,c,r            通告告警级别  

        notification_interval            60                通告间隔(分钟)     

        notification_period             24x7              通告周期(调用时间)    

         register                      0                 不注册   

        }

define contact{

        name                      generic-contact    模板名(联系人)

        service_notification_period     24x7            服务通告周期(调用时间)       

        host_notification_period        24x7            主机通告周期(调用时间)      

        service_notification_options    w,u,c,r,f,s         服务告警级别  

        host_notification_options       d,u,r,f,s          主机在什么状态下发送通知  

service_notification_commands notify-service-by-email服务发送通知邮箱(调用命令)

  host_notification_commands     notify-host-by-email   主机发送通知邮箱(调用命令)

        register                        0                  不注册

        }

define host{

        name                        linux-server        模板名(主机)

        use                          generic-host       调用了模板中的参数

        check_period                  24x7             检查周期(调用时间

        check_interval                 5                检查间隔(分钟)

        retry_interval                  1                重试间隔(分钟)

        max_check_attempts           10               最大尝试时间(分钟)

        check_command              check-host-alive     检查命令(调用命令

        notification_period             workhours         通告周期(调用时间

        notification_interval            120               通告间隔(分钟)

        notification_options            d,u,r             主机在什么状态下发送通知

        contact_groups                admins           联系人组(调用联系人

        register                     0                 不注册

}

define service{      

        name                      local-service         模板名(服务)

        use                        generic-service      调用模板中的参数

        max_check_attempts          4                 最大检查尝试时间(分钟)   

        normal_check_interval         5                 正常检查间隔(分钟)

        retry_check_interval          1                  重试检查间隔(分钟)

        register                     0                 不注册

        }

#vim /usr/local/nagios/etc/objects/timeperiods.cfg        时间文件

define timeperiod{

        timeperiod_name 24x7                        定义时间段名

        alias           24 Hours A Day, 7 Days A Week   定义时间别名

        sunday          00:00-24:00                 星期天

        monday          00:00-24:00                星期一

        tuesday         00:00-24:00                 星期二

        wednesday       00:00-24:00                星期三

        thursday        00:00-24:00                 星期四

        friday          00:00-24:00                 星期五

        saturday        00:00-24:00                 星期六

        }

define timeperiod{

        timeperiod_name workhours                  定义时间段名

        alias           Normal Work Hours           定义时间别名

        monday          09:00-17:00               星期一

        tuesday         09:00-17:00                星期二

        wednesday      09:00-17:00                星期三

        thursday        09:00-17:00                星期四

        friday           09:00-17:00                星期五

        }

 

#vim /usr/local/nagios/etc/objects/contacts.cfg                    联系人文件

define contact{

        contact_name                    nagiosadmin           联系人名  

        use                             generic-contact        调用模板

        alias                            Nagios Admin           别名

        email                           867218859@qq.com      邮箱地址

        }

define contactgroup{

        contactgroup_name       admins                        联系人组

        alias                   Nagios Administrators             联系人组别名

        members                 nagiosadmin                   成员

        }

注释:上面的是详细的配置文件详解

配置nagios文件经验:

对于全新的nagios只需要根据需求

定义主机文件:要监控那台主机,该主机上用命令监控那些服务

定义联系人:要给那个联系人发送通知,指定邮箱地址

定义时间段:在那个时间段发送通知

即可完成配置

check_ping!100.0,20%!500.0,60%意思是:使用check_ping命令检测中,如果延迟延迟>=100或丢包率超过20%则触发warning警告,如果延迟>=500,或丢包率超过60%,则触发critical警告;否则不触发警告。‘!’表示分割,‘,’表示或者

告警级别:w:warning警告  u:unknown未知错误  c:critical严重错误

主机状态:d:down关机  u:unreachable不可达  r:recovery恢复

关闭身份验证

#vim /usr/local/nagios/etc/cgi.cfg

修改内容:

Use-authentication = 0

部署被控端

# yum -y install openssl openssl-devel

#useradd nagios -s /sbin/nologin

#mkdir /usr/local/nagios

#tar xzvf nagios-plugins-1.5.tar.gz

# cd nagios-plugins-1.5

#./configure --prefix=/usr/local/nagios

#make && make install

#tar xzvf nrpe-2.15.tar.gz

#cd nrpe-2.15

#./configure --prefix=/usr/local/nagios

#make  all && make install-plugin && make install-daemon && make install-daemon-config

#vim /usr/local/nagios/etc/nrpe.cfg

添加相关内容:指定监控服务器

allowed_hosts=127.0.0.1,192.168.254.128

commond[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

commond[check_load]=/usr/local/nagios/libexc/check_load -w 15,10,5 -c 30,25,20

# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d       启动nrpe服务

# netstat -lnupt | grep 5666                                     nrpe服务端口

监控端测试nrpe是否正常

# /usr/local/nagios/libexec/check_nrpe  -H 192.168.254.129

重启服务

# service nagios restart

Linux使用Sendmail发送邮件:
# yum -y install sendmail postfix mailx

# systemctl restart sendmail

发送邮件的方式:

# echo ‘内容’| mail -s‘主题’ 邮箱地址

# mail -v867218859@qq.com

>主题:

>内容:

按快捷方式Ctrl+D。

Linux使用外部IMAP邮箱身份发送邮件方式:

邮箱协议:

SMTP:用于邮件的发送 端口号:25

POP3:用于接收邮件 端口号:110

IMAP:网络邮箱协议,用于邮件在线传输。

# yum -y install mailx

# vim /etc/mail.rc

添加相关参数:

set from=13590163240@163.com

set imap=imap.163.com

set imap-auth=login

set imap-auth-user=13590163240@163.com

set imap-auth-password=xyz110110

发送邮件的方式:

# echo ‘内容’| mail -s‘主题’ 邮箱地址

# mail -v867218859@qq.com

>主题:

>内容:

按快捷方式Ctrl+D。

解决mail邮件不能发送问题

报错信息

解决方法

#Cat /var/log/maillog

DSN: Service unavailable

更换hostname:

1.编辑/etc/sysconfig/network,更改hostname

2.把hostname写入/etc/hosts

3.执行hostnamewww.a.com临时修改

 


网站名称:50.nagios监控工具
网页链接:http://scyanting.com/article/pphgpg.html