Class: Plugin

Inherits:
Object
  • Object
show all
Defined in:
lib/rnagios/plugin.rb

Overview

There are two types of checks you can perform that Nagios will process: active and NSCA checks. Active checks are run on the Nagios monitoring host and actively check services. NSCA checks are run by remote hosts and their output is sent back to the host for processing by the NSCA daemon. Active checks are “actively” run by Nagios; NSCA checks are “passively” run by other servers.

To create an active check, create logic that measures a service (such as a database, a web service, or a custom application) and returns a fully-populated ActiveStatus object. By the same token, to create an NSCA check, create logic that measures a service and returns a fully-populated NscaHostStatus or NscaServiceStatus object. (There are many gems available that can help you connect to services and parse their output; this class does not attempt to be everything to everyone.)

Plugin handles the formatting, and provides the appropriate exit code so Nagios will process it correctly. When running on UNIX/Linux, Nagios will check the exit codes along with the string in the status message to determine how to handle it. Under Windows, Nagios only cares about the strings in the status message.

Plugin allows you to use an optional configuration file in case your plugin requires access to multiple configuration parameters. If you supply a configuration file, it should be a YAML file, and it will be parsed in the check method before the measure method is called. measure() can then use the @config attribute to access configuration parameters. The configuration file is expected to be in the same directory as the plugin.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(params = {}) ⇒ Plugin

Plugin expects a hash with the following symbols:

:host         # The hostname or IP address to check (required)
:port         # The port on :host to check (default: 80)
:name         # The name of the service, usually starting with 'check_' (default: <UNDEFINED>)
:config_file  # Path to YAML configuration file (optional)
:w            # Warning level (optional, usually numeric)
:c            # Critical level (optional, usually numeric)
:use_ssl      # True to use SSL, false otherwise (default: false)
:verify_ssl   # If using SSL, set to true to verify the server (default: false)

:host must be provided, otherwise a NagiosError will be thrown.



62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/rnagios/plugin.rb', line 62

def initialize(params={})
  if !blank?(params)
    if params[:host].nil?
      raise NagiosError.new('Hostname must be provided')
    else
      @host = params[:host]
    end
    if params[:port].nil?
      @port = 80
    elsif params[:port].is_a?(String)
      begin
        @port = params[:port].to_i
      rescue
        raise NagiosError.new('Port number must be numeric')
      end
    else
      @port = params[:port]
    end
    @name = default(params[:name])
    @config_file = params[:config_file] if !params[:config_file].nil?
    @w = params[:w] if !params[:w].nil?
    @c = params[:c] if !params[:c].nil?
    if !params[:use_ssl].nil?
      @use_ssl = params[:use_ssl]
    else
      @use_ssl = false
    end
    if !params[:verify_ssl].nil?
      @verify_ssl = params[:verify_ssl]
    else
      @verify_ssl = false
    end
  else
    raise NagiosError.new('At least hostname must be provided')
  end
end

Instance Attribute Details

#cObject

Critical level



40
41
42
# File 'lib/rnagios/plugin.rb', line 40

def c
  @c
end

#configObject (readonly)

Internal configuration access attribute



33
34
35
# File 'lib/rnagios/plugin.rb', line 33

def config
  @config
end

#config_fileObject

Path to the configuration file



42
43
44
# File 'lib/rnagios/plugin.rb', line 42

def config_file
  @config_file
end

#hostObject

Name of host to check



44
45
46
# File 'lib/rnagios/plugin.rb', line 44

def host
  @host
end

#nameObject

Name of the plugin (e.g. check_service)



35
36
37
# File 'lib/rnagios/plugin.rb', line 35

def name
  @name
end

#portObject

Port of host to check (defaults to 80)



46
47
48
# File 'lib/rnagios/plugin.rb', line 46

def port
  @port
end

#use_sslObject

Set to true to use SSL



48
49
50
# File 'lib/rnagios/plugin.rb', line 48

def use_ssl
  @use_ssl
end

#verify_sslObject

Whether we verify the host if using SSL to connect



50
51
52
# File 'lib/rnagios/plugin.rb', line 50

def verify_ssl
  @verify_ssl
end

#wObject

Warning level



38
39
40
# File 'lib/rnagios/plugin.rb', line 38

def w
  @w
end

Instance Method Details

#checkObject

Returns a Status with the message to print to STDOUT and an exit code to return if running under UNIX/Linux. If the measure method throws an exception or the status object is not a properly-populated Status object, then this method will throw a NagiosError.

This method will run the user-overloaded measure() method. If measure() should return an uncaught exception, the exception message will be returned to Nagios as an UNKNOWN status, so make sure to handle all known error conditions within measure() to make the output of your script meaningful.

Scripts should call this method at the very end. The general flow of a script running on UNIX/Linux should be as follows:

class MyCheck < Plugin
  def measure
    stat = ActiveStatus.new
    stat.status = <...service check code goes here, returns Nagios::Status::<constant> status...>
    stat.message = <...service message state construction goes here...>
    # UNIX/Linux exit codes are taken core of for you
  end
end

plugin = MyCheck.new( { :name => 'check_service', :host = '<my_host>', :w => 15, :c => 5 })
status = plugin.check
$stdout.puts status.message  # status.message needs to be output to STDOUT
exit status.exit_code  # For UNIX/Linux

It is up to the developer to handle command-line arguments. (See the trollop gem for a good example of command-line parameter hadling.) Nagios plugins usually accept “w” as a warning level and “c” as a critical level, but it is up to the plugin developer if these values are used in the measure method.



142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
# File 'lib/rnagios/plugin.rb', line 142

def check
  start_time = Time.now
  end_time = -1
  status = nil

  # If there is a config file, load it
  if !blank? @config_file
    begin
      @config = YAML.load_file(@config_file)
    rescue Psych::SyntaxError => e
      raise NagiosError.new('Error occurred while trying to parse YAML config file; please verify that config file is valid YAML')
    end
  end

  # measure() is where the magic happens.  measure() should
  # return a Status object.  Any exceptions thrown
  # in measure() drop right out, so make sure you handle all
  # error conditions appropriately before using your plugin
  # with Nagios
  status = measure

  # Mark the end time for Nagios performance stats
  end_time = Time.now
  time_took = end_time - start_time

  # Since we can't effectively check for exceptions, we make
  # sure we get a good Status
  if !valid?(status)
    raise NagiosError.new('Returned status must be an instance or subclass of Status')
  elsif !valid_status?(status)
    raise NagiosError.new('Status must have be a valid status constant')
  elsif status.is_a?(NscaHostStatus) && !valid_passive_code?(status)
    raise NagiosError.new('Status passive code is invalid')
  elsif status.is_a? ActiveStatus
    if status.is_a? NscaServiceStatus
      if !valid_passive_code? status
        raise NagiosError.new('Status passive code is invalid')
      end
    elsif !valid_exit_code? status
      raise NagiosError.new('Status exit code is invalid')
    end
  elsif blank?(status.message)
    raise NagiosError.new('Status message must not be nil or empty')
  end

  # Status checks out as valid -- we now check the warning and critical
  # times and format the output based on the type of Status received
  if status.is_a? NscaHostStatus
    status.message = format_passive_host_check(status)
  elsif status.is_a? NscaServiceStatus
    if valid_value_w && valid_value_c
      if time_took >= @w && time_took < @c && status.status != ActiveStatus::WARNING
        status.status = ActiveStatus::WARNING
        status.message += '; check time >= ' + @w.to_s
      elsif time_took >= @c && status.status != ActiveStatus::CRITICAL
        status.status = ActiveStatus::CRITICAL
        status.message += '; check time >= ' + @c.to_s
      end
    end
    status.message = format_passive_service_check(status)
  else
    if valid_value_w && valid_value_c
      if time_took >= @w && time_took < @c && status.status != ActiveStatus::WARNING
        status.status = ActiveStatus::WARNING
        status.message += '; check time >= ' + @w.to_s
      elsif time_took >= @c && status.status != ActiveStatus::CRITICAL
        status.status = ActiveStatus::CRITICAL
        status.message += '; check time >= ' + @c.to_s
      end
    end
    status.message = format_active_service_check(status, start_time, end_time)
  end

  status
end

#measureObject

This method should be overloaded by classes that subclass Plugin. For active checks, measure() should return an ActiveStatus object. For passive checks, measure() should return a NscaHostStatus or NscaServiceStatus object.

The developer must determine how to best check the service in question. Below is an example of how measure() should be structured.

status = ''
message = ''
begin
  random_measure = 1 + rand(4) # Bad real-world event simulator
  if random_measure == 1
    status = ActiveStatus::OK
    message = 'Everything looks good'
  elsif random_measure == 2
    status = ActiveStatus::WARNING
    message = 'Keep an eye on this service'
  elsif random_measure == 3
    status = ActiveStatus::CRITICAL
    message = 'There''s a problem'
  else
    status = ActiveStatus::UNKNOWN
    message = 'Can''t figure out what''s going on'
  end
rescue StandardError => e
  # You should decide what kind of status to return in case
  # of an error.  You should not let errors go unhandled in
  # a Nagios plugin, otherwise Nagios will handle it as a
  # failure and you may be spammed with unnecessary notifications
  # for something that should only merit a warning
  status = ActiveStatus::WARNING
  message = 'Possibly recoverable error occurred'
rescue Exception => e
  # It is wise to account for exceptions you don't anticipate
  # and create a status that is more meaningful for your service.
  # For your situation, unanticipated errors may be critical, or
  # they may be worth only a warning; you decide
  status = ActiveStatus::CRITICAL
  message = 'Something we didn''t anticipate occurred'
ensure
  # Make sure the returned status has something in it if all else fails
  status = ActiveStatus::UNKNOWN
  message = 'Don''t know what happened'
end
ActiveStatus.new(status, message)

Raises:



266
267
268
269
# File 'lib/rnagios/plugin.rb', line 266

def measure
  # Overload this method to populate the status object
  raise NagiosError.new('Call to measure should not invoke base class measure method; measure method should be overridden')
end