Sensu Kubernetes Prometheus Plugin

Sensu plugin designed to query prometheus data output from node-exporter


check_prometheus.rb /path/to/config.yml

# Debug mode to output all json and blacklisted checks
PROM_DEBUG=true check_prometheus.rb /path/to/config.yml

Development and testing

Dependencies: docker, docker-compose

To spinup a development stack and run the integration tests

ruby test.rb

Afterwards you can just run rspec to run the tests

To run the dockerized version (that gitlab-ci uses)


Environment variables

Name Example Default Description
PROM_DEBUG true false Debug output instead of sending checks to sensu
PROMETHEUS_ENDPOINT hostname:9090 localhost:9090 Connection string in the format address:port
SENSU_SOCKET_ADDRESS hostname localhost Address used to connect to the sensu socket
SENSU_SOCKET_PORT 1234 3030 Port used to connect to the sensu socket


Check configuration is defined in the config.yml file under the key checks, and checks based on custom Prometheus queries are under custom. Example:

  reported_by: sbppapik8s
  occurrences: 3
  whitelist: sbppapik8s.*
  use_default_source: false
  - service:
    name: kube-controller-manager.service
  - check: load_per_cluster
    host: sbppapik8s
      cluster: prometheus
      warn: 1.0
      crit: 2.0
      source: sbppapik8s
  - name: heartbeat
    query: up
      type: equals
      value: 1
      0: 'OK: Endpoint is alive and kicking'
      2: 'CRIT: Endpoints not reachable!'


Name Description
service Checks if a systemd service is active
memory Checks memory usage as a percentage
load_per_cpu Checks cpu load divided by cpus
load_per_cluster Checks cpu load of entire cluster divided by total cpus
load_per_cluster_minus_n Checks cpu load of entire cluster divided by total cpus minus n failures
inode Checks inode usage as a percentage per mountpoint
disk Checks filesytem usage as a percentage per mountpoint
disk_all Checks filesystem and inode usage of all mountpoints
predict_disk_all Predicts if any of the disks in prometheus will be full in x days


Name Example Description
name heartbeat Custom check's name
query up Prometheus query
check.type (equals|below|above) Type of evaluation applied against value. Avilable: equals, below and above
check.value 1 Value to be compared against query results, using check.type evaluation
cfg.warn 33.00 Warning threshold level
cfg.crit 37.00 Critical threshold level.
msg.0 OK: heartbeat is up Message to be used when value evaluation is sucessful.
msg.2 CRITICAL: heartbeat is down Message to be used when not sucessful.

Global Configuration Options

Name Example Description
reported_by sbppapik8s hostname that shows up in sensu reported_by field
occurrences 3 amount of failures before sensu will send an alert
whitelist sbppapik8s.* regex used as a safety whitelist to make sure the source names are correct
ttl 300 Override the Sensu TTL in seconds
ttl_status 1 Override the status code for an expiring Sensu TTL
use_default_source false When true the source of the events will be Sensu-Client's

Check Configuration Options

Name Config Example
service name: servicename
state: active|deactivating|failed|inactive (default:active)
state_required: 0|1 (default:1)
name: test-service.service
memory warn: warning percentage
crit: critical percentage
warn: 90
crit: 95
load_per_cpu warn: warning percentage
crit: critical percentage
warn: 90
crit: 95
load_per_cluster cluster: cluster name
warn: warning percentage
crit: critical percentage
source: name that shows in sensu
cluster: nodes
warn: 90
crit: 95
source: sbppapik8s
load_per_cluster_minus_n cluster: cluster name
minus_n: amount of member failures
warn: warning percentage
crit: critical percentage
source: name that shows in sensu
cluster: nodes
minus_n: 1
warn: 90
crit: 95
source: sbppapik8s
inode mount: mountpoint
name: human readable name
warn: warning percentage
crit: critical percentage
mount: /var/lib/docker
name: docker
warn: 90
crit: 95
disk mount: mountpoint
name: human readable name
warn: warning percentage
crit: critical percentage
mount: /var/lib/docker
name: docker
warn: 90
crit: 95
disk_all ignore_fs: regex of filesystems
warn: warning percentage
crit: critical percentage
ignore_fs: tmpfs
warn: 90
crit: 95
predict_disk_all range_vector: Prometheus range vector used for sample size of prediction filter: prometheus filter to include/exclude disks
days: prediction days source: sensu name
range_vector: 24h
filter: mountpoint="/"
days: 14 source: sbppapik8s