fluent-plugin-norikra
Fluentd plugins to send/receive events to/from Norikra server.
Norikra is an open source server software provides "Stream Processing" with SQL, written in JRuby, runs on JVM, licensed under GPLv2. For more details, see: http://norikra.github.io/ .
fluent-plugin-norikra has 3 plugins: in_norikra, out_norikra and out_norikra_filter.
- in_norikra
- fetch events of query results from Norikra server
- out_norikra
- send events to Norikra server
- out_norikra_filter
- launch Norikra server as child process dynamically, as needed
- use Norikra server as event filter (like out_exec_filter)
- register/execute queries for targets newly incoming
Setup
fluent-plugin-norikra
works with Norikra server, on same server with Fluentd, or anywhere reachable over network from Fluentd.
For Norikra server setup, see: http://norikra.github.io/ .
NOTES:
- Fluentd and fluent-plugin-norikra requires CRuby (MatzRuby).
- Norikra requires JRuby.
To use out_norikra_filter with dynamic Norikra server launching, check actual path of command norikra
under installed JRuby tree. (ex: $HOME/.rbenv/versions/jruby-1.7.8/bin/norikra
)
To use this plugin:
- run
gem install fluent-plugin-norikra
orfluent-gem install fluent-plugin-norikra
to install plugin - edit configuration files
- execute fluentd
Configuration
For variations, see example
directory.
NorikraOutput
Sends events to remote Norikra server. Minimal configurations are:
<match data.*>
type norikra
norikra norikra.server.local:26571
remove_tag_prefix data
target_map_tag true # fluentd's tag 'data.event' -> norikra's target 'event'
</match>
NorikraOutput plugin opens Norikra's target for newly incoming tags. You can specify fields to include/exclude, and specify types of each fields, for each targets (and all targets by default
). Definitions in <target TARGET_NAME>
overwrites <default>
specifications.
<match data.*>
type norikra
norikra norikra.server.local:26571
target_map_tag true # fluentd's tag -> norikra's target
remove_tag_prefix data
# other options:
# target_map_key KEY_NAME # use specified key's value as target in fluentd event
# target_string STRING # use fixed target name specified
# drop_error_record true # drop records chunk which includes records to occur ClientError on norikra server
# # default: true
# # (ex: specified (non-optional) fields missing or invalid value for specified type)
# drop_server_error_record true # drop records chunk when any ServerError occurs
# # default: false (to retry)
<default>
include * # send all fields values to norikra
exclude time # exclude 'time' field from sending event values
# AND/OR 'include_regexp' and 'exclude_regexp' available
field_integer seq # field 'seq' defined as integer for all targets
escape_fieldname yes # Escape field name special chars (non alphabetical or numerical names) with underscore('_')
# This is friendly for query access (ex: field.key1.cpu_total)
# Default: no
</default>
<target users>
field_string name,address
field_integer age
field_float height,weight
field_boolean superuser
</target>
</match>
With default setting, all fields are defined as 'string', so you must use field_xxxx
parameters for numerical processing in query (For more details, see Norikra and Esper's documents).
If fluentd's events has so many variations of sets of fields, you can specify not to include fields automatically, with auto_field
option:
<match data.*>
type norikra
norikra norikra.server.local:26571
target_map_tag true # fluentd's tag 'data.event' -> norikra's target 'event'
remove_tag_prefix data
<default>
auto_field false # norikra includes fields only used in queries.
</default>
</match>
Fields which are referred in queries are automatically registered on norikra server in spite of auto_field false
.
Use time_key FIELDNAME
to include time of Fluentd's event into data field of Norikra (by milliseconds with Norikra/Esper's rule). This is useful for queries with .win:ext_timed_batch(FIELD, PERIOD)
views.
** NOTE:
NorikraInput
Fetch events from Norikra server, and emits these into Fluentd itself. NorikraInput uses Norikra's API event
(for queries), and sweep
(for query groups).
Minimal configurations:
<source>
type norikra
norikra norikra.server.local:26571
<fetch>
method sweep
# target QUERY_GROUP_NAME # not specified => default query group
tag query_name
tag_prefix norikra.query
# other options:
# tag field FIELDNAME : tag by value with specified field name in output event
# tag string STRING : fixed string specified
interval 3s # interval to call api
</fetch>
</source>
Available <fetch>
methods are event
and sweep
. target
parameter is handled as query name for event
, and as query group name for sweep
.
<source>
type norikra
norikra norikra.server.local:26571
<fetch>
method event
target data_count_1hour
tag string data.count.1hour
interval 60m
</fetch>
<fetch>
method event
target data_count_5min
tag string data.count.5min
interval 5m
</fetch>
<fetch>
method sweep
target count_queries
tag field target_name
tag_prefix data.count.all
interval 15s
</fetch>
</source>
NorikraFilterOutput
NorikraFilterOutput has all features of both of NorikraInput and NorikraOutput, and also has additional features:
- execute Norikra server
- runs queries for newly incoming targets.
If you runs Norikra as standalone process, better configurations are to use NorikraInput and NorikraOutput separately. NorikraFilterOutput is for simple aggregations and filterings.
Configuration example to receive tags like event.foo
and send norikra's target foo
, and get count of its records per minute, and per hour with built-in Norikra server:
<match event.*>
type norikra_filter
<server>
path /home/username/.rbenv/versions/jruby-1.7.4/bin/norikra
# opts -Xmx2g # options of 'norikra start'
</server>
remove_tag_prefix event
target_map_tag yes
<default>
<query>
name count_min_${target}
group count_query_group # or default when omitted
expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 minute)
tag count.min.${target}
fetch_interval 10s
</query>
<query>
name count_hour_${target}
group count_query_group
expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 hour)
tag count.hour.${target}
</query>
</default>
</match>
Results of queries automatically registered by NorikraFilterOutput with tag
parameter, will be fetched automatically by this plugin, and re-emitted into Fluentd itself.
Other all options are available as same as NorikraInput and NorikraOutput. <default>
, <target>
and <fetch>
sections, auto_field
, include|exclude
and field_xxxx
specifiers for targets and parameters for <fetch>
sections.
Input event data filtering
If you want send known fields only, specify exclude *
and include
or include_regexp
like this:
<default>
exclude *
include path,status,method,bytes,rhost,referer,agent,duration
include_pattern ^(query_|header_).*
# ...
</default>
Or you can specify to include as default, and exclude known some fields:
<default>
include *
exclude user_secret
include_pattern ^(header_).*
# ...
</default>
NOTE: These configurations of <target>
section overwrites of configurations in <default>
section.
Target mapping
Norikra's target (like table name) can be generated from:
- tag
- one target per one tag
target_map_tag yes
- value of specified field
- targets from values in specified field of record, dynamically
target_map_key foo
- fixed string (in configuration file)
- all records are sent in single target
target_string from_fluentd
TODO
- write about these topics
- error logs for new target, success logs of retry
Copyright
- Copyright (c) 2013- TAGOMORI Satoshi (tagomoris)
- License
- Apache License, version 2.0