fluent-plugin-parser

Component

ParserOutput

This is a Fluentd plugin to parse strings in log messages and re-emit them.

DeparserOutput

Generate string log value from log message, with specified format and fields, and re-emit.

Configuration

ParserOutput

ParserOutput has just same with 'in_tail' about 'format' and 'time_format':

<match raw.apache.common.*>
  type parser
  remove_prefix raw
  format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)$/
  time_format %d/%b/%Y:%H:%M:%S %z
  key_name message
</match>

Of course, you can use predefined format 'apache' and 'syslog':

<match raw.apache.combined.*>
  type parser
  remove_prefix raw
  format apache
  key_name message
</match>

If you want original attribute-data pair in re-emitted message, specify 'reserve_data':

<match raw.apache.*>
  type parser
  tag apache
  format apache
  key_name message
  reserve_data yes
</match>

Format 'json', 'csv' and 'tsv' is also supported:

<match raw.sales.*>
  type parser
  tag sales
  format json
  key_name sales
</match>

Format 'ltsv'(Labeled-TSV (Tab separated values)) is also supported:

<match raw.sales.*>
  type parser
  tag sales
  format ltsv
  key_name sales
</match>

'LTSV' is format like below, unlinke json, easy to write with simple formatter (ex: LogFormat of apache):

KEY1:VALUE1 [TAB] KEY2:VALUE2 [TAB] ...

About LTSV, see: http://ltsv.org/

If you want to suppress 'pattern not match' log, specify 'suppress_parse_error_log true' to configuration. default value is false.

<match in.hogelog>
  type parser
  tag hogelog
  format /^col1=(?<col1>.+) col2=(?<col2>.+)$/
  key_name message
  suppress_parse_error_log true
</match>

To store parsed values with specified key name prefix, use inject_key_prefix option:

<match raw.sales.*>
  type parser
  tag sales
  format json
  key_name sales
  reserve_data      yes
  inject_key_prefix sales.
</match>
# input string of 'sales': {"user":1,"num":2}
# output data: {"sales":"{\"user\":1,\"num\":2}","sales.user":1, "sales.num":2}

To store parsed values as a hash value in a field, use hash_value_field option:

<match raw.sales.*>
  type parser
  tag sales
  format json
  key_name sales
  hash_value_field parsed
</match>
# input string of 'sales': {"user":1,"num":2}
# output data: {"parsed":{"user":1, "num":2}}

Other options (ex: reserve_data, inject_key_prefix) are available with hash_value_field.

# output data: {"sales":"{\"user\":1,\"num\":2}", "parsed":{"sales.user":1, "sales.num":2}}

Not to parse times (reserve that field like 'time' in record), specify time_parse no:

<match raw.sales.*>
  type parser
  tag sales
  format json
  key_name sales
  hash_value_field parsed
  time_parse no
</match>
# input string of 'sales': {"user":1,"num":2,"time":"2013-10-31 12:48:33"}
# output data: {"parsed":{"user":1, "num":2,"time":"2013-10-31 12:48:33"}}

DeparserOutput

To build CSV from field 'store','item','num', as field 'csv', without raw data:

<match in.marketlog.**>
  type deparser
  remove_prefix in
  format %s,%s,%s
  format_key_names store,item,num
  key_name csv
</match>

To build same CSV, as additional field 'csv', with reserved raw fields:

<match in.marketlog>
  type deparser
  tag marketlog
  format %s,%s,%s
  format_key_names store,item,num
  key_name csv
  reserve_data yes
</match>

TODO

  • consider what to do next
  • patches welcome!
  • Copyright
    • Copyright (c) 2012- TAGOMORI Satoshi (tagomoris)
  • License
    • Apache License, Version 2.0