Class: NCSAParser::Parser
- Inherits:
-
Object
- Object
- NCSAParser::Parser
- Defined in:
- lib/ncsa-parser/parser.rb
Overview
A line parser for a log file. Lines are parsed via Regexps. You can inject new tokens or override existing ones by modifying the passing along a :tokens
option and adding the keys to the :pattern
option accordingly.
Constant Summary collapse
- IP_ADDRESS =
'\d+\.\d+\.\d+\.\d+|unknown'
- TOKENS =
{ :host => "(?:#{IP_ADDRESS}|-|::1)", :host_proxy => "(?:#{IP_ADDRESS})(?:,\\s+#{IP_ADDRESS})*|-", :ident => '[^\s]+', :username => '[^\s]+', :datetime => '\[[^\]]+\]', :request => '".+"', :status => '\d+', :bytes => '\d+|-', :referer => '".*"', :ua => '".*"', :usertrack => "(?:#{IP_ADDRESS})[^ ]+|-", :outstream => '\d+|-', :instream => '\d+|-', :ratio => '\d+%|-%' }
- LOG_FORMAT_COMMON =
%w{ host ident username datetime request status bytes }
- LOG_FORMAT_COMBINED =
%w{ host ident username datetime request status bytes referer ua }
Instance Attribute Summary collapse
-
#matcher ⇒ Object
readonly
Returns the value of attribute matcher.
-
#pattern ⇒ Object
readonly
Returns the value of attribute pattern.
-
#re ⇒ Object
readonly
Returns the value of attribute re.
Instance Method Summary collapse
-
#initialize(options = {}) ⇒ Parser
constructor
Creates a new Parser object.
-
#parse_line(line) ⇒ Object
(also: #parse)
Parses a single line and returns an NCSAParser::ParsedLine object.
Constructor Details
#initialize(options = {}) ⇒ Parser
Creates a new Parser object.
Options
-
:domain
- when parsing query strings, use this domain as the URL’s domain. The default is “www.example.com”. -
:datetime_format
- sets the datetime format for when tokens are converted in NCSAParser::ParsedLine. The default is “[%d/%b/%Y:%H:%M:%S %Z]”. -
:pattern
- the default log line format to use. The default isLOG_FORMAT_COMBINED
, which matches the “combined” log format in Apache. The value for:pattern
can be either a space-delimited String of token names or an Array of token names. -
:browscap
- a browser capabilities object to use when sniffing out user agents. This object should be able to respond to thequery
method. Several browscap extensions are available for Ruby, and the the author of this extension’s version is called Browscapper and is available at github.com/dark-panda/browscapper . -
:token_conversions
- converters to pass along to the line parser. See NCSAParser::ParsedLine for details. -
:tokens
- tokens to add to the generated Regexp.
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
# File 'lib/ncsa-parser/parser.rb', line 67 def initialize( = {}) = { :domain => 'www.example.com', :datetime_format => '[%d/%b/%Y:%H:%M:%S %Z]', :pattern => LOG_FORMAT_COMBINED }.merge() @options = @pattern = if [:pattern].is_a?(Array) [:pattern] else [:pattern].to_s.split(/\s+/) end @re = '^' + @pattern.collect { |tk| tk = tk.to_sym token = if [:tokens] && [:tokens][tk] [:tokens][tk] elsif TOKENS[tk] TOKENS[tk] else raise ArgumentError.new("Token :#{tk} not found!") end "(#{token})" }.join(' ') + '$' @matcher = Regexp.new(@re) end |
Instance Attribute Details
#matcher ⇒ Object (readonly)
Returns the value of attribute matcher.
45 46 47 |
# File 'lib/ncsa-parser/parser.rb', line 45 def matcher @matcher end |
#pattern ⇒ Object (readonly)
Returns the value of attribute pattern.
45 46 47 |
# File 'lib/ncsa-parser/parser.rb', line 45 def pattern @pattern end |
#re ⇒ Object (readonly)
Returns the value of attribute re.
45 46 47 |
# File 'lib/ncsa-parser/parser.rb', line 45 def re @re end |
Instance Method Details
#parse_line(line) ⇒ Object Also known as: parse
Parses a single line and returns an NCSAParser::ParsedLine object.
97 98 99 100 101 102 103 104 105 106 107 108 |
# File 'lib/ncsa-parser/parser.rb', line 97 def parse_line(line) match = Hash.new if md = @matcher.match(line) @pattern.each_with_index do |k, j| match[k.to_sym] = md[j + 1] end match[:original] = line.strip else raise BadLogLine.new(line, @options[:pattern]) end ParsedLine.new(match, @options) end |