Class: LogParser

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/logbox/log_parser.rb

Overview

Parses a standard web server log file stream and returns a hash with key/values for each line. Includes the Enumerable interface.

Defined Under Namespace

Classes: ParseError

Constant Summary collapse

LOG_FORMAT =
/([^ ]*) [^ ]* [^ ]* \[([^\]]*)\] "([^"]*)" ([^ ]*)/
LOG_DATE_FORMAT =
"%d/%b/%Y:%H:%M:%S %z"
LOG_KEY_VALUE_FORMAT =
/[?&]([^=]+)=([^&]+)/
SERVER_ATTRIBUTES =
[:ip, :timestamp, :request, :status]

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input) ⇒ LogParser

Support both strings and streams as input.



11
12
13
14
# File 'lib/logbox/log_parser.rb', line 11

def initialize(input)
  input = StringIO.new(input) if input.class == String
  @stream = input
end

Class Method Details

.parse_line(line) ⇒ Object

Parse one log line and return a hash with all attributes.



34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# File 'lib/logbox/log_parser.rb', line 34

def self.parse_line(line)
  return nil if line.strip.empty?
  
  line =~ LOG_FORMAT
  result = {}

  # Save ip, timestamp and request.
  result[:ip] = $1
  begin
    result[:timestamp] = DateTime.strptime($2, LOG_DATE_FORMAT)
  rescue ArgumentError
    raise ParseError.new("Error while parsing timestamp")
  end
  result[:request] = $3
  result[:status] = $4

  # Extract key/values pairs from the query part of the request.
  $3.scan(LOG_KEY_VALUE_FORMAT) do |key, value|
    begin
      key = CGI.unescape(key).to_sym
      value = CGI.unescape(value)
    rescue Encoding::CompatibilityError => e
      raise ParseError.new("Error while parsing query parameters")
    end

    if result.has_key? key
      if result[key].is_a? Array
        result[key] << value
      else
        result[key] = [result[key], value]
      end
    else
      result[key] = value
    end
  end

  return result
rescue ParseError
  raise
rescue
  raise ParseError.new("Unknown parsing error")
end

Instance Method Details

#eachObject

Enumerable interface.



17
18
19
20
21
# File 'lib/logbox/log_parser.rb', line 17

def each
  while(observation = get_next_observation)
    yield observation
  end
end

#get_next_observationObject



23
24
25
26
# File 'lib/logbox/log_parser.rb', line 23

def get_next_observation
  line = @stream.gets
  line && LogParser.parse_line(line)
end