Class: RequestLogAnalyzer::Source::LogParser

Inherits:
Base
  • Object
show all
Includes:
Enumerable
Defined in:
lib/request_log_analyzer/source/log_parser.rb

Overview

The LogParser class reads log data from a given source and uses a file format definition to parse all relevent information about requests from the file. A FileFormat module should be provided that contains the definitions of the lines that occur in the log data.

De order in which lines occur is used to combine lines to a single request. If these lines are mixed, requests cannot be combined properly. This can be the case if data is written to the log file simultaneously by different mongrel processes. This problem is detected by the parser. It will emit warnings when this occurs. LogParser supports multiple parse strategies that deal differently with this problem.

Constant Summary collapse

DEFAULT_PARSE_STRATEGY =

The default parse strategy that will be used to parse the input.

'assume-correct'
PARSE_STRATEGIES =

All available parse strategies.

['cautious', 'assume-correct']

Instance Attribute Summary collapse

Attributes inherited from Base

#current_request, #file_format, #options, #parsed_lines, #parsed_requests, #skipped_lines, #skipped_requests

Instance Method Summary collapse

Methods inherited from Base

#finalize, #prepare

Constructor Details

#initialize(format, options = {}) ⇒ LogParser

Initializes the log file parser instance. It will apply the language specific FileFormat module to this instance. It will use the line definitions in this module to parse any input that it is given (see parse_io).

format

The current file format instance

options

A hash of options that are used by the parser



30
31
32
33
34
35
36
37
38
39
40
41
42
# File 'lib/request_log_analyzer/source/log_parser.rb', line 30

def initialize(format, options = {})
  super(format, options)
  @parsed_lines     = 0
  @parsed_requests  = 0
  @skipped_lines    = 0
  @skipped_requests = 0
  @current_file     = nil
  @current_lineno   = nil
  @source_files     = options[:source_files]
  
  @options[:parse_strategy] ||= DEFAULT_PARSE_STRATEGY
  raise "Unknown parse strategy" unless PARSE_STRATEGIES.include?(@options[:parse_strategy])
end

Instance Attribute Details

#current_fileObject (readonly)

Returns the value of attribute current_file.



22
23
24
# File 'lib/request_log_analyzer/source/log_parser.rb', line 22

def current_file
  @current_file
end

#current_linenoObject (readonly)

Returns the value of attribute current_lineno.



22
23
24
# File 'lib/request_log_analyzer/source/log_parser.rb', line 22

def current_lineno
  @current_lineno
end

#source_filesObject (readonly)

Returns the value of attribute source_files.



22
23
24
# File 'lib/request_log_analyzer/source/log_parser.rb', line 22

def source_files
  @source_files
end

Instance Method Details

#decompress_file?(filename) ⇒ Boolean

Check if a file has a compressed extention in the filename. If recognized, return the command string used to decompress the file

Returns:

  • (Boolean)


76
77
78
79
80
81
82
83
84
# File 'lib/request_log_analyzer/source/log_parser.rb', line 76

def decompress_file?(filename)
  nice_command = "nice -n 5"
  
  return "#{nice_command} gunzip -c -d #{filename}" if filename.match(/\.tar.gz$/) || filename.match(/\.tgz$/) || filename.match(/\.gz$/)
  return "#{nice_command} bunzip2 -c -d #{filename}" if filename.match(/\.bz2$/)
  return "#{nice_command} unzip -p #{filename}" if filename.match(/\.zip$/)

  return ""
end

#each_request(options = {}, &block) ⇒ Object Also known as: each

Reads the input, which can either be a file, sequence of files or STDIN to parse lines specified in the FileFormat. This lines will be combined into Request instances, that will be yielded. The actual parsing occurs in the parse_io method.

options

A Hash of options that will be pased to parse_io.



48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/request_log_analyzer/source/log_parser.rb', line 48

def each_request(options = {}, &block) # :yields: :request, request
  
  case @source_files
  when IO
    puts "Parsing from the standard input. Press CTRL+C to finish." # FIXME: not here
    parse_stream(@source_files, options, &block) 
  when String
    parse_file(@source_files, options, &block) 
  when Array
    parse_files(@source_files, options, &block) 
  else
    raise "Unknown source provided"
  end
end

#parse_file(file, options = {}, &block) ⇒ Object

Parses a log file. Creates an IO stream for the provided file, and sends it to parse_io for further handling. This method supports progress updates that can be used to display a progressbar

If the logfile is compressed, it is uncompressed to stdout and read. TODO: Check if IO.popen encounters problems with the given command line. TODO: Fix progress bar that is broken for IO.popen, as it returns a single string.

file

The file that should be parsed.

options

A Hash of options that will be pased to parse_io.



95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# File 'lib/request_log_analyzer/source/log_parser.rb', line 95

def parse_file(file, options = {}, &block)

  @current_source = File.expand_path(file)
  @source_changes_handler.call(:started, @current_source) if @source_changes_handler
  
  if decompress_file?(file).empty?

    @progress_handler = @dormant_progress_handler
    @progress_handler.call(:started, file) if @progress_handler
    
    File.open(file, 'r') { |f| parse_io(f, options, &block) }
    
    @progress_handler.call(:finished, file) if @progress_handler
    @progress_handler = nil
  else
    IO.popen(decompress_file?(file), 'r') { |f| parse_io(f, options, &block) }
  end
  
  @source_changes_handler.call(:finished, @current_source) if @source_changes_handler
  
  @current_source = nil

end

#parse_files(files, options = {}, &block) ⇒ Object

Parses a list of subsequent files of the same format, by calling parse_file for every file in the array.

files

The Array of files that should be parsed

options

A Hash of options that will be pased to parse_io.



70
71
72
# File 'lib/request_log_analyzer/source/log_parser.rb', line 70

def parse_files(files, options = {}, &block) # :yields: request
  files.each { |file| parse_file(file, options, &block) }
end

#parse_io(io, options = {}, &block) ⇒ Object

This method loops over each line of the input stream. It will try to parse this line as any of the lines that are defined by the current file format (see RequestLogAnalyazer::FileFormat). It will then combine these parsed line into requests using heuristics. These requests (see RequestLogAnalyzer::Request) will then be yielded for further processing in the pipeline.

  • RequestLogAnalyzer::LineDefinition#matches is called to test if a line matches a line definition of the file format.

  • update_current_request is used to combine parsed lines into requests using heuristics.

  • The method will yield progress updates if a progress handler is installed using progress=

  • The method will yield parse warnings if a warning handler is installed using warning=

io

The IO instance to use as source

options

A hash of options that can be used by the parser.



139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
# File 'lib/request_log_analyzer/source/log_parser.rb', line 139

def parse_io(io, options = {}, &block) # :yields: request
  @current_lineno = 1
  while line = io.gets
    @progress_handler.call(:progress, io.pos) if @progress_handler && @current_lineno % 255 == 0
    
    if request_data = file_format.parse_line(line) { |wt, message| warn(wt, message) }
      @parsed_lines += 1
      update_current_request(request_data.merge(:source => @current_source, :lineno => @current_lineno), &block)
    end
    
    @current_lineno += 1
  end
  
  warn(:unfinished_request_on_eof, "End of file reached, but last request was not completed!") unless @current_request.nil?
  @current_lineno = nil
end

#parse_stream(stream, options = {}, &block) ⇒ Object

Parses an IO stream. It will simply call parse_io. This function does not support progress updates because the length of a stream is not known.

stream

The IO stream that should be parsed.

options

A Hash of options that will be pased to parse_io.



123
124
125
# File 'lib/request_log_analyzer/source/log_parser.rb', line 123

def parse_stream(stream, options = {}, &block)
  parse_io(stream, options, &block)
end

#progress=(proc) ⇒ Object

Add a block to this method to install a progress handler while parsing.

proc

The proc that will be called to handle progress update messages



158
159
160
# File 'lib/request_log_analyzer/source/log_parser.rb', line 158

def progress=(proc)
  @dormant_progress_handler = proc
end

#source_changes=(proc) ⇒ Object

Add a block to this method to install a source change handler while parsing,

proc

The proc that will be called to handle source changes



170
171
172
# File 'lib/request_log_analyzer/source/log_parser.rb', line 170

def source_changes=(proc)
  @source_changes_handler = proc
end

#warn(type, message) ⇒ Object

This method is called by the parser if it encounteres any parsing problems. It will call the installed warning handler if any.

By default, RequestLogAnalyzer::Controller will install a warning handler that will pass the warnings to each aggregator so they can do something useful with it.

type

The warning type (a Symbol)

message

A message explaining the warning



183
184
185
# File 'lib/request_log_analyzer/source/log_parser.rb', line 183

def warn(type, message)
  @warning_handler.call(type, message, @current_lineno) if @warning_handler
end

#warning=(proc) ⇒ Object

Add a block to this method to install a warning handler while parsing,

proc

The proc that will be called to handle parse warning messages



164
165
166
# File 'lib/request_log_analyzer/source/log_parser.rb', line 164

def warning=(proc)
  @warning_handler = proc
end