Class: RequestLogAnalyzer::Source::LogParser
- Includes:
- Enumerable
- Defined in:
- lib/request_log_analyzer/source/log_parser.rb
Overview
The LogParser class reads log data from a given source and uses a file format definition to parse all relevent information about requests from the file. A FileFormat module should be provided that contains the definitions of the lines that occur in the log data.
De order in which lines occur is used to combine lines to a single request. If these lines are mixed, requests cannot be combined properly. This can be the case if data is written to the log file simultaneously by different mongrel processes. This problem is detected by the parser. It will emit warnings when this occurs. LogParser supports multiple parse strategies that deal differently with this problem.
Constant Summary collapse
- DEFAULT_MAX_LINE_LENGTH =
The maximum number of bytes to read from a line.
8096- DEFAULT_LINE_DIVIDER =
"\n"- DEFAULT_PARSE_STRATEGY =
The default parse strategy that will be used to parse the input.
'assume-correct'- PARSE_STRATEGIES =
All available parse strategies.
['cautious', 'assume-correct']
Instance Attribute Summary collapse
-
#current_file ⇒ Object
readonly
Returns the value of attribute current_file.
-
#current_lineno ⇒ Object
readonly
Returns the value of attribute current_lineno.
-
#parsed_lines ⇒ Object
readonly
Returns the value of attribute parsed_lines.
-
#parsed_requests ⇒ Object
readonly
Returns the value of attribute parsed_requests.
-
#processed_files ⇒ Object
readonly
Returns the value of attribute processed_files.
-
#skipped_lines ⇒ Object
readonly
Returns the value of attribute skipped_lines.
-
#skipped_requests ⇒ Object
readonly
Returns the value of attribute skipped_requests.
-
#source_files ⇒ Object
readonly
Returns the value of attribute source_files.
-
#warnings ⇒ Object
readonly
Returns the value of attribute warnings.
Attributes inherited from Base
#current_request, #file_format, #options
Instance Method Summary collapse
-
#decompress_file?(filename) ⇒ Boolean
Check if a file has a compressed extention in the filename.
-
#each_request(options = {}, &block) ⇒ Object
(also: #each)
Reads the input, which can either be a file, sequence of files or STDIN to parse lines specified in the FileFormat.
-
#initialize(format, options = {}) ⇒ LogParser
constructor
Initializes the log file parser instance.
- #line_divider ⇒ Object
- #max_line_length ⇒ Object
-
#parse_file(file, options = {}, &block) ⇒ Object
Parses a log file.
-
#parse_files(files, options = {}, &block) ⇒ Object
Parses a list of subsequent files of the same format, by calling parse_file for every file in the array.
-
#parse_io_18(io, options = {}, &block) ⇒ Object
This method loops over each line of the input stream.
-
#parse_io_19(io, options = {}, &block) ⇒ Object
This method loops over each line of the input stream.
-
#parse_line(line, &block) ⇒ Object
Parses a single line using the current file format.
-
#parse_stream(stream, options = {}, &block) ⇒ Object
Parses an IO stream.
-
#parse_string(string, options = {}, &block) ⇒ Object
Parses a string.
-
#progress=(proc) ⇒ Object
Add a block to this method to install a progress handler while parsing.
-
#source_changes=(proc) ⇒ Object
- Add a block to this method to install a source change handler while parsing,
proc -
The proc that will be called to handle source changes.
- Add a block to this method to install a source change handler while parsing,
-
#warn(type, message) ⇒ Object
This method is called by the parser if it encounteres any parsing problems.
-
#warning=(proc) ⇒ Object
- Add a block to this method to install a warning handler while parsing,
proc -
The proc that will be called to handle parse warning messages.
- Add a block to this method to install a warning handler while parsing,
Methods inherited from Base
Constructor Details
#initialize(format, options = {}) ⇒ LogParser
Initializes the log file parser instance. It will apply the language specific FileFormat module to this instance. It will use the line definitions in this module to parse any input that it is given (see parse_io).
format-
The current file format instance
options-
A hash of options that are used by the parser
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 34 def initialize(format, = {}) super(format, ) @warnings = 0 @parsed_lines = 0 @parsed_requests = 0 @skipped_lines = 0 @skipped_requests = 0 @current_request = nil @current_source = nil @current_file = nil @current_lineno = nil @processed_files = [] @source_files = [:source_files] @progress_handler = nil @warning_handler = nil [:parse_strategy] ||= DEFAULT_PARSE_STRATEGY unless PARSE_STRATEGIES.include?([:parse_strategy]) fail "Unknown parse strategy: #{@options[@parse_strategy]}" end end |
Instance Attribute Details
#current_file ⇒ Object (readonly)
Returns the value of attribute current_file.
25 26 27 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 25 def current_file @current_file end |
#current_lineno ⇒ Object (readonly)
Returns the value of attribute current_lineno.
25 26 27 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 25 def current_lineno @current_lineno end |
#parsed_lines ⇒ Object (readonly)
Returns the value of attribute parsed_lines.
26 27 28 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 26 def parsed_lines @parsed_lines end |
#parsed_requests ⇒ Object (readonly)
Returns the value of attribute parsed_requests.
26 27 28 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 26 def parsed_requests @parsed_requests end |
#processed_files ⇒ Object (readonly)
Returns the value of attribute processed_files.
25 26 27 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 25 def processed_files @processed_files end |
#skipped_lines ⇒ Object (readonly)
Returns the value of attribute skipped_lines.
26 27 28 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 26 def skipped_lines @skipped_lines end |
#skipped_requests ⇒ Object (readonly)
Returns the value of attribute skipped_requests.
26 27 28 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 26 def skipped_requests @skipped_requests end |
#source_files ⇒ Object (readonly)
Returns the value of attribute source_files.
25 26 27 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 25 def source_files @source_files end |
#warnings ⇒ Object (readonly)
Returns the value of attribute warnings.
26 27 28 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 26 def warnings @warnings end |
Instance Method Details
#decompress_file?(filename) ⇒ Boolean
Check if a file has a compressed extention in the filename. If recognized, return the command string used to decompress the file
97 98 99 100 101 102 103 104 105 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 97 def decompress_file?(filename) nice_command = 'nice -n 5' return "#{nice_command} gunzip -c -d #{filename}" if filename.match(/\.tar.gz$/) || filename.match(/\.tgz$/) || filename.match(/\.gz$/) return "#{nice_command} bunzip2 -c -d #{filename}" if filename.match(/\.bz2$/) return "#{nice_command} unzip -p #{filename}" if filename.match(/\.zip$/) '' end |
#each_request(options = {}, &block) ⇒ Object Also known as: each
Reads the input, which can either be a file, sequence of files or STDIN to parse lines specified in the FileFormat. This lines will be combined into Request instances, that will be yielded. The actual parsing occurs in the parse_io method.
options-
A Hash of options that will be pased to parse_io.
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 68 def each_request( = {}, &block) # :yields: :request, request case @source_files when IO if @source_files == $stdin puts 'Parsing from the standard input. Press CTRL+C to finish.' # FIXME: not here end parse_stream(@source_files, , &block) when String parse_file(@source_files, , &block) when Array parse_files(@source_files, , &block) else fail 'Unknown source provided' end end |
#line_divider ⇒ Object
60 61 62 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 60 def line_divider file_format.line_divider || DEFAULT_LINE_DIVIDER end |
#max_line_length ⇒ Object
56 57 58 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 56 def max_line_length file_format.max_line_length || DEFAULT_MAX_LINE_LENGTH end |
#parse_file(file, options = {}, &block) ⇒ Object
Parses a log file. Creates an IO stream for the provided file, and sends it to parse_io for further handling. This method supports progress updates that can be used to display a progressbar
If the logfile is compressed, it is uncompressed to stdout and read. TODO: Check if IO.popen encounters problems with the given command line. TODO: Fix progress bar that is broken for IO.popen, as it returns a single string.
file-
The file that should be parsed.
options-
A Hash of options that will be pased to parse_io.
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 116 def parse_file(file, = {}, &block) if File.directory?(file) parse_files(Dir["#{ file }/*"], , &block) return end @current_source = File.(file) @source_changes_handler.call(:started, @current_source) if @source_changes_handler if decompress_file?(file).empty? @progress_handler = @dormant_progress_handler @progress_handler.call(:started, file) if @progress_handler File.open(file, 'rb') { |f| parse_io(f, , &block) } @progress_handler.call(:finished, file) if @progress_handler @progress_handler = nil @processed_files.push(@current_source.dup) else IO.popen(decompress_file?(file), 'rb') { |f| parse_io(f, , &block) } end @source_changes_handler.call(:finished, @current_source) if @source_changes_handler @current_source = nil end |
#parse_files(files, options = {}, &block) ⇒ Object
Parses a list of subsequent files of the same format, by calling parse_file for every file in the array.
files-
The Array of files that should be parsed
options-
A Hash of options that will be pased to parse_io.
91 92 93 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 91 def parse_files(files, = {}, &block) # :yields: request files.each { |file| parse_file(file, , &block) } end |
#parse_io_18(io, options = {}, &block) ⇒ Object
This method loops over each line of the input stream. It will try to parse this line as any of the lines that are defined by the current file format (see RequestLogAnalyazer::FileFormat). It will then combine these parsed line into requests using heuristics. These requests (see RequestLogAnalyzer::Request) will then be yielded for further processing in the pipeline.
-
RequestLogAnalyzer::LineDefinition#matches is called to test if a line matches a line definition of the file format.
-
update_current_request is used to combine parsed lines into requests using heuristics.
-
The method will yield progress updates if a progress handler is installed using progress=
-
The method will yield parse warnings if a warning handler is installed using warning=
This is a Ruby 1.8 specific version that doesn’t offer memory protection.
io-
The IO instance to use as source
options-
A hash of options that can be used by the parser.
203 204 205 206 207 208 209 210 211 212 213 214 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 203 def parse_io_18(io, = {}, &block) # :yields: request @line_divider = [:line_divider] || line_divider @current_lineno = 0 while line = io.gets(@line_divider) @current_lineno += 1 @progress_handler.call(:progress, io.pos) if @progress_handler && @current_lineno % 255 == 0 parse_line(line, &block) end warn(:unfinished_request_on_eof, 'End of file reached, but last request was not completed!') unless @current_request.nil? @current_lineno = nil end |
#parse_io_19(io, options = {}, &block) ⇒ Object
This method loops over each line of the input stream. It will try to parse this line as any of the lines that are defined by the current file format (see RequestLogAnalyazer::FileFormat). It will then combine these parsed line into requests using heuristics. These requests (see RequestLogAnalyzer::Request) will then be yielded for further processing in the pipeline.
-
RequestLogAnalyzer::LineDefinition#matches is called to test if a line matches a line definition of the file format.
-
update_current_request is used to combine parsed lines into requests using heuristics.
-
The method will yield progress updates if a progress handler is installed using progress=
-
The method will yield parse warnings if a warning handler is installed using warning=
This is a Ruby 1.9 specific version that offers memory protection.
io-
The IO instance to use as source
options-
A hash of options that can be used by the parser.
175 176 177 178 179 180 181 182 183 184 185 186 187 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 175 def parse_io_19(io, = {}, &block) # :yields: request @max_line_length = [:max_line_length] || max_line_length @line_divider = [:line_divider] || line_divider @current_lineno = 0 while line = io.gets(@line_divider, @max_line_length) @current_lineno += 1 @progress_handler.call(:progress, io.pos) if @progress_handler && @current_lineno % 255 == 0 parse_line(line, &block) end warn(:unfinished_request_on_eof, 'End of file reached, but last request was not completed!') unless @current_request.nil? @current_lineno = nil end |
#parse_line(line, &block) ⇒ Object
Parses a single line using the current file format. If successful, use the parsed information to build a request
line-
The line to parse
block-
The block to send fully parsed requests to.
222 223 224 225 226 227 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 222 def parse_line(line, &block) # :yields: request if request_data = file_format.parse_line(line) { |wt, | warn(wt, ) } @parsed_lines += 1 update_current_request(request_data.merge(source: @current_source, lineno: @current_lineno), &block) end end |
#parse_stream(stream, options = {}, &block) ⇒ Object
Parses an IO stream. It will simply call parse_io. This function does not support progress updates because the length of a stream is not known.
stream-
The IO stream that should be parsed.
options-
A Hash of options that will be pased to parse_io.
150 151 152 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 150 def parse_stream(stream, = {}, &block) parse_io(stream, , &block) end |
#parse_string(string, options = {}, &block) ⇒ Object
Parses a string. It will simply call parse_io. This function does not support progress updates.
string-
The string that should be parsed.
options-
A Hash of options that will be pased to parse_io.
157 158 159 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 157 def parse_string(string, = {}, &block) parse_io(StringIO.new(string), , &block) end |
#progress=(proc) ⇒ Object
Add a block to this method to install a progress handler while parsing.
proc-
The proc that will be called to handle progress update messages
231 232 233 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 231 def progress=(proc) @dormant_progress_handler = proc end |
#source_changes=(proc) ⇒ Object
Add a block to this method to install a source change handler while parsing,
proc-
The proc that will be called to handle source changes
243 244 245 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 243 def source_changes=(proc) @source_changes_handler = proc end |
#warn(type, message) ⇒ Object
This method is called by the parser if it encounteres any parsing problems. It will call the installed warning handler if any.
By default, RequestLogAnalyzer::Controller will install a warning handler that will pass the warnings to each aggregator so they can do something useful with it.
type-
The warning type (a Symbol)
message-
A message explaining the warning
256 257 258 259 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 256 def warn(type, ) @warnings += 1 @warning_handler.call(type, , @current_lineno) if @warning_handler end |
#warning=(proc) ⇒ Object
Add a block to this method to install a warning handler while parsing,
proc-
The proc that will be called to handle parse warning messages
237 238 239 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 237 def warning=(proc) @warning_handler = proc end |