Class: RequestLogAnalyzer::Source::LogParser
- Includes:
- Enumerable
- Defined in:
- lib/request_log_analyzer/source/log_parser.rb
Overview
The LogParser class reads log data from a given source and uses a file format definition to parse all relevent information about requests from the file. A FileFormat module should be provided that contains the definitions of the lines that occur in the log data.
De order in which lines occur is used to combine lines to a single request. If these lines are mixed, requests cannot be combined properly. This can be the case if data is written to the log file simultaneously by different mongrel processes. This problem is detected by the parser. It will emit warnings when this occurs. LogParser supports multiple parse strategies that deal differently with this problem.
Constant Summary collapse
- DEFAULT_PARSE_STRATEGY =
The default parse strategy that will be used to parse the input.
'assume-correct'
- PARSE_STRATEGIES =
All available parse strategies.
['cautious', 'assume-correct']
Instance Attribute Summary collapse
-
#current_file ⇒ Object
readonly
Returns the value of attribute current_file.
-
#current_lineno ⇒ Object
readonly
Returns the value of attribute current_lineno.
-
#source_files ⇒ Object
readonly
Returns the value of attribute source_files.
Attributes inherited from Base
#current_request, #file_format, #options, #parsed_lines, #parsed_requests, #skipped_lines, #skipped_requests
Instance Method Summary collapse
-
#decompress_file?(filename) ⇒ Boolean
Check if a file has a compressed extention in the filename.
-
#each_request(options = {}, &block) ⇒ Object
(also: #each)
Reads the input, which can either be a file, sequence of files or STDIN to parse lines specified in the FileFormat.
-
#initialize(format, options = {}) ⇒ LogParser
constructor
Initializes the log file parser instance.
-
#parse_file(file, options = {}, &block) ⇒ Object
Parses a log file.
-
#parse_files(files, options = {}, &block) ⇒ Object
Parses a list of subsequent files of the same format, by calling parse_file for every file in the array.
-
#parse_io(io, options = {}, &block) ⇒ Object
This method loops over each line of the input stream.
-
#parse_stream(stream, options = {}, &block) ⇒ Object
Parses an IO stream.
-
#progress=(proc) ⇒ Object
Add a block to this method to install a progress handler while parsing.
-
#source_changes=(proc) ⇒ Object
- Add a block to this method to install a source change handler while parsing,
proc
-
The proc that will be called to handle source changes.
- Add a block to this method to install a source change handler while parsing,
-
#warn(type, message) ⇒ Object
This method is called by the parser if it encounteres any parsing problems.
-
#warning=(proc) ⇒ Object
- Add a block to this method to install a warning handler while parsing,
proc
-
The proc that will be called to handle parse warning messages.
- Add a block to this method to install a warning handler while parsing,
Methods inherited from Base
Constructor Details
#initialize(format, options = {}) ⇒ LogParser
Initializes the log file parser instance. It will apply the language specific FileFormat module to this instance. It will use the line definitions in this module to parse any input that it is given (see parse_io).
format
-
The current file format instance
options
-
A hash of options that are used by the parser
30 31 32 33 34 35 36 37 38 39 40 41 42 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 30 def initialize(format, = {}) super(format, ) @parsed_lines = 0 @parsed_requests = 0 @skipped_lines = 0 @skipped_requests = 0 @current_file = nil @current_lineno = nil @source_files = [:source_files] @options[:parse_strategy] ||= DEFAULT_PARSE_STRATEGY raise "Unknown parse strategy" unless PARSE_STRATEGIES.include?(@options[:parse_strategy]) end |
Instance Attribute Details
#current_file ⇒ Object (readonly)
Returns the value of attribute current_file.
22 23 24 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 22 def current_file @current_file end |
#current_lineno ⇒ Object (readonly)
Returns the value of attribute current_lineno.
22 23 24 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 22 def current_lineno @current_lineno end |
#source_files ⇒ Object (readonly)
Returns the value of attribute source_files.
22 23 24 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 22 def source_files @source_files end |
Instance Method Details
#decompress_file?(filename) ⇒ Boolean
Check if a file has a compressed extention in the filename. If recognized, return the command string used to decompress the file
76 77 78 79 80 81 82 83 84 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 76 def decompress_file?(filename) nice_command = "nice -n 5" return "#{nice_command} gunzip -c -d #{filename}" if filename.match(/\.tar.gz$/) || filename.match(/\.tgz$/) || filename.match(/\.gz$/) return "#{nice_command} bunzip2 -c -d #{filename}" if filename.match(/\.bz2$/) return "#{nice_command} unzip -p #{filename}" if filename.match(/\.zip$/) return "" end |
#each_request(options = {}, &block) ⇒ Object Also known as: each
Reads the input, which can either be a file, sequence of files or STDIN to parse lines specified in the FileFormat. This lines will be combined into Request instances, that will be yielded. The actual parsing occurs in the parse_io method.
options
-
A Hash of options that will be pased to parse_io.
48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 48 def each_request( = {}, &block) # :yields: :request, request case @source_files when IO puts "Parsing from the standard input. Press CTRL+C to finish." # FIXME: not here parse_stream(@source_files, , &block) when String parse_file(@source_files, , &block) when Array parse_files(@source_files, , &block) else raise "Unknown source provided" end end |
#parse_file(file, options = {}, &block) ⇒ Object
Parses a log file. Creates an IO stream for the provided file, and sends it to parse_io for further handling. This method supports progress updates that can be used to display a progressbar
If the logfile is compressed, it is uncompressed to stdout and read. TODO: Check if IO.popen encounters problems with the given command line. TODO: Fix progress bar that is broken for IO.popen, as it returns a single string.
file
-
The file that should be parsed.
options
-
A Hash of options that will be pased to parse_io.
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 95 def parse_file(file, = {}, &block) @current_source = File.(file) @source_changes_handler.call(:started, @current_source) if @source_changes_handler if decompress_file?(file).empty? @progress_handler = @dormant_progress_handler @progress_handler.call(:started, file) if @progress_handler File.open(file, 'r') { |f| parse_io(f, , &block) } @progress_handler.call(:finished, file) if @progress_handler @progress_handler = nil else IO.popen(decompress_file?(file), 'r') { |f| parse_io(f, , &block) } end @source_changes_handler.call(:finished, @current_source) if @source_changes_handler @current_source = nil end |
#parse_files(files, options = {}, &block) ⇒ Object
Parses a list of subsequent files of the same format, by calling parse_file for every file in the array.
files
-
The Array of files that should be parsed
options
-
A Hash of options that will be pased to parse_io.
70 71 72 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 70 def parse_files(files, = {}, &block) # :yields: request files.each { |file| parse_file(file, , &block) } end |
#parse_io(io, options = {}, &block) ⇒ Object
This method loops over each line of the input stream. It will try to parse this line as any of the lines that are defined by the current file format (see RequestLogAnalyazer::FileFormat). It will then combine these parsed line into requests using heuristics. These requests (see RequestLogAnalyzer::Request) will then be yielded for further processing in the pipeline.
-
RequestLogAnalyzer::LineDefinition#matches is called to test if a line matches a line definition of the file format.
-
update_current_request is used to combine parsed lines into requests using heuristics.
-
The method will yield progress updates if a progress handler is installed using progress=
-
The method will yield parse warnings if a warning handler is installed using warning=
io
-
The IO instance to use as source
options
-
A hash of options that can be used by the parser.
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 139 def parse_io(io, = {}, &block) # :yields: request @current_lineno = 1 while line = io.gets @progress_handler.call(:progress, io.pos) if @progress_handler && @current_lineno % 255 == 0 if request_data = file_format.parse_line(line) { |wt, | warn(wt, ) } @parsed_lines += 1 update_current_request(request_data.merge(:source => @current_source, :lineno => @current_lineno), &block) end @current_lineno += 1 end warn(:unfinished_request_on_eof, "End of file reached, but last request was not completed!") unless @current_request.nil? @current_lineno = nil end |
#parse_stream(stream, options = {}, &block) ⇒ Object
Parses an IO stream. It will simply call parse_io. This function does not support progress updates because the length of a stream is not known.
stream
-
The IO stream that should be parsed.
options
-
A Hash of options that will be pased to parse_io.
123 124 125 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 123 def parse_stream(stream, = {}, &block) parse_io(stream, , &block) end |
#progress=(proc) ⇒ Object
Add a block to this method to install a progress handler while parsing.
proc
-
The proc that will be called to handle progress update messages
158 159 160 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 158 def progress=(proc) @dormant_progress_handler = proc end |
#source_changes=(proc) ⇒ Object
Add a block to this method to install a source change handler while parsing,
proc
-
The proc that will be called to handle source changes
170 171 172 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 170 def source_changes=(proc) @source_changes_handler = proc end |
#warn(type, message) ⇒ Object
This method is called by the parser if it encounteres any parsing problems. It will call the installed warning handler if any.
By default, RequestLogAnalyzer::Controller will install a warning handler that will pass the warnings to each aggregator so they can do something useful with it.
type
-
The warning type (a Symbol)
message
-
A message explaining the warning
183 184 185 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 183 def warn(type, ) @warning_handler.call(type, , @current_lineno) if @warning_handler end |
#warning=(proc) ⇒ Object
Add a block to this method to install a warning handler while parsing,
proc
-
The proc that will be called to handle parse warning messages
164 165 166 |
# File 'lib/request_log_analyzer/source/log_parser.rb', line 164 def warning=(proc) @warning_handler = proc end |