Class: HTTPTools::Parser

Inherits:
Object
  • Object
show all
Includes:
Encoding
Defined in:
lib/http_tools/parser.rb

Overview

HTTPTools::Parser is a pure Ruby HTTP request & response parser with an evented API.

The HTTP message can be fed in to the parser piece by piece as it comes over the wire, and the parser will call its callbacks as it works it’s way through the message.

Example:

parser = HTTPTools::Parser.new
parser.on(:header) do |header|
  puts parser.status_code + " " + parser.method
  puts parser.header.inspect
end
parser.on(:stream) {|chunk| print chunk}

parser << "HTTP/1.1 200 OK\r\n"
parser << "Content-Length: 20\r\n\r\n"
parser << "<h1>Hello world</h1>"

Prints:

200 OK
{"Content-Length" => "20"}
<h1>Hello world</h1>

Constant Summary collapse

COLON =
":".freeze
KEY_TERMINATOR =
": ".freeze
CONTENT_LENGTH =
"Content-Length".freeze
TRANSFER_ENCODING =
"Transfer-Encoding".freeze
TRAILER =
"Trailer".freeze
CONNECTION =
"Connection".freeze
CLOSE =
"close".freeze
CHUNKED =
"chunked".freeze
EVENTS =
%W{header stream trailer finish error}.map do |event|
  event.freeze
end.freeze
REQUEST_METHOD =
"REQUEST_METHOD".freeze
PATH_INFO =
"PATH_INFO".freeze
QUERY_STRING =
"QUERY_STRING".freeze
REQUEST_URI =
"REQUEST_URI".freeze
FRAGMENT =
"FRAGMENT".freeze
PROTOTYPE_ENV =
{
"SCRIPT_NAME" => "".freeze,
PATH_INFO => "/".freeze,
QUERY_STRING => "".freeze,
"rack.version" => [1, 1].freeze,
"rack.url_scheme" => "http".freeze,
"rack.errors" => STDERR,
"rack.multithread" => false,
"rack.multiprocess" => false,
"rack.run_once" => false}.freeze
HTTP_ =
"HTTP_".freeze
LOWERCASE =
"a-z-".freeze
UPPERCASE =
"A-Z_".freeze

Constants included from Encoding

Encoding::AMPERSAND, Encoding::CHUNK_FORMAT, Encoding::EQUALS, Encoding::HEX_BIG_ENDIAN_2_BYTES, Encoding::HEX_BIG_ENDIAN_REPEATING, Encoding::PERCENT, Encoding::PLUS

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Encoding

transfer_encoding_chunked_decode, transfer_encoding_chunked_encode, url_decode, url_encode, www_form_decode, www_form_encode

Constructor Details

#initializeParser

:call-seq: Parser.new -> parser

Create a new HTTPTools::Parser.



82
83
84
85
86
87
88
# File 'lib/http_tools/parser.rb', line 82

def initialize
  @state = :start
  @buffer = StringScanner.new("")
  @buffer_backup_reference = @buffer
  @header = {}
  @trailer = {}
end

Instance Attribute Details

#allow_html_without_headerObject

Allow responses with no status line or headers if it looks like HTML.



76
77
78
# File 'lib/http_tools/parser.rb', line 76

def allow_html_without_header
  @allow_html_without_header
end

#force_no_bodyObject

Skip parsing the body, e.g. with the response to a HEAD request.



73
74
75
# File 'lib/http_tools/parser.rb', line 73

def force_no_body
  @force_no_body
end

#force_trailerObject

Force parser to expect and parse a trailer when Trailer header missing.



70
71
72
# File 'lib/http_tools/parser.rb', line 70

def force_trailer
  @force_trailer
end

#fragmentObject (readonly)

Returns the value of attribute fragment.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def fragment
  @fragment
end

#headerObject (readonly)

Returns the value of attribute header.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def header
  @header
end

#messageObject (readonly)

Returns the value of attribute message.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def message
  @message
end

#path_infoObject (readonly)

Returns the value of attribute path_info.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def path_info
  @path_info
end

#query_stringObject (readonly)

Returns the value of attribute query_string.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def query_string
  @query_string
end

#request_methodObject (readonly)

Returns the value of attribute request_method.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def request_method
  @request_method
end

#request_uriObject (readonly)

Returns the value of attribute request_uri.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def request_uri
  @request_uri
end

#stateObject (readonly)

:nodoc:



65
66
67
# File 'lib/http_tools/parser.rb', line 65

def state
  @state
end

#status_codeObject (readonly)

Returns the value of attribute status_code.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def status_code
  @status_code
end

#trailerObject (readonly)

Returns the value of attribute trailer.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def trailer
  @trailer
end

#versionObject (readonly)

Returns the value of attribute version.



66
67
68
# File 'lib/http_tools/parser.rb', line 66

def version
  @version
end

Instance Method Details

#add_listener(event, proc = nil, &block) ⇒ Object Also known as: on

:call-seq: parser.add_listener(event) {|arg1 [, arg2]| block} -> parser parser.add_listener(event, proc) -> parser parser.on(event) {|arg1 [, arg2]| block} -> parser parser.on(event, proc) -> parser

Available events are :header, :stream, :trailer, :finish, and :error.

Adding a second callback for an event will overwite the existing callback or delegate.

Events:

header

Called when headers are complete

stream

Supplied with one argument, the last chunk of body data fed in to the parser as a String, e.g. “<h1>Hello”

trailer

Called on the completion of the trailer, if present

finish

Supplied with one argument, any data left in the parser’s buffer after the end of the HTTP message (likely nil, but possibly the start of the next message)

error

Supplied with one argument, an error encountered while parsing as a HTTPTools::ParseError. If a listener isn’t registered for this event, an exception will be raised when an error is encountered



231
232
233
234
# File 'lib/http_tools/parser.rb', line 231

def add_listener(event, proc=nil, &block)
  instance_variable_set(:"@#{event}_callback", proc || block)
  self
end

#concat(data) ⇒ Object Also known as: <<

:call-seq: parser.concat(data) -> parser parser << data -> parser

Feed data in to the parser and trigger callbacks.

Will raise HTTPTools::ParseError on error, unless a callback has been set for the :error event, in which case the callback will recieve the error insted.



99
100
101
102
103
# File 'lib/http_tools/parser.rb', line 99

def concat(data)
  @buffer << data
  @state = send(@state)
  self
end

#envObject

:call-seq: parser.env -> hash or nil

Returns a Rack compatible environment hash. Will return nil if called before headers are complete.

The following are not supplied, and must be added to make the environment hash fully Rack compliant: SERVER_NAME, SERVER_PORT, rack.input



114
115
116
117
118
119
120
121
122
123
124
125
126
127
# File 'lib/http_tools/parser.rb', line 114

def env
  return unless @header_complete
  env = PROTOTYPE_ENV.merge(
    REQUEST_METHOD => @request_method,
    REQUEST_URI => @request_uri)
  if @path_info
    env[PATH_INFO] = @path_info
    env[QUERY_STRING] = @query_string
  end
  env[FRAGMENT] = @fragment if @fragment
  @header.each {|k, val| env[HTTP_ + k.tr(LOWERCASE, UPPERCASE)] = val}
  @trailer.each {|k, val| env[HTTP_ + k.tr(LOWERCASE, UPPERCASE)] = val}
  env
end

#finishObject

:call-seq: parser.finish -> parser

Used to notify the parser that the request has finished in a case where it can not be determined by the request itself.

For example, when a server does not set a content length, and instead relies on closing the connection to signify the body end.

until parser.finished?
  begin
    parser << socket.sysread(1024 * 16)
  rescue EOFError
    parser.finish
    break
  end
end

This method can not be used to interrupt parsing from within a callback.

Will raise HTTPTools::MessageIncompleteError if called too early, or HTTPTools::EndOfMessageError if the message has already finished, unless a callback has been set for the :error event, in which case the callback will recieve the error insted.



152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/http_tools/parser.rb', line 152

def finish
  if @state == :body_on_close
    @state = end_of_message
  elsif @state == :body_chunked && @header[CONNECTION] == CLOSE &&
    !@header[TRAILER] && @buffer.eos?
    @state = end_of_message
  elsif @state == :start && @buffer.string.length < 1
    raise EmptyMessageError.new("Message empty")
  else
    raise MessageIncompleteError.new("Message ended early")
  end
  self
end

#finished?Boolean

:call-seq: parser.finished? -> bool

Returns true when the parser has come to the end of the message, false otherwise.

Some HTTP servers may not supply the necessary information in the response to determine the end of the message (e.g., no content length) and insted close the connection to signify the end of the message, see #finish for how to deal with this.

Returns:

  • (Boolean)


176
177
178
# File 'lib/http_tools/parser.rb', line 176

def finished?
  @state == :end_of_message
end

#resetObject

:call-seq: parser.reset -> parser

Reset the parser so it can be used to process a new request. Callbacks/delegates will not be removed.



185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
# File 'lib/http_tools/parser.rb', line 185

def reset
  @state = :start
  @buffer = @buffer_backup_reference
  @buffer.string.replace("")
  @buffer.reset
  @request_method = nil
  @path_info = nil
  @query_string = nil
  @request_uri = nil
  @fragment = nil
  @version = nil
  @status_code = nil
  @header = {}
  @trailer = {}
  @last_key = nil
  @content_left = nil
  self
end