Method: Grubby#initialize

Defined in:: lib/grubby.rb

#initialize(journal = nil) ⇒ `Grubby`

Returns a new instance of Grubby.

Parameters:

journal (Pathname, String) (defaults to: nil) —

Optional journal file used to ensure only-once processing of resources by #fulfill across multiple program runs

# File 'lib/grubby.rb', line 42

def initialize(journal = nil)
  super()

  # Prevent "memory leaks", and prevent mistakenly blank urls from
  # resolving.  (Blank urls resolve as a path relative to the last
  # history entry.  Without this setting, an erroneous `agent.get("")`
  # could sometimes successfully fetch a page.)
  self.max_history = 0

  # Prevent files of unforeseen content type from being buffered into
  # memory by default, in case they are very large.  However, increase
  # the threshold for what is considered "large", to prevent
  # unnecessary writes to disk.
  #
  # References:
  #   - http://docs.seattlerb.org/mechanize/Mechanize/PluggableParser.html
  #   - http://docs.seattlerb.org/mechanize/Mechanize/Download.html
  #   - http://docs.seattlerb.org/mechanize/Mechanize/File.html
  self.max_file_buffer = 1_000_000 # only applies to Mechanize::Download
  self.pluggable_parser.default = Mechanize::Download
  self.pluggable_parser["text/plain"] = Mechanize::File
  self.pluggable_parser["application/json"] = Grubby::JsonParser

  # Set up configurable rate limiting, and choose a reasonable default
  # rate limit.
  self.pre_connect_hooks << Proc.new{ self.send(:sleep_between_requests) }
  self.post_connect_hooks << Proc.new do |agent, uri, response, body|
    self.send(:mark_last_request_time, (Time.now unless response.code.to_s.start_with?("3")))
  end
  self.time_between_requests = 1.0

  self.journal = journal
end

Method: Grubby#initialize

#initialize(journal = nil) ⇒ Grubby

#initialize(journal = nil) ⇒ `Grubby`