Class: Mechanize

Inherits:
Object
  • Object
show all
Defined in:
lib/mechanize.rb

Overview

The Mechanize library is used for automating interactions with a website. It can follow links and submit forms. Form fields can be populated and submitted. A history of URL’s is maintained and can be queried.

Example

require 'mechanize'
require 'logger'

agent = Mechanize.new
agent.log = Logger.new "mech.log"
agent.user_agent_alias = 'Mac Safari'

page = agent.get "http://www.google.com/"
search_form = page.form_with :name => "f"
search_form.field_with(:name => "q").value = "Hello"

search_results = agent.submit search_form
puts search_results.body

Defined Under Namespace

Modules: ElementMatcher, Parser Classes: ContentTypeError, Cookie, CookieJar, Download, Error, File, FileConnection, FileRequest, FileResponse, FileSaver, Form, HTTP, Headers, History, Page, PluggableParser, RedirectLimitReachedError, RedirectNotGetOrHeadError, ResponseCodeError, ResponseReadError, RobotsDisallowedError, TestCase, UnauthorizedError, UnsupportedSchemeError, Util

Constant Summary collapse

VERSION =

The version of Mechanize you are using.

'2.1'
AGENT_ALIASES =

Supported User-Agent aliases for use with user_agent_alias=. The description in parenthesis is for informative purposes and is not part of the alias name.

  • Linux Firefox (3.6.1)

  • Linux Konqueror (3)

  • Linux Mozilla

  • Mac Firefox (3.6)

  • Mac Mozilla

  • Mac Safari (5)

  • Mac Safari 4

  • Mechanize (default)

  • Windows IE 6

  • Windows IE 7

  • Windows IE 8

  • Windows IE 9

  • Windows Mozilla

  • iPhone (3.0)

Example:

agent = Mechanize.new
agent.user_agent_alias = 'Mac Safari'
{
  'Mechanize' => "Mechanize/#{VERSION} Ruby/#{ruby_version} (http://github.com/tenderlove/mechanize/)",
  'Linux Firefox' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1',
  'Linux Konqueror' => 'Mozilla/5.0 (compatible; Konqueror/3; Linux)',
  'Linux Mozilla' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624',
  'Mac FireFox' => 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6',
  'Mac Mozilla' => 'Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4a) Gecko/20030401',
  'Mac Safari 4' => 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; de-at) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/531.21.10',
  'Mac Safari' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/534.51.22 (KHTML, like Gecko) Version/5.1.1 Safari/534.51.22',
  'Windows IE 6' => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)',
  'Windows IE 7' => 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
  'Windows IE 8' => 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
  'Windows IE 9' => 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)',
  'Windows Mozilla' => 'Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6',
  'iPhone' => 'Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1C28 Safari/419.3',
}

Class Attribute Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize {|_self| ... } ⇒ Mechanize

Creates a new mechanize instance. If a block is given, the created instance is yielded to the block for setting up pre-connection state such as SSL parameters or proxies:

agent = Mechanize.new do |a|
  a.proxy_host = 'proxy.example'
  a.proxy_port = 8080
end

Yields:

  • (_self)

Yield Parameters:

  • _self (Mechanize)

    the object that the method was called on



114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# File 'lib/mechanize.rb', line 114

def initialize
  @agent = Mechanize::HTTP::Agent.new
  @agent.context = self
  @log = nil

  # attr_accessors
  @agent.user_agent = AGENT_ALIASES['Mechanize']
  @watch_for_set    = nil
  @history_added    = nil

  # attr_readers
  @pluggable_parser = PluggableParser.new

  @keep_alive_time  = 0

  # Proxy
  @proxy_addr = nil
  @proxy_port = nil
  @proxy_user = nil
  @proxy_pass = nil

  @html_parser = self.class.html_parser

  @default_encoding = nil
  @force_default_encoding = false

  yield self if block_given?

  @agent.set_proxy @proxy_addr, @proxy_port, @proxy_user, @proxy_pass
  @agent.set_http
end

Class Attribute Details

.html_parserObject

Default HTML parser for all mechanize instances

Mechanize.html_parser = Nokogiri::XML


464
465
466
# File 'lib/mechanize.rb', line 464

def html_parser
  @html_parser
end

.logObject

Default logger for all mechanize instances

Mechanize.log = Logger.new $stderr


471
472
473
# File 'lib/mechanize.rb', line 471

def log
  @log
end

Instance Attribute Details

#agentObject (readonly)

:section: Utilities



959
960
961
# File 'lib/mechanize.rb', line 959

def agent
  @agent
end

#default_encodingObject

A default encoding name used when parsing HTML parsing. When set it is used after any other encoding. The default is nil.



479
480
481
# File 'lib/mechanize.rb', line 479

def default_encoding
  @default_encoding
end

#force_default_encodingObject

Overrides the encodings given by the HTTP server and the HTML page with the default_encoding when set to true.



485
486
487
# File 'lib/mechanize.rb', line 485

def force_default_encoding
  @force_default_encoding
end

#history_addedObject

Callback which is invoked with the page that was added to history.



218
219
220
# File 'lib/mechanize.rb', line 218

def history_added
  @history_added
end

#html_parserObject

The HTML parser to be used when parsing documents



490
491
492
# File 'lib/mechanize.rb', line 490

def html_parser
  @html_parser
end

#keep_alive_timeObject

HTTP/1.0 keep-alive time. This is no longer supported by mechanize as it now uses net-http-persistent which only supports HTTP/1.1 persistent connections



497
498
499
# File 'lib/mechanize.rb', line 497

def keep_alive_time
  @keep_alive_time
end

#pluggable_parserObject (readonly)

:nodoc:



961
962
963
# File 'lib/mechanize.rb', line 961

def pluggable_parser
  @pluggable_parser
end

#proxy_addrObject (readonly)

The HTTP proxy address



502
503
504
# File 'lib/mechanize.rb', line 502

def proxy_addr
  @proxy_addr
end

#proxy_passObject (readonly)

The HTTP proxy password



507
508
509
# File 'lib/mechanize.rb', line 507

def proxy_pass
  @proxy_pass
end

#proxy_portObject (readonly)

The HTTP proxy port



512
513
514
# File 'lib/mechanize.rb', line 512

def proxy_port
  @proxy_port
end

#proxy_userObject (readonly)

The HTTP proxy username



517
518
519
# File 'lib/mechanize.rb', line 517

def proxy_user
  @proxy_user
end

#watch_for_setObject

The value of watch_for_set is passed to pluggable parsers for retrieved content



834
835
836
# File 'lib/mechanize.rb', line 834

def watch_for_set
  @watch_for_set
end

Class Method Details

.inherited(child) ⇒ Object

:nodoc:



98
99
100
101
102
# File 'lib/mechanize.rb', line 98

def self.inherited(child) # :nodoc:
  child.html_parser ||= html_parser
  child.log ||= log
  super
end

Instance Method Details

#auth(user, password) ⇒ Object Also known as: basic_auth

Sets the user and password to be used for HTTP authentication.



522
523
524
525
# File 'lib/mechanize.rb', line 522

def auth(user, password)
  @agent.user     = user
  @agent.password = password
end

#backObject

Equivalent to the browser back button. Returns the previous page visited.



153
154
155
# File 'lib/mechanize.rb', line 153

def back
  @agent.history.pop
end

#ca_fileObject

Path to an OpenSSL server certificate file



844
845
846
# File 'lib/mechanize.rb', line 844

def ca_file
  @agent.ca_file
end

#ca_file=(ca_file) ⇒ Object

Sets the certificate file used for SSL connections



851
852
853
# File 'lib/mechanize.rb', line 851

def ca_file= ca_file
  @agent.ca_file = ca_file
end

#certObject

An OpenSSL client certificate or the path to a certificate file.



858
859
860
# File 'lib/mechanize.rb', line 858

def cert
  @agent.cert
end

#cert=(cert) ⇒ Object

Sets the OpenSSL client certificate cert to the given path or certificate instance



866
867
868
# File 'lib/mechanize.rb', line 866

def cert= cert
  @agent.cert = cert
end

#cert_storeObject

An OpenSSL certificate store for verifying server certificates. This defaults to the default certificate store.



874
875
876
# File 'lib/mechanize.rb', line 874

def cert_store
  @agent.cert_store
end

#cert_store=(cert_store) ⇒ Object

Sets the OpenSSL certificate store to store.



881
882
883
# File 'lib/mechanize.rb', line 881

def cert_store= cert_store
  @agent.cert_store = cert_store
end

#certificateObject

What is this?

Why is it different from #cert?



890
891
892
# File 'lib/mechanize.rb', line 890

def certificate # :nodoc:
  @agent.certificate
end

#click(link) ⇒ Object

If the parameter is a string, finds the button or link with the value of the string on the current page and clicks it. Otherwise, clicks the Mechanize::Page::Link object passed in. Returns the page fetched.



245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
# File 'lib/mechanize.rb', line 245

def click link
  case link
  when Page::Link then
    referer = link.page || current_page()
    if @agent.robots
      if (referer.is_a?(Page) and referer.parser.nofollow?) or
         link.rel?('nofollow') then
        raise RobotsDisallowedError.new(link.href)
      end
    end
    if link.rel?('noreferrer')
      href = @agent.resolve(link.href, link.page || current_page)
      referer = Page.new(nil, {'content-type'=>'text/html'})
    else
      href = link.href
    end
    get href, [], referer
  when String, Regexp then
    if real_link = page.link_with(:text => link)
      click real_link
    else
      button = nil
      form = page.forms.find do |f|
        button = f.button_with(:value => link)
        button.is_a? Form::Submit
      end
      submit form, button if form
    end
  else
    referer = current_page()
    href = link.respond_to?(:href) ? link.href :
      (link['href'] || link['src'])
    get href, [], referer
  end
end

#conditional_requestsObject

Are If-Modified-Since conditional requests enabled?



532
533
534
# File 'lib/mechanize.rb', line 532

def conditional_requests
  @agent.conditional_requests
end

#conditional_requests=(enabled) ⇒ Object

Disables If-Modified-Since conditional requests (enabled by default)



539
540
541
# File 'lib/mechanize.rb', line 539

def conditional_requests= enabled
  @agent.conditional_requests = enabled
end

#content_encoding_hooksObject

A list of hooks to call before reading response header ‘content-encoding’.

The hook is called with the agent making the request, the URI of the request, the response an IO containing the response body.



211
212
213
# File 'lib/mechanize.rb', line 211

def content_encoding_hooks
  @agent.content_encoding_hooks
end

A Mechanize::CookieJar which stores cookies



546
547
548
# File 'lib/mechanize.rb', line 546

def cookie_jar
  @agent.cookie_jar
end

Replaces the cookie jar with cookie_jar



553
554
555
# File 'lib/mechanize.rb', line 553

def cookie_jar= cookie_jar
  @agent.cookie_jar = cookie_jar
end

#cookiesObject

Returns a list of cookies stored in the cookie jar.



560
561
562
# File 'lib/mechanize.rb', line 560

def cookies
  @agent.cookie_jar.to_a
end

#current_pageObject Also known as: page

Returns the latest page loaded by Mechanize



160
161
162
# File 'lib/mechanize.rb', line 160

def current_page
  @agent.current_page
end

#delete(uri, query_params = {}, headers = {}) ⇒ Object

DELETE uri with query_params, and setting headers:

delete('http://example/', {'q' => 'foo'}, {})


286
287
288
289
290
# File 'lib/mechanize.rb', line 286

def delete(uri, query_params = {}, headers = {})
  page = @agent.fetch(uri, :delete, headers, query_params)
  add_to_history(page)
  page
end

#follow_meta_refreshObject

Follow HTML meta refresh and HTTP Refresh headers. If set to :anywhere meta refresh tags outside of the head element will be followed.



568
569
570
# File 'lib/mechanize.rb', line 568

def follow_meta_refresh
  @agent.follow_meta_refresh
end

#follow_meta_refresh=(follow) ⇒ Object

Controls following of HTML meta refresh and HTTP Refresh headers in responses.



576
577
578
# File 'lib/mechanize.rb', line 576

def follow_meta_refresh= follow
  @agent.follow_meta_refresh = follow
end

#follow_meta_refresh_selfObject

Follow an HTML meta refresh and HTTP Refresh headers that have no “url=” in the content attribute.

Defaults to false to prevent infinite refresh loops.



586
587
588
# File 'lib/mechanize.rb', line 586

def follow_meta_refresh_self
  @agent.follow_meta_refresh_self
end

#follow_meta_refresh_self=(follow) ⇒ Object

Alters the following of HTML meta refresh and HTTP Refresh headers that point to the same page.



594
595
596
# File 'lib/mechanize.rb', line 594

def follow_meta_refresh_self= follow
  @agent.follow_meta_refresh_self = follow
end

#get(uri, parameters = [], referer = nil, headers = {}) {|page| ... } ⇒ Object

GET the uri with the given request parameters, referer and headers.

The referer may be a URI or a page.

Yields:



298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
# File 'lib/mechanize.rb', line 298

def get(uri, parameters = [], referer = nil, headers = {})
  method = :get

  referer ||=
    if uri.to_s =~ %r{\Ahttps?://}
      Page.new(nil, {'content-type'=>'text/html'})
    else
      current_page || Page.new(nil, {'content-type'=>'text/html'})
    end

  # FIXME: Huge hack so that using a URI as a referer works.  I need to
  # refactor everything to pass around URIs but still support
  # Mechanize::Page#base
  unless Mechanize::Parser === referer then
    referer = referer.is_a?(String) ?
    Page.new(URI.parse(referer), {'content-type' => 'text/html'}) :
      Page.new(referer, {'content-type' => 'text/html'})
  end

  # fetch the page
  headers ||= {}
  page = @agent.fetch uri, method, headers, parameters, referer
  add_to_history(page)
  yield page if block_given?
  page
end

#get_file(url) ⇒ Object

GET url and return only its contents



328
329
330
# File 'lib/mechanize.rb', line 328

def get_file(url)
  get(url).body
end

#gzip_enabledObject

Is gzip compression of responses enabled?



601
602
603
# File 'lib/mechanize.rb', line 601

def gzip_enabled
  @agent.gzip_enabled
end

#gzip_enabled=(enabled) ⇒ Object

Disables HTTP/1.1 gzip compression (enabled by default)



608
609
610
# File 'lib/mechanize.rb', line 608

def gzip_enabled=enabled
  @agent.gzip_enabled = enabled
end

#head(uri, query_params = {}, headers = {}) {|page| ... } ⇒ Object

HEAD uri with query_params, and setting headers:

head('http://example/', {'q' => 'foo'}, {})

Yields:



337
338
339
340
341
342
# File 'lib/mechanize.rb', line 337

def head(uri, query_params = {}, headers = {})
  # fetch the page
  page = @agent.fetch(uri, :head, headers, query_params)
  yield page if block_given?
  page
end

#historyObject

The history of this mechanize run



169
170
171
# File 'lib/mechanize.rb', line 169

def history
  @agent.history
end

#idle_timeoutObject

Connections that have not been used in this many seconds will be reset.



615
616
617
# File 'lib/mechanize.rb', line 615

def idle_timeout
  @agent.idle_timeout
end

#idle_timeout=(idle_timeout) ⇒ Object

Sets the idle timeout to idle_timeout. The default timeout is 5 seconds. If you experience “too many connection resets”, reducing this value may help.



623
624
625
# File 'lib/mechanize.rb', line 623

def idle_timeout= idle_timeout
  @agent.idle_timeout = idle_timeout
end

#keep_aliveObject

Are HTTP/1.1 keep-alive connections enabled?



630
631
632
# File 'lib/mechanize.rb', line 630

def keep_alive
  @agent.keep_alive
end

#keep_alive=(enable) ⇒ Object

Disable HTTP/1.1 keep-alive connections if enable is set to false. If you are experiencing “too many connection resets” errors setting this to false will eliminate them.

You should first investigate reducing idle_timeout.



641
642
643
# File 'lib/mechanize.rb', line 641

def keep_alive= enable
  @agent.keep_alive = enable
end

#keyObject

An OpenSSL private key or the path to a private key



897
898
899
# File 'lib/mechanize.rb', line 897

def key
  @agent.key
end

#key=(key) ⇒ Object

Sets the OpenSSL client key to the given path or key instance



904
905
906
# File 'lib/mechanize.rb', line 904

def key= key
  @agent.key = key
end

#logObject

The current logger. If no logger has been set Mechanize.log is used.



648
649
650
# File 'lib/mechanize.rb', line 648

def log
  @log || Mechanize.log
end

#log=(logger) ⇒ Object

Sets the logger used by this instance of mechanize



655
656
657
# File 'lib/mechanize.rb', line 655

def log= logger
  @log = logger
end

#max_file_bufferObject

Responses larger than this will be written to a Tempfile instead of stored in memory. The default is 10240 bytes



663
664
665
# File 'lib/mechanize.rb', line 663

def max_file_buffer
  @agent.max_file_buffer
end

#max_file_buffer=(bytes) ⇒ Object

Sets the maximum size of a response body that will be stored in memory to bytes



671
672
673
# File 'lib/mechanize.rb', line 671

def max_file_buffer= bytes
  @agent.max_file_buffer = bytes
end

#max_historyObject

Maximum number of items allowed in the history.



176
177
178
# File 'lib/mechanize.rb', line 176

def max_history
  @agent.history.max_size
end

#max_history=(length) ⇒ Object

Sets the maximum number of items allowed in the history to length.



183
184
185
# File 'lib/mechanize.rb', line 183

def max_history= length
  @agent.history.max_size = length
end

#open_timeoutObject

Length of time to wait until a connection is opened in seconds



678
679
680
# File 'lib/mechanize.rb', line 678

def open_timeout
  @agent.open_timeout
end

#open_timeout=(open_timeout) ⇒ Object

Sets the connection open timeout to open_timeout



685
686
687
# File 'lib/mechanize.rb', line 685

def open_timeout= open_timeout
  @agent.open_timeout = open_timeout
end

#parse(uri, response, body) ⇒ Object

Parses the body of the response from uri using the pluggable parser that matches its content type



967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
# File 'lib/mechanize.rb', line 967

def parse uri, response, body
  content_type = nil

  unless response['Content-Type'].nil?
    data, = response['Content-Type'].split ';', 2
    content_type, = data.downcase.split ',', 2 unless data.nil?
  end

  # Find our pluggable parser
  parser_klass = @pluggable_parser.parser content_type

  unless parser_klass <= Mechanize::Download then
    body = case body
           when IO, Tempfile, StringIO then
             body.read
           else
             body
           end
  end

  parser_klass.new uri, response, body, response.code do |parser|
    parser.mech = self if parser.respond_to? :mech=

    parser.watch_for_set = @watch_for_set if
      @watch_for_set and parser.respond_to?(:watch_for_set=)
  end
end

#passObject

OpenSSL client key password



911
912
913
# File 'lib/mechanize.rb', line 911

def pass
  @agent.pass
end

#pass=(pass) ⇒ Object

Sets the client key password to pass



918
919
920
# File 'lib/mechanize.rb', line 918

def pass= pass
  @agent.pass = pass
end

#post(uri, query = {}, headers = {}) ⇒ Object

POST to the given uri with the given query. The query is specified by either a string, or a list of key-value pairs represented by a hash or an array of arrays.

Examples:

agent.post 'http://example.com/', "foo" => "bar"

agent.post 'http://example.com/', [%w[foo bar]]

agent.post('http://example.com/', "<message>hello</message>",
           'Content-Type' => 'application/xml')


357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
# File 'lib/mechanize.rb', line 357

def post(uri, query={}, headers={})
  return request_with_entity(:post, uri, query, headers) if String === query

  node = {}
  # Create a fake form
  class << node
    def search(*args); []; end
  end
  node['method'] = 'POST'
  node['enctype'] = 'application/x-www-form-urlencoded'

  form = Form.new(node)

  query.each { |k, v|
    if v.is_a?(IO)
      form.enctype = 'multipart/form-data'
      ul = Form::FileUpload.new({'name' => k.to_s},::File.basename(v.path))
      ul.file_data = v.read
      form.file_uploads << ul
    else
      form.fields << Form::Field.new({'name' => k.to_s},v)
    end
  }
  post_form(uri, form, headers)
end

#post_connect_hooksObject

A list of hooks to call after retrieving a response. Hooks are called with the agent and the response returned.



224
225
226
# File 'lib/mechanize.rb', line 224

def post_connect_hooks
  @agent.post_connect_hooks
end

#pre_connect_hooksObject

A list of hooks to call before making a request. Hooks are called with the agent and the request to be performed.



232
233
234
# File 'lib/mechanize.rb', line 232

def pre_connect_hooks
  @agent.pre_connect_hooks
end

#pretty_print(q) ⇒ Object

:nodoc:



995
996
997
998
999
1000
1001
1002
# File 'lib/mechanize.rb', line 995

def pretty_print(q) # :nodoc:
  q.object_group(self) {
    q.breakable
    q.pp cookie_jar
    q.breakable
    q.pp current_page
  }
end

#put(uri, entity, headers = {}) ⇒ Object

PUT to uri with entity, and setting headers:

put('http://example/', 'new content', {'Content-Type' => 'text/plain'})


388
389
390
# File 'lib/mechanize.rb', line 388

def put(uri, entity, headers = {})
  request_with_entity(:put, uri, entity, headers)
end

#read_timeoutObject

Length of time to wait for data from the server



692
693
694
# File 'lib/mechanize.rb', line 692

def read_timeout
  @agent.read_timeout
end

#read_timeout=(read_timeout) ⇒ Object

Sets the timeout for each chunk of data read from the server to read_timeout. A single request may read many chunks of data.



700
701
702
# File 'lib/mechanize.rb', line 700

def read_timeout= read_timeout
  @agent.read_timeout = read_timeout
end

#redirect_okObject Also known as: follow_redirect?

Controls how mechanize deals with redirects. The following values are allowed:

:all, true

All 3xx redirects are followed (default)

:permanent

Only 301 Moved Permanantly redirects are followed

false

No redirects are followed



712
713
714
# File 'lib/mechanize.rb', line 712

def redirect_ok
  @agent.redirect_ok
end

#redirect_ok=(follow) ⇒ Object

Sets the mechanize redirect handling policy. See redirect_ok for allowed values



722
723
724
# File 'lib/mechanize.rb', line 722

def redirect_ok= follow
  @agent.redirect_ok = follow
end

#redirection_limitObject

Maximum number of redirections to follow



729
730
731
# File 'lib/mechanize.rb', line 729

def redirection_limit
  @agent.redirection_limit
end

#redirection_limit=(limit) ⇒ Object

Sets the maximum number of redirections to follow to limit



736
737
738
# File 'lib/mechanize.rb', line 736

def redirection_limit= limit
  @agent.redirection_limit = limit
end

#request_headersObject

A hash of custom request headers that will be sent on every request



743
744
745
# File 'lib/mechanize.rb', line 743

def request_headers
  @agent.request_headers
end

#request_headers=(request_headers) ⇒ Object

Replaces the custom request headers that will be sent on every request with request_headers



751
752
753
# File 'lib/mechanize.rb', line 751

def request_headers= request_headers
  @agent.request_headers = request_headers
end

#request_with_entity(verb, uri, entity, headers = {}) ⇒ Object

Makes an HTTP request to url using HTTP method verb. entity is used as the request body, if allowed.



396
397
398
399
400
401
402
403
404
405
406
407
# File 'lib/mechanize.rb', line 396

def request_with_entity(verb, uri, entity, headers = {})
  cur_page = current_page || Page.new(nil, {'content-type'=>'text/html'})

  headers = {
    'Content-Type' => 'application/octet-stream',
    'Content-Length' => entity.size.to_s,
  }.update headers

  page = @agent.fetch uri, verb, headers, [entity], cur_page
  add_to_history(page)
  page
end

#retry_change_requestsObject

Retry POST and other non-idempotent requests. See RFC 2616 9.1.2.



758
759
760
# File 'lib/mechanize.rb', line 758

def retry_change_requests
  @agent.retry_change_requests
end

#retry_change_requests=(retry_change_requests) ⇒ Object

When setting retry_change_requests to true you are stating that, for all the URLs you access with mechanize, making POST and other non-idempotent requests is safe and will not cause data duplication or other harmful results.

If you are experiencing “too many connection resets” errors you should instead investigate reducing the idle_timeout or disabling keep_alive connections.



772
773
774
# File 'lib/mechanize.rb', line 772

def retry_change_requests= retry_change_requests
  @agent.retry_change_requests = retry_change_requests
end

#robotsObject

Will /robots.txt files be obeyed?



779
780
781
# File 'lib/mechanize.rb', line 779

def robots
  @agent.robots
end

#robots=(enabled) ⇒ Object

When enabled mechanize will retrieve and obey robots.txt files



787
788
789
# File 'lib/mechanize.rb', line 787

def robots= enabled
  @agent.robots = enabled
end

#scheme_handlersObject

The handlers for HTTP and other URI protocols.



794
795
796
# File 'lib/mechanize.rb', line 794

def scheme_handlers
  @agent.scheme_handlers
end

#scheme_handlers=(scheme_handlers) ⇒ Object

Replaces the URI scheme handler table with scheme_handlers



801
802
803
# File 'lib/mechanize.rb', line 801

def scheme_handlers= scheme_handlers
  @agent.scheme_handlers = scheme_handlers
end

#set_proxy(address, port, user = nil, password = nil) ⇒ Object

Sets the proxy address at port with an optional user and password



1007
1008
1009
1010
1011
1012
1013
1014
1015
# File 'lib/mechanize.rb', line 1007

def set_proxy address, port, user = nil, password = nil
  @proxy_addr = address
  @proxy_port = port
  @proxy_user = user
  @proxy_pass = password

  @agent.set_proxy address, port, user, password
  @agent.set_http
end

#submit(form, button = nil, headers = {}) ⇒ Object

Submits form with an optional button.

Without a button:

page = agent.get('http://example.com')
agent.submit(page.forms.first)

With a button:

agent.submit(page.forms.first, page.forms.first.buttons.first)


421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
# File 'lib/mechanize.rb', line 421

def submit(form, button=nil, headers={})
  form.add_button_to_query(button) if button

  case form.method.upcase
  when 'POST'
    post_form(form.action, form, headers)
  when 'GET'
    get(form.action.gsub(/\?[^\?]*$/, ''),
        form.build_query,
        form.page,
        headers)
  else
    raise ArgumentError, "unsupported method: #{form.method.upcase}"
  end
end

#transactObject

Runs given block, then resets the page history as it was before. self is given as a parameter to the block. Returns the value of the block.



441
442
443
444
445
446
447
448
# File 'lib/mechanize.rb', line 441

def transact
  history_backup = @agent.history.dup
  begin
    yield self
  ensure
    @agent.history = history_backup
  end
end

#user_agentObject

The identification string for the client initiating a web request



808
809
810
# File 'lib/mechanize.rb', line 808

def user_agent
  @agent.user_agent
end

#user_agent=(user_agent) ⇒ Object

Sets the User-Agent used by mechanize to user_agent. See also user_agent_alias



816
817
818
# File 'lib/mechanize.rb', line 816

def user_agent= user_agent
  @agent.user_agent = user_agent
end

#user_agent_alias=(name) ⇒ Object

Set the user agent for the Mechanize object based on the given name.

See also AGENT_ALIASES



825
826
827
828
# File 'lib/mechanize.rb', line 825

def user_agent_alias= name
  self.user_agent = AGENT_ALIASES[name] ||
    raise(ArgumentError, "unknown agent alias #{name.inspect}")
end

#verify_callbackObject

A callback for additional certificate verification. See OpenSSL::SSL::SSLContext#verify_callback

The callback can be used for debugging or to ignore errors by always returning true. Specifying nil uses the default method that was valid when the SSLContext was created



930
931
932
# File 'lib/mechanize.rb', line 930

def verify_callback
  @agent.verify_callback
end

#verify_callback=(verify_callback) ⇒ Object

Sets the OpenSSL certificate verification callback



937
938
939
# File 'lib/mechanize.rb', line 937

def verify_callback= verify_callback
  @agent.verify_callback = verify_callback
end

#verify_modeObject

the OpenSSL server certificate verification method. The default is OpenSSL::SSL::VERIFY_PEER and certificate verification uses the default system certificates. See also cert_store



946
947
948
# File 'lib/mechanize.rb', line 946

def verify_mode
  @agent.verify_mode
end

#verify_mode=(verify_mode) ⇒ Object

Sets the OpenSSL server certificate verification method.



953
954
955
# File 'lib/mechanize.rb', line 953

def verify_mode= verify_mode
  @agent.verify_mode = verify_mode
end

#visited?(url) ⇒ Boolean Also known as: visited_page

Returns a visited page for the url passed in, otherwise nil

Returns:

  • (Boolean)


190
191
192
193
194
# File 'lib/mechanize.rb', line 190

def visited? url
  url = url.href if url.respond_to? :href

  @agent.visited_page url
end