Class: CurlAgent

Inherits:
Object
  • Object
show all
Defined in:
lib/curl_agent.rb

Defined Under Namespace

Classes: IO

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(url, options = {}) ⇒ CurlAgent

See CurlAgent::open for explanation about options



8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# File 'lib/curl_agent.rb', line 8

def initialize(url, options = {})
  @curl = Curl::Easy.new(url)
  # Defaults
  @curl.headers['User-Agent'] = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6'
  @curl.follow_location = true
  @curl.max_redirects = 2
  @curl.enable_cookies = true
  @curl.connect_timeout = 5
  @curl.timeout = 30
  @performed = false

  options ||= {}
  options.each {|k, v|
    # Strings will be passed as headers, as in original open-uri
    next unless k.is_a? Symbol
    @curl.send("#{k}=".intern, v)
    options.delete(k)
  }

  # All that's left should be considered headers
  @curl.headers.merge!(options)
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(symbol, *args) ⇒ Object

Proxies all calls to Curl::Easy instance



57
58
59
# File 'lib/curl_agent.rb', line 57

def method_missing(symbol, *args)
  @curl.send(symbol, *args)
end

Class Method Details

.open(name, *rest, &block) ⇒ Object

This method opens the URL and returns an IO object. If a block is provided, it’s called with that object. You can override defaults and provide configuration directives to Curl::Easy with symbol hash keys, for example: open(‘www.example.com/’, :timeout => 10) all the rest keys will be passed as headers, for example: open(‘www.example.com/’, :timeout => 10, ‘User-Agent’=>‘curl’)

Raises:

  • (ArgumentError)


68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# File 'lib/curl_agent.rb', line 68

def self.open(name, *rest, &block)
  mode, perm, rest = scan_open_optional_arguments(*rest)
  options = rest.shift if !rest.empty? && Hash === rest.first
  raise ArgumentError.new("extra arguments") if !rest.empty?

  unless mode == nil || mode == 'r' || mode == 'rb' || mode == File::RDONLY
    raise ArgumentError.new("invalid access mode #{mode} (resource is read only.)")
  end

  agent = CurlAgent.new(name, options)

  agent.perform!
  io = IO.new(agent.body_str, agent.header_str)
  io.base_uri = URI.parse(agent.last_effective_url) rescue nil
  io.status = [agent.response_code, '']
  if block
    block.call(io)
  else
    io
  end
end

.scan_open_optional_arguments(*rest) ⇒ Object

:nodoc:



90
91
92
93
94
95
96
97
98
# File 'lib/curl_agent.rb', line 90

def self.scan_open_optional_arguments(*rest) # :nodoc:
  if !rest.empty? && (String === rest.first || Integer === rest.first)
    mode = rest.shift
    if !rest.empty? && Integer === rest.first
      perm = rest.shift
    end
  end
  return mode, perm, rest
end

Instance Method Details

#charsetObject

Returns the charset of the page



38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/curl_agent.rb', line 38

def charset
  perform! unless @performed
  content_type = @curl.content_type || ''
  charset = if content_type.match(/charset\s*=\s*([a-zA-Z0-9-]+)/ni)
      $1
    elsif ! body_str.nil? and (m = body_str.slice(0,1000).match(%r{<meta.*http-equiv\s*=\s*['"]?Content-Type['"]?.*?>}mi)) and
      m[0].match(%r{content=['"]text/html.*?charset=(.*?)['"]}mi)
      $1
    else
      ''
    end.downcase
end

#perform!Object

Do the actual fetch, after which it’s possible to call body_str method



32
33
34
35
# File 'lib/curl_agent.rb', line 32

def perform!
  @curl.perform
  @performed = true
end

#respond_to?(symbol) ⇒ Boolean

Proxies all calls to Curl::Easy instance

Returns:

  • (Boolean)


52
53
54
# File 'lib/curl_agent.rb', line 52

def respond_to?(symbol)
  @curl.respond_to?(symbol)
end