Class: Mechanize::PluggableParser

Inherits:

Object

Object
Mechanize::PluggableParser

show all

Defined in:: lib/mechanize/pluggable_parsers.rb

Overview

This class is used to register and maintain pluggable parsers for Mechanize to use.

Mechanize allows different parsers for different content types. Mechanize uses PluggableParser to determine which parser to use for any content type. To use your own pluggable parser or to change the default pluggable parsers, register them with this class.

The default parser for unregistered content types is Mechanize::File.

The module Mechanize::Parser provides basic functionality for any content type, so you may use it in custom parsers you write. For small files you wish to perform in-memory operations on, you should subclass Mechanize::File. For large files you should subclass Mechanize::Download as the content is only loaded into memory in small chunks.

Example

To create your own parser, just create a class that takes four parameters in the constructor. Here is an example of registering a pluggable parser that handles CSV files:

require 'csv'

class CSVParser < Mechanize::File
  attr_reader :csv

  def initialize uri = nil, response = nil, body = nil, code = nil
    super uri, response, body, code
    @csv = CSV.parse body
  end
end

agent = Mechanize.new
agent.pluggable_parser.csv = CSVParser
agent.get('http://example.com/test.csv')  # => CSVParser

Now any response with a content type of ‘text/csv’ will initialize a CSVParser and return that object to the caller.

To register a pluggable parser for a content type that pluggable parser does not know about, use the hash syntax:

agent.pluggable_parser['text/something'] = SomeClass

To set the default parser, use #default:

agent.pluggable_parser.default = Mechanize::Download

Now all unknown content types will be saved to disk and not loaded into memory.

Constant Summary collapse

CONTENT_TYPES =

{
  :html  => 'text/html',
  :wap   => 'application/vnd.wap.xhtml+xml',
  :xhtml => 'application/xhtml+xml',
  :pdf   => 'application/pdf',
  :csv   => 'text/csv',
  :xml   => 'text/xml',
}

Instance Attribute Summary collapse

#default ⇒ Object

Returns the value of attribute default.

Instance Method Summary collapse

#[](content_type) ⇒ Object

Retrieves the parser for content_type content.
#[]=(content_type, klass) ⇒ Object

Sets the parser for content_type content to klass.
#csv=(klass) ⇒ Object

Registers klass as the parser for text/csv content.
#html=(klass) ⇒ Object

Registers klass as the parser for text/html and application/xhtml+xml content.
#initialize ⇒ PluggableParser constructor

A new instance of PluggableParser.
#parser(content_type) ⇒ Object

Returns the parser registered for the given content_type.
#pdf=(klass) ⇒ Object

Registers klass as the parser for application/pdf content.
#register_parser(content_type, klass) ⇒ Object

:nodoc:.
#xhtml=(klass) ⇒ Object

Registers klass as the parser for application/xhtml+xml content.
#xml=(klass) ⇒ Object

Registers klass as the parser for text/xml content.

Constructor Details

#initialize ⇒ `PluggableParser`

Returns a new instance of PluggableParser.

# File 'lib/mechanize/pluggable_parsers.rb', line 71

def initialize
  @parsers = {
    CONTENT_TYPES[:html]  => Mechanize::Page,
    CONTENT_TYPES[:xhtml] => Mechanize::Page,
    CONTENT_TYPES[:wap]   => Mechanize::Page,
  }

  @default = Mechanize::File
end

Instance Attribute Details

#default ⇒ `Object`

Returns the value of attribute default.



69
70
71

# File 'lib/mechanize/pluggable_parsers.rb', line 69

def default
  @default
end

Instance Method Details

#[](content_type) ⇒ `Object`

Retrieves the parser for content_type content



132
133
134

# File 'lib/mechanize/pluggable_parsers.rb', line 132

def [](content_type)
  @parsers[content_type]
end

#[]=(content_type, klass) ⇒ `Object`

Sets the parser for content_type content to klass



139
140
141

# File 'lib/mechanize/pluggable_parsers.rb', line 139

def []=(content_type, klass)
  @parsers[content_type] = klass
end

#csv=(klass) ⇒ `Object`

Registers klass as the parser for text/csv content



118
119
120

# File 'lib/mechanize/pluggable_parsers.rb', line 118

def csv=(klass)
  register_parser(CONTENT_TYPES[:csv], klass)
end

#html=(klass) ⇒ `Object`

Registers klass as the parser for text/html and application/xhtml+xml content

# File 'lib/mechanize/pluggable_parsers.rb', line 96

def html=(klass)
  register_parser(CONTENT_TYPES[:html], klass)
  register_parser(CONTENT_TYPES[:xhtml], klass)
end

#parser(content_type) ⇒ `Object`

Returns the parser registered for the given content_type



84
85
86

# File 'lib/mechanize/pluggable_parsers.rb', line 84

def parser(content_type)
  content_type.nil? ? default : @parsers[content_type] || default
end

#pdf=(klass) ⇒ `Object`

Registers klass as the parser for application/pdf content



111
112
113

# File 'lib/mechanize/pluggable_parsers.rb', line 111

def pdf=(klass)
  register_parser(CONTENT_TYPES[:pdf], klass)
end

#register_parser(content_type, klass) ⇒ `Object`

:nodoc:



88
89
90

# File 'lib/mechanize/pluggable_parsers.rb', line 88

def register_parser(content_type, klass) # :nodoc:
  @parsers[content_type] = klass
end

#xhtml=(klass) ⇒ `Object`

Registers klass as the parser for application/xhtml+xml content



104
105
106

# File 'lib/mechanize/pluggable_parsers.rb', line 104

def xhtml=(klass)
  register_parser(CONTENT_TYPES[:xhtml], klass)
end

#xml=(klass) ⇒ `Object`

Registers klass as the parser for text/xml content



125
126
127

# File 'lib/mechanize/pluggable_parsers.rb', line 125

def xml=(klass)
  register_parser(CONTENT_TYPES[:xml], klass)
end

Class: Mechanize::PluggableParser

Overview

Example

Constant Summary collapse

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize ⇒ PluggableParser

Instance Attribute Details

#default ⇒ Object

Instance Method Details

#[](content_type) ⇒ Object

#[]=(content_type, klass) ⇒ Object

#csv=(klass) ⇒ Object

#html=(klass) ⇒ Object

#parser(content_type) ⇒ Object

#pdf=(klass) ⇒ Object

#register_parser(content_type, klass) ⇒ Object

#xhtml=(klass) ⇒ Object

#xml=(klass) ⇒ Object