Class: Ripli::CustomParserTemplate
- Inherits:
-
CustomParser
- Object
- CustomParser
- Ripli::CustomParserTemplate
- Defined in:
- lib/ripli/customparser_template.rb
Overview
class should be inherited from CustomParser class name should be related with sitename
Constant Summary collapse
- CONSTANT =
from superclass you inherit constants: LOG_DIR = ‘log’ -> directory to save files with proxies DEFAULT_MAX_TIMEOUT = 1000 -> max timeout of proxy response in ms
'Your constants'
Constants inherited from CustomParser
Ripli::CustomParser::DEFAULT_MAX_TIMEOUT, Ripli::CustomParser::DEFAULT_MECHANIZE_TIMEOUT, Ripli::CustomParser::LOG_DIR
Instance Method Summary collapse
-
#initialize ⇒ CustomParserTemplate
constructor
define it if you need initialize some instance variables or perform some preparations (creating directories, etc).
-
#parse(type, opts = {}) ⇒ Object
required method! logic of scraping site must be here type – proxy type: [:https, :socks4, :socks5] opts – additional params if you need return – array of stings in format: “<type>t<ip>tt<port>”.
Methods inherited from CustomParser
Constructor Details
#initialize ⇒ CustomParserTemplate
define it if you need initialize some instance variables or perform some preparations (creating directories, etc)
18 19 20 21 22 |
# File 'lib/ripli/customparser_template.rb', line 18 def initialize super # required for creating logger and directory # define @mechanize = Mechanize.new { |agent| agent.open_timeout...} if you need add some options to mechanize agent # your code here end |
Instance Method Details
#parse(type, opts = {}) ⇒ Object
required method! logic of scraping site must be here type – proxy type: [:https, :socks4, :socks5] opts – additional params if you need return – array of stings in format: “<type>t<ip>tt<port>”
29 30 31 32 33 34 35 |
# File 'lib/ripli/customparser_template.rb', line 29 def parse(type, opts = {}) [] # for downloading use @mechanize.get(url) @logger.info 'Use @logger for print logs in STDOUT' rescue Net::OpenTimeout, Net::ReadTimeout # rescue exception during downloading page, DEFAULT_MECHANIZE_TIMEOUT=10s end |