Module: Wayfarer::Base

Extended by:
ActiveSupport::Concern
Defined in:
lib/wayfarer/base.rb

Class Attribute Summary collapse

Instance Attribute Summary collapse

Callbacks collapse

Class Method Summary collapse

Instance Method Summary collapse

Class Attribute Details

.routeWayfarer::Routing::DSL (readonly)

The job's Routing::DSL that maps URLs to instance methods or to a Handler.

Examples:

Append a host route

route.host "examplxe.com", to: :index

Returns:



# File 'lib/wayfarer/base.rb', line 37

Instance Attribute Details

#actionSymbol, Object (readonly)

Returns action that the task URL was routed to.

Returns:

  • (Symbol, Object)

    action that the task URL was routed to.



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/wayfarer/base.rb', line 14

module Base
  extend ActiveSupport::Concern
  # @!method stage(urls)
  # Adds URLs to an internal staging set so that they get enqueued
  # eventually, once the job executed successfully.
  # @overload stage(urls)
  #   @param urls [Array<String>] URLs to add to the staging set.
  # @overload stage(url)
  #   @param url [String] URL to add to the staging set.

  # @!method fetch(url, follow: 3)
  # @param url [String] URL to fetch using plain HTTP(S).
  # @param follow [Fixnum] Number of redirects to follow.
  # Retrieves the given URL to a {Page}.

  # @!method page(live: false)
  # @param url [live] whether to retrieve a new {Page}.
  # @return [Wayfarer::Page]
  # Returns the most recently retrieved page or a new page
  # for the current task URL if the `follow` keyword is passed.

  # @!scope class

  # @!attribute [r] route
  # @return [Wayfarer::Routing::DSL]
  # The job's {Wayfarer::Routing::DSL} that maps URLs to instance methods
  # or to a {Handler}.
  # @example Append a host route
  #   route.host "examplxe.com", to: :index

  # @!method content_types(*content_types)
  # @param content_types [*Array<String, Regexp>] Content-Types to whitelist
  # Whitelists Content-Types. Once at least one Content-Type is set, only
  # those Content-Types will be processed.

  # @!group Callbacks

  # @!method before_fetch
  # @overload before_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the page is fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.
  # @example Accessing the user agent in {#before_fetch}
  #   before_fetch do |task|
  #     user_agent # => the user agent that will fetch the page
  #   end

  # @!method around_fetch
  # @overload around_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the page getting fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_fetch
  # @overload after_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the page was fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method before_perform
  # @overload before_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the task is performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method around_perform
  # @overload around_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the task getting performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_perform
  # @overload after_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the task was performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!endgroup

  included do
    include Wayfarer::Middleware::Controller

    # Implement ActiveJob's #perform by calling into our own middleware chain
    alias_method :perform, :call

    # Middleware stack
    use Wayfarer::Middleware::Redis
    use Wayfarer::Middleware::BatchCompletion
    use Wayfarer::Middleware::UriParser
    use Wayfarer::Middleware::Normalize
    use Wayfarer::Middleware::Dedup
    use Wayfarer::Middleware::Stage
    use Wayfarer::Middleware::Router
    use Wayfarer::Middleware::UserAgent
    use Wayfarer::Middleware::ContentType
    use Wayfarer::Middleware::Dispatch
  end

  class_methods do
    def crawl(url, batch: SecureRandom.uuid)
      Task.new(url, batch).tap do |task|
        perform_later(task)
      end
    end
  end
end

#paramsHashWithIndifferentAccess (readonly)

Returns path parameters collected from routes.

Returns:

  • (HashWithIndifferentAccess)

    path parameters collected from routes



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/wayfarer/base.rb', line 14

module Base
  extend ActiveSupport::Concern
  # @!method stage(urls)
  # Adds URLs to an internal staging set so that they get enqueued
  # eventually, once the job executed successfully.
  # @overload stage(urls)
  #   @param urls [Array<String>] URLs to add to the staging set.
  # @overload stage(url)
  #   @param url [String] URL to add to the staging set.

  # @!method fetch(url, follow: 3)
  # @param url [String] URL to fetch using plain HTTP(S).
  # @param follow [Fixnum] Number of redirects to follow.
  # Retrieves the given URL to a {Page}.

  # @!method page(live: false)
  # @param url [live] whether to retrieve a new {Page}.
  # @return [Wayfarer::Page]
  # Returns the most recently retrieved page or a new page
  # for the current task URL if the `follow` keyword is passed.

  # @!scope class

  # @!attribute [r] route
  # @return [Wayfarer::Routing::DSL]
  # The job's {Wayfarer::Routing::DSL} that maps URLs to instance methods
  # or to a {Handler}.
  # @example Append a host route
  #   route.host "examplxe.com", to: :index

  # @!method content_types(*content_types)
  # @param content_types [*Array<String, Regexp>] Content-Types to whitelist
  # Whitelists Content-Types. Once at least one Content-Type is set, only
  # those Content-Types will be processed.

  # @!group Callbacks

  # @!method before_fetch
  # @overload before_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the page is fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.
  # @example Accessing the user agent in {#before_fetch}
  #   before_fetch do |task|
  #     user_agent # => the user agent that will fetch the page
  #   end

  # @!method around_fetch
  # @overload around_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the page getting fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_fetch
  # @overload after_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the page was fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method before_perform
  # @overload before_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the task is performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method around_perform
  # @overload around_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the task getting performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_perform
  # @overload after_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the task was performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!endgroup

  included do
    include Wayfarer::Middleware::Controller

    # Implement ActiveJob's #perform by calling into our own middleware chain
    alias_method :perform, :call

    # Middleware stack
    use Wayfarer::Middleware::Redis
    use Wayfarer::Middleware::BatchCompletion
    use Wayfarer::Middleware::UriParser
    use Wayfarer::Middleware::Normalize
    use Wayfarer::Middleware::Dedup
    use Wayfarer::Middleware::Stage
    use Wayfarer::Middleware::Router
    use Wayfarer::Middleware::UserAgent
    use Wayfarer::Middleware::ContentType
    use Wayfarer::Middleware::Dispatch
  end

  class_methods do
    def crawl(url, batch: SecureRandom.uuid)
      Task.new(url, batch).tap do |task|
        perform_later(task)
      end
    end
  end
end

#taskWayfarer::Task (readonly)

Returns the current task.

Returns:



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/wayfarer/base.rb', line 14

module Base
  extend ActiveSupport::Concern
  # @!method stage(urls)
  # Adds URLs to an internal staging set so that they get enqueued
  # eventually, once the job executed successfully.
  # @overload stage(urls)
  #   @param urls [Array<String>] URLs to add to the staging set.
  # @overload stage(url)
  #   @param url [String] URL to add to the staging set.

  # @!method fetch(url, follow: 3)
  # @param url [String] URL to fetch using plain HTTP(S).
  # @param follow [Fixnum] Number of redirects to follow.
  # Retrieves the given URL to a {Page}.

  # @!method page(live: false)
  # @param url [live] whether to retrieve a new {Page}.
  # @return [Wayfarer::Page]
  # Returns the most recently retrieved page or a new page
  # for the current task URL if the `follow` keyword is passed.

  # @!scope class

  # @!attribute [r] route
  # @return [Wayfarer::Routing::DSL]
  # The job's {Wayfarer::Routing::DSL} that maps URLs to instance methods
  # or to a {Handler}.
  # @example Append a host route
  #   route.host "examplxe.com", to: :index

  # @!method content_types(*content_types)
  # @param content_types [*Array<String, Regexp>] Content-Types to whitelist
  # Whitelists Content-Types. Once at least one Content-Type is set, only
  # those Content-Types will be processed.

  # @!group Callbacks

  # @!method before_fetch
  # @overload before_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the page is fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.
  # @example Accessing the user agent in {#before_fetch}
  #   before_fetch do |task|
  #     user_agent # => the user agent that will fetch the page
  #   end

  # @!method around_fetch
  # @overload around_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the page getting fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_fetch
  # @overload after_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the page was fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method before_perform
  # @overload before_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the task is performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method around_perform
  # @overload around_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the task getting performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_perform
  # @overload after_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the task was performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!endgroup

  included do
    include Wayfarer::Middleware::Controller

    # Implement ActiveJob's #perform by calling into our own middleware chain
    alias_method :perform, :call

    # Middleware stack
    use Wayfarer::Middleware::Redis
    use Wayfarer::Middleware::BatchCompletion
    use Wayfarer::Middleware::UriParser
    use Wayfarer::Middleware::Normalize
    use Wayfarer::Middleware::Dedup
    use Wayfarer::Middleware::Stage
    use Wayfarer::Middleware::Router
    use Wayfarer::Middleware::UserAgent
    use Wayfarer::Middleware::ContentType
    use Wayfarer::Middleware::Dispatch
  end

  class_methods do
    def crawl(url, batch: SecureRandom.uuid)
      Task.new(url, batch).tap do |task|
        perform_later(task)
      end
    end
  end
end

#uriAddressable::URI (readonly)

Returns Parsed task URL.

Returns:

  • (Addressable::URI)

    Parsed task URL



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/wayfarer/base.rb', line 14

module Base
  extend ActiveSupport::Concern
  # @!method stage(urls)
  # Adds URLs to an internal staging set so that they get enqueued
  # eventually, once the job executed successfully.
  # @overload stage(urls)
  #   @param urls [Array<String>] URLs to add to the staging set.
  # @overload stage(url)
  #   @param url [String] URL to add to the staging set.

  # @!method fetch(url, follow: 3)
  # @param url [String] URL to fetch using plain HTTP(S).
  # @param follow [Fixnum] Number of redirects to follow.
  # Retrieves the given URL to a {Page}.

  # @!method page(live: false)
  # @param url [live] whether to retrieve a new {Page}.
  # @return [Wayfarer::Page]
  # Returns the most recently retrieved page or a new page
  # for the current task URL if the `follow` keyword is passed.

  # @!scope class

  # @!attribute [r] route
  # @return [Wayfarer::Routing::DSL]
  # The job's {Wayfarer::Routing::DSL} that maps URLs to instance methods
  # or to a {Handler}.
  # @example Append a host route
  #   route.host "examplxe.com", to: :index

  # @!method content_types(*content_types)
  # @param content_types [*Array<String, Regexp>] Content-Types to whitelist
  # Whitelists Content-Types. Once at least one Content-Type is set, only
  # those Content-Types will be processed.

  # @!group Callbacks

  # @!method before_fetch
  # @overload before_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the page is fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.
  # @example Accessing the user agent in {#before_fetch}
  #   before_fetch do |task|
  #     user_agent # => the user agent that will fetch the page
  #   end

  # @!method around_fetch
  # @overload around_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the page getting fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_fetch
  # @overload after_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the page was fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method before_perform
  # @overload before_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the task is performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method around_perform
  # @overload around_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the task getting performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_perform
  # @overload after_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the task was performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!endgroup

  included do
    include Wayfarer::Middleware::Controller

    # Implement ActiveJob's #perform by calling into our own middleware chain
    alias_method :perform, :call

    # Middleware stack
    use Wayfarer::Middleware::Redis
    use Wayfarer::Middleware::BatchCompletion
    use Wayfarer::Middleware::UriParser
    use Wayfarer::Middleware::Normalize
    use Wayfarer::Middleware::Dedup
    use Wayfarer::Middleware::Stage
    use Wayfarer::Middleware::Router
    use Wayfarer::Middleware::UserAgent
    use Wayfarer::Middleware::ContentType
    use Wayfarer::Middleware::Dispatch
  end

  class_methods do
    def crawl(url, batch: SecureRandom.uuid)
      Task.new(url, batch).tap do |task|
        perform_later(task)
      end
    end
  end
end

#user_agentObject (readonly)

Returns the user agent that retrieved the page.

Returns:

  • (Object)

    the user agent that retrieved the page



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# File 'lib/wayfarer/base.rb', line 14

module Base
  extend ActiveSupport::Concern
  # @!method stage(urls)
  # Adds URLs to an internal staging set so that they get enqueued
  # eventually, once the job executed successfully.
  # @overload stage(urls)
  #   @param urls [Array<String>] URLs to add to the staging set.
  # @overload stage(url)
  #   @param url [String] URL to add to the staging set.

  # @!method fetch(url, follow: 3)
  # @param url [String] URL to fetch using plain HTTP(S).
  # @param follow [Fixnum] Number of redirects to follow.
  # Retrieves the given URL to a {Page}.

  # @!method page(live: false)
  # @param url [live] whether to retrieve a new {Page}.
  # @return [Wayfarer::Page]
  # Returns the most recently retrieved page or a new page
  # for the current task URL if the `follow` keyword is passed.

  # @!scope class

  # @!attribute [r] route
  # @return [Wayfarer::Routing::DSL]
  # The job's {Wayfarer::Routing::DSL} that maps URLs to instance methods
  # or to a {Handler}.
  # @example Append a host route
  #   route.host "examplxe.com", to: :index

  # @!method content_types(*content_types)
  # @param content_types [*Array<String, Regexp>] Content-Types to whitelist
  # Whitelists Content-Types. Once at least one Content-Type is set, only
  # those Content-Types will be processed.

  # @!group Callbacks

  # @!method before_fetch
  # @overload before_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the page is fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.
  # @example Accessing the user agent in {#before_fetch}
  #   before_fetch do |task|
  #     user_agent # => the user agent that will fetch the page
  #   end

  # @!method around_fetch
  # @overload around_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the page getting fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_fetch
  # @overload after_fetch(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_fetch(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the page was fetched.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method before_perform
  # @overload before_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload before_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called before the task is performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method around_perform
  # @overload around_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload around_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called around the task getting performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!method after_perform
  # @overload after_perform(callback)
  #   @param callback [Symbol] Instance method to call
  # @overload after_perform(&block)
  #   @yield [Wayfarer::Task]
  # Registers a callback that is called after the task was performed.
  # If a symbol is passed, an instance method with the same name will be
  # called.

  # @!endgroup

  included do
    include Wayfarer::Middleware::Controller

    # Implement ActiveJob's #perform by calling into our own middleware chain
    alias_method :perform, :call

    # Middleware stack
    use Wayfarer::Middleware::Redis
    use Wayfarer::Middleware::BatchCompletion
    use Wayfarer::Middleware::UriParser
    use Wayfarer::Middleware::Normalize
    use Wayfarer::Middleware::Dedup
    use Wayfarer::Middleware::Stage
    use Wayfarer::Middleware::Router
    use Wayfarer::Middleware::UserAgent
    use Wayfarer::Middleware::ContentType
    use Wayfarer::Middleware::Dispatch
  end

  class_methods do
    def crawl(url, batch: SecureRandom.uuid)
      Task.new(url, batch).tap do |task|
        perform_later(task)
      end
    end
  end
end

Class Method Details

.after_fetch(callback) ⇒ Object .after_fetch {|Wayfarer::Task| ... } ⇒ Object

Registers a callback that is called after the page was fetched. If a symbol is passed, an instance method with the same name will be called.

Overloads:

  • .after_fetch(callback) ⇒ Object

    Parameters:

    • callback (Symbol)

      Instance method to call

  • .after_fetch {|Wayfarer::Task| ... } ⇒ Object

    Yields:



# File 'lib/wayfarer/base.rb', line 73

.after_perform(callback) ⇒ Object .after_perform {|Wayfarer::Task| ... } ⇒ Object

Registers a callback that is called after the task was performed. If a symbol is passed, an instance method with the same name will be called.

Overloads:

  • .after_perform(callback) ⇒ Object

    Parameters:

    • callback (Symbol)

      Instance method to call

  • .after_perform {|Wayfarer::Task| ... } ⇒ Object

    Yields:



# File 'lib/wayfarer/base.rb', line 100

.around_fetch(callback) ⇒ Object .around_fetch {|Wayfarer::Task| ... } ⇒ Object

Registers a callback that is called around the page getting fetched. If a symbol is passed, an instance method with the same name will be called.

Overloads:

  • .around_fetch(callback) ⇒ Object

    Parameters:

    • callback (Symbol)

      Instance method to call

  • .around_fetch {|Wayfarer::Task| ... } ⇒ Object

    Yields:



# File 'lib/wayfarer/base.rb', line 64

.around_perform(callback) ⇒ Object .around_perform {|Wayfarer::Task| ... } ⇒ Object

Registers a callback that is called around the task getting performed. If a symbol is passed, an instance method with the same name will be called.

Overloads:

  • .around_perform(callback) ⇒ Object

    Parameters:

    • callback (Symbol)

      Instance method to call

  • .around_perform {|Wayfarer::Task| ... } ⇒ Object

    Yields:



# File 'lib/wayfarer/base.rb', line 91

.before_fetch(callback) ⇒ Object .before_fetch {|Wayfarer::Task| ... } ⇒ Object

Registers a callback that is called before the page is fetched. If a symbol is passed, an instance method with the same name will be called.

Examples:

Accessing the user agent in #before_fetch

before_fetch do |task|
  user_agent # => the user agent that will fetch the page
end

Overloads:

  • .before_fetch(callback) ⇒ Object

    Parameters:

    • callback (Symbol)

      Instance method to call

  • .before_fetch {|Wayfarer::Task| ... } ⇒ Object

    Yields:



# File 'lib/wayfarer/base.rb', line 51

.before_perform(callback) ⇒ Object .before_perform {|Wayfarer::Task| ... } ⇒ Object

Registers a callback that is called before the task is performed. If a symbol is passed, an instance method with the same name will be called.

Overloads:

  • .before_perform(callback) ⇒ Object

    Parameters:

    • callback (Symbol)

      Instance method to call

  • .before_perform {|Wayfarer::Task| ... } ⇒ Object

    Yields:



# File 'lib/wayfarer/base.rb', line 82

.content_types(*content_types) ⇒ Object

Whitelists Content-Types. Once at least one Content-Type is set, only those Content-Types will be processed.

Parameters:

  • content_types (*Array<String, Regexp>)

    Content-Types to whitelist



# File 'lib/wayfarer/base.rb', line 44

Instance Method Details

#fetch(url, follow: 3) ⇒ Object

Retrieves the given URL to a Page.

Parameters:

  • url (String)

    URL to fetch using plain HTTP(S).

  • follow (Fixnum) (defaults to: 3)

    Number of redirects to follow.



# File 'lib/wayfarer/base.rb', line 24

#page(live: false) ⇒ Wayfarer::Page

Returns the most recently retrieved page or a new page for the current task URL if the follow keyword is passed.

Parameters:

  • url (live)

    whether to retrieve a new Page.

Returns:



# File 'lib/wayfarer/base.rb', line 29

#stage(urls) ⇒ Object #stage(url) ⇒ Object

Adds URLs to an internal staging set so that they get enqueued eventually, once the job executed successfully.

Overloads:

  • #stage(urls) ⇒ Object

    Parameters:

    • urls (Array<String>)

      URLs to add to the staging set.

  • #stage(url) ⇒ Object

    Parameters:

    • url (String)

      URL to add to the staging set.



# File 'lib/wayfarer/base.rb', line 16