Class: Mass::Source
- Inherits:
-
BlackStack::Base
- Object
- BlackStack::Base
- Mass::Source
- Defined in:
- lib/base-line/source.rb
Instance Attribute Summary collapse
-
#type ⇒ Object
Returns the value of attribute type.
Class Method Summary collapse
Instance Method Summary collapse
-
#child_class_instance ⇒ Object
crate an instance of the profile type using the class defined in the ‘desc` attribute.
-
#class_name_from_source_type ⇒ Object
convert the source_type into the ruby class to create an instance.
-
#do(job:, logger: nil) ⇒ Object
Return a hash desriptor of the events found.
-
#event_elements(job:) ⇒ Object
return array of event elements.
-
#initialize(h) ⇒ Source
constructor
A new instance of Source.
-
#normalized_source_url(url:) ⇒ Object
Return the same URL in a normalized form: - remove all GET parameters.
-
#show_up_event_elements(job:, event_limit:, max_scrolls:, logger: nil) ⇒ Object
scroll down the page until N event elements are showed up.
-
#valid_source_params?(params:) ⇒ Boolean
If the profile ‘access` is not `:api`, raise an exception.
-
#valid_source_url?(url:) ⇒ Boolean
If the profile ‘access` is not `:rpa`, raise an exception.
Constructor Details
#initialize(h) ⇒ Source
Returns a new instance of Source.
9 10 11 12 |
# File 'lib/base-line/source.rb', line 9 def initialize(h) super(h) self.type = Mass::SourceType.new(h['source_type_desc']).child_class_instance end |
Instance Attribute Details
#type ⇒ Object
Returns the value of attribute type.
3 4 5 |
# File 'lib/base-line/source.rb', line 3 def type @type end |
Class Method Details
.object_name ⇒ Object
5 6 7 |
# File 'lib/base-line/source.rb', line 5 def self.object_name 'source' end |
Instance Method Details
#child_class_instance ⇒ Object
crate an instance of the profile type using the class defined in the ‘desc` attribute. override the base method
23 24 25 26 27 28 29 |
# File 'lib/base-line/source.rb', line 23 def child_class_instance source_type = self.desc['source_type'] key = self.class_name_from_source_type raise "Source code of souurce type #{source_type} not found. Create a class #{key} in the folder `/lib` of your mass-sdk." unless Kernel.const_defined?(key) ret = Kernel.const_get(key).new(self.desc) return ret end |
#class_name_from_source_type ⇒ Object
convert the source_type into the ruby class to create an instance. example: Apollo –> Mass::ApolloAPI
16 17 18 19 |
# File 'lib/base-line/source.rb', line 16 def class_name_from_source_type source_type = self.desc['source_type'] "Mass::#{source_type}" end |
#do(job:, logger: nil) ⇒ Object
Return a hash desriptor of the events found.
Parameters:
-
If the profile ‘access` is `:rpa`, then the `bot_driver` parameter is mandatory.
-
If the profile ‘access` is `:api`, then the `api_key` parameter is mandatory.
-
If the profile ‘access` is `:mta`, raise an exception.
-
If the profile ‘access` is `:rpa`, then the `bot_url` parameter is mandatory, and it must be a valid URL.
-
If the profile ‘access` is `:api`, then the `api_params` parameter is mandatory and it must be a hash.
-
The ‘event_count` is for scrolling down (or perform any other required action) until finding `event_count` events.
Output: {
'status' => :performed, # if it is not 'success', then it is an error description.
'snapshot' => 'https://foo.com/snapshot.png'
'screenshots' => [
# array of URLs to screenshots
],
'events' => [
'url' => 'https://facebook.com/john-doe/posts/12345', # normalized URL of the event
'title' => 'Join my Facebook Community!'
'content' => 'My name is John Doe and I invite everyone to join my Facebook Community: facebook.com/groups/john-doe-restaurants!',
'pictures' => [
# array of URLs to pictures scraped from the post and uploaded to our DropBox.
],
'lead' => {
'name' => 'John Doe',
'url' => 'https://facebook.com/john-doe',
'headline' => "Founder & CEO at Doe's Restaurants",
'picture' => 'https://foo.com/john-doe.png'
}
],
}
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 |
# File 'lib/base-line/source.rb', line 157 def do(job:, logger:nil) # If the profile `access` is `:rpa`, then the `bot_driver` parameter is mandatory. #raise "The parameter `bot_driver` is mandatory." if bot_driver.nil? if self.profile_type.desc['access'].to_sym == :rpa # If the profile `access` is `:api`, then the `api_key` parameter is mandatory. #raise "The parameter `api_key` is mandatory." if api_key.nil? if self.profile_type.desc['access'].to_sym == :api # If the profile `access` is `:mta`, raise an exception. raise "The method `do` is not allowed for #{self.profile_type.desc['access'].to_s} access." if self.profile_type.desc['access'].to_sym == :mta # If the profile `access` is `:rpa`, then the `bot_url` parameter is mandatory, and it must be a valid URL. #raise "The parameter `bot_url` is mandatory." if bot_url.nil? if self.profile_type.desc['access'].to_sym == :rpa # If the profile `access` is `:api`, then the `api_params` parameter is mandatory and it must be a hash. #raise "The parameter `api_params` is mandatory." if api_params.nil? if self.profile_type.desc['access'].to_sym == :api # The `event_count` is for scrolling down (or perform any other required action) until finding `event_count` events. #raise "The parameter `event_count` must be an integer higher or equal then 0." if !event_count.is_a?(Integer) || event_count < 0 # return return { 'status' => :performed, # if it is not 'success', then it is an error description. 'screenshots' => [ # array of URLs to screenshots ], # array of URLs to HTML snapshots 'snapshot_url' => nil, 'events' => [ # array of event descriptors ], } end |
#event_elements(job:) ⇒ Object
return array of event elements
84 85 86 |
# File 'lib/base-line/source.rb', line 84 def event_elements(job:) raise "The method `event_elements` is not implemented for #{self.class.name}." end |
#normalized_source_url(url:) ⇒ Object
Return the same URL in a normalized form:
-
remove all GET parameters.
-
remove all trailing slashes.
If the profile ‘access` is not `:rpa`, raise an exception. If the `url` is not valid, raise an exception. Return the normalized URL.
Overload this method in the child class.
55 56 57 58 59 60 61 62 63 64 65 66 67 |
# File 'lib/base-line/source.rb', line 55 def normalized_source_url(url:) # If the profile `access` is not `:rpa`, raise an exception. raise "The method `normalized_source_url` is not allowed for #{self.profile_type.desc['access'].to_s} access." if self.profile_type.desc['access'] != :rpa # If the `url` is not valid, raise an exception. raise "The URL is not valid." if !self.valid_source_url?(url: url) # Return the same URL in a normalized form: # - remove all GET parameters. # - remove all trailing slashes. url = url.gsub(/\?.*$/, '').strip url = ret.gsub(/\/+$/, '') # Return the normalized URL. url end |
#show_up_event_elements(job:, event_limit:, max_scrolls:, logger: nil) ⇒ Object
scroll down the page until N event elements are showed up
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
# File 'lib/base-line/source.rb', line 89 def show_up_event_elements(job:, event_limit:, max_scrolls:, logger:nil) l = logger || BlackStack::DummyLogger.new(nil) driver = job.profile.driver # scroll down i = 0 prev_n_events = 0 security_height = 150 lis = self.event_elements(job: job) n_events = lis.size while (i<max_scrolls || n_events>prev_n_events) && n_events<event_limit i += 1 prev_n_events = n_events lis = self.event_elements(job: job) n_events = lis.size # scroll down the exact height of the viewport # reference: https://stackoverflow.com/questions/1248081/how-to-get-the-browser-viewport-dimensions l.logs "Scrolling down (#{i.to_s.blue}/#{max_scrolls.to_s.blue} - #{n_events.to_s.blue}/#{event_limit.to_s.blue} events showed up)... " step = self.desc['scrolling_step'] + rand(self.desc['scrolling_step_random'].to_i) driver.execute_script("window.scrollTo(0, #{i.to_s}*#{step})") #driver.execute_script("window.scrollTo(0, #{i.to_s}*(Math.max(document.documentElement.clientHeight || 0, window.innerHeight || 0)-#{security_height}))") sleep(5) l.logf "done".green # screenshot l.logs 'Screenshot... ' job.desc['screenshots'] << job.profile.screenshot if job.profile.desc['allow_browser_to_download_multiple_files'] l.logf 'done'.green + " (#{job.desc['screenshots'].size.to_s.blue} total)" end end |
#valid_source_params?(params:) ⇒ Boolean
If the profile ‘access` is not `:api`, raise an exception. Parameter `params` must be a hash. Return `true` if the `params` are valid. Return `false` if the `params` are not valid.
73 74 75 76 77 78 79 80 81 |
# File 'lib/base-line/source.rb', line 73 def valid_source_params?(params:) # If the profile `access` is not `:api`, raise an exception. raise "The method `valid_source_params?` is not allowed for #{self.profile_type.desc['access'].to_s} access." if self.profile_type.desc['access'] != :api # Parameter `params` must be a hash. raise "The parameter `params` must be a hash." if !params.is_a?(Hash) # Return `true` if the `params` are valid. # Return `false` if the `params` are not valid. true end |
#valid_source_url?(url:) ⇒ Boolean
If the profile ‘access` is not `:rpa`, raise an exception. Return `true` if the `url` is valid. Return `false` if the `url` is not valid.
Overload this method in the child class.
37 38 39 40 41 42 43 |
# File 'lib/base-line/source.rb', line 37 def valid_source_url?(url:) # If the profile `access` is not `:rpa`, raise an exception. raise "The method `valid_source_url?` is not allowed for #{self.profile_type.desc['access'].to_s} access." if self.profile_type.desc['access'] != :rpa # Return `true` if the `url` is valid. # Return `false` if the `url` is not valid. true end |