Class: CloudCrowd::Action
- Inherits:
-
Object
- Object
- CloudCrowd::Action
- Defined in:
- lib/cloud_crowd/action.rb
Overview
As you write your custom actions, have them inherit from CloudCrowd::Action. All actions must implement a process
method, which should return a JSON-serializable object that will be used as the output for the work unit. See the default actions for examples.
Optionally, actions may define split
and merge
methods to do mapping and reducing around the input
. split
should return an array of URLs – to be mapped into WorkUnits and processed in parallel. In the merge
step, input
will be an array of all the resulting outputs from calling process.
All actions have use of an individual work_directory
, for scratch files, and spend their duration inside of it, so relative paths work well.
Note that Actions inherit a backticks (‘) method that raises an Exception if the external command fails.
Constant Summary collapse
- FILE_URL =
/\Afile:\/\//
Instance Attribute Summary collapse
-
#file_name ⇒ Object
readonly
Returns the value of attribute file_name.
-
#input ⇒ Object
readonly
Returns the value of attribute input.
-
#input_path ⇒ Object
readonly
Returns the value of attribute input_path.
-
#options ⇒ Object
readonly
Returns the value of attribute options.
-
#work_directory ⇒ Object
readonly
Returns the value of attribute work_directory.
Instance Method Summary collapse
-
#`(command) ⇒ Object
Actions have a backticks command that raises a CommandFailed exception on failure, so that processing doesn’t just blithely continue.
-
#cleanup_work_directory ⇒ Object
After the Action has finished, we remove the work directory and return to the root directory (where workers run by default).
-
#download(url, path) ⇒ Object
Download a file to the specified path.
-
#initialize(status, input, options, store) ⇒ Action
constructor
Initializing an Action sets up all of the read-only variables that form the bulk of the API for action subclasses.
-
#process ⇒ Object
Each Action subclass must implement a
process
method, overriding this. -
#save(file_path) ⇒ Object
Takes a local filesystem path, saves the file to S3, and returns the public (or authenticated) url on S3 where the file can be accessed.
Constructor Details
#initialize(status, input, options, store) ⇒ Action
Initializing an Action sets up all of the read-only variables that form the bulk of the API for action subclasses. (Paths to read from and write to). It creates the work_directory
and moves into it. If we’re not merging multiple results, it downloads the input file into the work_directory
before starting.
29 30 31 32 33 34 35 36 |
# File 'lib/cloud_crowd/action.rb', line 29 def initialize(status, input, , store) @input, @options, @store = input, , store @job_id, @work_unit_id = ['job_id'], ['work_unit_id'] @work_directory = File.(File.join(@store.temp_storage_path, storage_prefix)) FileUtils.mkdir_p(@work_directory) unless File.exists?(@work_directory) parse_input download_input end |
Instance Attribute Details
#file_name ⇒ Object (readonly)
Returns the value of attribute file_name.
22 23 24 |
# File 'lib/cloud_crowd/action.rb', line 22 def file_name @file_name end |
#input ⇒ Object (readonly)
Returns the value of attribute input.
22 23 24 |
# File 'lib/cloud_crowd/action.rb', line 22 def input @input end |
#input_path ⇒ Object (readonly)
Returns the value of attribute input_path.
22 23 24 |
# File 'lib/cloud_crowd/action.rb', line 22 def input_path @input_path end |
#options ⇒ Object (readonly)
Returns the value of attribute options.
22 23 24 |
# File 'lib/cloud_crowd/action.rb', line 22 def @options end |
#work_directory ⇒ Object (readonly)
Returns the value of attribute work_directory.
22 23 24 |
# File 'lib/cloud_crowd/action.rb', line 22 def work_directory @work_directory end |
Instance Method Details
#`(command) ⇒ Object
Actions have a backticks command that raises a CommandFailed exception on failure, so that processing doesn’t just blithely continue.
75 76 77 78 79 80 |
# File 'lib/cloud_crowd/action.rb', line 75 def `(command) result = super(command) exit_code = $?.to_i raise Error::CommandFailed.new(result, exit_code) unless exit_code == 0 result end |
#cleanup_work_directory ⇒ Object
After the Action has finished, we remove the work directory and return to the root directory (where workers run by default).
69 70 71 |
# File 'lib/cloud_crowd/action.rb', line 69 def cleanup_work_directory FileUtils.rm_r(@work_directory) if File.exists?(@work_directory) end |
#download(url, path) ⇒ Object
Download a file to the specified path.
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
# File 'lib/cloud_crowd/action.rb', line 44 def download(url, path) `curl -s "#{url}" > "#{path}"` return path # The previous implementation is below, and, although it would be # wonderful not to shell out, RestClient wasn't handling URLs with encoded # entities (%20, for example), and doesn't let you download to a given # location. Getting a RestClient patch in would be ideal. # # if url.match(FILE_URL) # FileUtils.cp(url.sub(FILE_URL, ''), path) # else # resp = RestClient::Request.execute(:url => url, :method => :get, :raw_response => true) # FileUtils.mv resp.file.path, path # end end |
#process ⇒ Object
Each Action subclass must implement a process
method, overriding this.
39 40 41 |
# File 'lib/cloud_crowd/action.rb', line 39 def process raise NotImplementedError, "CloudCrowd::Actions must override 'process' with their own processing code." end |
#save(file_path) ⇒ Object
Takes a local filesystem path, saves the file to S3, and returns the public (or authenticated) url on S3 where the file can be accessed.
62 63 64 65 |
# File 'lib/cloud_crowd/action.rb', line 62 def save(file_path) save_path = File.join(storage_prefix, File.basename(file_path)) @store.save(file_path, save_path) end |