Class: CDMDEXER::ETLWorker
- Inherits:
-
Object
- Object
- CDMDEXER::ETLWorker
- Extended by:
- Forwardable
- Includes:
- Sidekiq::Worker
- Defined in:
- lib/cdmdexer/etl_worker.rb
Overview
Extract records from OAI, delete records marked for deletion and send everything else to a transformation / load worker
Instance Attribute Summary collapse
-
#batch_size ⇒ Object
readonly
Returns the value of attribute batch_size.
-
#cdm_endpoint ⇒ Object
readonly
Returns the value of attribute cdm_endpoint.
-
#completed_callback_klass ⇒ Object
Because Sidekiq serializes params to JSON, we provide custom setters for dependencies (normally these would be default params in the constructor) so that they may be mocked and tested.
-
#config ⇒ Object
readonly
Returns the value of attribute config.
- #etl_worker_klass ⇒ Object
-
#field_mappings ⇒ Object
readonly
Returns the value of attribute field_mappings.
-
#is_recursive ⇒ Object
readonly
Returns the value of attribute is_recursive.
- #load_worker_klass ⇒ Object
-
#oai_endpoint ⇒ Object
readonly
Returns the value of attribute oai_endpoint.
- #oai_request_klass ⇒ Object
-
#resumption_token ⇒ Object
readonly
Returns the value of attribute resumption_token.
-
#solr_config ⇒ Object
readonly
Returns the value of attribute solr_config.
- #transform_worker_klass ⇒ Object
Instance Method Summary collapse
- #perform(config) ⇒ Object
-
#run_next_batch! ⇒ Object
Recurse through OAI batches one at a time.
Instance Attribute Details
#batch_size ⇒ Object (readonly)
Returns the value of attribute batch_size.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def batch_size @batch_size end |
#cdm_endpoint ⇒ Object (readonly)
Returns the value of attribute cdm_endpoint.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def cdm_endpoint @cdm_endpoint end |
#completed_callback_klass ⇒ Object
Because Sidekiq serializes params to JSON, we provide custom setters for dependencies (normally these would be default params in the constructor) so that they may be mocked and tested
55 56 57 |
# File 'lib/cdmdexer/etl_worker.rb', line 55 def completed_callback_klass @completed_callback_klass ||= CDMDEXER::CompletedCallback end |
#config ⇒ Object (readonly)
Returns the value of attribute config.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def config @config end |
#etl_worker_klass ⇒ Object
59 60 61 |
# File 'lib/cdmdexer/etl_worker.rb', line 59 def etl_worker_klass @etl_worker_klass ||= ETLWorker end |
#field_mappings ⇒ Object (readonly)
Returns the value of attribute field_mappings.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def field_mappings @field_mappings end |
#is_recursive ⇒ Object (readonly)
Returns the value of attribute is_recursive.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def is_recursive @is_recursive end |
#load_worker_klass ⇒ Object
67 68 69 |
# File 'lib/cdmdexer/etl_worker.rb', line 67 def load_worker_klass @load_worker_klass ||= LoadWorker end |
#oai_endpoint ⇒ Object (readonly)
Returns the value of attribute oai_endpoint.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def oai_endpoint @oai_endpoint end |
#oai_request_klass ⇒ Object
63 64 65 |
# File 'lib/cdmdexer/etl_worker.rb', line 63 def oai_request_klass @oai_request_klass ||= OaiRequest end |
#resumption_token ⇒ Object (readonly)
Returns the value of attribute resumption_token.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def resumption_token @resumption_token end |
#solr_config ⇒ Object (readonly)
Returns the value of attribute solr_config.
14 15 16 |
# File 'lib/cdmdexer/etl_worker.rb', line 14 def solr_config @solr_config end |
#transform_worker_klass ⇒ Object
71 72 73 |
# File 'lib/cdmdexer/etl_worker.rb', line 71 def transform_worker_klass @transform_worker_klass ||= TransformWorker end |
Instance Method Details
#perform(config) ⇒ Object
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/cdmdexer/etl_worker.rb', line 29 def perform(config) # Sidekiq stores params in JSON, so we can't inject dependencies. This # results in the long set of arguments that follows. Otherwise, we'd # simply inject the OAI request and extractor objects @config = config @solr_config = config.fetch('solr_config').symbolize_keys @cdm_endpoint = config.fetch('cdm_endpoint') @oai_endpoint = config.fetch('oai_endpoint') @field_mappings = config.fetch('field_mappings', false) @resumption_token = config.fetch('resumption_token', nil) @batch_size = config.fetch('batch_size', 5).to_i @is_recursive = config.fetch('is_recursive', true) @oai_request = oai_request_klass.new( endpoint_url: oai_endpoint, resumption_token: resumption_token, set_spec: config.fetch('set_spec', nil) ) run_batch! run_next_batch! end |
#run_next_batch! ⇒ Object
Recurse through OAI batches one at a time
76 77 78 79 80 81 82 |
# File 'lib/cdmdexer/etl_worker.rb', line 76 def run_next_batch! if next_resumption_token && is_recursive etl_worker_klass.perform_async(next_config) else completed_callback_klass.call!(config) end end |