Class: Traject::Indexer::Context
- Inherits:
-
Object
- Object
- Traject::Indexer::Context
- Defined in:
- lib/traject/indexer/context.rb
Instance Attribute Summary collapse
-
#clipboard ⇒ Object
Returns the value of attribute clipboard.
-
#index_step ⇒ Object
Returns the value of attribute index_step.
-
#input_name ⇒ Object
sometimes we have multiple inputs, input_name describes the current one, and position_in_input the position of the record in the current input -- both can sometimes be blanl when we don't know.
-
#logger ⇒ Object
Returns the value of attribute logger.
-
#output_hash ⇒ Object
Returns the value of attribute output_hash.
-
#position ⇒ Object
'position' is a 1-based position in stream of processed records.
-
#position_in_input ⇒ Object
sometimes we have multiple inputs, input_name describes the current one, and position_in_input the position of the record in the current input -- both can sometimes be blanl when we don't know.
-
#settings ⇒ Object
Returns the value of attribute settings.
-
#skipmessage ⇒ Object
Should we be skipping this record?.
-
#source_record ⇒ Object
Returns the value of attribute source_record.
-
#source_record_id_proc ⇒ Object
Returns the value of attribute source_record_id_proc.
Instance Method Summary collapse
-
#add_output(field_name, *values) ⇒ Traject::Context
Add values to an array in context.output_hash with the specified key/field_name(s).
-
#initialize(hash_init = {}) ⇒ Context
constructor
A new instance of Context.
-
#record_inspect ⇒ Object
a string label that can be used to refer to a particular record in log messages and exceptions.
-
#skip!(msg = '(no message given)') ⇒ Object
Set the fact that this record should be skipped, with an optional message.
-
#skip? ⇒ Boolean
Should we skip this record?.
-
#source_record_id ⇒ Object
Useful for describing a record in a log or especially error message.
Constructor Details
#initialize(hash_init = {}) ⇒ Context
Returns a new instance of Context.
8 9 10 11 12 13 14 15 16 17 18 19 |
# File 'lib/traject/indexer/context.rb', line 8 def initialize(hash_init = {}) # TODO, argument checking for required args? self.clipboard = {} self.output_hash = {} hash_init.each_pair do |key, value| self.send("#{key}=", value) end @skip = false end |
Instance Attribute Details
#clipboard ⇒ Object
Returns the value of attribute clipboard.
21 22 23 |
# File 'lib/traject/indexer/context.rb', line 21 def clipboard @clipboard end |
#index_step ⇒ Object
Returns the value of attribute index_step.
22 23 24 |
# File 'lib/traject/indexer/context.rb', line 22 def index_step @index_step end |
#input_name ⇒ Object
sometimes we have multiple inputs, input_name describes the current one, and position_in_input the position of the record in the current input -- both can sometimes be blanl when we don't know.
28 29 30 |
# File 'lib/traject/indexer/context.rb', line 28 def input_name @input_name end |
#logger ⇒ Object
Returns the value of attribute logger.
21 22 23 |
# File 'lib/traject/indexer/context.rb', line 21 def logger @logger end |
#output_hash ⇒ Object
Returns the value of attribute output_hash.
21 22 23 |
# File 'lib/traject/indexer/context.rb', line 21 def output_hash @output_hash end |
#position ⇒ Object
'position' is a 1-based position in stream of processed records.
24 25 26 |
# File 'lib/traject/indexer/context.rb', line 24 def position @position end |
#position_in_input ⇒ Object
sometimes we have multiple inputs, input_name describes the current one, and position_in_input the position of the record in the current input -- both can sometimes be blanl when we don't know.
28 29 30 |
# File 'lib/traject/indexer/context.rb', line 28 def position_in_input @position_in_input end |
#settings ⇒ Object
Returns the value of attribute settings.
22 23 24 |
# File 'lib/traject/indexer/context.rb', line 22 def settings @settings end |
#skipmessage ⇒ Object
Should we be skipping this record?
31 32 33 |
# File 'lib/traject/indexer/context.rb', line 31 def @skipmessage end |
#source_record ⇒ Object
Returns the value of attribute source_record.
22 23 24 |
# File 'lib/traject/indexer/context.rb', line 22 def source_record @source_record end |
#source_record_id_proc ⇒ Object
Returns the value of attribute source_record_id_proc.
22 23 24 |
# File 'lib/traject/indexer/context.rb', line 22 def source_record_id_proc @source_record_id_proc end |
Instance Method Details
#add_output(field_name, *values) ⇒ Traject::Context
Add values to an array in context.output_hash with the specified key/field_name(s). Creates array in output_hash if currently nil.
Post-processing/filtering:
- uniqs accumulator, unless settings["allow_dupicate_values"] is set.
- Removes nil values unless settings["allow_nil_values"] is set.
- Will not add an empty array to output_hash (will leave it nil instead) unless settings["allow_empty_fields"] is set.
Multiple values can be added with multiple arguments (we avoid an array argument meaning multiple values to accomodate odd use cases where array itself is desired in output_hash value)
Note for historical reasons relevant settings key names are in constants in Traject::Indexer::ToFieldStep, but the settings don't just apply to ToFieldSteps
117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/traject/indexer/context.rb', line 117 def add_output(field_name, *values) values.compact! unless self.settings && self.settings[Traject::Indexer::ToFieldStep::ALLOW_NIL_VALUES] return self if values.empty? and not (self.settings && self.settings[Traject::Indexer::ToFieldStep::ALLOW_EMPTY_FIELDS]) Array(field_name).each do |key| accumulator = (self.output_hash[key.to_s] ||= []) accumulator.concat values accumulator.uniq! unless self.settings && self.settings[Traject::Indexer::ToFieldStep::ALLOW_DUPLICATE_VALUES] end return self end |
#record_inspect ⇒ Object
a string label that can be used to refer to a particular record in log messages and exceptions. Includes various parts depending on what we got.
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/traject/indexer/context.rb', line 59 def record_inspect str = "<" str << "record ##{position}" if position if input_name && position_in_input str << " (#{input_name} ##{position_in_input}), " elsif position str << ", " end if source_id = source_record_id str << "source_id:#{source_id} " end if output_id = self.output_hash["id"] str << "output_id:#{[output_id].join(',')}" end str.chomp!(" ") str.chomp!(",") str << ">" str end |
#skip!(msg = '(no message given)') ⇒ Object
Set the fact that this record should be skipped, with an optional message
35 36 37 38 |
# File 'lib/traject/indexer/context.rb', line 35 def skip!(msg = '(no message given)') @skipmessage = msg @skip = true end |
#skip? ⇒ Boolean
Should we skip this record?
41 42 43 |
# File 'lib/traject/indexer/context.rb', line 41 def skip? @skip end |
#source_record_id ⇒ Object
Useful for describing a record in a log or especially error message. May be useful to combine with #position in output messages, especially since this method may sometimes return empty string if info on record id is not available.
Returns id from source_record (if we can get it from a source_record_id_proc), then a slash,then output_hash["id"] -- if both are present. Otherwise may return just one, or even an empty string.
53 54 55 |
# File 'lib/traject/indexer/context.rb', line 53 def source_record_id source_record_id_proc && source_record_id_proc.call(source_record) end |