Module: Traject::Macros::Transformation
- Included in:
- Indexer
- Defined in:
- lib/traject/macros/transformation.rb
Overview
Macros intended to be mixed into an Indexer and used in config as second or further args to #to_field, to transform existing accumulator values.
They have the same form as any proc/block passed to #to_field, but operate on an existing accumulator, intended to be used as non-first-step transformations.
Some of these are extracted from extract_marc options, so they can be used with any first-step extract methods. Some informed by current users.
Instance Method Summary collapse
-
#append(suffix) ⇒ Object
Append argument to end of each value in accumulator.
-
#default(default_value) ⇒ Object
Adds a literal to accumulator if accumulator was empty.
-
#delete_if(arg) ⇒ Object
Run ruby
delete_if
on the accumulator for values that include or are equal to arg. -
#first_only ⇒ Object
Removes all but the first value from accumulator, if more values were present.
-
#gsub(pattern, replace) ⇒ Object
Run ruby
gsub
on each value in accumulator, with pattern and replace value given. -
#prepend(prefix) ⇒ Object
prepend argument to beginning of each value in accumulator.
-
#select(arg) ⇒ Object
Run ruby
select!
on the accumulator for values that include or are equal to arg. -
#split(separator) ⇒ Object
Run ruby
split
on each value in the accumulator, with separator given, flatten all results into single array as accumulator. -
#strip ⇒ Object
For each value in accumulator, remove all leading or trailing whitespace (unique aware).
-
#transform(a_proc = nil, &block) ⇒ Object
Pass in a proc/lambda arg or a block (or both), that will be called on each value already in the accumulator, to transform it.
-
#translation_map(*translation_map_specifier) ⇒ Object
Maps all values on accumulator through a Traject::TranslationMap.
-
#unique ⇒ Object
calls ruby
uniq!
on accumulator, removes any duplicate values.
Instance Method Details
#append(suffix) ⇒ Object
Append argument to end of each value in accumulator.
141 142 143 144 145 |
# File 'lib/traject/macros/transformation.rb', line 141 def append(suffix) lambda do |rec, acc| acc.collect! { |v| v + suffix } end end |
#default(default_value) ⇒ Object
Adds a literal to accumulator if accumulator was empty
85 86 87 88 89 90 91 |
# File 'lib/traject/macros/transformation.rb', line 85 def default(default_value) lambda do |rec, acc| if acc.empty? acc << default_value end end end |
#delete_if(arg) ⇒ Object
Run ruby delete_if
on the accumulator for values that include or are equal to arg.
It will also accept an array, set, regex pattern, proc or lambda as an arugment.
166 167 168 169 170 171 172 173 174 |
# File 'lib/traject/macros/transformation.rb', line 166 def delete_if(arg) p = if arg.respond_to? :include? proc { |v| arg.include?(v) } else proc { |v| arg === v } end ->(_, acc) { acc.delete_if(&p) } end |
#first_only ⇒ Object
Removes all but the first value from accumulator, if more values were present.
97 98 99 100 101 102 |
# File 'lib/traject/macros/transformation.rb', line 97 def first_only lambda do |rec, acc| # kind of esoteric, but slice used this way does mutating first, yep acc.slice!(1, acc.length) end end |
#gsub(pattern, replace) ⇒ Object
Run ruby gsub
on each value in accumulator, with pattern and replace value given.
155 156 157 158 159 |
# File 'lib/traject/macros/transformation.rb', line 155 def gsub(pattern, replace) lambda do |rec, acc| acc.collect! { |v| v.gsub(pattern, replace) } end end |
#prepend(prefix) ⇒ Object
prepend argument to beginning of each value in accumulator.
148 149 150 151 152 |
# File 'lib/traject/macros/transformation.rb', line 148 def prepend(prefix) lambda do |rec, acc| acc.collect! { |v| prefix + v } end end |
#select(arg) ⇒ Object
Run ruby select!
on the accumulator for values that include or are equal to arg.
It accepts an array, set, regex pattern, proc or lambda as an arugument.
181 182 183 184 185 186 187 188 189 |
# File 'lib/traject/macros/transformation.rb', line 181 def select(arg) p = if arg.respond_to? :include? proc { |v| arg.include?(v) } else proc { |v| arg === v } end ->(_, acc) { acc.select!(&p) } end |
#split(separator) ⇒ Object
Run ruby split
on each value in the accumulator, with separator
given, flatten all results into single array as accumulator.
Will generally result in more individual values in accumulator as output than were
there in input, as input values are split up into multiple values.
134 135 136 137 138 |
# File 'lib/traject/macros/transformation.rb', line 134 def split(separator) lambda do |rec, acc| acc.replace( acc.flat_map { |v| v.split(separator) } ) end end |
#strip ⇒ Object
For each value in accumulator, remove all leading or trailing whitespace (unique aware). Like ruby #strip, but whitespace-aware
121 122 123 124 125 126 127 128 |
# File 'lib/traject/macros/transformation.rb', line 121 def strip lambda do |rec, acc| acc.collect! do |v| # unicode whitespace class aware v.sub(/\A[[:space:]]+/,'').sub(/[[:space:]]+\Z/, '') end end end |
#transform(a_proc = nil, &block) ⇒ Object
Pass in a proc/lambda arg or a block (or both), that will be called on each
value already in the accumulator, to transform it. (Ie, with #map!
/#collect!
on your proc(s)).
Due to how ruby syntax precedence works, the block form is probably not too useful
in traject config files, except with the &:
trick.
The "stabby lambda" may be convenient for passing an explicit proc argument.
You can pass both an explicit proc arg and a block, in which case the proc arg will be applied first.
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/traject/macros/transformation.rb', line 60 def transform(a_proc=nil, &block) unless a_proc || block raise ArgumentError, "Needs a transform proc arg or block arg" end transformer_callable = if a_proc && block # need to make a combo wrapper. ->(val) { block.call(a_proc.call(val)) } elsif a_proc a_proc else block end lambda do |rec, acc| acc.collect! do |value| transformer_callable.call(value) end end end |
#translation_map(*translation_map_specifier) ⇒ Object
Maps all values on accumulator through a Traject::TranslationMap.
A Traject::TranslationMap is hash-like mapping from input to output, usually defined in a yaml or dot-properties file, which can be looked up in load path with a file name as arg. See Traject::TranslationMap header coments for details.
Using this macro, you can pass in one TranslationMap initializer arg, but you can also pass in multiple, and they will be merged into each other (last one last), so you can use this to apply over-rides: Either from another on-disk map, or even from an inline hash (since a Hash is a valid TranslationMap initialization arg too).
34 35 36 37 38 39 40 41 42 |
# File 'lib/traject/macros/transformation.rb', line 34 def translation_map(*translation_map_specifier) translation_map = translation_map_specifier. collect { |spec| Traject::TranslationMap.new(spec) }. reduce(:merge) lambda do |rec, acc| translation_map.translate_array! acc end end |
#unique ⇒ Object
calls ruby uniq!
on accumulator, removes any duplicate values
109 110 111 112 113 |
# File 'lib/traject/macros/transformation.rb', line 109 def unique lambda do |rec, acc| acc.uniq! end end |