Module: RedAmber::VectorSelectable
- Included in:
- Vector
- Defined in:
- lib/red_amber/vector_selectable.rb
Overview
Mix-in for class Vector
Functions to select some data.
Instance Method Summary collapse
-
#[](*args) {|specifier.| ... } ⇒ scalar, Array
Select elements in the self by indices or booleans.
-
#drop_nil ⇒ Vector
Drop nil in self and returns a new Vector as a result.
-
#filter(*booleans, &block) ⇒ Vector
(also: #select, #find_all)
Select elements in the self by booleans.
-
#first ⇒ Object
Returns first element of self.
-
#index(element) ⇒ integer?
Returns index of first matched position of element in self.
-
#is_in(*values) ⇒ Vector
Check if elements of self are in the other values.
-
#last ⇒ Object
Returns last element of self.
-
#rank(order = :ascending, tie: :first, null_placement: :at_end) ⇒ Vector
Returns 1-based numerical rank of self.
-
#sample(n_or_prop = nil) ⇒ Object
Pick up elements at random.
-
#sort(order = :ascending) ⇒ Vector
Arrange values in Vector.
-
#take(*indices, &block) ⇒ Vector
Select elements in the self by indices.
Instance Method Details
#[](*args) {|specifier.| ... } ⇒ scalar, Array
Select elements in the self by indices or booleans.
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/red_amber/vector_selectable.rb', line 103 def [](*args) array = case args in [Vector => v] return scalar_or_array(take_by_vector(v)) if v.numeric? return scalar_or_array(filter_by_array(v.data)) if v.boolean? raise VectorTypeError, "Argument must be numeric or boolean: #{args}" in [Arrow::BooleanArray => ba] return scalar_or_array(filter_by_array(ba)) in [] return nil in [Arrow::Array => arrow_array] arrow_array in [Range => r] Arrow::Array.new(parse_range(r, size)) else Arrow::Array.new(args.flatten) end return scalar_or_array(filter_by_array(array)) if array.boolean? vector = Vector.new(array) return scalar_or_array(take_by_vector(vector)) if vector.numeric? raise VectorArgumentError, "Invalid argument: #{args}" end |
#drop_nil ⇒ Vector
Drop nil in self and returns a new Vector as a result.
200 201 202 203 |
# File 'lib/red_amber/vector_selectable.rb', line 200 def drop_nil datum = find(:drop_null).execute([data]) Vector.create(datum.value) end |
#filter(*booleans, &block) ⇒ Vector Also known as: select, find_all
Select elements in the self by booleans.
TODO: support for the option ‘null_selection_behavior: :drop`
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/red_amber/vector_selectable.rb', line 61 def filter(*booleans, &block) if block unless booleans.empty? raise VectorArgumentError, 'Must not specify both arguments and block.' end booleans = [yield] end case booleans in [Vector => v] raise VectorTypeError, 'Argument is not a boolean.' unless v.boolean? Vector.create(filter_by_array(v.data)) in [Arrow::BooleanArray => ba] Vector.create(filter_by_array(ba)) in [] Vector.new else booleans.flatten! a = Arrow::Array.new(booleans) if a.boolean? Vector.create(filter_by_array(a)) elsif booleans.compact.empty? # [nil, nil] becomes string array Vector.new else raise VectorTypeError, "Argument is not a boolean: #{booleans}" end end end |
#first ⇒ Object
Returns first element of self.
181 182 183 |
# File 'lib/red_amber/vector_selectable.rb', line 181 def first data[0] end |
#index(element) ⇒ integer?
Returns index of first matched position of element in self.
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
# File 'lib/red_amber/vector_selectable.rb', line 158 def index(element) if element.nil? datum = find(:is_null).execute([data]) value = Arrow::Scalar.resolve(true, :boolean) else datum = data value = Arrow::Scalar.resolve(element, type) end datum = find(:index).execute([datum], value: value) index = get_scalar(datum) if index.negative? nil else index end end |
#is_in(*values) ⇒ Vector
Check if elements of self are in the other values.
138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/red_amber/vector_selectable.rb', line 138 def is_in(*values) enum = case values in [] | [[]] | [nil] |[[nil]] return Vector.new([false] * size) in [Vector | Arrow::Array | Arrow::ChunkedArray] values[0].each else parse_args(values, size, symbolize: false) end enum.filter_map { self == _1 unless _1.nil? }.reduce(&:|) end |
#last ⇒ Object
Returns last element of self.
191 192 193 |
# File 'lib/red_amber/vector_selectable.rb', line 191 def last data[-1] end |
#rank(order = :ascending, tie: :first, null_placement: :at_end) ⇒ Vector
Returns 1-based numerical rank of self.
-
Nil values are considered greater than any value.
-
NaN values are considered greater than any value but smaller than nil values.
-
Order of each element is considered as ascending by default. It is changable by the parameter ‘order = :descending`.
-
Tiebreakers are ranked in order of appearance by default or with ‘tie: :first` option.
-
Null values (nil and NaN) are placed at end by default. This behavior can be changed by the option ‘null_placement: :at_start`.
337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 |
# File 'lib/red_amber/vector_selectable.rb', line 337 def rank(order = :ascending, tie: :first, null_placement: :at_end) func = find(:rank) = func. order = case order.to_sym when :+, :ascending, :increasing :ascending when :-, :descending, :decreasing :descending else raise VectorArgumentError, "illegal order option: #{order}" end .sort_keys = [Arrow::SortKey.resolve('', order)] .tiebreaker = tie .null_placement = null_placement Vector.create(func.execute([data], ).value) end |
#sample ⇒ scalar #sample(n) ⇒ Vector #sample(prop) ⇒ Vector
This method requires ‘arrow-numo-narray’ gem.
Pick up elements at random.
444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 |
# File 'lib/red_amber/vector_selectable.rb', line 444 def sample(n_or_prop = nil) require 'arrow-numo-narray' return nil if size == 0 n_sample = case n_or_prop in Integer n_or_prop in Float (n_or_prop * size).truncate in nil return to_a.sample else raise VectorArgumentError, "must specify Integer or Float but #{n_or_prop}" end if n_or_prop < 0 raise VectorArgumentError, '#sample does not accept negative number.' end return Vector.new([]) if n_sample == 0 over_sample = [8 * size, n_sample].max over_size = n_sample > size ? n_sample / size * size * 2 : size over_vector = Vector.create(Numo::UInt32.new(over_size).rand(over_sample).to_arrow_array) indices = over_vector.rank.take(*0...n_sample) take(indices - ((indices / size) * size)) end |
#sort(order = :ascending) ⇒ Vector
Arrange values in Vector.
232 233 234 235 236 237 238 239 240 241 242 243 |
# File 'lib/red_amber/vector_selectable.rb', line 232 def sort(order = :ascending) order = case order.to_sym when :+, :ascending, :increasing :ascending when :-, :descending, :decreasing :descending else raise VectorArgumentError, "illegal order option: #{order}" end take(sort_indices(order: order)) end |
#take(*indices, &block) ⇒ Vector
Select elements in the self by indices.
TODO: support for the option ‘boundscheck: true`
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# File 'lib/red_amber/vector_selectable.rb', line 23 def take(*indices, &block) if block unless indices.empty? raise VectorArgumentError, 'Must not specify both arguments and block.' end indices = [yield] end vector = case indices in [Vector => v] if v.numeric? return Vector.create(take_by_vector(v)) in [] return Vector.new in [(Arrow::Array | Arrow::ChunkedArray) => aa] Vector.create(aa) else Vector.new(indices.flatten) end unless vector.numeric? raise VectorArgumentError, "argument must be a integers: #{indices}" end Vector.create(take_by_vector(vector)) end |