Module: RedAmber::VectorUpdatable

Included in:
Vector
Defined in:
lib/red_amber/vector_updatable.rb

Overview

mix-in for class Vector Functions to make up some data (especially missing) for new data.

Instance Method Summary collapse

Instance Method Details

#if_else(true_choice, false_choice) ⇒ Object

[Ternary element-wise]: boolean_vector.func(if_true, else) => vector

Raises:



59
60
61
62
63
64
65
66
# File 'lib/red_amber/vector_updatable.rb', line 59

def if_else(true_choice, false_choice)
  true_choice = true_choice.data if true_choice.is_a? Vector
  false_choice = false_choice.data if false_choice.is_a? Vector
  raise VectorTypeError, 'Reciever must be a boolean' unless boolean?

  datum = find(:if_else).execute([data, true_choice, false_choice])
  Vector.create(datum.value)
end

#list_flattenObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Flatten list Vector for rows.



150
151
152
# File 'lib/red_amber/vector_updatable.rb', line 150

def list_flatten
  Vector.create find(:list_flatten).execute([data]).value
end

#list_separateObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Separate list Vector by columns.



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# File 'lib/red_amber/vector_updatable.rb', line 130

def list_separate
  len = list_sizes.data
  min, max = Arrow::Function.find(:min_max).execute([len]).value.value.map(&:value)

  result = []
  (0...min).each do |i|
    result << Vector.create(find(:list_element).execute([data, i]).value)
  end
  return result if min == max

  (min...max).each do |i|
    result << Vector.new(data.map { |e| e&.[](i) })
  end
  result
end

#list_sizesObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

return element size Array for list Vector.



122
123
124
# File 'lib/red_amber/vector_updatable.rb', line 122

def list_sizes
  Vector.create find(:list_value_length).execute([data]).value
end

#merge(other, sep: ' ') ⇒ Vector

Merge String or other string Vector to self.

Self must be a string Vector.

Parameters:

  • other (String, Vector)

    merger from right. It will be broadcasted if it is a scalar String.

  • sep (String) (defaults to: ' ')

    separator.

Returns:



185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
# File 'lib/red_amber/vector_updatable.rb', line 185

def merge(other, sep: ' ')
  if empty? || !string?
    raise VectorTypeError,
          "self is not a string Vector: #{self}"
  end
  unless sep.is_a?(String)
    raise VectorArgumentError, "separator is not a String: #{sep}"
  end

  other_array =
    case other
    in String => s
      [s] * size
    in (Vector | Arrow::Array | Arrow::ChunkedArray) => x if x.string?
      x.to_a
    else
      raise VectorArgumentError,
            "other is not a String or a string Vector: #{self}"
    end

  list = Arrow::Array.new(to_a.zip(other_array))
  datum = find(:binary_join).execute([list, sep])
  Vector.create(datum.value)
end

#primitive_invertObject

same behavior as Ruby’s invert ![true, false, nil] #=> [false, true, true]

Raises:



70
71
72
73
74
# File 'lib/red_amber/vector_updatable.rb', line 70

def primitive_invert
  raise VectorTypeError, "Not a boolean Vector: #{self}" unless boolean?

  is_nil.if_else(false, self).invert
end

#replace(specifier, replacer) ⇒ Vector

Replace data

Parameters:

  • specifier (Array, Vector, Arrow::Array)

    index or booleans.

  • replacer (Scalar, Array, Vector, Arrow::Array)

    new data to replace for.

Returns:

  • (Vector)

    Replaced new Vector. If specifier has no true, return self.



19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# File 'lib/red_amber/vector_updatable.rb', line 19

def replace(specifier, replacer)
  vector = Vector.new(parse_args(Array(specifier), size))
  return self if vector.empty? || empty?

  booleans =
    if vector.boolean?
      vector
    elsif vector.numeric?
      Vector.new(indices).is_in(vector)
    else
      raise VectorArgumentError, "Invalid data type #{specifier}"
    end
  return self if booleans.sum.zero?

  replacer_array =
    case replacer
    in []
      return self
    in nil | [nil]
      return replace_to_nil(booleans.data)
    in Arrow::Array
    # nop
    in Vector
      replacer.data
    in Array
      Arrow::Array.new(replacer)
    else # Broadcast scalar to Array
      Arrow::Array.new(Array(replacer) * booleans.to_a.count(true))
    end
  if booleans.sum != replacer_array.length
    raise VectorArgumentError, 'Replacements size unmatch'
  end

  replace_with(booleans.data, replacer_array)
end

#shift(amount = 1, fill: nil) ⇒ Object



76
77
78
79
80
81
82
83
84
85
86
# File 'lib/red_amber/vector_updatable.rb', line 76

def shift(amount = 1, fill: nil)
  raise VectorArgumentError, 'Shift amount is too large' if amount.abs > size

  if amount.positive?
    replace(amount..-1, self[0...-amount]).replace(0...amount, fill)
  elsif amount.negative?
    replace(0...amount, self[-amount..]).replace(amount..-1, fill)
  else # amount == 0
    self
  end
end

#split(sep = nil, limit = 0) ⇒ Object

Note:

if sep is not specified, use Arrow’s ascii_split_whitespace. It will separate string by ascii whitespaces.

Note:

if sep specified, sep and limit will passed to String#split.

Split string Vector by each element with separator and returns list Array.



160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
# File 'lib/red_amber/vector_updatable.rb', line 160

def split(sep = nil, limit = 0)
  if empty? || !string?
    raise VectorTypeError, "self is not a valid string Vector: #{self}"
  end
  if self[0].nil? && uniq.to_a == [nil] # Avoid heavy check to be activated always.
    raise VectorTypeError, 'self contains only nil'
  end

  list =
    if sep
      Arrow::Array.new(to_a.map { |e| e&.split(sep, limit) })
    else
      find(:ascii_split_whitespace).execute([data]).value
    end
  Vector.create(list)
end

#split_to_columns(sep = nil, limit = 0) ⇒ Array<Vector>

Note:

nil will separated as nil’s at same row. ex) ‘nil => [nil, nil]`

Split string Vector and returns Array of columns.

Parameters:

  • sep (nil, String, Regexp) (defaults to: nil)

    separater. If separator is nil (or no argeument given), the column will be splitted by Arrow’s split function using any ASCII whitespace. Otherwise sep will passed to String#split.

  • limit (Integer) (defaults to: 0)

    maximum number to limit separation. Passed to String#split.

Returns:

  • (Array<Vector>)

    an Array of Vectors.



98
99
100
101
# File 'lib/red_amber/vector_updatable.rb', line 98

def split_to_columns(sep = nil, limit = 0)
  l = split(sep, limit)
  l.list_separate
end

#split_to_rows(sep = nil, limit = 0) ⇒ Vector

Note:

nil will separated as nil’s at same row. ex) ‘nil => [nil, nil]`

Split string Vector and flatten into rows.

Parameters:

  • sep (nil, String, Regexp) (defaults to: nil)

    separater. If separator is nil (or no argeument given), the column will be splitted by Arrow’s split function using any ASCII whitespace. Otherwise sep will passed to String#split.

  • limit (Integer) (defaults to: 0)

    maximum number to limit separation. Passed to String#split.

Returns:

  • (Vector)

    a flatten Vector.



113
114
115
116
# File 'lib/red_amber/vector_updatable.rb', line 113

def split_to_rows(sep = nil, limit = 0)
  l = split(sep, limit)
  l.list_flatten
end