Class: DataModeler::Dataset::Accessor

Inherits:
Object
  • Object
show all
Includes:
ConvertingTimeAndIndices, IteratingBasedOnNext
Defined in:
lib/data_modeler/dataset/accessor.rb

Overview

Build complex inputs and targets from the data to train the model.

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from ConvertingTimeAndIndices

#idx, #time

Methods included from IteratingBasedOnNext

#each, #map, #to_a

Constructor Details

#initialize(data, inputs:, targets:, first_idx:, end_idx:, ninput_points:, tspread:, look_ahead:) ⇒ Accessor

Note:

we expect Datasets indices to be used with left inclusion but right exclusion, i.e. targets are considered in the range ‘[from,to)`

Returns a new instance of Accessor.

Parameters:

  • data (Hash)

    the data, in an object that can be accessed by keys and return a time series per each key. It is required to include and be sorted by a series named ‘time`, and for all series to have equal length.

  • inputs (Array)

    data key accessors for input series

  • targets (Array)

    data key accessors for target series

  • first_idx (Integer)

    index where the dataset starts on data

  • end_idx (Integer)

    index where the dataset ends on data

  • ninput_points (Integer)

    number of lines/datapoints to be used to construct the input

  • tspread (Numeric)

    distance (in ‘time`!) between the `ninput_points` lines/datapoints used to construct the input

  • look_ahead (Numeric)

    distance (in ‘time`!) between the most recent line/time/datapoint used for the input and the target – i.e., how far ahead the model is trained to predict



26
27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/data_modeler/dataset/accessor.rb', line 26

def initialize data, inputs:, targets:, first_idx:, end_idx:, ninput_points:, tspread:, look_ahead:
  @data = data
  @input_series = inputs
  @target_series = targets
  @first_idx = first_idx
  @end_idx = end_idx
  @ninput_points = ninput_points
  @nrows = data[:time].size
  @tspread = tspread
  @look_ahead = look_ahead
  @first_idx = first_idx
  reset_iteration
end

Instance Attribute Details

#dataObject (readonly)

Returns the value of attribute data.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def data
  @data
end

#end_idxObject (readonly)

Returns the value of attribute end_idx.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def end_idx
  @end_idx
end

#first_idxObject (readonly)

Returns the value of attribute first_idx.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def first_idx
  @first_idx
end

#input_idxsObject (readonly)

Returns the value of attribute input_idxs.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def input_idxs
  @input_idxs
end

#input_seriesObject (readonly)

Returns the value of attribute input_series.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def input_series
  @input_series
end

#look_aheadObject (readonly)

Returns the value of attribute look_ahead.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def look_ahead
  @look_ahead
end

#ninput_pointsObject (readonly)

Returns the value of attribute ninput_points.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def ninput_points
  @ninput_points
end

#nrowsObject (readonly)

Returns the value of attribute nrows.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def nrows
  @nrows
end

#target_idxObject (readonly)

Returns the value of attribute target_idx.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def target_idx
  @target_idx
end

#target_seriesObject (readonly)

Returns the value of attribute target_series.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def target_series
  @target_series
end

#tspreadObject (readonly)

Returns the value of attribute tspread.



5
6
7
# File 'lib/data_modeler/dataset/accessor.rb', line 5

def tspread
  @tspread
end

Instance Method Details

#==(other) ⇒ true|false

Equality operator – most useful in testing

Parameters:

  • other (Dataset)

    what needs comparing to

Returns:

  • (true|false)


99
100
101
102
103
104
105
# File 'lib/data_modeler/dataset/accessor.rb', line 99

def == other
  self.class == other.class && # terminate check here if wrong class
    data.object_id == other.data.object_id && # both `data` point to same object
    (instance_variables - [:@data]).all? do |var|
      self.instance_variable_get(var) == other.instance_variable_get(var)
    end
end

#inputsArray

Builds inputs for the model

Returns:

  • (Array)


44
45
46
47
48
49
50
# File 'lib/data_modeler/dataset/accessor.rb', line 44

def inputs
  input_idxs.flat_map do |idx|
    input_series.collect do |s|
      data[s][idx]
    end
  end
end

#nextArray

Returns the next pair [inputs, targets] and increments the target

Returns:

  • (Array)


78
79
80
81
82
83
# File 'lib/data_modeler/dataset/accessor.rb', line 78

def next
  peek.tap do
    @target_idx += 1
    @input_idxs = init_inputs
  end
end

#peekArray

Returns the next pair [inputs, targets]

Returns:

  • (Array)

Raises:

  • (StopIteration)

    when the target index is past the dataset limits



71
72
73
74
# File 'lib/data_modeler/dataset/accessor.rb', line 71

def peek
  raise StopIteration if target_idx >= end_idx
  [trg_time, inputs, targets]
end

#targetsArray

Builds targets for the model

Returns:

  • (Array)


54
55
56
57
58
# File 'lib/data_modeler/dataset/accessor.rb', line 54

def targets
  target_series.collect do |s|
    data[s][target_idx]
  end
end

#trg_timetype of `data[:time]`

Returns the time of the current target

Returns:

  • (type of `data[:time]`)


62
63
64
# File 'lib/data_modeler/dataset/accessor.rb', line 62

def trg_time
  data[:time][target_idx]
end

#valuesArray<Array>

Compatibility with Hash, which returns a list of series’ data arrays

Returns:

  • (Array<Array>)

    ] list of values per each serie



92
93
94
# File 'lib/data_modeler/dataset/accessor.rb', line 92

def values
  to_a.transpose
end