Class: MoreMath::Sequence

Inherits:
Object show all
Includes:
Enumerable, MovingAverage
Defined in:
lib/more_math/sequence.rb,
lib/more_math/sequence/moving_average.rb

Overview

This class is used to contain elements and compute various statistical values for them.

Defined Under Namespace

Modules: MovingAverage, Refinement

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from MovingAverage

#simple_moving_average

Constructor Details

#initialize(elements) ⇒ Sequence

Returns a new instance of Sequence.



10
11
12
# File 'lib/more_math/sequence.rb', line 10

def initialize(elements)
  @elements = elements.dup.freeze
end

Instance Attribute Details

#elementsObject (readonly)

Returns the array of elements.



15
16
17
# File 'lib/more_math/sequence.rb', line 15

def elements
  @elements
end

Instance Method Details

#autocorrelationObject

Returns the array of autocorrelation values c_k / c_0 (of length size - 1).



290
291
292
293
# File 'lib/more_math/sequence.rb', line 290

def autocorrelation
  c = autovariance
  Array.new(c.size) { |k| c[k] / c[0] }
end

#autovarianceObject

Returns the array of autovariances (of length size - 1).



278
279
280
281
282
283
284
285
286
# File 'lib/more_math/sequence.rb', line 278

def autovariance
  Array.new(size - 1) do |k|
    s = 0.0
    0.upto(size - k - 1) do |i|
      s += (@elements[i] - arithmetic_mean) * (@elements[i + k] - arithmetic_mean)
    end
    s / size
  end
end

#common_standard_deviation(other) ⇒ Object

Returns an estimation of the common standard deviation of the elements of this and other.



219
220
221
# File 'lib/more_math/sequence.rb', line 219

def common_standard_deviation(other)
  Math.sqrt(common_variance(other))
end

#common_variance(other) ⇒ Object

Returns an estimation of the common variance of the elements of this and other.



225
226
227
228
# File 'lib/more_math/sequence.rb', line 225

def common_variance(other)
  (size - 1) * sample_variance + (other.size - 1) *
    other.sample_variance / (size + other.size - 2)
end

#compute_student_df(other) ⇒ Object

Compute the # degrees of freedom for Student’s t-test.



231
232
233
# File 'lib/more_math/sequence.rb', line 231

def compute_student_df(other)
  size + other.size - 2
end

#compute_welch_df(other) ⇒ Object

Use an approximation of the Welch-Satterthwaite equation to compute the degrees of freedom for Welch’s t-test.



200
201
202
203
204
# File 'lib/more_math/sequence.rb', line 200

def compute_welch_df(other)
  (sample_variance / size + other.sample_variance / other.size) ** 2 / (
    (sample_variance ** 2 / (size ** 2 * (size - 1))) +
    (other.sample_variance ** 2 / (other.size ** 2 * (other.size - 1))))
end

#confidence_interval(alpha = 0.05) ⇒ Object

Return the confidence interval for the arithmetic mean with alpha level alpha of the elements of this Sequence instance as a Range object.



270
271
272
273
274
275
# File 'lib/more_math/sequence.rb', line 270

def confidence_interval(alpha = 0.05)
  td = TDistribution.new(size - 1)
  t = td.inverse_probability(alpha / 2).abs
  delta = t * sample_standard_deviation / Math.sqrt(size)
  (arithmetic_mean - delta)..(arithmetic_mean + delta)
end

#cover?(other, alpha = 0.05) ⇒ Boolean

Return true, if the Sequence instance covers the other, that is their arithmetic mean value is most likely to be equal for the alpha error level.

Returns:

  • (Boolean)


262
263
264
265
266
# File 'lib/more_math/sequence.rb', line 262

def cover?(other, alpha = 0.05)
  t = t_welch(other)
  td = TDistribution.new(compute_welch_df(other))
  t.abs < td.inverse_probability(1 - alpha.abs / 2.0)
end

#detect_autocorrelation(lags = 20, alpha_level = 0.05) ⇒ Object

This method tries to detect autocorrelation with the Ljung-Box statistic. If enough lags can be considered it returns a hash with results, otherwise nil is returned. The keys are

:lags

the number of lags,

:alpha_level

the alpha level for the test,

:q

the value of the ljung_box_statistic,

:p

the p-value computed, if p is higher than alpha no correlation was detected,

:detected

true if a correlation was found.



323
324
325
326
327
328
329
330
331
332
333
334
# File 'lib/more_math/sequence.rb', line 323

def detect_autocorrelation(lags = 20, alpha_level = 0.05)
  if q = ljung_box_statistic(lags)
    p = ChiSquareDistribution.new(lags).probability(q)
    return {
      :lags         => lags,
      :alpha_level  => alpha_level,
      :q            => q,
      :p            => p,
      :detected     => p >= 1 - alpha_level,
    }
  end
end

#detect_outliers(factor = 3.0, epsilon = 1E-5) ⇒ Object

Return a result hash with the number of :very_low, :low, :high, and :very_high outliers, determined by the box plotting algorithm run with :median and :iqr parameters. If no outliers were found or the iqr is less than epsilon, nil is returned.



340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
# File 'lib/more_math/sequence.rb', line 340

def detect_outliers(factor = 3.0, epsilon = 1E-5)
  half_factor = factor / 2.0
  quartile1 = percentile(25)
  quartile3 = percentile(75)
  iqr = quartile3 - quartile1
  iqr < epsilon and return
  result = @elements.inject(Hash.new(0)) do |h, t|
    extreme =
      case t
      when -Infinity..(quartile1 - factor * iqr)
        :very_low
      when (quartile1 - factor * iqr)..(quartile1 - half_factor * iqr)
        :low
      when (quartile1 + half_factor * iqr)..(quartile3 + factor * iqr)
        :high
      when (quartile3 + factor * iqr)..Infinity
        :very_high
      end and h[extreme] += 1
    h
  end
  unless result.empty?
    result[:median] = median
    result[:iqr] = iqr
    result[:factor] = factor
    result
  end
end

#durbin_watson_statisticObject

Returns the d-value for the Durbin-Watson statistic. The value is d << 2 for positive, d >> 2 for negative and d around 2 for no autocorrelation.



297
298
299
300
301
302
# File 'lib/more_math/sequence.rb', line 297

def durbin_watson_statistic
  e = linear_regression.residuals
  e.size <= 1 and return 2.0
  (1...e.size).inject(0.0) { |s, i| s + (e[i] - e[i - 1]) ** 2 } /
    e.inject(0.0) { |s, x| s + x ** 2 }
end

#each(&block) ⇒ Object

Calls the block for every element of this Sequence.



18
19
20
# File 'lib/more_math/sequence.rb', line 18

def each(&block)
  @elements.each(&block)
end

#empty?Boolean

Returns true if this sequence is empty, otherwise false.

Returns:

  • (Boolean)


24
25
26
# File 'lib/more_math/sequence.rb', line 24

def empty?
  @elements.empty?
end

#histogram(bins) ⇒ Object

Returns a Histogram instance with bins as the number of bins for this analysis’ elements.



377
378
379
# File 'lib/more_math/sequence.rb', line 377

def histogram(bins)
  Histogram.new(self, bins)
end

#ljung_box_statistic(lags = 20) ⇒ Object

Returns the q value of the Ljung-Box statistic for the number of lags lags. A higher value might indicate autocorrelation in the elements of this Sequence instance. This method returns nil if there weren’t enough (at least lags) lags available.



308
309
310
311
312
313
# File 'lib/more_math/sequence.rb', line 308

def ljung_box_statistic(lags = 20)
  r = autocorrelation
  lags >= r.size and return
  n = size
  n * (n + 2) * (1..lags).inject(0.0) { |s, i| s + r[i] ** 2 / (n - i) }
end

#percentile(p = 50) ⇒ Object Also known as: median

Returns the p-percentile of the elements. There are many methods to compute the percentile, this method uses the the weighted average at x_(n + 1)p, which allows p to be in 0…100 (excluding the 100).



177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
# File 'lib/more_math/sequence.rb', line 177

def percentile(p = 50)
  (0...100).include?(p) or
    raise ArgumentError, "p = #{p}, but has to be in (0...100)"
  p /= 100.0
  sorted_elements = sorted
  r = p * (sorted_elements.size + 1)
  r_i = r.to_i
  r_f = r - r_i
  if r_i >= 1
    result = sorted_elements[r_i - 1]
    if r_i < sorted_elements.size
      result += r_f * (sorted_elements[r_i] - sorted_elements[r_i - 1])
    end
  else
    result = sorted_elements[0]
  end
  result
end

#push(element) ⇒ Object Also known as: <<

Push element on this Sequence and return a new Sequence instance with element as its last element.



47
48
49
# File 'lib/more_math/sequence.rb', line 47

def push(element)
  Sequence.new(@elements.dup.push(element))
end

#resetObject

Reset all memoized values of this sequence.



34
35
36
37
# File 'lib/more_math/sequence.rb', line 34

def reset
  self.class.mize_cache_clear
  self
end

#sizeObject

Returns the number of elements, on which the analysis is based.



29
30
31
# File 'lib/more_math/sequence.rb', line 29

def size
  @elements.size
end

#suggested_sample_size(other, alpha = 0.05, beta = 0.05) ⇒ Object

Compute a sample size, that will more likely yield a mean difference between this instance’s elements and those of other. Use alpha and beta as levels for the first- and second-order errors.



249
250
251
252
253
254
255
256
257
# File 'lib/more_math/sequence.rb', line 249

def suggested_sample_size(other, alpha = 0.05, beta = 0.05)
  alpha, beta = alpha.abs, beta.abs
  signal = arithmetic_mean - other.arithmetic_mean
  df = size + other.size - 2
  pooled_variance_estimate = (sum_of_squares + other.sum_of_squares) / df
  td = TDistribution.new df
  (((td.inverse_probability(alpha) + td.inverse_probability(beta)) *
    Math.sqrt(pooled_variance_estimate)) / signal) ** 2
end

#t_student(other) ⇒ Object

Returns the t value of the Student’s t-test between this Sequence instance and the other.



237
238
239
240
241
242
243
244
# File 'lib/more_math/sequence.rb', line 237

def t_student(other)
  signal = arithmetic_mean - other.arithmetic_mean
  noise = common_standard_deviation(other) *
    Math.sqrt(size ** -1 + size ** -1)
  signal / noise
rescue Errno::EDOM
  0.0
end

#t_welch(other) ⇒ Object

Returns the t value of the Welch’s t-test between this Sequence instance and the other.



208
209
210
211
212
213
214
215
# File 'lib/more_math/sequence.rb', line 208

def t_welch(other)
  signal = arithmetic_mean - other.arithmetic_mean
  noise = Math.sqrt(sample_variance / size +
    other.sample_variance / other.size)
  signal / noise
rescue Errno::EDOM
  0.0
end

#to_aryObject Also known as: to_a



39
40
41
# File 'lib/more_math/sequence.rb', line 39

def to_ary
  @elements.dup
end