Module: FeldtRuby::Statistics

Included in:
FeldtRuby
Defined in:
lib/feldtruby/statistics/time_series/sax.rb,
lib/feldtruby/statistics.rb,
lib/feldtruby/statistics/design_of_experiments.rb,
lib/feldtruby/statistics/distance/string_distance.rb

Overview

Implements the basic SAX (Symbolic Adaptive approXimation) from the paper:

Jessica Lin, Eamonn Keogh, Stefano Lonardi, Bill Chiu, 
"A Symbolic Representation of Time Series, with Implications for Streaming Algorithms", IDMKD 2003.

available from: www.cs.ucr.edu/~eamonn/SAX.pdf

Defined Under Namespace

Modules: DesignOfExperiments, Plotting Classes: CompressionBasedDissimilarityMeasure, DiffusionKDE, NormalizedCompressionDistance, SAX, StringDistance

Instance Method Summary collapse

Instance Method Details

#cdm(string1, string2) ⇒ Object



45
46
47
# File 'lib/feldtruby/statistics/distance/string_distance.rb', line 45

def cdm(string1, string2)
  (@cdm ||= CompressionBasedDissimilarityMeasure.new).distance(string1, string2)
end

#chi_squared_test(aryOrHashOfCounts) ⇒ Object



181
182
183
184
185
186
187
# File 'lib/feldtruby/statistics.rb', line 181

def chi_squared_test(aryOrHashOfCounts)
  puts "aryOrHashOfCounts = #{aryOrHashOfCounts}"
  counts = (Hash === aryOrHashOfCounts) ? aryOrHashOfCounts : aryOrHashOfCounts.counts
  vs = counts.values
  res = RC.call("chisq.test", vs)
  res.p_value
end

#correlation(ary1, ary2) ⇒ Object



189
190
191
# File 'lib/feldtruby/statistics.rb', line 189

def correlation(ary1, ary2)
  RC.call("cor", ary1, ary2)
end

#density_estimation(values, n = 2**9, min = nil, max = nil) ⇒ Object

Do a kernel density estimation based on the sampled values, with n bins (rounded up to nearest exponent of 2) and optional min and max values.



221
222
223
224
225
226
227
228
229
230
# File 'lib/feldtruby/statistics.rb', line 221

def density_estimation(values, n = 2**9, min = nil, max = nil)
  # Ensure we have loaded the diffusion.kde code
  RC.load_feldtruby_r_script("diffusion_kde.R")
  args = [values, n]
  if min && max
    args << min
    args << max
  end
  DiffusionKDE.new RC.call("diffusion.kde", *args)
end

#ncd(string1, string2) ⇒ Object



30
31
32
# File 'lib/feldtruby/statistics/distance/string_distance.rb', line 30

def ncd(string1, string2)
  (@ncd ||= NormalizedCompressionDistance.new).distance(string1, string2)
end

#probability_of_same_proportions(aryOrHashOfCounts) ⇒ Object

Calc the probability that the unique values in array (or hash of counts of the values) have (statistically) equal proportions.



174
175
176
177
178
179
# File 'lib/feldtruby/statistics.rb', line 174

def probability_of_same_proportions(aryOrHashOfCounts)
  counts = (Hash === aryOrHashOfCounts) ? aryOrHashOfCounts : aryOrHashOfCounts.counts
  vs = counts.values
  res = RC.call("prop.test", vs, ([vs.sum] * vs.length))
  res.p_value
end