Class: RubyStatistics::StatisticalTest::KolmogorovSmirnovTest

Inherits:
Object
  • Object
show all
Defined in:
lib/ruby-statistics/statistical_test/kolmogorov_smirnov_test.rb

Class Method Summary collapse

Class Method Details

.two_samples(group_one:, group_two:, alpha: 0.05) ⇒ Object

Common alpha, and critical D are calculated following formulas from: en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Two-sample_Kolmogorov%E2%80%93Smirnov_test



5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# File 'lib/ruby-statistics/statistical_test/kolmogorov_smirnov_test.rb', line 5

def self.two_samples(group_one:, group_two:, alpha: 0.05)
  samples = group_one + group_two # We can use unbalaced group samples

  ecdf_one = Distribution::Empirical.new(samples: group_one)
  ecdf_two = Distribution::Empirical.new(samples: group_two)

  d_max = samples.sort.map do |sample|
    d1 = ecdf_one.cumulative_function(x: sample)
    d2 = ecdf_two.cumulative_function(x: sample)

    (d1 - d2).abs
  end.max

  # TODO: Validate calculation of Common alpha.
  common_alpha = Math.sqrt((-0.5 * Math.log(alpha)))
  radicand = (group_one.size + group_two.size) / (group_one.size * group_two.size).to_r

  critical_d = common_alpha * Math.sqrt(radicand)
  # critical_d = self.critical_d(alpha: alpha, n: samples.size)

  # We are unable to calculate the p_value, because we don't have the Kolmogorov distribution
  # defined. We reject the null hypotesis if Dmax is > than Dcritical.
  { d_max: d_max,
    d_critical: critical_d,
    total_samples: samples.size,
    alpha: alpha,
    null: d_max <= critical_d,
    alternative: d_max > critical_d,
    confidence_level: 1.0 - alpha }
end