Module: Ai4r::Data::Statistics
- Defined in:
- lib/ai4r/data/statistics.rb
Overview
This module provides some basic statistics functions to operate on data set attributes.
Class Method Summary collapse
-
.max(data_set, attribute) ⇒ Object
Get the maximum value of an attribute in the data set.
-
.mean(data_set, attribute) ⇒ Object
Get the sample mean.
-
.min(data_set, attribute) ⇒ Object
Get the minimum value of an attribute in the data set.
-
.mode(data_set, attribute) ⇒ Object
Get the sample mode.
-
.standard_deviation(data_set, attribute, variance = nil) ⇒ Object
Get the standard deviation.
-
.variance(data_set, attribute, mean = nil) ⇒ Object
Get the variance.
Class Method Details
.max(data_set, attribute) ⇒ Object
Get the maximum value of an attribute in the data set
62 63 64 65 66 |
# File 'lib/ai4r/data/statistics.rb', line 62 def self.max(data_set, attribute) index = data_set.get_index(attribute) item = data_set.data_items.max {|x,y| x[index] <=> y[index]} return (item) ? item[index] : (-1.0/0) end |
.mean(data_set, attribute) ⇒ Object
Get the sample mean
20 21 22 23 24 25 |
# File 'lib/ai4r/data/statistics.rb', line 20 def self.mean(data_set, attribute) index = data_set.get_index(attribute) sum = 0.0 data_set.data_items.each { |item| sum += item[index] } return sum / data_set.data_items.length end |
.min(data_set, attribute) ⇒ Object
Get the minimum value of an attribute in the data set
69 70 71 72 73 |
# File 'lib/ai4r/data/statistics.rb', line 69 def self.min(data_set, attribute) index = data_set.get_index(attribute) item = data_set.data_items.min {|x,y| x[index] <=> y[index]} return (item) ? item[index] : (1.0/0) end |
.mode(data_set, attribute) ⇒ Object
Get the sample mode.
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/ai4r/data/statistics.rb', line 45 def self.mode(data_set, attribute) index = data_set.get_index(attribute) count = Hash.new {0} max_count = 0 mode = nil data_set.data_items.each do |data_item| attr_value = data_item[index] attr_count = (count[attr_value] += 1) if attr_count > max_count mode = attr_value max_count = attr_count end end return mode end |
.standard_deviation(data_set, attribute, variance = nil) ⇒ Object
Get the standard deviation. You can provide the variance if you have it already, to speed up things.
39 40 41 42 |
# File 'lib/ai4r/data/statistics.rb', line 39 def self.standard_deviation(data_set, attribute, variance = nil) variance ||= variance(data_set, attribute) Math.sqrt(variance) end |
.variance(data_set, attribute, mean = nil) ⇒ Object
Get the variance. You can provide the mean if you have it already, to speed up things.
29 30 31 32 33 34 35 |
# File 'lib/ai4r/data/statistics.rb', line 29 def self.variance(data_set, attribute, mean = nil) index = data_set.get_index(attribute) mean = mean(data_set, attribute) sum = 0.0 data_set.data_items.each { |item| sum += (item[index]-mean)**2 } return sum / (data_set.data_items.length-1) end |