Module: Nimbus::LossFunctions
- Defined in:
- lib/nimbus/loss_functions.rb
Overview
Math functions.
The LossFunctions class provides handy mathematical functions as class methods to be used by Tree and Forest when estimating predictions, errors and loss functions for training and testing data.
Class Method Summary collapse
-
.average(ids, value_table) ⇒ Object
Simple average: sum(n) / n.
-
.class_sizes(ids, value_table, classes) ⇒ Object
Array with the list of sizes of each class in the given list of individuals.
-
.class_sizes_in_list(list, classes) ⇒ Object
Array with the list of sizes of each class in the given list of classes.
-
.gini_index(ids, value_table, classes) ⇒ Object
Gini index of a list of classified individuals.
-
.majority_class(ids, value_table, classes) ⇒ Object
Majority class of a list of classified individuals.
-
.majority_class_in_list(list, classes) ⇒ Object
Majority class of a list of classes.
-
.mean_squared_error(ids, value_table, mean = nil) ⇒ Object
Mean squared error: sum (x-y)^2.
-
.pseudo_huber_error(ids, value_table, mean = nil) ⇒ Object
Simplified Huber function.
-
.pseudo_huber_loss(ids, value_table, mean = nil) ⇒ Object
Simplified Huber loss function: PHE / n.
-
.quadratic_loss(ids, value_table, mean = nil) ⇒ Object
Quadratic loss: averaged mean squared error: sum (x-y)^2 / n.
-
.squared_difference(x, y) ⇒ Object
Difference between two values, squared.
Class Method Details
.average(ids, value_table) ⇒ Object
Simple average: sum(n) / n
17 18 19 |
# File 'lib/nimbus/loss_functions.rb', line 17 def average(ids, value_table) ids.inject(0.0){|sum, i| sum + value_table[i]} / ids.size end |
.class_sizes(ids, value_table, classes) ⇒ Object
Array with the list of sizes of each class in the given list of individuals.
81 82 83 |
# File 'lib/nimbus/loss_functions.rb', line 81 def class_sizes(ids, value_table, classes) classes.map{|c| ids.count{|i| value_table[i] == c}} end |
.class_sizes_in_list(list, classes) ⇒ Object
Array with the list of sizes of each class in the given list of classes.
86 87 88 |
# File 'lib/nimbus/loss_functions.rb', line 86 def class_sizes_in_list(list, classes) classes.map{|c| list.count{|i| i == c}} end |
.gini_index(ids, value_table, classes) ⇒ Object
Gini index of a list of classified individuals.
If a dataset T contains examples from n classes, then: gini(T) = 1 - Sum (Pj)^2 where Pj is the relative frequency of class j in T
57 58 59 60 61 62 |
# File 'lib/nimbus/loss_functions.rb', line 57 def gini_index(ids, value_table, classes) total_size = ids.size.to_f gini = 1 - class_sizes(ids, value_table, classes).inject(0.0){|sum, size| sum + (size/total_size)**2} gini.round(5) end |
.majority_class(ids, value_table, classes) ⇒ Object
Majority class of a list of classified individuals. If more than one class has the same number of individuals, one of the majority classes is selected randomly.
67 68 69 70 |
# File 'lib/nimbus/loss_functions.rb', line 67 def majority_class(ids, value_table, classes) sizes = class_sizes(ids, value_table, classes) Hash[classes.zip sizes].keep_if{|k,v| v == sizes.max}.keys.sample end |
.majority_class_in_list(list, classes) ⇒ Object
Majority class of a list of classes. If more than one class has the same number of individuals, one of the majority classes is selected randomly.
75 76 77 78 |
# File 'lib/nimbus/loss_functions.rb', line 75 def majority_class_in_list(list, classes) sizes = classes.map{|c| list.count{|i| i == c}} Hash[classes.zip sizes].keep_if{|k,v| v == sizes.max}.keys.sample end |
.mean_squared_error(ids, value_table, mean = nil) ⇒ Object
Mean squared error: sum (x-y)^2
22 23 24 25 |
# File 'lib/nimbus/loss_functions.rb', line 22 def mean_squared_error(ids, value_table, mean = nil) mean ||= self.average ids, value_table ids.inject(0.0){|sum, i| sum + ((value_table[i] - mean)**2) } end |
.pseudo_huber_error(ids, value_table, mean = nil) ⇒ Object
Simplified Huber function
40 41 42 43 |
# File 'lib/nimbus/loss_functions.rb', line 40 def pseudo_huber_error(ids, value_table, mean = nil) mean ||= self.average ids, value_table ids.inject(0.0){|sum, i| sum + (Math.log(Math.cosh(value_table[i] - mean))) } end |
.pseudo_huber_loss(ids, value_table, mean = nil) ⇒ Object
Simplified Huber loss function: PHE / n
46 47 48 |
# File 'lib/nimbus/loss_functions.rb', line 46 def pseudo_huber_loss(ids, value_table, mean = nil) self.pseudo_huber_error(ids, value_table, mean) / ids.size end |
.quadratic_loss(ids, value_table, mean = nil) ⇒ Object
Quadratic loss: averaged mean squared error: sum (x-y)^2 / n
Default loss function for regression forests.
30 31 32 |
# File 'lib/nimbus/loss_functions.rb', line 30 def quadratic_loss(ids, value_table, mean = nil) self.mean_squared_error(ids, value_table, mean) / ids.size end |
.squared_difference(x, y) ⇒ Object
Difference between two values, squared. (x-y)^2
35 36 37 |
# File 'lib/nimbus/loss_functions.rb', line 35 def squared_difference(x,y) 0.0 + (x-y)**2 end |