Class: Nimbus::Forest
- Inherits:
-
Object
- Object
- Nimbus::Forest
- Defined in:
- lib/nimbus/forest.rb
Overview
Forest represents the Random forest being generated (or used to test samples) by the application object.
Instance Attribute Summary collapse
-
#bag ⇒ Object
writeonly
Sets the attribute bag.
-
#options ⇒ Object
Returns the value of attribute options.
-
#predictions ⇒ Object
Returns the value of attribute predictions.
-
#size ⇒ Object
Returns the value of attribute size.
-
#snp_importances ⇒ Object
Returns the value of attribute snp_importances.
-
#tree_errors ⇒ Object
Returns the value of attribute tree_errors.
-
#trees ⇒ Object
Returns the value of attribute trees.
Instance Method Summary collapse
- #classification? ⇒ Boolean
-
#grow ⇒ Object
Creates a random forest based on the TrainingSet included in the configuration, creating N random trees (size N defined in the configuration).
-
#initialize(config) ⇒ Forest
constructor
Initialize Forest object with options included in the Nimbus::Configuration object received.
- #regression? ⇒ Boolean
-
#to_yaml ⇒ Object
The array containing every tree in the forest, to YAML format.
-
#traverse ⇒ Object
Traverse a testing set through every tree of the forest.
-
#traverse_classification_forest ⇒ Object
Traverse a testing set through every classification tree of the forest and get majority class predictions for every individual in the sample.
-
#traverse_regression_forest ⇒ Object
Traverse a testing set through every regression tree of the forest and get averaged predictions for every individual in the sample.
Constructor Details
#initialize(config) ⇒ Forest
Initialize Forest object with options included in the Nimbus::Configuration object received.
12 13 14 15 16 17 18 19 20 21 22 |
# File 'lib/nimbus/forest.rb', line 12 def initialize(config) @trees = [] @tree_errors = [] @options = config @size = config.forest_size @predictions = {} @times_predicted = [] @snp_importances = {} @tree_snp_importances = [] raise Nimbus::ForestError, "Forest size parameter (#{@size}) is invalid. You need at least one tree." if @size < 1 end |
Instance Attribute Details
#bag=(value) ⇒ Object
Sets the attribute bag
8 9 10 |
# File 'lib/nimbus/forest.rb', line 8 def bag=(value) @bag = value end |
#options ⇒ Object
Returns the value of attribute options.
9 10 11 |
# File 'lib/nimbus/forest.rb', line 9 def @options end |
#predictions ⇒ Object
Returns the value of attribute predictions.
8 9 10 |
# File 'lib/nimbus/forest.rb', line 8 def predictions @predictions end |
#size ⇒ Object
Returns the value of attribute size.
8 9 10 |
# File 'lib/nimbus/forest.rb', line 8 def size @size end |
#snp_importances ⇒ Object
Returns the value of attribute snp_importances.
8 9 10 |
# File 'lib/nimbus/forest.rb', line 8 def snp_importances @snp_importances end |
#tree_errors ⇒ Object
Returns the value of attribute tree_errors.
8 9 10 |
# File 'lib/nimbus/forest.rb', line 8 def tree_errors @tree_errors end |
#trees ⇒ Object
Returns the value of attribute trees.
8 9 10 |
# File 'lib/nimbus/forest.rb', line 8 def trees @trees end |
Instance Method Details
#classification? ⇒ Boolean
91 92 93 |
# File 'lib/nimbus/forest.rb', line 91 def classification? @options.tree[:classes] end |
#grow ⇒ Object
Creates a random forest based on the TrainingSet included in the configuration, creating N random trees (size N defined in the configuration).
This is the method called when the application’s configuration flags training on.
It performs this tasks:
-
grow the forest (all the N random trees)
-
store generalization errors for every tree
-
obtain averaged importances for all the SNPs
-
calculate averaged predictions for all individuals in the training sample
Every tree of the forest is created with a different random sample of the individuals in the training set.
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
# File 'lib/nimbus/forest.rb', line 36 def grow @size.times do |i| Nimbus.write("\rCreating trees: #{i+1}/#{@size} ") tree_individuals_bag = individuals_random_sample tree_out_of_bag = oob tree_individuals_bag tree_class = (classification? ? ClassificationTree : RegressionTree) tree = tree_class.new @options.tree @trees << tree.seed(@options.training_set.individuals, tree_individuals_bag, @options.training_set.ids_fenotypes) @tree_errors << tree.generalization_error_from_oob(tree_out_of_bag) @tree_snp_importances << tree.estimate_importances(tree_out_of_bag) if @options.do_importances acumulate_predictions tree.predictions Nimbus.clear_line! end average_snp_importances if @options.do_importances totalize_predictions end |
#regression? ⇒ Boolean
95 96 97 |
# File 'lib/nimbus/forest.rb', line 95 def regression? @options.tree[:classes].nil? end |
#to_yaml ⇒ Object
The array containing every tree in the forest, to YAML format.
87 88 89 |
# File 'lib/nimbus/forest.rb', line 87 def to_yaml @trees.to_yaml end |
#traverse ⇒ Object
Traverse a testing set through every tree of the forest.
This is the method called when the application’s configuration flags testing on.
56 57 58 |
# File 'lib/nimbus/forest.rb', line 56 def traverse classification? ? traverse_classification_forest : traverse_regression_forest end |
#traverse_classification_forest ⇒ Object
Traverse a testing set through every classification tree of the forest and get majority class predictions for every individual in the sample.
74 75 76 77 78 79 80 81 82 83 84 |
# File 'lib/nimbus/forest.rb', line 74 def traverse_classification_forest @predictions = {} @options.read_testing_data{|individual| individual_prediction = [] trees.each do |t| individual_prediction << Nimbus::Tree.traverse(t, individual.snp_list) end class_sizes = Nimbus::LossFunctions.class_sizes_in_list(individual_prediction, @options.tree[:classes]).map{|p| (p/individual_prediction.size.to_f).round(3)} @predictions[individual.id] = Hash[@options.tree[:classes].zip class_sizes].map{|k,v| "'#{k}': #{v}"}.join(' , ') } end |
#traverse_regression_forest ⇒ Object
Traverse a testing set through every regression tree of the forest and get averaged predictions for every individual in the sample.
61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/nimbus/forest.rb', line 61 def traverse_regression_forest @predictions = {} prediction_count = trees.size @options.read_testing_data{|individual| individual_prediction = 0.0 trees.each do |t| individual_prediction = (individual_prediction + Nimbus::Tree.traverse(t, individual.snp_list)).round(5) end @predictions[individual.id] = (individual_prediction / prediction_count).round(5) } end |