Class: Nimbus::Forest

Inherits:
Object
  • Object
show all
Defined in:
lib/nimbus/forest.rb

Overview

Forest represents the Random forest being generated (or used to test samples) by the application object.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(config) ⇒ Forest

Initialize Forest object with options included in the Nimbus::Configuration object received.



12
13
14
15
16
17
18
19
20
21
22
# File 'lib/nimbus/forest.rb', line 12

def initialize(config)
  @trees = []
  @tree_errors = []
  @options = config
  @size = config.forest_size
  @predictions = {}
  @times_predicted = []
  @snp_importances = {}
  @tree_snp_importances = []
  raise Nimbus::ForestError, "Forest size parameter (#{@size}) is invalid. You need at least one tree." if @size < 1
end

Instance Attribute Details

#bag=(value) ⇒ Object

Sets the attribute bag

Parameters:

  • value

    the value to set the attribute bag to.



8
9
10
# File 'lib/nimbus/forest.rb', line 8

def bag=(value)
  @bag = value
end

#optionsObject

Returns the value of attribute options.



9
10
11
# File 'lib/nimbus/forest.rb', line 9

def options
  @options
end

#predictionsObject

Returns the value of attribute predictions.



8
9
10
# File 'lib/nimbus/forest.rb', line 8

def predictions
  @predictions
end

#sizeObject

Returns the value of attribute size.



8
9
10
# File 'lib/nimbus/forest.rb', line 8

def size
  @size
end

#snp_importancesObject

Returns the value of attribute snp_importances.



8
9
10
# File 'lib/nimbus/forest.rb', line 8

def snp_importances
  @snp_importances
end

#tree_errorsObject

Returns the value of attribute tree_errors.



8
9
10
# File 'lib/nimbus/forest.rb', line 8

def tree_errors
  @tree_errors
end

#treesObject

Returns the value of attribute trees.



8
9
10
# File 'lib/nimbus/forest.rb', line 8

def trees
  @trees
end

Instance Method Details

#growObject

Creates a random forest based on the TrainingSet included in the configuration, creating N random trees (size N defined in the configuration).

This is the method called when the application’s configuration flags training on.

It performs this tasks:

  • grow the forest (all the N random trees)

  • store generalization errors for every tree

  • obtain averaged importances for all the SNPs

  • calculate averaged predictions for all individuals in the training sample

Every tree of the forest is created with a different random sample of the individuals in the training set.



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/nimbus/forest.rb', line 36

def grow
  @size.times do |i|
    Nimbus.write("\rCreating trees: #{i+1}/#{@size} ")
    tree_individuals_bag = individuals_random_sample
    tree_out_of_bag = oob tree_individuals_bag
    tree_class = (classification? ? ClassificationTree : RegressionTree)
    tree = tree_class.new @options.tree
    @trees << tree.seed(@options.training_set.individuals, tree_individuals_bag, @options.training_set.ids_fenotypes)
    @tree_errors << tree.generalization_error_from_oob(tree_out_of_bag)
    @tree_snp_importances << tree.estimate_importances(tree_out_of_bag)
    acumulate_predictions tree.predictions
    Nimbus.clear_line!
  end
  average_snp_importances
  totalize_predictions
end

#to_yamlObject

The array containing every tree in the forest, to YAML format.



86
87
88
# File 'lib/nimbus/forest.rb', line 86

def to_yaml
  @trees.to_yaml
end

#traverseObject

Traverse a testing set through every tree of the forest.

This is the method called when the application’s configuration flags testing on.



56
57
58
# File 'lib/nimbus/forest.rb', line 56

def traverse
  classification? ? traverse_classification_forest : traverse_regression_forest
end

#traverse_classification_forestObject

Traverse a testing set through every classification tree of the forest and get majority class predictions for every individual in the sample.



74
75
76
77
78
79
80
81
82
83
# File 'lib/nimbus/forest.rb', line 74

def traverse_classification_forest
  @predictions = {}
  @options.read_testing_data{|individual|
    individual_prediction = []
    trees.each do |t|
      individual_prediction << Nimbus::Tree.traverse(t, individual.snp_list)
    end
    @predictions[individual.id] = Nimbus::LossFunctions.majority_class_in_list(individual_prediction, @options.tree[:classes])
  }
end

#traverse_regression_forestObject

Traverse a testing set through every regression tree of the forest and get averaged predictions for every individual in the sample.



61
62
63
64
65
66
67
68
69
70
71
# File 'lib/nimbus/forest.rb', line 61

def traverse_regression_forest
  @predictions = {}
  prediction_count = trees.size
  @options.read_testing_data{|individual|
    individual_prediction = 0.0
    trees.each do |t|
      individual_prediction = (individual_prediction + Nimbus::Tree.traverse(t, individual.snp_list)).round(5)
    end
    @predictions[individual.id] = (individual_prediction / prediction_count).round(5)
  }
end