Class: Nimbus::Forest

Inherits:
Object
  • Object
show all
Defined in:
lib/nimbus/forest.rb

Overview

Forest represents the Random forest being generated (or used to test samples) by the application object.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(config) ⇒ Forest

Initialize Forest object with options included in the Nimbus::Configuration object received.


12
13
14
15
16
17
18
19
20
21
22
# File 'lib/nimbus/forest.rb', line 12

def initialize(config)
  @trees = []
  @tree_errors = []
  @options = config
  @size = config.forest_size
  @predictions = {}
  @times_predicted = []
  @snp_importances = {}
  @tree_snp_importances = []
  raise Nimbus::ForestError, "Forest size parameter (#{@size}) is invalid. You need at least one tree." if @size < 1
end

Instance Attribute Details

#bag=(value) ⇒ Object

Sets the attribute bag

Parameters:

  • value

    the value to set the attribute bag to.


8
9
10
# File 'lib/nimbus/forest.rb', line 8

def bag=(value)
  @bag = value
end

#optionsObject

Returns the value of attribute options


9
10
11
# File 'lib/nimbus/forest.rb', line 9

def options
  @options
end

#predictionsObject

Returns the value of attribute predictions


8
9
10
# File 'lib/nimbus/forest.rb', line 8

def predictions
  @predictions
end

#sizeObject

Returns the value of attribute size


8
9
10
# File 'lib/nimbus/forest.rb', line 8

def size
  @size
end

#snp_importancesObject

Returns the value of attribute snp_importances


8
9
10
# File 'lib/nimbus/forest.rb', line 8

def snp_importances
  @snp_importances
end

#tree_errorsObject

Returns the value of attribute tree_errors


8
9
10
# File 'lib/nimbus/forest.rb', line 8

def tree_errors
  @tree_errors
end

#treesObject

Returns the value of attribute trees


8
9
10
# File 'lib/nimbus/forest.rb', line 8

def trees
  @trees
end

Instance Method Details

#classification?Boolean

Returns:

  • (Boolean)

91
92
93
# File 'lib/nimbus/forest.rb', line 91

def classification?
  @options.tree[:classes]
end

#growObject

Creates a random forest based on the TrainingSet included in the configuration, creating N random trees (size N defined in the configuration).

This is the method called when the application's configuration flags training on.

It performs this tasks:

  • grow the forest (all the N random trees)

  • store generalization errors for every tree

  • obtain averaged importances for all the SNPs

  • calculate averaged predictions for all individuals in the training sample

Every tree of the forest is created with a different random sample of the individuals in the training set.


36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/nimbus/forest.rb', line 36

def grow
  @size.times do |i|
    Nimbus.write("\rCreating trees: #{i+1}/#{@size} ")
    tree_individuals_bag = individuals_random_sample
    tree_out_of_bag = oob tree_individuals_bag
    tree_class = (classification? ? ClassificationTree : RegressionTree)
    tree = tree_class.new @options.tree
    @trees << tree.seed(@options.training_set.individuals, tree_individuals_bag, @options.training_set.ids_fenotypes)
    @tree_errors << tree.generalization_error_from_oob(tree_out_of_bag)
    @tree_snp_importances << tree.estimate_importances(tree_out_of_bag) if @options.do_importances
    acumulate_predictions tree.predictions
    Nimbus.clear_line!
  end
  average_snp_importances if @options.do_importances
  totalize_predictions
end

#regression?Boolean

Returns:

  • (Boolean)

95
96
97
# File 'lib/nimbus/forest.rb', line 95

def regression?
  @options.tree[:classes].nil?
end

#to_yamlObject

The array containing every tree in the forest, to YAML format.


87
88
89
# File 'lib/nimbus/forest.rb', line 87

def to_yaml
  @trees.to_yaml
end

#traverseObject

Traverse a testing set through every tree of the forest.

This is the method called when the application's configuration flags testing on.


56
57
58
# File 'lib/nimbus/forest.rb', line 56

def traverse
  classification? ? traverse_classification_forest : traverse_regression_forest
end

#traverse_classification_forestObject

Traverse a testing set through every classification tree of the forest and get majority class predictions for every individual in the sample.


74
75
76
77
78
79
80
81
82
83
84
# File 'lib/nimbus/forest.rb', line 74

def traverse_classification_forest
  @predictions = {}
  @options.read_testing_data{|individual|
    individual_prediction = []
    trees.each do |t|
      individual_prediction << Nimbus::Tree.traverse(t, individual.snp_list)
    end
    class_sizes = Nimbus::LossFunctions.class_sizes_in_list(individual_prediction, @options.tree[:classes]).map{|p| (p/individual_prediction.size.to_f).round(3)}
    @predictions[individual.id] = Hash[@options.tree[:classes].zip class_sizes].map{|k,v| "'#{k}': #{v}"}.join(' , ')
  }
end

#traverse_regression_forestObject

Traverse a testing set through every regression tree of the forest and get averaged predictions for every individual in the sample.


61
62
63
64
65
66
67
68
69
70
71
# File 'lib/nimbus/forest.rb', line 61

def traverse_regression_forest
  @predictions = {}
  prediction_count = trees.size
  @options.read_testing_data{|individual|
    individual_prediction = 0.0
    trees.each do |t|
      individual_prediction = (individual_prediction + Nimbus::Tree.traverse(t, individual.snp_list)).round(5)
    end
    @predictions[individual.id] = (individual_prediction / prediction_count).round(5)
  }
end