Class: Nimbus::Tree

Inherits:
Object
• Object
show all
Defined in:
lib/nimbus/tree.rb

Overview

Tree object representing a random tree.

A tree is generated following this steps:

• 1: Calculate loss function for the individuals in the node (first node contains all the individuals).

• 2: Take a random sample of the SNPs (size m << total count of SNPs)

• 3: Compute the loss function for the split of the sample based on value of every SNP.

• 4: If the SNP with minimum loss function also minimizes the general loss of the node, split the individuals sample in three nodes, based on value for that SNP [0, 1, or 2]

• 5: Repeat from 1 for every node until:

• a) The individuals count in that node is < minimum size OR

• b) None of the SNP splits has a loss function smaller than the node loss function

• 6) When a node stops, label the node with the average fenotype value (for regression problems) or the majority class (for classification problems) of the individuals in the node.

Constant Summary collapse

NODE_SPLIT_01_2 =
`"zero"`
NODE_SPLIT_0_12 =
`"two"`

Instance Attribute Summary collapse

• Returns the value of attribute generalization_error.

• Returns the value of attribute id_to_fenotype.

• Returns the value of attribute importances.

• Returns the value of attribute individuals.

• Returns the value of attribute node_min_size.

• Returns the value of attribute predictions.

• Returns the value of attribute snp_sample_size.

• Returns the value of attribute snp_total_count.

• Returns the value of attribute structure.

• Returns the value of attribute used_snps.

Class Method Summary collapse

• Class method to traverse a single individual through a tree structure.

Instance Method Summary collapse

• Creates a node by taking a random sample of the SNPs and computing the loss function for every split by SNP of that sample.

• Estimation of importance for every SNP.

• Compute generalization error for the tree.

• constructor

Initialize Tree object with the configuration (as in Nimbus::Configuration.tree) options received.

• Creates the structure of the tree, as a hash of SNP splits and values.

Constructor Details

#initialize(options) ⇒ Tree

Initialize Tree object with the configuration (as in Nimbus::Configuration.tree) options received.

 ``` 25 26 27 28 29``` ```# File 'lib/nimbus/tree.rb', line 25 def initialize(options) @snp_total_count = options[:snp_total_count] @snp_sample_size = options[:snp_sample_size] @node_min_size = options[:tree_node_min_size] end```

Instance Attribute Details

#generalization_error ⇒ Object

Returns the value of attribute generalization_error.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def generalization_error @generalization_error end```

#id_to_fenotype ⇒ Object

Returns the value of attribute id_to_fenotype.

 ``` 19 20 21``` ```# File 'lib/nimbus/tree.rb', line 19 def id_to_fenotype @id_to_fenotype end```

#importances ⇒ Object

Returns the value of attribute importances.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def importances @importances end```

#individuals ⇒ Object

Returns the value of attribute individuals.

 ``` 19 20 21``` ```# File 'lib/nimbus/tree.rb', line 19 def individuals @individuals end```

#node_min_size ⇒ Object

Returns the value of attribute node_min_size.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def node_min_size @node_min_size end```

#predictions ⇒ Object

Returns the value of attribute predictions.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def predictions @predictions end```

#snp_sample_size ⇒ Object

Returns the value of attribute snp_sample_size.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def snp_sample_size @snp_sample_size end```

#snp_total_count ⇒ Object

Returns the value of attribute snp_total_count.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def snp_total_count @snp_total_count end```

#structure ⇒ Object

Returns the value of attribute structure.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def structure @structure end```

#used_snps ⇒ Object

Returns the value of attribute used_snps.

 ``` 18 19 20``` ```# File 'lib/nimbus/tree.rb', line 18 def used_snps @used_snps end```

Class Method Details

.traverse(tree_structure, data) ⇒ Object

Class method to traverse a single individual through a tree structure.

Returns the prediction for that individual (the label of the final node reached by the individual).

Raises:

 ``` 57 58 59 60 61 62 63 64 65 66 67``` ```# File 'lib/nimbus/tree.rb', line 57 def self.traverse(tree_structure, data) return tree_structure if tree_structure.is_a?(Numeric) || tree_structure.is_a?(String) raise Nimbus::TreeError, "Forest data has invalid structure. Please check your forest data (file)." if !(tree_structure.is_a?(Hash) && tree_structure.keys.size == 1) branch = tree_structure.values.first split_type = branch[1].to_s datum = data_traversing_value(data[tree_structure.keys.first - 1], split_type) return self.traverse(branch[datum], data) end```

Instance Method Details

#build_node(individuals_ids, y_hat) ⇒ Object

Creates a node by taking a random sample of the SNPs and computing the loss function for every split by SNP of that sample.

 ``` 43 44``` ```# File 'lib/nimbus/tree.rb', line 43 def build_node(individuals_ids, y_hat) end```

#estimate_importances(oob_ids) ⇒ Object

Estimation of importance for every SNP.

 ``` 51 52``` ```# File 'lib/nimbus/tree.rb', line 51 def estimate_importances(oob_ids) end```

#generalization_error_from_oob(oob_ids) ⇒ Object

Compute generalization error for the tree.

 ``` 47 48``` ```# File 'lib/nimbus/tree.rb', line 47 def generalization_error_from_oob(oob_ids) end```

#seed(all_individuals, individuals_sample, ids_fenotypes) ⇒ Object

Creates the structure of the tree, as a hash of SNP splits and values.

It just initializes the needed variables and then defines the first node of the tree. The rest of the structure of the tree is computed recursively building every node calling `build_node`.

 ``` 35 36 37 38 39 40``` ```# File 'lib/nimbus/tree.rb', line 35 def seed(all_individuals, individuals_sample, ids_fenotypes) @individuals = all_individuals @id_to_fenotype = ids_fenotypes @predictions = {} @used_snps = [] end```